ScalingPersonalizedWebSearch-ComputerScience缩放的个性化Web搜索-计算机科学

上传人:ra****d 文档编号:251984162 上传时间:2024-11-11 格式:PPT 页数:31 大小:210.50KB
返回 下载 相关 举报
ScalingPersonalizedWebSearch-ComputerScience缩放的个性化Web搜索-计算机科学_第1页
第1页 / 共31页
ScalingPersonalizedWebSearch-ComputerScience缩放的个性化Web搜索-计算机科学_第2页
第2页 / 共31页
ScalingPersonalizedWebSearch-ComputerScience缩放的个性化Web搜索-计算机科学_第3页
第3页 / 共31页
点击查看更多>>
资源描述
Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,*,Scaling Personalized Web Search,Glen Jeh,Jennfier Widom Stanford University,Presented by Li-Tal Mashiach,SearchEngineTechnology course(236620),Technion,Todays topics,Overview,Motivation,Personal PageRank Vector,Efficient calculation of PPV,Experimental results,Discussion,PageRank Overview,Ranking method of web pages based on the link structure of the web,Important pages are those linked-to by many important pages,Original PageRank has no initial preference for any particular pages,PageRank Overview,The,ranking is based on the probability that a,random surfer,will visit a certain page at a given time,E(p),can be:,Uniformly distributed,Biased distributed,Motivation,We would like to give higher importance to user selected pages,User may have a set,P,of,preferred pages,Instead of jumping,to,any,random page,with probability c,the jump is restricted to P,That way,we increase the probability that the random surfer will stay in the near environment of pages in P,Considering P will create a,personalized view,of the importance of pages on the web,Personalized PageRank Vector(PPV),Restrict preference sets P to subsets of a set of hub pages H-set of pages with high PageRank,PPV is a vector of length n,where n is the number of pages on the web,PPVp=the importance of page p,PPV Equation,u preference vector,|u|=1,u(p)=the amount of preference for page p,A n,x,n matrix,c the probability the random surfer jumps to a page in P,PPV Problem,Not practical to compute PPVs during query time,Not practical to compute and store offline,There are preference sets,How to calculate PPV?How to do it efficiently?,Main Steps to solution,Break down,preference vectors,into,common,components,Computation divided between,offline,(lots of time)and,online,(focused computation),Eliminates,redundant,computation,Linearity Theorem,The solution to a linear combination of preference vectors is the same linear combination of the corresponding PPVs .,Let,x,i,be a,unit vector,Let,r,i,be the PPV corresponding to x,i,called,hub vector,Example,r1,r2,r12,x1,x2,x12,Personal preferences of David,rk,Good,but not enough,If hub vector r,i,for each page in H can be computed ahead of time and stored,then computing PPV is easier,The number of pre-computed PPV decrease from to|H|.,But.,Each hub vector computation requires multiple scans of the web graph,Time and space grow linearly with|H|,The solution so far is impractical,Decomposition of Hub Vectors,In order to compute and store the hub vectors efficiently,we can further break them down into,Partial vector,unique component,Hubs skeleton,encode interrelationships among hub vectors,Construct into full,hub vector,during query time,Saves computation time and storage due to sharing of components among hub vectors,Inverse P-distance,Hub vector,r,p,can be represented as,inverse,P-distance vector,l(t)the number of edges in path t,P(t)the probability of traveling on path t,Partial Vectors,Breaking,r,p,into into two components:,Partial Vectors-,computed without using any intermediate nodes from H,The rest,For well-chosen sets H,it will be true that for many pages p,q,Partial Vector,Paths that going through some page,Precompute and store the partial vector,Cheaper to compute and store than,Decreases as|H|increases,Add at query time to compute the full hub vector,But,Computing and storing could be expensive as itself,Good,but not enough,Hubs Skeleton,Breaking down :,Hubs skeleton,-The set of distances among hub,giving the interrelationships among partial vectors,for each p,r,p,(H),has size at most|H|,much smaller than the full hub vector,Partial Vectors,Hubs skeleton,Handling the case p or q is itself in H,Paths that go through some page,Example,H,a,b,d,c,Putting it all together,Given a chosen reference set P,Form a preference vector u,Calculate hub vector for each,i,k,Combine the hub vectors,Pre-computed of partial vectors,Hubs skeleton may be deferred to query time,Algorithms,Decomposition theorem,Basic dynamic programming algorithm,Partial vectors-Selective expansion algorithm,Hubs skeleton-Repeated squaring algorithm,Decomposition theorem,The basis vector,r,p,is the average of the basis vectors of its out-neighbors,plus a compensation factor,Define relationships among basis vectors,Having computed the basis vectors of ps out-neighbors to certain precision,we can use the theorem to compute,r,p,to greater precision,Basic dynamic programming algorithm,Using the decomposition theory,we can build a,dynamic programming algorithm,which iteratively improves the precision of the calculation,On iteration k,only paths with length k-1 are being considered,The error is reduced by a factor of 1-c on each iteration,Computing partial vectors,Selective expansion algorithm,Tours passing through a hub page H are never considered,The expansion from p will stop when reaching page from H,Computing hubs skeleton,Repeated squari
展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 商业管理 > 商业计划


copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!