XML代价估计方法研究综述课件

上传人:txadgkn****dgknqu... 文档编号:252471550 上传时间:2024-11-16 格式:PPT 页数:45 大小:401.56KB
返回 下载 相关 举报
XML代价估计方法研究综述课件_第1页
第1页 / 共45页
XML代价估计方法研究综述课件_第2页
第2页 / 共45页
XML代价估计方法研究综述课件_第3页
第3页 / 共45页
点击查看更多>>
资源描述
,Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,XML,代价估计方法研究综述,XML Group,XML 代价估计方法研究综述,1,Outline,研究代价估计的必要性,xml,中代价估计研究的难点,背景知识介绍,现有方法分类,Outline研究代价估计的必要性,2,研究代价估计的必要性,查询优化的基础,不同分支对查询目标的选择率不同,如果选择性低的分支先于选择性高的分支执行,就会有效的减少中间结果从而提供查询效率,结构连接在,XML,中是一个很重要的操作,连接的顺序的选择需要计算不同连接顺序的代价,及早给用户提供反馈信息,研究代价估计的必要性查询优化的基础,3,xml,中代价估计研究的难点,XML,数据的不规则性是对传统统计信息方法的重要挑战,具体表现在以下几个方面:,路径依赖性,,如同为,name,结点,在,person,下和在,city,下出现意义就完全不同,结构相关性,嵌套在不同祖先下面的同类结点的个数差别可能很大,如,book,结点下的,author,个数是不确定的,值相关性,/purchase/Itemtype=bookprice100,XML,的有序性制约了转换规则的灵活性,所有这些问题,都使得在,xml,中采用传统的代价估计方法不切实际,会带来很大的误差。针对,xml,数据的特点,我们应该寻求一种新的代价估计方法。,xml中代价估计研究的难点XML数据的不规则性是对传统统计信,4,Background Knowledge,XML Data Model,A large,node-labeled tree T(V,E),d,0,a,1,a,2,a,3,p,4,p,5,n,6,t,14,k,15,y,13,1999,y,16,t,17,k,18,k,19,n,7,b,9,p,8,y,20,t,21,k,22,y,24,t,25,k,26,n,10,b,12,p,9,t,23,t,26,2002,2001,2000,Example XML Document Tree,Background Knowledge XML Data,5,Background Knowledge,XML Data Model,A large,node-labeled tree T(V,E),d,0,a,1,a,2,a,3,p,4,p,5,n,6,t,14,k,15,y,13,1999,y,16,t,17,k,18,k,19,n,7,b,9,p,8,y,20,t,21,k,22,y,24,t,25,k,26,n,10,b,12,p,9,t,23,t,26,2002,2001,2000,Example XML Document,Background Knowledge XML Data,6,Background Knowledge,XML Query Model,A node-labeled query tree TQ,Each node of T,Q,is labeled with a variable name q,i,in Q,Each edge(q,i,q,j,)of T,Q,is annotated with an XPath expression path(q,i,q,j,)that describes the specific structural constraints specified in Q,Query Fragment,:,for$b in doc(“bib.xml”)/bib/book,$a in$b/author,$t in$b/title,bib,book,author,title,root,$b,$a,$t,Twig Query Model,Background KnowledgeXML Query,7,现有估计方法的分类,路径选择性计算,Twig,匹配个数统计,值谓词选择率估计,结构连接代价估计,现有估计方法的分类路径选择性计算,8,Overview,data tree,different summarization methods based on:,path,all,m,-length path,F/B-stability,Lore,McHugh et al,VLDB99,pruning,PathTree,Markov Table,Aboulnaga et al,VLDB01,XSketch,Polyzotis et al,VLDB02,t,ype,XSketches,Polyzotis et al,SIGMOD02,+value,+,twig,XSketches,Polyzotis et al,SIGMOD0,4,Region Code,2D Model,1D Model,StatiX,Freire et al,SIGMOD02,PH Histogram,Wu Y EDBT 2002,Interva&Position Model,H.Lu,SIGMOD 2003,Overviewdata treedifferent sum,9,路径选择性计算,问题描述,/,a/bc/d/eg/e/f/*/*/e/f,路径选择性计算问题描述 /a/bc/d/eg/,10,Path Tree&,Markov Table,Ashraf Aboulnaga,Alaa R.Alameldeen,and Jeffry F.Naughton.Estimating the selectivity of XML path expressions for Internet scale applications.VLDB 2001,Path Tree&Markov Table Ashra,11,Path Tree,Construction,Start from an original path tree,Count of nodes in,absolute paths,Aggregate the tree to fit in the limited space,Sibling-*,Estimation,Match the query against the path tree,matching*as the last resort.,Need to decide whether to use total count or average count.,Path TreeConstruction,12,An XML document and its path tree,A,1,B,2,C,1,D,1,D,1,E,3,Estimation:count(A/B/D)=1,count(A/C/D)=1,An XML document and its path t,13,Aggregate Path Tree,A,1,C,9,B,13,G,10,F,15,H,6,K,12,E,5,D,7,K,11,J,4,I,2,Aggregate Path TreeA1C9B13G10F,14,Aggregate Path Tree,A,1,C,9,B,13,G,10,F,15,H,6,K,12,E,5,D,7,K,11,J,4,I,2,红色结点表示那些不频繁出现的结点,需要删除,从而保证,path,树能够放入内存,Aggregate Path TreeA1C9B13G10F,15,Sibling-*Summarization,A,1,C,9,B,13,*,F,15,*,K,*,f=23,n=2,6,8,3,把查询和,path,树匹配,需要决定用总数还是平均数,估计方法,Estimation:count(/C/H/K)=11,count(/C/*/K)=23,Sibling-*SummarizationA1C9B13,16,Markov Table,构造一个表,存储长度=2。,当查询路径长度=,m,时,可以直接从表中读出结果,否则,用公式计算。,例如,,m3,,查询为/,A/B/C/D,,则,当表不能装入内存的时候,进行剪枝操作。,f(B/C/D),f(B/C),f(A/B/C/D),f(A/B/C),Markov Table构造一个表,存储长度=m的不同路径,17,Markov Table:Example,a,b,b,c,d,d,e,e,e,f,d,a,1,a/b,2,b,2,a/c,1,c,1,b/d,2,d,3,c/d,1,e,3,c/e,3,f,1,c/f,1,a,b,d,d,e,1,2,2,1,3,c,f,1,1,m=2,b,2,c/e,3,d,3,*,3/3,e,3,c/*,2/2,a/b,2,*,/*,1/1,b/d,2,suffix-*,Q1:a/c/e,Markov Table:Exampleabbcdde,18,Twig,匹配个数统计,问题描述,Twig Query(basic building block of declarative query languages for XML),FOR$f IN document(“person.xml”)/department/faculty,FOR$r in$f/RA,FOR$t in$f/TA,Department,Faculty,RA,TA,$f,$r,$t,Twig 匹配个数统计问题描述FOR$f IN docum,19,XSketch,方法,N.Polyzotis,M.Garofalakis.Statistical Synopses for Graph-Structured XML Databases,In Proc.of ACM SIGMOD Conf.2002,N.Polyzotis and M.Garofalakis.Structure and value synopses for xml data graphs.In Proceedings of 28,th,VLDB Conf.,2002.,N.Polyzotis,M.Garofalakis,and Yannis Iosnnidis.Selectivity Estimation for XML Twigs.In Proc.of the 20th Intl.Conf.on Data Engineering,2004,N.Polyzotis and M.Garofalakis and Y.Ioannidis.Approximate XML Query Answers.In Proc.ACM SIGMOD 2004,XSketch方法 N.Polyzotis,M.Garof,20,d,0,a,1,a,2,a,3,p,4,p,5,n,6,t,14,k,15,y,13,1999,y,16,t,17,k,18,k,19,n,7,b,9,p,8,y,20,t,21,k,22,y,24,t,25,k,26,n,10,b,12,p,9,t,23,t,26,2002,2001,2000,D(1),A(3),N(3),P(4),B(2),Y(4),K(6),T(6),H(Y),B/F,B/F,B/F,B/F,B,B/F,F,F,XSketch,方法,d0a1a2a3p4p5n6t14k15y131999y16,21,B/F stability,Definition,:,Let U,V be sets of elements in an XML data Tree T.We say that V is backward-stable(B-stale)with respect to U,iff for each vV there exists a u U such tha
展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 办公文档 > 教学培训


copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!