清华大学数据库原理课件第七章

上传人:无*** 文档编号:241589683 上传时间:2024-07-07 格式:PPT 页数:61 大小:473.50KB
返回 下载 相关 举报
清华大学数据库原理课件第七章_第1页
第1页 / 共61页
清华大学数据库原理课件第七章_第2页
第2页 / 共61页
清华大学数据库原理课件第七章_第3页
第3页 / 共61页
点击查看更多>>
资源描述
数据库原理2003年/秋Chapter7RelationalDatabaseDesign(c1)12Main ContentslBoyce-Codd Normal FormlThird Normal FormlMultivalued Dependencies lFourth Normal FormlOverall Database Design Process3Boyce-Codd Normal Forml is trivial(i.e.,)l is a superkey for RA relation schema R is in BCNF with respect to a set F of functional dependencies if for all functional dependencies in F+of the form ,where R and R,at least one of the following holds:4ExamplelR=(A,B,C)F=A B B CKey=AlR is not in BCNFlDecomposition R1=(A,B),R2=(B,C)R1 and R2 in BCNFLossless-join decompositionDependency preserving5Testing for BCNFlTo check if a non-trivial dependency in F+causes a violation of BCNFpute+(the attribute closure of)2.verify that it includes all attributes of R,that is,it is a superkey of R6Simplified Testing for BCNFlTo check if a relation schema R with a given set of functional dependencies F is in BCNF,it suffices to check only the dependencies in the given set F for violation of BCNF,rather than checking all dependencies in F+We can show that if none of the dependencies in F causes a violation of BCNF,then none of the dependencies in F+will cause a violation of BCNF either7No Test on a Decomposition of RlConsider R(A,B,C,D),with F=A B,B CDecompose R into R1(A,B)and R2(A,C,D)Neither of the dependencies in F contain only attributes from(A,C,D)so we might be mislead into thinking R2 satisfies BCNFIn fact,dependency A C in F+shows R2 is not in BCNFlHowever,using only F is incorrect when testing a relation in a decomposition of R8BCNF Decomposition Algorithmresult:=R;done:=false;compute F+;while(not done)doif(there is a schema Ri in result that is not in BCNF)then beginlet be a nontrivial functional dependency that holds on Ri such that Ri is not in F+,and =;result:=(result Ri)(Ri )(,);endelse done:=true;Ri-Ri?9Example of BCNF DecompositionlR=(branch-name,branch-city,assets,customer-name,loan-number,amount)F=branch-name assets branch-cityloan-number amount branch-nameKey=loan-number,customer-namelDecompositionR1=(branch-name,branch-city,assets)R2=(branch-name,customer-name,loan-number,amount)R3=(branch-name,loan-number,amount)R4=(customer-name,loan-number)lFinal decomposition R1,R3,R410Testing Decomposition with F+lA relation Ri in a decomposition of RTest Ri for BCNF with respect to the restriction of F to Ri (that is,all FDs in F+that contain only attributes from Ri)11Testing Decomposition with FlA relation Ri in a decomposition of Rfor every set of attributes Ri,check that+(the attribute closure of)either includes no attribute of Ri-,or includes all attributes of RiIf the condition is violated by some in F,the dependency (+-)Rican be shown to hold on Ri,and Ri violates BCNFWe use above dependency to decompose Ri12BCNF and DPIt is not always possible to get a BCNF decomposition that is dependency preserving,overnormalizelR=(J,K,L)F=JK L,L KTwo candidate keys=JK and JLlR is not in BCNFlAny decomposition of R will fail to preserve JK L13Main ContentslBoyce-Codd Normal FormlThird Normal FormlMultivalued Dependencies lFourth Normal FormlOverall Database Design Process14Motivation of 3NFlThere are some situations where BCNF is not dependency preserving,and Efficient checking for FD violation on updates is importantlSolutiondefine a weaker normal form,called Third Normal Form(3NF)15Third Normal FormlA relation schema R is in third normal form if for all:in F+at least one of the following holds:is trivial(i.e.,)is a superkey for REach attribute A in is contained in a candidate key for RlCan the candidate key be replaced by superkey?lShould each attribute be in the same candidate key?163NF and BCNFlIf a relation is in BCNF it is in 3NF(since in BCNF one of the first two conditions above must hold)lThird condition is a minimal relaxation of BCNF to ensure dependency preservation17Example of 3NFlR=(J,K,L)Two candidate keys,JK and JLR is in 3NFJK LJK is a superkeyL KK is contained in a candidate keylBCNF decomposition has (JL)and(LK)F=JK L,L KTesting for JK L requires a join18Redundancy in 3NFlThere is some redundancy in this schemalEquivalent to example in book:Banker-schema=(branch-name,customer-name,banker-name)banker-name branch-namebranch-name customer-name banker-nameb1 c1 k1b1 c2 k1b1 c1 k219Testing for 3NFlTesting for 3NF has been shown to be NP-hardlNeed to check only FDs in F,need not check all FDs in F+lUse attribute closure to check,for each dependency ,if is a superkeylIf is not a superkey,we have to verify if each attribute in is contained in a candidate key of R203NF Decomposition AlgorithmLet Fc be a canonical cover for F;i:=0;for each functional dependency in Fc do if none of the schemas Rj,1 j i contains then begini:=i +1;Ri :=end/end of if and forif none of the schemas Rj,1 j i contains a candidate key for Rthen begini:=i +1;Ri:=any candidate key for R;end/end of ifreturn(R1,R2,.,Ri)213NF Decomposition Algorithm(c1)lThe previous algorithm ensureseach relation schema Ri is in 3NFdecomposition is dependency preserving and lossless-jointhe result is not uniquely defined 22ExamplelRelation schemaBanker-info-schema=(branch-name,customer-name,banker-name,office-number)lThe functional dependencies banker-name branch-name office-numbercustomer-name branch-name banker-namelThe keycustomer-name,branch-name23Banker-info-schema to 3NFlThe for loop in the algorithm causes us to include the following schemas in our decomposition:Banker-office-schema=(banker-name,branch-name,office-number)Banker-schema=(customer-name,branch-name,banker-name)lSince Banker-schema contains a candidate key for Banker-info-schema,we are done with the decomposition process24Comparison of BCNF and 3NFlIt is always possible to decompose a relation into relations in 3NF and the decomposition is lossless,andthe dependencies are preservedlIt is always possible to decompose a relation into relations in BCNF and the decomposition is lossless,howerverit may not be possible to preserve dependencies25Example in 3NFJj1j2j3nullLl1l1l1l2Kk1k1k1k2Example of problems due to redundancy in 3NFlR=(J,K,L)F=JK L,L KJ=street,K=city,and L=zip26Problems in 3NF lA schema that is in 3NF but not in BCNF has the problems of repetition of information(e.g.,the relationship l1,k1)need to use null values(e.g.,to represent the relationship l2,k2 where there is no corresponding value for J).27Tradeoff in DesignlGoal for a relational database design is:BCNFLossless joinDependency preservationlIf we cannot achieve this,we accept one ofLack of dependency preservation Redundancy due to use of 3NF28Main ContentslBoyce-Codd Normal FormlThird Normal FormlMultivalued Dependencies lFourth Normal FormlOverall Database Design Process29Beyond BCNFlThere are database schemas in BCNF that do not seem to be sufficiently normalized lConsider a database classes(course,teacher,book)such that(c,t,b)classes means that t is qualified to teach c,and b is a required textbook for c30lThe database is supposed to list for each course the set of teachers any one of which can be the courses instructor,and the set of books,all of which are required for the course(no matter who teaches it)A Classes Instancecourseteacherbookdatabasedatabasedatabasedatabasedatabasedatabaseoperating systemsoperating systemsoperating systemsoperating systemsAviAviHankHankSudarshanSudarshanAviAvi Jim Jim DB ConceptsUllmanDB ConceptsUllmanDB ConceptsUllmanOS ConceptsShawOS ConceptsShawcoursesteachersbooksC-T-B31lSince there are not non-trivial dependencies,(course,teacher,book)is the only key,and therefore the relation is in BCNF lInsertion anomalies(插入异常)if Sara is a new teacher that can teach database,two tuples need to be inserted(database,Sara,DB Concepts)(database,Sara,Ullman)Problems in Classes Table32lTherefore,it is better to decompose classes into:courseteacherdatabasedatabasedatabaseoperating systemsoperating systemsAviHankSudarshanAvi Jimteachescoursebookdatabasedatabaseoperating systemsoperating systemsDB ConceptsUllmanOS ConceptsShawtextDecompose Classes Table33MVD DefinitionLet R be a relation schema and let R and R lThe multivalued dependency(多值依赖)holds on R if in any legal relation r(R),for all pairs for tuples t1 and t2 in r such that t1=t2,there exist tuples t3 and t4 in r such that:t1=t2 =t3=t4 t3 =t1 t3R =t2R t4 =t2 t4R =t1R 34Tabular representation of t1=t2 =t3=t4 t3 =t1 t3R =t2R t4 =t2t4R =t1R 35Another MVD DefinitionLet R be a relation schema with a set of attributes that are partitioned into 3 nonempty subsets(非空子集)Y,Z,WlWe say that Y Z(Y multidetermines Z)if and only if for all possible relations r(R)r and rthen r and rlTrivial multivalued dependency,if W is emptyNote that since the behavior of Z and W are identical it follows that Y Z if Y W 36MVD ExamplelIn our example:course teachercourse booklThe above formal definition is supposed to formalize the notion that given a particular value of Y(course)it has associated with it a set of values of Z(teacher)and a set of values of W(book),and these two sets are in some sense independent of each otherlNote:If Y Z then Y ZIndeed we have(in previous slide)Z1=Z2R=Y Z (R-Y-Z)37Use of MVDlWe use multivalued dependencies in two ways:1.To test relations to determine whether they are legal under a given set of functional and multivalued dependencies2.To specify constraints on the set of legal relations.We shall thus concern ourselves only with relations that satisfy a given set of FDs and MVDs38ConstructionlIf a relation r fails to satisfy a given multivalued dependency,we can construct a relations r that does satisfy the multivalued dependency by adding tuples to r lExample in bookBC-schema39Theory of MVDslFrom the definition of multivalued dependency,we can derive the following rule:If ,then That is,every functional dependency is also a multivalued dependency40Closure D+of DlThe closure(闭包闭包)D+of D is the set of all functional and multivalued dependencies logically implied by D We can compute D+from D,using the formal definitions of FDs and MVDsWe can manage with such reasoning for very simple multivalued dependencies,which seem to be most common in practiceFor complex dependencies,it is better to reason about sets of dependencies using a system of inference rules41Main ContentslBoyce-Codd Normal FormlThird Normal FormlMultivalued Dependencies lFourth Normal FormlOverall Database Design Process42Fourth Normal FormlA relation schema R is in 4NF with respect to a set D of functional and multivalued dependencies if for all multivalued dependencies in D+of the form ,where R and R,at least one of the following hold:is trivial(i.e.,or =R)is a superkey for schema RlIf a relation is in 4NF it is in BCNF43Restriction of MVDslThe restriction of D to Ri is the set Di consisting ofAll functional dependencies in D+that include only attributes of RiAll multivalued dependencies of the form (Ri)where Ri and is in D+444NF Decomposition Algorithm result:=R;done:=false;compute D+;Let Di denote the restriction of D+to Ri while(not done)if(there is a schema Ri in result that is not in 4NF)then begin let be a nontrivial multivalued dependency that holds on Ri such that Ri is not in Di,and;result:=(result-Ri)(Ri-)(,);end else done:=true;45Decomposition ExamplelR=(A,B,C,G,H,I)F=A B,B HI,CG H lR is not in 4NF since A B and A is not a superkey for RlDecompositiona)R1=(A,B)(R1 is in 4NF)b)R2=(A,C,G,H,I)(R2 is not in 4NF)c)R3=(C,G,H)(R3 is in 4NF)d)R4=(A,C,G,I)(R4 is not in 4NF)46Decomposition Example(c1)Since A B and B HI,A HI,A Ie)R5=(A,I)(R5 is in 4NF)f)R6=(A,C,G)(R6 is in 4NF)47Main ContentslBoyce-Codd Normal FormlThird Normal FormlMultivalued Dependencies lFourth Normal FormlOverall Database Design Process48Overall Database Design ProcesslWe have assumed schema R is given1.R could have been generated when converting E-R diagram to a set of tables2.R could have been a single relation containing all attributes that are of interest(called universal relation 泛关系泛关系)lNormalization breaks R into smaller relations.3.R could have been the result of some ad hoc design of relations,which we then test/convert to normal form49ER Model and NormalizationlWhen an E-R diagram is carefully designed,identifying all entities correctly,the tables generated from the E-R diagram should not need further normalizationlHowever,in a real(imperfect)design there can be FDs from non-key attributes of an entity to other attributes of the entity50ER Model and Normalization(c1)lExampleemployee entity with attributes department-number and department-address,and an FD department-number department-addressGood design would have made department an entitylFDs from non-key attributes of a relationship set possible,but rare most relationships are binary 51Universal RelationlDangling tuples Tuples that“disappear”in computing a joinLet r1(R1),r2(R2),.,rn(Rn)be a set of relationsA tuple r of the relation ri is a dangling tuple if r is not in the relation:Ri(r1 r2 rn)lThe relation r1 r2 rn is called a universal relation since it involves all the attributes in the“universe”defined by R1 R2 Rn 52Dangling TupleslIf dangling tuples are allowed in the database,instead of decomposing a universal relation,we may prefer to synthesize a collection of normal form schemas from a given set of attributeslDangling tuples may occur in practical database applicationslThey represent incomplete information 53Example for Dangling TupleslWe may want to break up information about loans into:(branch-name,loan-number)(loan-number,amount)(loan-number,customer-name)lUniversal relation would require null values,and have dangling tuples54Universal Relation ApproachlA particular decomposition defines a restricted form of incomplete information that is acceptable in our databaseAbove decomposition requires at least one(至少一个)of customer-name,branch-name or amount in order to enter a loan number without using null valuesRules out storing of customer-name,amount without an appropriate loan-number(since it is a key,it cant be null either!)55Universal Relation Approach(c1)lUniversal relation requires unique attribute names unique role assumptione.g.customer-name,branch-namelReuse of attribute names is natural in SQL since relation names can be prefixed to disambiguate names(消除名字的二义性)56Denormalization for PerformancelExample displaying customer-name along with account-number and balance requires join of account with depositorlAlternative 1 Use denormalized relation containing attributes of account as well as depositor with all above attributeslfaster lookuplExtra space and extra execution time for updateslextra coding work for programmer and possibility of error in extra code57Denormalization for Performance(c1)lAlternative 2use a materialized view defined as account depositorlBenefits and drawbacks same as above,except no extra coding work for programmer and avoids possible errors58QuestionlRelationship among 3NF,BCNF and 4NFGive an example of 3NF,BCNF,and 4NF respectively59HomeworklExercise 7.11Compute the closure of the set of functional dependencies,lExercise 7.29explain why 4NF is normal form more desirable than BCNF 60SQL&FDlSQL does not provide a direct way of specifying functional dependencies other than primary key or unique constraintslWe can specify FDs using assertions(断言),but they are expensive to testlEven if we had a dependency preserving decomposition,using SQL we would not be able to efficiently test a functional dependency whose left hand side is not a key61Testing for FDs Across RelationslIf decomposition is not dependency preserving,we can have an extra materialized view(物化视物化视图图)for each dependency in Fc that is not preserved in the decompositionThe materialized view is defined as a projection on of the join of the relations in the decomposition
展开阅读全文
相关资源
相关搜索

最新文档


当前位置:首页 > 管理文书 > 施工组织


copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!