数据库复习题2答案分析解析.doc

资源描述

复习题（2）1、试分别判断下列图中G1和G2是否互模拟(bisimulation)，并说明理由:aaabccbG1=G2=abcabccG1G2ddd答案：(1) 在图中标出各点的状态，我们构造关系，可知G2可以模拟G1，下面我们讨论是否可模拟，在G2中有一个a变换可对应到G1中2个变换，即，。但有两个变换b，c，而在G1中仅存在只有b或只有c的状态点，可知G1和G2不能互模拟。 (2) 如图，标出各状态点，构造有关系可知其中G1中的点均可由G2中的点模拟，下面我们考虑可知同样其中G2中的点均可由G1中的点模拟. 所以G1和G2为互模拟的。2、给定如下数据图(Data Graph)：试给出其Strong DataGuide 图答案：Strong DataGuide 图3、 Consider the relation, r , shown in Figure 5.27. Give the result of the following query :Figure 5.27Query 1:select building, room number, time_slo_ id, count(*)from rgroup by rollup (building, room number, time_slo_ id)Query 1:select building, room number, time_slo_ id, count(*)from rgroup by cube (building, room number, time_slo_ id)答案：Query 1返回结果集：为以下四种分组统计结果集的并集且未去掉重复数据。buildingroom numbertime_slo_ idcount(*)产生的分组种数：4种；第一种：group by A,B,CGarfield359A1Garfield359B1Saucon651A1Saucon550C1Painter705D1Painter403D1第二种：group by A,BGarfield359A2Garfield359B2Saucon651A1Saucon550C1Painter705D1Painter403D1第三种：group by AGarfield359A2Garfield359B2Saucon651A2Saucon550C2Painter705D2Painter403D2第四种：group by NULL。本没有group by NULL的写法，在这里指是为了方便说明，而采用之。含义是：没有分组，也就是所有数据做一个统计。例如聚合函数是SUM的话，那就是对所有满足条件的数据进行求和。Garfield359A6Garfield359B6Saucon651A6Saucon550C6Painter705D6Painter403D6Query 2:group by后带rollup子句与group by后带cube子句的唯一区别就是：带cube子句的group by会产生更多的分组统计数据。cube后的列有多少种组合（注意组合是与顺序无关的）就会有多少种分组。返回结果集：为以下八种分组统计结果集的并集且未去掉重复数据。buildingroom numbertime_slo_ idcount(*)产生的分组种数：8种第一种：group by A,B,CGarfield359A1Garfield359B1Saucon651A1Saucon550C1Painter705D1Painter403D1第二种：group by A,BGarfield359A2Garfield359B2Saucon651A1Saucon550C1Painter705D1Painter403D1第三种：group by A,CGarfield359A1Garfield359B1Saucon651A1Saucon550C1Painter705D2Painter403D2第四种：group by B,CGarfield359A2Garfield359B2Saucon651A1Saucon550C1Painter705D1Painter403D1第五种：group by AGarfield359A2Garfield359B2Saucon651A2Saucon550C2Painter705D2Painter403D2第六种：group by BGarfield359A2Garfield359B2Saucon651A1Saucon550C1Painter705D1Painter403D1第七种：group by CGarfield359A2Garfield359B1Saucon651A2Saucon550C1Painter705D2Painter403D2第八种：group by NULLGarfield359A6Garfield359B6Saucon651A6Saucon550C6Painter705D6Painter403D64、 Disks and Access TimeConsider a disk with a sector扇区 size of 512 bytes, 63 sectors per track磁道, 16,383 tracks per surface盘面, 8 double-sided platters柱面 (i.e., 16 surfaces). The disk platters rotate at 7,200 rpm (revolutions per minute). The average seek time is 9 msec, whereas the track-to-track seek time is 1 msec.Suppose that a page size of 4096 bytes is chosen. Suppose that a file containing 1,000,000 records of 256 bytes each is to be stored on such a disk. No record is allowed to span two pages (use these numbers in appropriate places in your calculation).(a) What is the capacity of the disk?(b) If the file is arranged sequentially on the disk, how many cylinders are needed?(c) How much time is required to read this file sequentially?(d) How much time is needed to read 10% of the pages in the file randomly?Answer:(a) Capacity = sector size * num. of sectors per track * num. of tracks per surface * num of surfaces = 512 * 63 * 16383 * 16 = 8 455 200 768(b) File: 1,000,000 records of 256 bytes eachNum of records per page: 4096/256 = 161,000,000/ 16 = 62,500 pages or 62,500 * 8 = 500,000 sectorsEach cylinder has 63 * 16 = 1,008 sectorsSo we need 496.031746 cylinders.(c) We analyze the cost using the following three components:Seek time: This access seeks the initial position of the file (whose cost can be approximated using the average seek time) and then seeks between adjacent tracks 496 times (whose cost is the track-to-track seek time). So the seek time is 0.009 + 496*0.001 = 0.505 seconds.Rotational delay: The transfer time of one track of data is 1/ (7200/60) = 0.0083 seconds.For this question, we use 0.0083/2 as an estimate of the rotational delay (other numbers between 0 and 0.00415 are also fine). So the rotational delay for 497 seeks is 0.00415 * 497 = 2.06255.Transfer time: It takes 0.0083*(500000/63) = 65.8730159 seconds to transfer data in 500,000 sectors.Therefore, total access time is 0.505 + 2.06255 + 65.8730159 = 68.4405659 seconds.(d) number of pages = 6250time cost per page: 0.009 (seek) + 0.0083/2 (rotational delay) + 0.0083*8/63 (transfer) = 0.0142 secondstotal cost = 6250 * 0.0142 = 88.77 seconds5、 Disk Page Layout The figure below shows a page containing variable length records. The page size is 1KB (1024 bytes). It contains 3 records, some free space, and a slot directory in that order. Each record has its record id, in the form of Rid=(page id, slot number), as well as its start and end addresses in the page, as shown in the figure.Now a new record of size 200 bytes needs to be inserted into this page. Apply the record insertion operation with page compaction, if necessary. Show the content of the slot directory after the new record is inserted. Assume that you have only the page, not any other temporary space, to work with.Answer: Content of the slot directory, from left to right, is:(650, 200), (0, 200), (500, 150), (200, 300), 4, 8506、 Buffer Management for File and Index Accesses Consider the following two relations:l student(snum:integer, sname:char(30), major:char(25), standing:char(2), age:integer)l enrolled(snum:integer, cname:char(40)The following index is available:A B+ Tree index on the attribute of the student relation.Assume that the buffer size is large enough to store multiple paths of each B+ Tree but not an entire tree.(a) Consider Query 1 and Query 2 that retrieve the snums of students who have taken Database Systems and Operating Systems, respectively, from the enrolled table. We know that Query 1 will be executed before Query 2, and both queries are executed using a file scan of the enrolled table.Which replacement policy would you recommend for the buffer manager to use to support this workload?(b) Now assume that we have retrieved the snums of students who have taken Database Systems from the enrolled table. In the exact order of the retrieved snums (not necessarily in sorted order),we then retrieve the names of those students via repeated lookups in the B+ Tree on .For these repeated accesses to the index on student.snum, which replacement policy would you recommend for efficient buffer management?Query 1: select snum from student s, enrolled e where s.snum=e.snum and cname like Database Systems;Query 2: select snum from student s, enrolled e where s.snum=e.snum and cname like Operating Systems;Answer: (a) Not LRU. MRU or LRU2.(b) Now we have repeated equality searches over the B+ Tree on , with no duplicate values in the search key (because the schema does not allow a student to take the same course twice). Two possible answers:- The B+ Tree pages close to the top are repeated accessed but those at the leaf level are rarely reused. So we can use LRU.- LIFO (Last In First Out) is another possible answer. This is because over time, we have cached most nodes close to the top in memory. So the nodes recently read from the disk are mostly close to the leaves. So LIFO will replace those leaf or close-to-leaf nodes to make room for the newly requested nodes.- We also accept other answers if the student can justify well.

展开阅读全文

数据库复习题2答案分析解析.doc

最新文档