数据结构与算法分析第二版英文版课件chapter8

上传人:痛*** 文档编号:241899042 上传时间:2024-08-03 格式:PPT 页数:38 大小:230.56KB
返回 下载 相关 举报
数据结构与算法分析第二版英文版课件chapter8_第1页
第1页 / 共38页
数据结构与算法分析第二版英文版课件chapter8_第2页
第2页 / 共38页
数据结构与算法分析第二版英文版课件chapter8_第3页
第3页 / 共38页
点击查看更多>>
资源描述
Primary vs.Secondary StoragePrimary storage:Main memory(RAM)Secondary Storage:Peripheral devicesDisk drivesTape drives8/3/20241优秀课件,精彩无限!Primary vs.Secondary StoragePComparisonsRAM is usually volatile.RAM is about 1/4 million times faster than disk.MediumEarly 1996 Mid 1997 Early 2000RAM$45.007.001.50Disk0.250.100.01Floppy0.500.360.25Tape0.030.010.0018/3/20242优秀课件,精彩无限!ComparisonsMediumEarly 1996 MiGolden Rule of File ProcessingMinimize the number of disk accesses!1.Arrange information so that you get what you want with few disk accesses.2.Arrange information to minimize future disk accesses.An organization for data on disk is often called a file structure(p261).Disk-based space/time tradeoff(p74):Compress information to save processing time by reducing disk accesses.8/3/20243优秀课件,精彩无限!Golden Rule of File ProcessingDisk Drives8/3/20244优秀课件,精彩无限!Disk Drives7/29/20234优秀课件,精彩无限SectorsA sector is the basic unit of I/O.Interleaving factor:Physical distance between logically adjacent sectors on a track.8/3/20245优秀课件,精彩无限!Sectors7/29/20235优秀课件,精彩无限!TermsLocality of Reference:When record is read from disk,next request is likely to come from near the same place in the file.Cluster:Smallest unit of file allocation,usually several sectors.Extent:A group of physically contiguous clusters.Internal fragmentation:Wasted space within sector if record size does not match sector size;wasted space within cluster if file size is not a multiple of cluster size.8/3/20246优秀课件,精彩无限!TermsLocality of Reference:WhSeek TimeSeek time:Time for I/O head to reach desired track.Largely determined by distance between I/O head and desired track.Track-to-track time:Minimum time to move from one track to an adjacent track.Average Seek time:Average time to reach a track for random access.8/3/20247优秀课件,精彩无限!Seek TimeSeek time:Time for IOther FactorsRotational Delay or Latency:Time for data to rotate under I/O head.One half of a rotation on average.At 7200 rpm,this is 8.3/2=4.2ms.Transfer time:Time for data to move under the I/O head.At 7200 rpm:Number of sectors read/Number of sectors per track*8.3ms.8/3/20248优秀课件,精彩无限!Other FactorsRotational Delay Disk Spec Example16.8 GB disk on 10 platters=1.68GB/platter13,085 tracks/platter256 sectors/track512 bytes/sectorTrack-to-track seek time:2.2 msAverage seek time:9.5ms4KB clusters,32 clusters/track.Interleaving factor of 3.5400RPM8/3/20249优秀课件,精彩无限!Disk Spec Example16.8 GB disk Disk Access Cost Example(1)Read a 1MB file divided into 2048 records of 512 bytes(1 sector)each.Assume all records are on 8 contiguous tracks.First track:9.5+11.1/2+3 x 11.1=48.4 msRemaining 7 tracks:2.2+11.1/2+3 x 11.1=41.1 ms.Total:48.4+7*41.1=335.7ms8/3/202410优秀课件,精彩无限!Disk Access Cost Example(1)ReDisk Access Cost Example(2)Read a 1MB file divided into 2048 records of 512 bytes(1 sector)each.Assume all file clusters are randomly spread across the disk.256 clusters.Cluster read time is (3 x 8)/256 of a rotation for about 1 ms.256(9.5+11.1/2+11.1x3 x(8/256)is about 4119.2 ms.8/3/202411优秀课件,精彩无限!Disk Access Cost Example(2)ReHow Much to Read?Read time for one track:9.5+11.1/2+3 x 11.1=48.4ms.Read time for one sector:9.5+11.1/2+(1/256)11.1=15.1ms.Read time for one byte:9.5+11.1/2=15.05 ms.Nearly all disk drives read/write one sector at every I/O access.Also referred to as a page.P288:8.2 if clusters are spread randomly across the disk?8/3/202412优秀课件,精彩无限!How Much to Read?Read time forBuffersThe information in a sector is stored in a buffer or cache.If the next I/O access is to the same buffer,then no need to go to disk.There are usually one or more input buffers and one or more output buffers.8/3/202413优秀课件,精彩无限!BuffersThe information in a seBuffer PoolsA series of buffers used by an application to cache disk data is called a buffer pool.Virtual memory uses a buffer pool to imitate greater RAM memory by actually storing information on disk and“swapping”between disk and RAM.8/3/202414优秀课件,精彩无限!Buffer PoolsA series of bufferBuffer Pools8/3/202415优秀课件,精彩无限!Buffer Pools7/29/202315优秀课件,精彩Organizing Buffer PoolsWhich buffer should be replaced when new data must be read?First-in,First-out:Use the first one on the queue.Least Frequently Used(LFU):Count buffer accesses,reuse the least used.Least Recently used(LRU):Keep buffers on a linked list.When buffer is accessed,bring it to front.Reuse the one at end.If 3 9 10 4 7 3 5 3 8 2 7 come into 3-buffer pool(empty initially)?8/3/202416优秀课件,精彩无限!Organizing Buffer PoolsWhich bBufferpool ADT(1)class BufferPool /(1)Message Passingpublic:virtual void insert(void*space,int sz,int pos)=0;virtual void getbytes(void*space,int sz,int pos)=0;class BufferPool /(2)Buffer Passingpublic:virtual void*getblock(int block)=0;virtual void dirtyblock(int block)=0;virtual int blocksize()=0;8/3/202417优秀课件,精彩无限!Bufferpool ADT(1)class BufferDesign IssuesDisadvantage of message passing:Messages are copied and passed back and forth.Disadvantages of buffer passing:The user is given access to system memory(the buffer itself)The user must explicitly tell the buffer pool when buffer contents have been modified,so that modified data can be rewritten to disk when the buffer is flushed.The pointer might become stale when the buffer pool replaces the contents of a buffer.8/3/202418优秀课件,精彩无限!Design IssuesDisadvantage of mProgrammers View of FilesLogical view of files:An array of bytes.A file pointer marks the current position.Three fundamental operations:Read bytes from current position(move file pointer)Write bytes to current position(move file pointer)Set file pointer to specified byte position.8/3/202419优秀课件,精彩无限!Programmers View of FilesLogiC+File Functions#include void fstream:open(char*name,openmode mode);Example:ios:in|ios:binaryvoid fstream:close();fstream:read(char*ptr,int numbytes);fstream:write(char*ptr,int numbtyes);fstream:seekg(int pos);fstream:seekg(int pos,ios:curr);fstream:seekp(int pos);fstream:seekp(int pos,ios:end);8/3/202420优秀课件,精彩无限!C+File Functions#include fsExternal SortingProblem:Sorting data sets too large to fit into main memory.Assume data are stored on disk drive.To sort,portions of the data must be brought into main memory,processed,and returned to disk.An external sort should minimize disk accesses.8/3/202421优秀课件,精彩无限!External SortingProblem:SortiModel of External ComputationSecondary memory is divided into equal-sized blocks(512,1024,etc)A basic I/O operation transfers the contents of one disk block to/from main memory.Under certain circumstances,reading blocks of a file in sequential order is more efficient.(When?)Primary goal is to minimize I/O operations.Assume only one disk drive is available.8/3/202422优秀课件,精彩无限!Model of External ComputationSKey SortingOften,records are large,keys are small.Ex:Payroll entries keyed on ID numberApproach 1:Read in entire records,sort them,then write them out again.Approach 2:Read only the key values,store with each key the location on disk of its associated record.After keys are sorted the records can be read and rewritten in sorted order.8/3/202423优秀课件,精彩无限!Key SortingOften,records are Simple External Mergesort(1)Quicksort requires random access to the entire set of records.Better:Modified Merge-sort algorithm.Process n elements in Q(log n)passes.A group of sorted records is called a run 顺串.8/3/202424优秀课件,精彩无限!Simple External Mergesort(1)QSimple External Mergesort(2)Split the file into two files.Read in a block from each file.Take first record from each block,output them in sorted order.Take next record from each block,output them to a second file in sorted order.Repeat until finished,alternating between output files.Read new input blocks as needed.Repeat steps 2-5,except this time input files have runs of two sorted records that are merged together.Each pass through the files provides larger runs.8/3/202425优秀课件,精彩无限!Simple External Mergesort(2)SSimple External Mergesort(3)8/3/202426优秀课件,精彩无限!Simple External Mergesort(3)7Problems with Simple MergesortIs each pass through input and output files sequential?What happens if all work is done on a single disk drive?How can we reduce the number of Merge-sort passes?In general,external sorting consists of two phases:Break the files into initial runsMerge the runs together into a single run.8/3/202427优秀课件,精彩无限!Problems with Simple MergesortBreaking a File into RunsGeneral approach:Read as much of the file into memory as possible.Perform an in-memory sort.Output this group of records as a single run.8/3/202428优秀课件,精彩无限!Breaking a File into RunsGenerReplacement Selection(1)Break available memory into an array for the heap,an input buffer,and an output buffer.Fill the array from disk.Make a min-heap.Send the smallest value(root)to the output buffer.8/3/202429优秀课件,精彩无限!Replacement Selection(1)BreakReplacement Selection(2)If the next key in the file is greater than the last value output,thenReplace the root with this keyelseReplace the root with the last key in the arrayAdd the next record in the file to a new heap(actually,stick it at the end of the array).8/3/202430优秀课件,精彩无限!Replacement Selection(2)If thRS Example8/3/202431优秀课件,精彩无限!RS Example7/29/202331优秀课件,精彩无限Problems with Simple MergeSimple merge-sort:Place runs into two files.Merge the first two runs to output file,then next two runs,etc.Repeat process until only one run remains.How many passes for r initial runs?Is there benefit from sequential reading?Is working memory well used?Need a way to reduce the number of passes.8/3/202432优秀课件,精彩无限!Problems with Simple MergeSimpMulti-way Merge(1)With replacement selection,each initial run is several blocks long.Assume each run is placed in separate file.Read the first block from each file into memory and perform an r-way merge.When a buffer becomes empty,read a block from the appropriate run file.Each record is read only once from disk during the merge process.8/3/202433优秀课件,精彩无限!Multi-way Merge(1)With replacMulti-way Merge(2)In practice,use only one file and seek to appropriate block.8/3/202434优秀课件,精彩无限!Multi-way Merge(2)In practiceLimits to Multi-way Merge(1)Assume working memory is b blocks in size.How many runs can be processed at one time?The runs are 2b blocks long(on average).How big a file can be merged in one pass?8/3/202435优秀课件,精彩无限!Limits to Multi-way Merge(1)ALimits to Multi-way Merge(2)Larger files will need more passes-but the run size grows quickly!In k merge passes,we process 2b(k+1)blocks.Example:0.5MB working memory,4KB blocks,yield 128 blocks for working memory.Average run size is 1MB,so 128MB can be sorted in one pass on average.16GB in two merge passes.8/3/202436优秀课件,精彩无限!Limits to Multi-way Merge(2)LGeneral PrinciplesA good external sorting algorithm will seek to do the following:Make the initial runs as long as possible.At all stages,overlap重叠、并行 input,processing and output as much as possible.Use as much working memory as possible.Applying more memory usually speeds processing.If possible,use additional disk drives for more overlapping of processing with I/O,and allow for more sequential file processing.8/3/202437优秀课件,精彩无限!General PrinciplesA good exterExerciseP2888.2,8.3,8.88/3/202438优秀课件,精彩无限!ExerciseP2887/29/202338优秀课件,精彩
展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 管理文书 > 施工组织


copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!