资源描述
Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,Performance Measurement 1,Performance,Execution time,执行时间,(,latency,等待时间,),:,Time between the start and the completion of an event,一个事件从开始到结束所经过的时间,Performance,1/(,Execution time),性能与执行时间成反比,Throughput,吞吐量,(,bandwidth,带宽,),:,Total amount of work done in a given time,给定时间内完成的全部工作,1,PerformanceMeasurement1,Machine X is,n,%faster than Machine Y:,机器,X,比机器,Y,快,n,%,2,PerformanceMeasurement2,Example:,Machine A runs a programin 10seconds,Machine B runs thesameprogram in 15 seconds,A is _%faster than B.,3,Make the Common Case Fast,Perhaps themost important andpervasive principle ofcomputer design istomake the common case fast:In makinga design trade-off,favor thefrequent case overthe infrequent case.,计算机设计的,最,最重要的原则,就,就是:加快经,常,常性发生事件,的,的执行速度。,4,Make the Common Case Fast,Improvingthe frequent event,rather than therareevent,will obviously help performance.,Overflow case and nooverflowcasein addition,提高频繁事,件,件的执行速,度,度,而不是,提,提高罕见事,件,件的执行速,度,度,将带来,明,明显的性能,上,上的提高,例如加法运,算,算中的溢出,和,和非溢出情,况,况,5,Amdahls Law 1,Amdahls Law statesthatthe performanceimprovement tobe gainedfromusing some faster mode of execution islimited bythefraction of thetimethefaster mode canbe used.,阿姆达定律,表,表明:通过,改,改进某模式,得,得到的整体,性,性能提高,,受,受限于该改,进,进模式所占,的,的运行时间,比,比例。,6,Amdahls Law 2,Speedup,(,加速比),=,Performance forentire task using the enhancementwhenpossible,(,改进后完成,整,整个任务的,性,性能,),Performance forentire task w/o using the enhancement,(,改进前完成,整,整个任务的,性,性能,),=,Executiontimefor entiretaskw/ousingtheenhancement,(,改进前完成,整,整个任务的,时,时间),Executiontimefor entiretaskusing theenhancement when possible,(,改进前完成,整,整个任务的,时,时间),7,Amdahls Law 3,Executiontime,new,=Execution time,old,x,where,f,E,:fractionof enhancement,s,E,:improvement gainedby the,enhancement mode,即:新的执行时,间,间,=,原来执行时,间,间,x,8,Amdahls Law 3,Speedup=,即:加速比,原来的执,行,行时间/新,的,的执行时间,1,9,Amdahls Law 4,Example:An enhancement run 10times faster than the original machine,butit isusable 40%ofthe time,thenthe speedup=_.,Sol:,f,E,=0.4,s,E,=10,Speedup=1/(1-0.4)+0.4/10),=1.56,10,Amdahls Law can also beapplied tocompare two CPU design alternatives,for example:Implementations offloating-point(FP)squarerootvarysignificantly in performance,especiallyamong processors designed forgraphics.SupposeFP squareroot(FPSQR)is responsiblefor20%of the execution time ofa criticalgraphicsbenchmark.Oneproposal is toenhance the FPSQR hardware andspeed upthisoperationby afactor of10.Theother alternative is just totry to make allFPinstructions inthegraphics processor run fasterby afactor of1.6;FPinstructions are responseiblefor atotal of50%of the execution time fortheapplication.Comparethese twodesign alternatives.,Amdahls Law can also beapplied tocompare two CPU design alternatives,for example:Implementationsoffloating-point(FP)squarerootvarysignificantly in performance,especiallyamong processors designed forgraphics.,Amdahls Law,也可以用于,比,比较两种设,计,计不同的,CPU,,特别是对,于,于处理图形,的,的处理器来,说,说,求浮点,数,数平方根的,不,不同实现方,法,法在性能上,有,有很大差异,。,。,11,Amdahls Law can also beapplied tocompare two CPU design alternatives,for example:Implementations offloating-point(FP)squarerootvarysignificantly in performance,especiallyamong processors designed forgraphics.SupposeFP squareroot(FPSQR)is responsiblefor20%of the execution time ofa criticalgraphicsbenchmark.Oneproposal is toenhance the FPSQR hardware andspeed upthisoperationby afactor of10.Theother alternative is just totry to make allFPinstructions inthegraphics processor run fasterby afactor of1.6;FPinstructions are responseiblefor atotal of50%of the execution time fortheapplication.Comparethese twodesign alternatives.,SupposeFP squareroot(FPSQR)is responsiblefor20%of the execution time ofa criticalgraphicsbenchmark.Oneproposal is toenhance the FPSQR hardware andspeed upthisoperationby afactor of10.The other alternative is just totry tomake allFPinstructions in the graphicsprocessor runfasterby a factor of1.6;FP instructions areresponseiblefor a total of50%of theexecution timefor the application.Compare thesetwo design alternatives.,例如,求浮点数,平,平方根的操作,,在,在一个标准测试,程,程序中占总执行,时,时间的,20%,。一种方法是改,进,进,FPSQR,硬件,将它的操,作,作速度提,10,倍。另一种方法,是,是将所有图形处,理,理器中的,FP,指令的执行速度,都,都提高,1.6,倍,这些,FP,指令在总的执行,时,时间中占,50%,比较这两种设计,方,方法。,12,Answer:we cancompare thesetwo alternatives bycomparing thespeedups:,Improving theperformance ofthe FPoperations overall is slightly better because ofthe higher frequency.,Answer:we cancompare thesetwo alternatives bycomparing thespeedups:,(可以通过计算,加,加速比来进行比,较,较),Improving theperformance ofthe FPoperations overall is slightly better because ofthe higher frequency.,(可见提高所有,FP,操作的性能的方,案,案要好,这是,由,由于它们的执行,频,频率较高),13,Amdahls Law 6,ExtremeCases,极限情况,f,E,=0,Speedup=1,f,E,=1,Speedup=,s,E,f,E,增强比例,s,E,增强加速比,14,CPU Performance 1,Most computersare constructed using a clock running ata constant rate,多数计算机的运,行,行都基于一个固,定,定频率的时钟信,号,号,Referred to bylength/time,e.g.,10 ns,or rate,e.g.,100 MHz,ms=10,3,sec,s=10,6,sec,ns=10,9,sec,Hz=1/sec,
展开阅读全文