资源描述
单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,Homework,1,1,Explain the Concepts,系统结构,(CA:Computer Architectute),、高级系统结构,(Advanced CA),、,Amdahl,law,、,SCALAR PROCESSING、LOOK-AHEAD、PVP、SMP、MPP、DSM、COW、GCE、CISC、RISC、VMM,、,SUPERCOMPUTER、SVM、MAINFRAME、COMPUTER SYSTEM ON CHIP、PARALLEL ARCHITECTURE INTO SINGLE CHIP、MOORE,定律、,UMA、NUMA、COMA、CC-NUMA、NORMA、SHELL ARCHITECTURE,PRAM,BSP,Log P,2,Explain the Flynn classify,and its semantics.,3,Where are the difficulties of Parallel Processing?,What is Parallel(,并行,),、,concurrent(,并发,),and Simultaneous(,同时,)separately?,4,Assume that a system component be speeded to 10 times,its past processing time accounts for 40%of the system,how much will be the system performance improved?,5,Please draw the memory Hierarchy chart.,homework2,1,Explaining Conceptions,RISC,CISC,VLIW,SUPER-SCALAR,SUPER-PIPELINE,SUPER-SCALAR-SUPER-PIPELINE,IPC,SINGLE ISSUE,MULTIPLE ISSUE,OOO,Multi-Threading,2 In ideal case,please give the performances for SUPER-SCALAR,SUPER-PIPELINE,SUPER-SCALAR-SUPER-PIPELINE,and give N=8 examples and their average IPC.,3 Please describe the CPU technology in recent years,and give an example to illustrate their technical parameters.,Please give computing trace of ai+7=bj+1+ck+8+dm+10 for T9000,How to solve the multicore memory wall problem?,6,Simple computer design test:,Supposed there are a simple CPU with A15-A0,D7-D0,read and write,an register and Tri-state buffers for I/O device,a 8k*8 ROM with A12-A0,D7-D0,CE and OE,and a 8k*8 RAM with A12-A0,D7-D0,CE,RD and WR.Please you design the circuit of a simple computer.,Homework,1)There is a program in a 40MHz,processor,,,its number of integer arithmetic instructions is 45000,and the clock is,1.The number of instructions with data operations is 32000,and the clock is 2.,The number of float instructions is 15000,and its clock 2.The number of instructions such as JMP etc is 8000,and its clock is 2.Please give its CPI,、,MIPS,and the CPU execution time.,2),If the communication cost in a SP2,machine is,t,(,m,),=46+,(,0.035,),m,please,give its(,渐进带宽,)r and its (,半峰值信息长度,),m1/2.,3),If a,N*N,matrix A is multiplied by a N*N matrix B,its time is T1=CN,3,s,but,the execution time in a machine with n nodes is Tn=(CN,3,/n+bN,2,/square-root(N)s,,,here,C,N and b,is a constant.Please you give its Speedup under fix-workload,fix-time and fix-memory usage.,4),(,等效率,)see:kumar V,Rao V N.Parallel depth firsh search,part ii:Analysis.IntI J of Parallel Programming,1987,16(6):501-519,5)(,等速度,)see:Sun X H,etc.scalability of parallel algorithm-machine combina-tions.IEEE transaction on parallel and distributed system,1994,5(6):519-613,6),(,等延迟,)see:Zhang X D,etc.latency metric:an experimental method for measuring and evaluating parallel program and architecture scalability.J of parallel and distributed Computing,1994,22:392-410.,2024/11/15,5,HOMEWORK,1解释,Concepts,State network(,静态网络,),、,Daymic network(,动态网络,),、,crossbar(,交叉开关,),、,Multi-buses(,多总线,),2,Draw N=25,MESH,3,写出用混洗交换网络模拟单级立方体网络的互连,函数表达式,please give the single cube function by using SHUFFLE-EXCHANGE NETWORK,4,How much have the different bus,Arbitrate Alogrithms?What is its merit and demerit?,总线仲裁有几种,各有何优缺点?,HOMEWORK,1 THERE ARE P1,P2,P3,P4 AND M1,M2,M3,M4,PLEASE USE,2*2 CROSSBAR SWITCH BOX TO DESIGN A CONNECTION,NETWORK OF P1M2,P2M4,P3M1,P4M3.,2 WHAT IS DS-LINK?,3 WHAT IS DIFFERENCE BETWEEN MESSAGE-PASSING,AND SHARED MEMORY?,4 WHAT ARE WORMHOLE COMMUNICATION AND ITS,PERFORMANCE?,5 Explaining that Myrinet,HiPPI,FDDI,ATM,SCI,and 100BaseT,。,6,采用虫洞寻径的超立方体多机系统中,如果相邻节点间有一对方向相反的单向通道,试证明在该系统上实现立方体编码下的寻径不会死锁。,(In a supercube network with wormhole communication,if there is a pair opposite one-way link between any two neighbour nodes,please try to proof that the network can not be deadlocked when looking for path),自由任选(课外大作业),任选一个具有代表性的,BENCHMARK,小程序,要求,:,1),请分析其源程序,写出其中的数据执行流程,(,包括,:,数据类型、具体值和时序关系,);,2),在,1),的基础上,对其目标代码再进行数据流分析,写出其中的数据执行流程,(,包括,:,数据类型、具体值和时序关系,);,观察并写出与,1,)中间结果的差别表。,3,)在,1,)和,2,)基础上,对其数据执行的流程进行数据预取优化,并在具有,CACHE,的计算机上进行实际求解。要求给出具体数据预取的优化方法和改进的测试时间效果表。,(可加,5-10,分),用,PVM,或,MPI,并行软件工具编写一个并行程序,要求至少有,2,或,3,个并行任务进行协同求解某问题。,What are Architecture,?,What are,Key Tech and Theory,?,What are,Meeting Problems,?,1,、,Earth-Simulator,2,、,Blue Gene,3,、,Beowulf With PoPC Cluster,4,、,Grid Computing,5,、,Pervasive Computing,6,、,P2P Computing,7,、,Special PoPC Cluster Such as Web Cache Cluster,作业,homework1,1,H0(n)=nmH0(,-n,)/(,-,)+n(1-m)H0;,Hc(n)=nmH0(,-n,)/(,-,)+n(1-m)H0,please delete the,and,by using,then draw the function figure when m=0.5,=0.2(,画出其函数关系图,),Homework2,事件,状态,A,状态,B,说明,初始,无效,无效(,I,),数据未装入,CPU A,读,独占,无效(,I,),读操作,cache,失效,装入,CPU B,读,共享,共享(,S,),读操作,cache,失效,装入后共享,CPU A,写,修改,无效(,I,),写操作命中,CPU B,读,共享,共享(,S,),读操作失效,装入,CPU B,写,无效,修改(,M,),写操作命中,MESI Protocol,Can you fill the states?,关于可用性中的检查点问题,CHECKPOINT(a,b,c),可在内核、库、应用程序三级发生;,a,b,d,c,x,y,z,P,Q,R,Process,一致性快照,Checkpoint Consistency Snapshot,(,a-Consistency,一致,;b-No Consistency,不一致,),如果进程之间不存在一个进程的检查点已接收了消息,而另一进程的检查点还未发送这个消息。称,一致性快照。,a,b,x,y,z,P,Q,R,C?,If th
展开阅读全文