资源描述
单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,第10章,设计优化和设计方法,10.1 面积优化,FPGA/CPLD资源的优化具有实用意义:,(1)通过优化,可以使用规模更小的可编程逻辑芯片,从而降低系统成本。,(2)对于许多可编程逻辑器件(例如某些公司的CPLD器件),由于布线资源有限,耗用资源过多而严重影响电路性能。,(3)为以后的技术升级,留下更多的可编程资源,方便添加产品的功能。,(4)对于多数可编程逻辑器件,资源耗用太多会使器件功耗显著上升。,10.1.1 资源共享,【例10-1】,LIBRARY ieee;,USE ieee.std_logic_1164.all;,USE ieee.std_logic_unsigned.all;,USE ieee.std_logic_arith.all;,ENTITY multmux IS,PORT(A0,A1,B :IN std_logic_vector(3 downto 0);,sel :IN std_logic;,Result :OUT std_logic_vector(7 downto 0);,END multmux;,ARCHITECTURE rtl OF multmux IS,BEGIN,process(A0,A1,B,sel),begin,if(sel=0)then Result=A0*B;,else Result=A1*B;,end if;,end process;,END rtl;,图10-1 先乘后选择的设计方法RTL结构,图10-2 先选择后乘设计方法RTL结构,【例10-2】,ARCHITECTURE rtl OF muxmult IS,signal temp:std_logic_vector(3 downto 0);,BEGIN,process(A0,A1,B,sel),begin,if(sel=0)then temp=A0;,else temp=A1;,end if;,result=temp*B;,end process;,END rtl;,图10-3 资源共享反例,10.1.2 逻辑优化,【例10-3】,LIBRARY ieee;,USE ieee.std_logic_1164.all;,use ieee.std_logic_unsigned.all;,use ieee.std_logic_arith.all;,ENTITY mult1 IS,PORT(clk:in std_logic;,ma:In std_logic_vector(11 downto 0);,mc:out std_logic_vector(23 downto 0);,END mult1;,ARCHITECTURE rtl OF mult1 IS,signal ta,tb:std_logic_vector(11 downto 0);,BEGIN,process(clk)begin,if(clkevent and clk=1)then,end if;,end process;,END rtl;,在此构建了一个两输入的乘法器:,mc=ta*tb;,【例10-4】,LIBRARY ieee;,USE ieee.std_logic_1164.all;,use ieee.std_logic_unsigned.all;,use ieee.std_logic_arith.all;,ENTITY mult2 IS,PORT(clk:in std_logic;,ma:In std_logic_vector(11 downto 0);,mc:out std_logic_vector(23 downto 0);,END mult2;,ARCHITECTURE rtl OF mult2 IS,signal ta:std_logic_vector(11 downto 0);,BEGIN,process(clk)begin,if(clkevent and clk=1)then ta=ma;mc=ta*tb;,end if;,end process;,END rtl;,10.1.3 串行化,【例10-5】,LIBRARY ieee;,USE ieee.std_logic_1164.all;,use ieee.std_logic_unsigned.all;,use ieee.std_logic_arith.all;,ENTITY pmultadd IS,PORT(clk:in std_logic;,a0,a1,a2,a3:in std_logic_vector(7 downto 0);,b0,b1,b2,b3:in std_logic_vector(7 downto 0);,yout:out std_logic_vector(15 downto 0);,END pmultadd;,ARCHITECTURE p_arch OF pmultadd IS,BEGIN,process(clk)begin,if(clkevent and clk=1)then,yout=(a0*b0)+(a1*b1)+(a2*b2)+(a3*b3);,end if;,end process;,END p_arch;,对8个16位数据进行乘法和加法运算,即,yout,=,a,0,b,0,+,a,1,b,1,+,a,2,b,2,+,a,3,b,3,图10-4 并行并行乘法RTL结构,【例10-6】,LIBRARY ieee;,USE ieee.std_logic_1164.all;,use ieee.std_logic_unsigned.all;,use ieee.std_logic_arith.all;,ENTITY smultadd IS,PORT(clk,start:in std_logic;,a0,a1,a2,a3:In std_logic_vector(7 downto 0);,b0,b1,b2,b3:In std_logic_vector(7 downto 0);,yout:out std_logic_vector(15 downto 0);,END smultadd;,ARCHITECTURE s_arch OF smultadd IS,signal cnt:std_logic_vector(2 downto 0);,signal tmpa,tmpb:std_logic_vector(7 downto 0);,signal tmp,ytmp:std_logic_vector(15 downto 0);,BEGIN,tmpa=a0 when cnt=0 else,a1 when cnt=1 else,a2 when cnt=2 else,a3 when cnt=3 else,a0;,接下页,tmpb=b0 when cnt=0 else,b1 when cnt=1 else,b2 when cnt=2 else,b3 when cnt=3 else,b0;,tmp=tmpa*tmpb;,process(clk)begin,if(clkevent and clk=1)then,if(start=1)then cnt=000;,ytmp 0);,elsif(cnt 4)then cnt=cnt+1;,ytmp=ytmp+tmp;,elsif(cnt=4)then yout=ytmp;,end if;,end if;,end process;,END s_arch;,图10-5 串行化结构,10.2 速度优化,10.2.1 流水线设计,显然该设计从输入到输出需经过的时间至少为Ta,就是说,时钟信号clk周期不能小于Ta。,图10-7使用流水线,其最高频率为:,图10-8 流水线工作图示,【例10-7】,LIBRARY ieee;,USE ieee.std_logic_1164.all;,use ieee.std_logic_unsigned.all;,use ieee.std_logic_arith.all;,ENTITY adder4 IS,PORT(clk:in std_logic;,a0,a1,a2,a3:in std_logic_vector(7 downto 0);,yout:out std_logic_vector(9 downto 0);,END adder4;,接下页,ARCHITECTURE normal_arch OF adder4 IS,signal t0,t1,t2,t3:std_logic_vector(7 downto 0);,signal addtmp0,addtmp1:std_logic_vector(8 downto 0);,BEGIN,process(clk)begin,if(clkevent and clk=1)then,t0=a0;t1=a1;t2=a2;t3=a3;,end if;,end process;,addtmp0=0,addtmp1=0,process(clk)begin,if(clkevent and clk=1)then,yout=0,end if;,end process;,END normal_arch;,接上页,signal t0,t1,t2,t3:std_logic_vector(7 downto 0);,(1)通过优化,可以使用规模更小的可编程逻辑芯片,从而降低系统成本。,图10-26 清除工程选项设置,ARCHITECTURE pipelining_arch OF pipeadd IS,if(clkevent and clk=1)then ta=ma;mc=ta*tb;,yout=0,图10-18 打包Clique设计示例,signal t0,t1,t2,t3:std_logic_vector(7 downto 0);,ma:In std_logic_vector(11 downto 0);,图10-33 EPC2下载,ARCHITECTURE rtl OF muxmult IS,process(A0,A1,B,sel),在此构建了一个两输入的乘法器:,ENTITY multmux IS,图10-18 打包Clique设计示例,在此构建了一个两输入的乘法器:,LIBRARY ieee;,【例10-8】,LIBRARY ieee;,USE ieee.std_logic_1164.all;,use ieee.std_logic_unsigned.all;,use ieee.std_logic_arith.all;,ENTITY pipeadd IS,PORT(clk:in std_logic;,a0,a1,a2,a3:in std_logic_vector(7 downto 0);,yout:out std_logic_vector(9 downto 0);,END pipeadd;,ARCHITECTURE pipelining_arch OF pipeadd IS,signal t0,t1,t2,t3:std_logic_vector(7 downto 0);,signal addtmp0,addtmp1:std_logic_vector(8 downto 0);,BEGIN,process(clk)begin,if(clkevent and clk=1)then,t0=a0;t1=a1;t2=a2;t3=a3;,end if;,end process;,process(clk)begin,if(clkevent and clk=1)then,addtmp0=0,addtmp1=0,yout=0,end if;,end process;,END pipelining_arch;,10.2.2 寄存器配平,如果其中的两个组合逻辑块的延时差别过大,如,T,1大于,T,2,于是其总体的工作频率,Fmax,取决于,T,1,即最
展开阅读全文