John Cooleys 設計大賽

Created: November-22, 2018

這個例子直接來自 John Cooley 在 SNUG'95（Synopsys 使用者組會議）的設計競賽。本次比賽旨在反對 VHDL 和 Verilog 設計師遇到的相同設計問題。約翰想到的可能是確定哪種語言最有效。結果是，9 位 Verilog 設計師中有 8 位成功完成了設計競賽，但 5 位 VHDL 設計師都沒有。希望使用提出的方法，我們會做得更好。

產品規格

我們的目標是在普通的可綜合 VHDL（實體和架構）中設計一個同步的 3×5，可載入，模數 512 計數器，帶有進位輸出，借位輸出和奇偶校驗輸出。計數器是一個 9 位無符號計數器，因此它的範圍在 0 到 511 之間。計數器的介面規範如下表所示：

名稱	位寬	方向	描述
時鐘	1	輸入	主時鐘; 計數器在 CLOCK 的上升沿同步
`DI`	9	輸入	資料輸入匯流排; 當 UP 和 DOWN 都為低時，計數器載入 DI
`UP`	1	輸入	Up-by-3 計數命令; 當 UP 為高而 DOWN 為低時，計數器增加 3，繞其最大值（511）
下	1	輸入	按 5 計數命令; 當 DOWN 為高且 UP 為低時，計數器減少 5，繞其最小值（0）
`CO`	1	輸出	執行訊號; 只有當計數超過最大值（511）並因此環繞時才為高
`BO`	1	輸出	借出訊號; 僅當向下計數低於最小值（0）並因此環繞時才高
做	9	輸出	輸出匯流排; 櫃檯的當前價值; 當 UP 和 DOWN 都為高時，計數器保持其值
`PO`	1	輸出	奇偶校驗訊號; 當計數器的當前值包含偶數 1 時為高

當計數超過其最大值或向下計數低於其最小值時，計數器會迴繞：

逆流值	上下	反擊下一個值	下一個 CO	下一個 BO	下一個 PO
`X`	00	DI	0	0	奇偶校驗（DI）
`X`	11	X	0	0	奇偶校驗（x）的
0≤x≤508	10	X + 3	0	0	奇偶校驗（X + 3）
`509`	10	0	1	0	1
`510`	10	1	1	0	0
`511`	10	2	1	0	0
5≤x≤511	01	X-5	0	0	奇偶校驗（X-5）
`4`	01	511	0	1	0
`3`	01	510	0	1	1
`2`	01	509	0	1	1
`1`	01	508	0	1	0
`0`	01	507	0	1	1

框圖

基於這些規範，我們可以開始設計框圖。我們先來代表一下介面：

StackOverflow 文件

我們的電路有 4 個輸入（包括時鐘）和 4 個輸出。下一步是確定我們將使用多少暫存器和組合塊以及它們的角色。對於這個簡單的例子，我們將專門用一個組合塊來計算計數器的下一個值，執行和借出。另一個組合塊將用於計算奇偶校驗輸出的下一個值。計數器的當前值，執行和借出將儲存在暫存器中，而奇偶校驗輸出的當前值將儲存在單獨的暫存器中。結果如下圖所示：

StackOverflow 文件

檢查框圖是否符合我們的 10 個設計規則很快就完成了：

我們的外部介面由大的周圍矩形正確表示。
我們的 2 個組合塊（圓形）和 2 個暫存器（方形）明顯分開。
我們只使用上升沿觸發暫存器。
我們只使用一個時鐘。
我們有 4 個內部箭頭（訊號），4 個輸入箭頭（輸入埠）和 4 個輸出箭頭（輸出埠）。
我們的箭都沒有幾個起源。三個有幾個目的地（clock，ncnt 和 do）。
我們的 4 個輸入箭頭都不是我們內部塊的輸出。
我們的三個輸出箭頭只有一個原點和一個目標。但是 do 有兩個目的地：外面和我們的組合塊之一。這違反了規則 8，如果我們想要遵守 2008 年之前的 VHDL 版本，必須通過插入新的組合塊來修復：

StackOverflow 文件

我們現在有 5 個內部訊號（cnt，nco，nbo，ncnt 和 npo）。
圖中只有一個週期，由 cnt 和 ncnt 組成。週期中有一個方塊。

在 2008 年之前的 VHDL 版本中進行編碼

在 VHDL 中翻譯我們的框圖非常簡單。計數器的當前值範圍為 0 到 511，因此我們將使用 9 位 bit_vector 訊號來表示它。唯一的微妙之處在於需要對相同資料執行按位（如計算奇偶校驗）和算術運算。庫 ieee 的標準 numeric_bit 包解決了這個問題：它宣告瞭一個 unsigned 型別，與 bit_vector 具有完全相同的宣告，並過載算術運算子，使得它們可以採用 unsigned 和整數的任意混合。為了計算執行和借出，我們將使用 10 位 unsigned 臨時值。

庫宣告和實體：

library ieee;
use ieee.numeric_bit.all;

entity cooley is
  port(
        clock: in  bit;
        up:    in  bit;
        down:  in  bit;
        di:    in  bit_vector(8 downto 0);
        co:    out bit;
        bo:    out bit;
        po:    out bit;
        do:    out bit_vector(8 downto 0)
      );
end entity cooley;

該架構的骨架是：

architecture arc1 of cooley is
  signal cnt:  unsigned(8 downto 0);
  signal ncnt: unsigned(8 downto 0);
  signal nco:  bit;
  signal nbo:  bit;
  signal npo:  bit;
begin
    <...processes...>
end architecture arc1;

我們的 5 個塊中的每一個都被建模為一個過程。與我們的兩個暫存器對應的同步過程非常容易編碼。我們只使用編碼示例中提出的模式。例如，儲存奇偶校驗輸出標誌的暫存器被編碼：

  poreg: process(clock)
  begin
    if rising_edge(clock) then
      po <= npo;
    end if;
  end process poreg;

以及儲存 co，bo 和 cnt 的另一個暫存器：

  cobocntreg: process(clock)
  begin
    if rising_edge(clock) then
      co  <= nco;
      bo  <= nbo;
      cnt <= ncnt;
    end if;
  end process cobocntreg;

重新命名組合過程也很簡單：

  rename: process(cnt)
  begin
    do <= (others => '0');
    do <= bit_vector(cnt);
  end process rename;

奇偶校驗計算可以使用變數和簡單迴圈：

  parity: process(ncnt)
    variable tmp: bit;
  begin
    tmp := '0';
    npo <= '0';
    for i in 0 to 8 loop
      tmp := tmp xor ncnt(i);
    end loop;
    npo <= not tmp;
  end process parity;

最後的組合過程是最複雜的，但嚴格應用所提出的翻譯方法也很容易：

  u3d5: process(up, down, di, cnt)
    variable tmp: unsigned(9 downto 0);
  begin
    tmp  := (others => '0');
    nco  <= '0';
    nbo  <= '0';
    ncnt <= (others => '0');
    if up = '0' and down = '0' then
      ncnt <= unsigned(di);
    elsif up = '1' and down = '1' then
      ncnt <= cnt;
    elsif up = '1' and down = '0' then
      tmp   := ('0' & cnt) + 3;
      ncnt  <= tmp(8 downto 0);
      nco   <= tmp(9);
    elsif up = '0' and down = '1' then
      tmp   := ('0' & cnt) - 5;
      ncnt  <= tmp(8 downto 0);
      nbo   <= tmp(9);
    end if;
  end process u3d5;

注意，也可以合併兩個同步過程，並且可以在簡單的並發訊號分配中簡化我們的組合過程之一。完整的程式碼，包含庫和包宣告，以及建議的簡化如下：

library ieee;
use ieee.numeric_bit.all;

entity cooley is
  port(
        clock: in  bit;
        up:    in  bit;
        down:  in  bit;
        di:    in  bit_vector(8 downto 0);
        co:    out bit;
        bo:    out bit;
        po:    out bit;
        do:    out bit_vector(8 downto 0)
      );
end entity cooley;

architecture arc2 of cooley is
  signal cnt:  unsigned(8 downto 0);
  signal ncnt: unsigned(8 downto 0);
  signal nco:  bit;
  signal nbo:  bit;
  signal npo:  bit;
begin
  reg: process(clock)
  begin
    if rising_edge(clock) then
      co  <= nco;
      bo  <= nbo;
      po  <= npo;
      cnt <= ncnt;
    end if;
  end process reg;

  do <= bit_vector(cnt);

  parity: process(ncnt)
    variable tmp: bit;
  begin
    tmp := '0';
    npo <= '0';
    for i in 0 to 8 loop
      tmp := tmp xor ncnt(i);
    end loop;
    npo <= not tmp;
  end process parity;

  u3d5: process(up, down, di, cnt)
    variable tmp: unsigned(9 downto 0);
  begin
    tmp  := (others => '0');
    nco  <= '0';
    nbo  <= '0';
    ncnt <= (others => '0');
    if up = '0' and down = '0' then
      ncnt <= unsigned(di);
    elsif up = '1' and down = '1' then
      ncnt <= cnt;
    elsif up = '1' and down = '0' then
      tmp   := ('0' & cnt) + 3;
      ncnt  <= tmp(8 downto 0);
      nco   <= tmp(9);
    elsif up = '0' and down = '1' then
      tmp   := ('0' & cnt) - 5;
      ncnt  <= tmp(8 downto 0);
      nbo   <= tmp(9);
    end if;
  end process u3d5;
end architecture arc2;

走得更遠

所提出的方法簡單且安全，但它依賴於可以放寬的若干約束。

跳過程式框圖

經驗豐富的設計人員可以跳過簡單設計的方框圖。但他們仍然認為硬體第一。他們畫在頭上，而不是在一張紙上，但他們不知何故繼續畫畫。

使用非同步重置

在某些情況下，非同步復位（或集合）可以提高設計質量。所提出的方法僅支援同步復位（即在時鐘上升沿時考慮的復位）：

  process(clock)
  begin
    if rising_edge(clock) then
      if reset = '1' then
        o <= reset_value_for_o;
      else
        o <= i;
      end if;
    end if;
  end process;

具有非同步重置的版本通過在靈敏度列表中新增重置訊號並賦予其最高優先順序來修改我們的模板：

  process(clock, reset)
  begin
    if reset = '1' then
      o <= reset_value_for_o;
    elsif rising_edge(clock) then
      o <= i;
    end if;
  end process;

合併幾個簡單的過程

我們已經在我們的示例的最終版本中使用了它。合併多個同步程序（如果它們都具有相同的時鐘）是微不足道的。將多個組合過程合併為一個也很簡單，只是對框圖的簡單重組。

我們還可以將一些組合過程與同步過程合併。但為了做到這一點，我們必須回到我們的框圖並新增第十一條規則：

通過在它們周圍繪製一個外殼，將幾個圓形塊和至少一個方塊組合在一起。還附上可以的箭頭。如果箭頭未到達或從機箱外部進入，則不要讓箭頭穿過機箱的邊界。完成後，檢視機箱的所有輸出箭頭。如果它們中的任何一個來自機箱的圓形塊或者也是機箱的輸入，我們就無法在同步過程中合併這些過程。我們可以。

例如，在我們的計數器示例中，我們無法將這兩個程序分組到下圖的紅色機箱中：

StackOverflow 文件

因為 ncnt 是外殼的輸出，其原點是圓形（組合）塊。但我們可以分組：

StackOverflow 文件

內部訊號 npo 將變得無用，最終的過程將是：

  poreg: process(clock)
    variable tmp: bit;
  begin
    if rising_edge(clock) then
      tmp := '0';
      for i in 0 to 8 loop
        tmp := tmp xor ncnt(i);
      end loop;
      po <= not tmp;
    end if;
  end process poreg;

也可以與其他同步過程合併：

  reg: process(clock)
    variable tmp: bit;
  begin
    if rising_edge(clock) then
      co  <= nco;
      bo  <= nbo;
      cnt <= ncnt;
      tmp := '0';
      for i in 0 to 8 loop
        tmp := tmp xor ncnt(i);
      end loop;
      po <= not tmp;
    end if;
  end process reg;

分組甚至可以是：

StackOverflow 文件

通向更簡單的架構：

architecture arc5 of cooley is
  signal cnt: unsigned(8 downto 0);
begin
  process(clock)
    variable ncnt: unsigned(9 downto 0);
    variable tmp:  bit;
  begin
    if rising_edge(clock) then
      ncnt := '0' & cnt;
      co   <= '0';
      bo   <= '0';
      if up = '0' and down = '0' then
        ncnt := unsigned('0' & di);
      elsif up = '1' and down = '0' then
        ncnt := ncnt + 3;
        co   <= ncnt(9);
      elsif up = '0' and down = '1' then
        ncnt := ncnt - 5;
        bo   <= ncnt(9);
      end if;
      tmp := '0';
      for i in 0 to 8 loop
        tmp := tmp xor ncnt(i);
      end loop;
      po  <= not tmp;
      cnt <= ncnt(8 downto 0);
    end if;
  end process;

  do <= bit_vector(cnt);
end architecture arc5;

有兩個過程（do 的並發訊號分配是等效過程的簡寫）。只有一個過程的解決方案留作練習。要注意，它提出了有趣和微妙的問題。

更進一步

電平觸發鎖存器，時鐘下降沿，多個時鐘（以及時鐘域之間的重新同步），同一訊號的多個驅動器等都不是邪惡的。它們有時很有用。但是學習如何使用它們以及如何避免相關的陷阱遠遠超出了 VHDL 對數字硬體設計的簡短介紹。

編碼 VHDL 2008

VHDL 2008 引入了一些修改，我們可以使用它們來進一步簡化程式碼。在這個例子中，我們可以從 2 個修改中受益：

輸出埠可以讀取，我們不再需要 cnt 訊號，
一元 xor 運算子可用於計算奇偶校驗。

VHDL 2008 程式碼可以是：

library ieee;
use ieee.numeric_bit.all;

entity cooley is
  port(
        clock: in  bit;
        up:    in  bit;
        down:  in  bit;
        di:    in  bit_vector(8 downto 0);
        co:    out bit;
        bo:    out bit;
        po:    out bit;
        do:    out bit_vector(8 downto 0)
      );
end entity cooley;

architecture arc6 of cooley is
begin
  process(clock)
    variable ncnt: unsigned(9 downto 0);
  begin
    if rising_edge(clock) then
      ncnt := unsigned('0' & do);
      co   <= '0';
      bo   <= '0';
      if up = '0' and down = '0' then
        ncnt := unsigned('0' & di);
      elsif up = '1' and down = '0' then
        ncnt := ncnt + 3;
        co   <= ncnt(9);
      elsif up = '0' and down = '1' then
        ncnt := ncnt - 5;
        bo   <= ncnt(9);
      end if;
      po <= not (xor ncnt(8 downto 0));
      do <= bit_vector(ncnt(8 downto 0));
    end if;
  end process;
end architecture arc6;