This style has one process, and uses synthesiser-determined state encodings. The two outputs are generated directly from the clocked process, which guarantees that the module outputs are registered.
There are a number of interesting points to note about this style:
- The specific Verilog parameter encodings are not important, since the design intent is that the synthesiser should replace the codings anyway. The codings are specified arbitrarily here as 0, 1, 2, and 3; this is sufficient to allow the RTL code to simulate, and to allow the synthesiser to analyse the code without errors. However, as it turns out, the codings are not quite as arbitrary as they may appear to be; see the synthesis results below.
- The Verilog module declaration is now output reg, rather than simply output, since the outputs are driven by an always block. The reg keyword is optional in the module declaration in the DUT section of the testbench code (fsm.tv), so the same testbench may be used.
- The SCK and BUSY outputs are driven from the clocked process. This is a little harder to code than Style #1 and Style #2, since we have to decode the conditions that will result in the required output after the next clock edge. For example, if the FSM is currently in st3, we set BUSY active because we know that the next state will be st1, and BUSY is required in st1. the next clock edge will change the state from st3 to st1, and will change BUSY from 0 to 1.
- The Verilog code requires a default clause in the case statement. This is not, strictly speaking, essential, but XST issues a warning if it is not present, and the RTL code latches metavalues in the state register. The VHDL code does not require a default (others), since both the simulator and the synthesiser agree that STATEREG has only 4 possible values.
-
The major complication in Style #3 is that we have to be sure that all
the outputs (in this case, there are only two) are always defined in
every branch of the case statement. There are 3 ways to do this:
- Every branch can simply explicitly list every output that has to be assigned, setting it to 0 or 1. This is tedious and impractical if there are a large number of outputs.
- By keeping track of what the current value of an output is, you can reduce verbosity by only assigning to the output when it changes. This is what the two examples below do. While this is generally concise, it is possibly more error-prone.
- The outputs can all be set to a default value just before the case statement; the case statement branches can then just assign to the output when it need a non-default value. This works in both VHDL and Verilog (there is a common misconception that this doesn't work in Verilog). This option is used in Style #4.
Verilog
fsm3.v
module FSM ( input CLK, SRST, LOAD, TC, output reg SCK, BUSY); parameter [1:0] st0 = 0, // important: this shows an st1 = 1, // initial coding of 0123 st2 = 2, st3 = 3; reg [1:0] STATEREG; always @(posedge CLK) if(SRST) begin STATEREG <= st3; BUSY <= 0; SCK <= 1; end else case(STATEREG) st3: begin STATEREG <= st1; BUSY <= 1; SCK <= 0; end st0: if(LOAD) begin STATEREG <= st1; BUSY <= 1; end st1: begin STATEREG <= st2; SCK <= 1; end default: begin SCK <= 0; if(TC) begin STATEREG <= st0; BUSY <= 0; end else STATEREG <= st1; end endcase endmodule
VHDL
fsm3.vhd
library IEEE; use IEEE.std_logic_1164.all; entity FSM is port ( CLK, SRST, LOAD, TC : in std_logic; SCK, BUSY : out std_logic); end entity FSM; architecture RTL of FSM is type FSMTYPE is (st0, st1, st2, st3); signal STATEREG : FSMTYPE; begin FSM : process (CLK) is begin if rising_edge(CLK) then if SRST = '1' then STATEREG <= st3; BUSY <= '0'; SCK <= '1'; else case STATEREG is when st3 => STATEREG <= st1; BUSY <= '1'; SCK <= '0'; when st0 => if LOAD = '1' then STATEREG <= st1; BUSY <= '1'; end if; when st1 => STATEREG <= st2; SCK <= '1'; when st2 => SCK <= '0'; if TC = '1' then STATEREG <= st0; BUSY <= '0'; else STATEREG <= st1; end if; end case; end if; end if; end process FSM; end architecture RTL;
Synthesis
The XST synthesis results, with default automatic FSM encoding, turned up some surprises. Both the Verilog and the VHDL code produced a one-hot FSM, with two additional registers for the SCK and BUSY outputs, as might be expected. However, the specific implementations differed:
Mode | SCK/BUSY | Period | LUT2 | LUT3 | LUT4 | LUT5 | LUT6 | FDR | FDS | FDRS | Total | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
fsm3.v | Auto | Reg/Reg | 1.103 | 1 | 2 | 1 | 2 | 2 | 2 | 10 | ||
fsm3.vhd | Auto | Reg/Reg | 1.298 | 1 | 1 | 2 | 1 | 3 | 2 | 1 | 11 |
On the face of it, this makes little sense; the only difference between the VHDL and the Verilog code is that the designer specified explicit state codings (0123) in the Verilog code, which were replaced by XST with 4-bit one-hot encodings. The synthesiser might be expected to ignore the initial codings. However, some experimentation showed that the initial codings were actually used by XST; different initial codings resulted in different one-hot codings, implementations, and period estimates.
There are 24 ways to specify the initial Verilog coding for this simple FSM. I tried 3 of these (the initial 0123 coding above, and two new codings of 1302 and 3210), and all three produced a period estimate of 1.103ns, using one-hot coding. However, the fourth (2013) produced an estimate of 1.042ns. Coding 2013 turned up another surprise: XST's auto mode did not use a one-hot encoding, but instead took 2013 as a user-defined coding, and produced a binary FSM with two additional registers for the SCK and BUSY outputs. The technology diagram showed 4 flip-flops and 4 LUTs.
Given this, I repeated synthesis of both sources for all 7 explicit extraction styles provided by XST, together with 'none' (6). The Verilog source was restored to an initial coding of 0123 (as in the source code above). The table below gives XST's estimated minimum period, together with XST's assigned coding when it produced a 2-bit FSM (in the 'Binary' column) and a 4-bit one-hot FSM, for all extraction styles:
Verilog | VHDL | |||||
---|---|---|---|---|---|---|
Coding | Period, ns | Binary | One-hot | Period, ns | Binary | One-hot |
Auto | 1.103 | 4281 | 1.298 | 1482 | ||
Compact | 1.042 | 2130 | 1.042 | 0231 | ||
Sequential | 1.042 | 2130 | 1.042 | 0231 | ||
Gray | 1.110 | 3120 | 1.110 | 0321 | ||
Johnson | 1.110 | 3120 | 1.110 | 0321 | ||
User | 1.042 | 0123 | 1.042 | 0123 | ||
One-hot | 1.103 | 4281 | 1.298 | 1482 | ||
Speed1 | 1.103 | 2418 | 1.298 | 8214 | ||
None | 1.110 | 1.373 |
All the extraction styles which have an entry in the 'Binary' column produced a 2-bit state register, together with two additional bits for the SCK and BUSY outputs, giving a total of 4 flip-flops. The two styles which produced one-hot FSMs, and the 'None' style, produced 6 flip-flops.
One clear conclusion is that auto extraction may not produce the best results. Additionally, for at least some of the possible settings of the Verilog state parameters, the one-hot Verilog FSMs are faster than the corresponding VHDL one-hot FSM. The one-hot FSMs were only compared for 3 of the possible 24 settings of the Verilog parameters; it is possible (or even likely) that one of the other settings would produce a result which is equivalent to the VHDL result.
Finally, it is also clear that none of the synthesis results were as efficient as the Style #1 and Style #2 results (2 flip-flops and 2 LUTs, at 0.926ns). Since the required functionality is identical, it does seem likely that some modification of the Verilog and VHDL code could produce better results. However, it's not obvious what these modifications would be. Simply breaking the clocked process down into two clocked processes (Style #4) has no effect on the synthesis results.