Finite State Machines

This section contains a number of examples of Verilog and VHDL FSMs. All are synthesisable, but the precise synthesis results will depend on the coding style, and on any recoding of the FSM which may be carried out by the synthesiser. The source code contains no synthesiser directives to select a specific coding style, since these are vendor-specific.

All the examples implement this simple FSM:

The FSM has three inputs (SRST, LOAD and TC), and two outputs (BUSY and SCK). SRST is a synchronous reset; the FSM resets to state 3.


Testbench

A single testbench is used for all the examples (both VHDL and Verilog). This FSM is so simple that it can be tested simply by cycling through all the states from a reset. In general, however, a test for a more complex FSM would need to preload specific states (using the .P directive), and then test the state space around that state.

A significant issue with FSM verification is that the synthesiser may change the state coding, or the number of bits in the state register. If the testbench explicitly tests the state register, or makes any assumptions about the contents of the state register, then that testbench will only work for the original RTL code. A testbench might, for example, check that an FSM is in state ST0; if the synthesiser has recoded ST0 from 00 to 1000, then the testbench will fail with the post-synthesis netlist.

It is not generally practical to construct different testbenches for the RTL code and the netlist, as the synthesis output will change as different coding algorithms are tried (XST, for example, has 8 different algorithms). If the synthesiser is set to auto-encoding, it might also later change the algorithm if an external factor changes. This testbench avoids the problem by checking only the signals at the module interface, and not the state register itself (it may appear that the testbench does actually test the state register, but S0, S1, S2, and S3 are defined as combinations of the two output signals, rather than as specific values of a state register).

fsm.tv

#define S0 0,0
#define S1 0,1
#define S2 1,1
#define S3 1,0

DUT {
   module FSM(
      input  CLK, SRST, LOAD, TC,
      output SCK, BUSY)
   [CLK, SRST, LOAD, TC] -> [SCK, BUSY]
   create_clock CLK       // ';' termination optional in DUT section
}

[.C, 1, .X, .X] -> [S3]
[.C, 0,   ,   ] -> [S1]   // leave blank inputs unchanged at .X
[.C, 0,   ,   ] -> [S2]
[.C, 0,   ,  0] -> [S1]
[.C, 0,   , .X] -> [S2]
[.C, 0,   ,  1] -> [S0]
[.C, 0,  0, .X] -> [S0]
[.C, 0,  1, .X] -> [S1]
[.C, 1, .X, .X] -> [S3]   // reset

The expected testbench output is:

# (Log)        (90 ns) 9 vectors executed (18 passes, 0 fails)
Synthesis summary

All the code examples produce exactly the same FSM, which is shown in the diagram above. There is a minor difference, however. SCK and BUSY are coded as combinatorial ouputs in Style #2 and Style #5, and as registered outputs in the other styles. However, coding an output as sequential or combinatorial may not actually have the desired effect in the synthesis netlist, for these reasons:

  • if a synthesiser recodes the state encodings, it is possible that it will have to convert sequential outputs to combinatorial. See the auto-mode output for Style #1, for example
  • XST, at least, is capable of rolling combinatorial outputs back into a register; see the output for Style #2 and Style #5.

This needs to be considered carefully, as it has important implications for methodologies which require subblock outputs to be registered.

Implementing a combinatorial output as a register improves module timing, and is clearly beneficial. However, its usefulness is limited by the fact that there is no simple way of determining whether or not an output was registered. This means that this feature cannot be relied on if there is a methodology requirement to register the outputs of subblocks, and the FSM outputs are the subblock outputs.

There was a large variation in the synthesis results, with estimated cycle times ranging from 0.926ns to 1.298ns. The number of flip-flops varied from 2 to 6, and the total flip-flop and LUT count varied from 4 to 11. The results are summarised in the table below.

Synthesis results
Mode SCK/BUSY Period LUT2 LUT3 LUT4 LUT5 LUT6 FD FDR FDS FDRS Total
fsm1.v Auto Comb/Comb 1.103 2 2 2 1 1 8
fsm1.vhd Auto Comb/Comb 1.103 2 2 2 1 1 8
fsm1.v User Reg/Reg 0.926 1 1 1 1 4
fsm1.vhd User Reg/Reg 0.926 1 1 1 1 4
fsm2.v Auto Reg/Comb 0.926 2 1 2 5
fsm2.vhd Auto Reg/Reg 0.926 1 1 1 1 4
fsm3.v Auto Reg/Reg 1.103 1 2 1 2 2 2 10
fsm3.vhd Auto Reg/Reg 1.298 1 1 2 1 3 2 1 11
fsm4b.v Auto Reg/Reg 1.103 1 3 1 2 1 2 10
fsm4b.vhd Auto Reg/Reg 1.107 2 1 2 2 2 9
fsm4b.v User Reg/Reg 1.110 1 2 1 1 2 1 8
fsm4b.vhd User Reg/Reg 1.107 1 1 1 2 1 6
fsm5.v Auto Comb/Comb 1.103 2 2 2 1 1 8
fsm5.vhd Auto Reg/Reg 0.926 1 1 1 1 4
fsm5.v User Reg/Comb 0.926 2 1 2 5
fsm5.vhd User Reg/Comb 0.926 2 1 2 5