FIFO test example

The FIFO test code is a complete example which tests a Xilinx CoreGen FIFO. You can download the code as a zip file from here. The FIFO itself is a netlist which was created by the Xilinx core generator, but the simulation also requires a small number of UNISIM components (the UNISIM library contains models of the device primitives, such as flops, for functional simulation). The required models are included in the example, so it's not necessary to install the Xilinx tools or libraries to run the example. The FIFO is small (16 wide by 31 deep, dual-clock, distributed RAM) to make it easier to fill and empty it during testing.

The testbench has a number of new concepts in it:

  • It is multi-threaded (with one thread for writing to the FIFO, and one for reading from it)
  • It uses pass-by-reference parameters in function calls
  • The FIFO has two independent clock inputs. In normal circumstances, these would both be generated by the testbench. However, Maia testbenches can respond to DUT-output clocks, as well as generating DUT-input clocks.
    To demonstrate this, the testbench (ftest1.tv) does not instantiate the FIFO directly, but instead instantiates a wrapper module (tfifo_wrapper.v). The wrapper contains a behavioural model for a 250MHz clock, which it drives to both the FIFO and the testbench, for FIFO reads. The testbench creates and drives a 156.25MHz clock for FIFO writes. The testbench itself therefore has both an input clock, and an output clock. This isn't necessary for the FIFO test, of course: it just demonstrates the use of DUT-output clocks.

The code in this example is Verilog-only, for simplicity.


1: defines

The two seeds (for the LSFR generator and the rand function), and the test count (the number of words to read and write) are defined with #define. In practice, these would normally be passed in from the test environment (as mtv -DTEST_COUNT=50000, for example), so they are only defined here if they do not already have values:

#ifndef LFSR_SEED
#define LFSR_SEED 0xcdef
#endif

2: global variables

The global variables are any variables declared outside a function (randSeed, cyclesEF, and cyclesFF), and all the DUT module ports. The only DUT module ports which are directly referred to in the testbench are EF and FF, which are the FIFO empty and full flags.


3: the DUT section

The DUT section includes a declaration of the module to be tested, together with clock, signal, and timing declarations. A 'new'-style Verilog module declaration can be copied directly into the DUT section (note that the Verilog reg keyword is optional, and has been omitted here). The clocks and drives for this testbench have been declared as follows:

1   create_clock CLK156 -period 6.4;       // we drive this clock
2   create_clock CLK250 -period 4.0;       // this is a DUT output
3
4   // reset
5   [CLK156, RST156];
6
7   // write to the DUT
8   [CLK156, WREN, DIN];
9
10  // read from the DUT
11  [CLK250, RDEN] -> [DOUT];

Drive statements are identified by a signature, which is made up of the number of signals on the LHS, and the RHS, of the declaration. These three declarations have signatures of (2:0), (3:0), and (2:1). Any drive statement that appears in the code with the form [A,B], for example, is then known to correspond to the declaration on line 5 above, and A will be driven to CLK156, while B will be driven to RST156. The device can therefore be reset with a statement of the form [.C,1].

If it is necessary to have two declarations with the same signature, they must be differentiated by adding a textual label, and a ':' character, prior to both the declaration, and any drive statements corresponding to that declaration.

This testbench will use 3 drives:

  • Line 5 is the declaration of a drive which will advance CLK156, while setting RST156 as required. No other DUT inputs are changed during this operation, so this can be used to reset the DUT, or clear the reset, at any time.
  • Line 8 declares a drive which will advance CLK156, while setting WREN and DIN as required. This will be used to write to the FIFO.
  • Line 11 declares a drive which will advance CLK250, while setting RDEN as required; it will also automatically test DOUT after the clock edge (if the test value is not a 'null' value; in other words, it is not omitted, or given as '-'). This will be used to read from the FIFO, and test the output.
    Note that CLK250 is actually an input to the testbench, and is driven by the wrapper, so the TB does not technically 'advance' the clock. See note (9).

If a testbench has to operate with multiple clocks, then it must drive (or respond to) the clocks from different threads. If you attempt to drive two clocks from the same thread, then they will just 'take turns', instead of running concurrently.

In this case, the TB drives CLK156 from the writer thread, while 'driving' CLK250 in the reader thread. Both threads run concurrently, at the same time, so a waveform display will show both clocks running. Note that any DUT input (including any declared clocks) must be driven by at most one thread; this is covered in more detail below. The declarations on lines 5 and 8 both drive CLK156, so the writer thread carries out both the reset and FIFO write operations.


4: The main function

main simply initiates the reader and writer threads before terminating:

main() {
   int tid;
   exec thread_write(tid);
   exec thread_read (tid);
}

The exec statement starts a function in a new thread. The function must have at least one formal parameter. The first parameter must be an integer reference; the new thread ID is returned to the caller in this integer, but is unused in this example. Functions which are started in this way are called 'thread functions', and do not return to the caller.

The program terminates when any thread calls exit, or when all threads have terminated by 'falling off the bottom'. The program will also terminate automatically after a predetermined number of DUT failures or run-time errors (run mtv --help for details).

DUT ports and signals can only be driven by a single thread. The compiler detects any attempt to drive a port or signal from multiple threads, and reports an error (16). main runs in its own thread, so this example has 3 threads. If the FIFO reset operation in the writer thread ([.C,1]) is moved to main, for example, compilation will fail, because CLK156 will then be driven from two threads.


5: The writer thread

If the FIFO is not full, the writer thread waits for a small number of cycles before writing new data to the FIFO. The number of wait cycles is set by the wait1_count function, which returns an integer in the range [0,5], which is heavily weighted towards 0 (in other words, no waits):

/**
 * Return a cycle count which is used to insert a PR delay when writing to
 * the FIFO
 *
 * @return The cycle count
 */
int wait1_count() {
   result = rand(randSeed, 0, 99);
   if     (result < 80) result = 0;       // 80% 0 cycle
   else if(result < 85) result = 1;       //  5% 1 cycle
   else if(result < 90) result = 2;       //  5% 2 cycle
   else if(result < 94) result = 3;       //  4% 3 cycle
   else if(result < 98) result = 4;       //  4% 4 cycle
   else                 result = 5;       //  2% 5 cycle
}  // wait1_count()

randSeed is a global variable, because it is required by both the reader and writer threads. Note that the function assigns to the built-in result variable to return a value, rather than using a return statement.

The writer thread repeats this process until it has written TEST_COUNT words. The write data is created by a 16-bit LFSR, which is advanced on every word write:

/**
 * Calculate and return the next value in an LFSR16 sequence. This is a
 * maximal-length 16-bit Galois LFSR. 0 is not a valid value for the LFSR, so
 * is checked for and modified.
 *
 * @param lfsr  The current LFSR value on entry, and the new one on return
 */
void lfsr_next(bit16 &lfsr) {
   if(lfsr == 0)
      lfsr = 1;

   bool lsb = lfsr & 1;
   lfsr >>= 1;
   if(lsb)
      lfsr ^= 0xB400;
}

Note that the lfsr parameter is passed by reference, so the function directly modifies whatever the caller passes as the parameter. The function doesn't need to return anything directly, so has a void return type.

The thread_write function is:

/**
 * The writer thread: writes TEST_COUNT LFSR words to the FIFO
 *
 * @param tid  The thread ID for this thread
 */
void thread_write(int &tid) {
   int   i;
   int   words_written = 0;
   bit16 lfsr = LFSR_SEED;

   // fifo reset: careful, see PG057
   for(i=0; i<4; i++)
     [.C, 1];
   for(i=0; i<26; i++)
     [.C, 0];

   while(true) {
      if(FF) {                            // wait 1 cycle if the fifo is full
         [.C, 0, .X];                     // drive data as X
         ++cyclesFF;                      // record full cycle count
         continue;
      }

      for(i=0; i < wait1_count(); i++)
         [.C, 0, .X];                     // PR wait before writing this word

      [.C, 1, lfsr];                      // write the test data
      lfsr_next(lfsr);                    // advance the test data
      if(++words_written >= TEST_COUNT)
         break;
   }

   [.C, 0, .X];                           // clean up: leave WREN low
}  // thread_write()

The tid parameter is the thread ID for this function; it is passed by reference because the function itself returns the new thread ID to the caller when it is created (this is an internal operation; the programmer doesn't have to do anything). All thread functions must have an integer reference as the first parameter. The new thread is created by an exec call in main.

The writer thread resets the FIFO, and writes data to it, as described above. cyclesFF is a global variable which records the number of times that the FIFO was found to be full, for reporting purposes.


6: The reader thread

The reader thread operates in much the same way as the writer thread. If the FIFO is not empty, the reader thread inserts a small number of waits before reading a word from the FIFO. The number of wait cycles is set by the wait2_count function, which again returns an integer which is weighted towards zero. However, in this case, the returned wait count is generally larger than the count used when writing to the FIFO. The reason for this is that the read clock is faster than the write clock, and increasing the read wait balances out the number of read and write cycles, allowing the FIFO to fluctuate between the empty and full levels.

The reader thread maintains its own copy of an lfsr variable, so that it can test the data out of the FIFO. The lfsr_next function is therefore called by, and runs in, both the reader and writer threads (which means that lfsr_next cannot write to an HDL port or signal, since these can be written by at most one thread).

The reader thread code is shown below. Note that the reader terminates the program by calling exit:

/**
 * The reader thread: reads TEST_COUNT LFSR words from the FIFO
 *
 * @param tid  The thread ID for this thread
 */
void thread_read(int &tid) {
   int   i;
   int   words_read = 0;
   bit16 lfsr = LFSR_SEED;

   while(true) {
      if(EF) {                            // wait 1 cycle if the fifo is empty
         [.C, 0] -> [];                   // ignore DOUT; don't test
         ++cyclesEF;                      // record empty cycle count
         continue;
      }

      for(i=0; i < wait2_count(); i++)
         [.C, 0] -> [];                   // PR wait before reading this word
      [.C, 1] -> [lfsr];                  // read the fifo, check the output

      if(++words_read >= TEST_COUNT) {
         report(
            "finished: %d words read. The FIFO was empty for %d cycles, "
            "and full for %d cycles\n", words_read, cyclesEF, cyclesFF);
         exit(0);                         // test finished
      }
      lfsr_next(lfsr);
   }
}  // thread_read()

7: Test result reporting

The reader thread reports the test results when it terminates. At the simplest level, we know that the test has passed if mtv reports that there were TEST_COUNT passes and no failures. This reporting happens automatically, and doesn't require any programmer intervention; the report statement in the reader thread is therefore not strictly necessary. The default value of TEST_COUNT is 5000, and we test the value of DOUT 5000 times (with -> [lfsr]), so we expect 5000 passes, and no failures.

However, this isn't always the whole story. In this case, the FIFO test is useless if the FIFO never fills, or never empties. There is likely to be some similar issue when testing most DUTs, and a raw pass/fail statistic may not generally be particularly helpful.

For this test, the writer and reader threads therefore record the number of cycles on which they found the FIFO to be full, or empty, as the global cyclesFF and cyclesEF counts, and the reader thread reports these numbers on termination. A more sophisticated test would record the number of transitions, rather than the raw number of cycles, or might occasionally stop the write and read operations in order to force a FIFO empty or full condition. This version of the test produces this report on completion, after any additional simulator output has been removed:

finished: 5000 words read. The FIFO was empty for 173 cycles, and full for 42 cycles
(Log)        (40281.9 ns) exit (code 0)
(Log)        (40281.9 ns) 16345 vectors executed (5000 passes, 0 fails)

The inclusion of the 'finished' message gives some confidence that the test has actually done something useful, and has read 5000 words, and has actually both filled and emptied the FIFO. If you don't add any more details to your test output, then anyone else who has to maintain, modify, or fix your code is going to find it difficult to find out what your test has actually done. If this test is part of a set of unit tests for your code, then the 3 lines above should form the contents of your golden logfile.

Here are two other issues you might want to consider when creating a golden logfile for automated testing:

  1. Try not to have report output from different threads at the same simulation time. You might, for example, be tempted to create a message at the start of the reader and writer threads that says something like report("In reader thread; TID %d\n", tid). However, both these statements execute at the same simulation time (there are no time-consuming statements between the two exec statements). Both report statements will print correctly, but simulator A might output the reader thread message first, while simulator B might output the writer thread message first. If you put these messages in the golden logfile, one of the simulators will produce output which fails the logfile comparison.
  2. For longer integration tests, you might want to put in messages such as 'completed reset', giving a cycle count or a simulation time. The problem here is that minor changes in your chip are likely to change the specific time at which you complete reset, for example, or your PLLs lock, by a few cycles. You don't want these changes to cause regression failures, so it can make sense to round any time that you report. FAQ 17 gives a function for reporting times to the nearest microsecond, for example.

8: Running the test

The default simulators.conf already contains two entries (for Icarus and ModelSim) which look in a local 'xvlog' directory for additional sources. If you are using one of these simulators, you can therefore run the test directly with rtv as follows (this assumes a tcsh shell; if you are using bash use export instead of setenv):

evan 82 > setenv RTV_SIMULATOR xicarus
evan 83 > rtv ftest1.tv tfifo_wrapper.v xvlog/netlist/FIFO_16x31_sim_netlist.v
...extraneous simulator output removed    
finished: 5000 words read. The FIFO was empty for 173 cycles, and full for 42 cycles
(Log)        (40281.9 ns) exit (code 0)
(Log)        (40281.9 ns) 16345 vectors executed (5000 passes, 0 fails)
evan 84 > setenv RTV_SIMULATOR xmodelsim
evan 85 > rtv ftest1.tv tfifo_wrapper.v xvlog/netlist/FIFO_16x31_sim_netlist.v
...extraneous simulator output removed    
# finished: 5000 words read. The FIFO was empty for 173 cycles, and full for 42 cycles
# (Log)        (40281.9 ns) exit (code 0)
# (Log)        (40281.9 ns) 16345 vectors executed (5000 passes, 0 fails)

In general, however, you will first need to run the individual steps manually to confirm that you are compiling everything necessary, with the correct switches, and to find out whether a default entry in simulators.conf is sufficient to run the test automatically with rtv. The mrun.sh script below carries out a manual run with ModelSim. When you are happy that you can carry out the simulation in this way, you should make any necessary adjustments to simulators.conf:

#!/bin/bash

# exit early on error
set -e

rm -rf work
vlib work
mtv ftest1.tv
vlog -y xvlog/unisims +libext+.v \
     xvlog/glbl.v    \
     test.v          \
     tfifo_wrapper.v \
     xvlog/netlist/FIFO_16x31_sim_netlist.v

# run a batch-mode simulation
vsim work.top work.glbl -c -do "run -all; quit"

# or run up a GUI for debug
# vsim work.top work.glbl