Repository of Examples
The FVM provides a repository of examples to showcase how a number of designs of increasing complexity can be verified with it. While a small number of designs have been developed by the methodology authors, the majority of the designs come from third-party libraries or developers.
Examples developed by the FVM team are provided in the FVM repository and covered by the same License as the rest of the provided code.
Examples developed by third parties have their own Free/Open-Source license and
are not provided in the FVM repository. Instead, git clone commands are run
from the relevant formal.py scripts in order to download the code.
In this section, the most relevant aspects of each example and how to approach
its verification are explained, but users can find the complete commented
properties and formal.py script in the FVM repository.
Suggested order
Examples are shown in this section in the suggested order for users, organized in difficulty levels:
Complexity |
Example |
Library / Developer |
Difficulty |
|---|---|---|---|
Trivial |
Counter |
FVM team |
Simple logic with a single path |
Priority arbiter |
Open Logic |
Static arbitration scheme with low complexity |
|
Dualcounter |
FVM team |
Structure remains simple despite duplication of logic |
|
Round robin arbiter |
Open Logic |
Moderate complexity due to round-robin policy |
|
Easy |
UART transmitter |
FVM team |
Finite state machine with sequential behavior |
AXI4-lite slave |
Open Logic |
Multiple interfaces must be simultaneously verified |
|
Medium |
SDRAM controller |
Joshua Bassett |
Control and timing behavior is dependent on the issued commands |
Linear interpolator |
FVM team |
Requires data integrity checks over 12 cycles |
|
Synchronous FIFO |
Open Logic |
Unbounded timing relation between read and write complicates proofs |
|
Intermediate |
32-bit divider |
GRLIB |
High arithmetic complexity significantly increases the state space |
Asynchronous FIFO |
Open Logic |
Multi-clock domain interactions |
|
Difficult |
IPv6 transceiver |
Pile of Cores |
Protocol complexity due to buffering and large headers with variable latency |
Trivial examples
Counter
The first trivial example was the counter, explained in Your first example.
As this design is trivial and a counter is not too suited to be modeled in
terms of input/output transactions, drom2psl is not used for this design.
Dualcounter
This dualcounter compares whether the most significant bit of two counters
is equal and shows the result in the output equalmsb. In the
reachability step, we can see that the equalmsb = '0' toggle is
unreachable, so we know that the output of this dualcounter will always be
correct.
Priority Arbiter
This example verifies a simple priority arbiter from the Open Logic library: https://github.com/open-logic/open-logic. In this example, many things can be checked simply with assertions, for example, that the output always has 0 or 1 bit asserted; that if there is no request, the output is 0; that if there is a request, the output is one-hot… Additionally, to fully test the functionality, we can create a function that returns the expected value for the arbiter result. This can be done either in the PSL file or in a VHDL package, although the latter is recommended (by defining the function in a VHDL package, the PSL files are simpler and we can easily reuse the function in simulation if needed).
Round Robin Arbiter
This example also belongs to the OpenLogic library. Again, it’s a simple
example, and some of the priority arbiter’s priorities can be reused. However,
this case is more complicated due to the round-robin algorithm and the control
logic, which requires a handshake. We can define a sequence for the handshake
since it will be reused quite a bit. Thus, this will be the first example
design in which we will use drom2psl.
Again, it is very convenient to define a function that models the round-robin behavior, since it’s very simple, so we’ll define it in a VHDL package.
Let’s try creating a sequence with
drom2psl. We want to ensure that if there has been a valid grant before, the
next grant (whenever it happens) will be successful. The function that predicts
the round robin needs the request and the last grant as inputs.
Wavedrom diagram for the last_valid_grant sequence. Cycles where the
clock signal has the two wavy lines mean “0 or more cycles”. The data type
for Out_Grant is specified in the JSON file
as an extra field, type, of the Out_Grant wavelane.
This is the generated PSL:
vunit last_valid_grant {
sequence last_valid_grant (
hdltype std_logic_vector(Width_g-1 downto 0) last_grant
) is {
((Out_Ready = '1') and (Out_Valid = '1') and (Out_Grant = last_grant))[*1]; -- 1 cycle
((Out_Ready = '0') and (Out_Valid = '0'))[*]; -- 0 or more cycles
((Out_Ready = '0') and (Out_Valid = '1'))[*]; -- 0 or more cycles
((Out_Ready = '1') and (Out_Valid = '1'))[*1] -- 1 cycle
};
}
With this, we can ensure that if we have a grant and then we have another
(which can happen in any possible cycle), the actual grant will be equal to the
output of the function that predicts it. We can verify that the property holds
true for all possible values by creating a symbolic constant last_grant
(explained in symbolic-constants in Techniques to reduce proof complexity).
assert always ( {last_valid_grant(last_grant)} |->
{Out_Grant = round_robin_arbiter(In_Req, last_grant) } ) abort Rst;
Easy examples
UART transmitter
This example is still simple, but it already includes a state machine,
albeit one with very linear behavior. With drom2psl we can define the input
transaction in the Precond group and the output transaction in the Cond
group (or Cond_non_overlap) if we want to generate a property with an
implication.
The generated PSL is as follows:
vunit uart_prop {
sequence uart_prop_Precond (
hdltype std_logic_vector(7 downto 0) d
) is {
((rd_en = '1'))[*1]; -- 1 cycle
((data = d))[*1] -- 1 cycle
};
sequence uart_prop_Cond_non_overlap (
hdltype std_logic_vector(7 downto 0) d;
hdltype std_logic parity_value
) is {
((TX = '0'))[*2]; -- 2 cycles
((TX = d(0)))[*2]; -- 2 cycles
((TX = d(1)))[*2]; -- 2 cycles
((TX = d(2)))[*2]; -- 2 cycles
((TX = d(3)))[*2]; -- 2 cycles
((TX = d(4)))[*2]; -- 2 cycles
((TX = d(5)))[*2]; -- 2 cycles
((TX = d(6)))[*2]; -- 2 cycles
((TX = d(7)))[*2]; -- 2 cycles
((TX = parity_value))[*2]; -- 2 cycles
((TX = '1'))[*2] -- 2 cycles
};
-- Relational operands between sequences may be, among others:
-- && : both must happen and last exactly the same number of cycles
-- & : both must happen, without any requirement on their durations
-- |-> : implication: both must happen, with the first cycle of the second occurring during the last cycle of the first
-- |=> : non-overlapping implication: both must happen, with the first cycle of the second occuring the cycle after the last cycle of the first
property uart_prop (
hdltype std_logic_vector(7 downto 0) d;
hdltype std_logic parity_value;
hdltype std_ulogic fvm_rst
) is
always ( uart_prop_Precond(d) |=> uart_prop_Cond_non_overlap(d, parity_value) ) abort (fvm_rst = '1');
}
From our PSL, the property can be used simply with:
assert_tx:
assert uart_prop(data_uart, calc_parity(data_uart), rst);
Caution
Note that the automatic property reset is active high. If your design uses an active low reset, you can use assert uart_prop(data_uart, calc_parity(data_uart), not rst);
This property verifies the entire functionality of the design, but it must be
noted that in the property’s precondition there is an output (rd_en) and,
when a precondition of a property depends on a design output, you must make
sure that that specific design output is activated when expected, to avoid
verification gaps. So, we need to verify somehow that rd_en is activated
when we want it to be; the example code shows how we do this in this case.
Note
For clarity of the wavedrom diagrams, a sequence where each bit lasts for
two clock cycles has been used for the documentation. The actual VHDL has a
generic, BIT_DURATION, which is set to a default value of 5. Depending
on the actual clock frequency used and the desired baudrate, this
BIT_DURATION generic could be set to different values, for example using
a design configuration, as explained in design_configurations.
Future versions of drom2psl may support stretching parts of the sequences
to accomodate sequences where some durations are set by generics, but in the
meantime engineers can always manually call drom2psl, modify the generated
PSL file by changing all the [*5] to [*BIT_DURATION], and add the
modified PSL source instead of the JSON source to the fvm framework
instance.
AXI-4 Lite Slave
This example also belongs to the OpenLogic library. This AXI Slave has a state machine that is no longer as linear (it has both TX and RX) and also has quite a few interfaces. This makes writing sequences for reuse more important, both in terms of speed and clarity.
A very interesting way to use drom2psl is to define one of the
functionalities of our design in wavedrom, in this case READ, and cover it
to see if it can occur:
With the wavedrom, we can generate assertions to write, either manually or
with drom2psl. In this case, it’s clear that three things happen:
1) The AR channel requests data from an address, 2) The data is read from the
register bank for the requested address, and 3) the R channel reads the data.
Furthermore, we can observe that in 1) the input transaction could be the
AR channel (let’s call it a) and the output transaction the read portion of
the register bank (let’s call it b); in 2) the input transaction is the
read portion of the register bank and the output transaction is the write
portion of the register bank (let’s call it c); and in 3) the input
transaction is the write portion of the register bank and the output
transaction is the R channel (let’s call it d). So, conceptually, we want
assertions that relate a -> b, b -> c, and c -> d, and with those
we verify all parts of the READ functionality.
Medium examples
SDRAM controller
This example comes from a popular Github repo: https://github.com/nullobject/sdram-fpga.
This is the first example where complexity reduction techniques are applied,
although it’s quite easy. There’s a startup delay of over 2000 cycles
parameterized with a generic, so it can simply be reduced without affecting
the design at all, as it’s just a startup delay. Regarding the design, it’s
similar to the AXI slave with many states (WRITE, READ, REFRESH…)
and many interfaces, so using sequences remains essential.
With drom2psl we can easily assert a write, with an input transaction
using the user interface and an output transaction using the SDRAM interface:
drom2psl generates the following code:
vunit write_seq {
sequence write_seq_Precond (
hdltype unsigned(22 downto 0) addr_value;
hdltype std_logic_vector(31 downto 0) data_value
) is {
((addr = addr_value) and (data = data_value) and (we = '1') and (req = '1') and (ack = '0'))[*1]; -- 1 cycle
((addr = addr_value) and (data = data_value) and (we = '1') and (req = '1') and (ack = '1'))[*1] -- 1 cycle
};
sequence write_seq_Cond (
hdltype unsigned(22 downto 0) addr_value;
hdltype std_logic_vector(31 downto 0) data_value
) is {
((sdram_a = addr_value(20 downto 8)) and (sdram_ba = addr_value(22 downto 21)) and (sdram_cs_n = '0') and (sdram_ras_n = '0') and (sdram_cas_n = '1') and (sdram_we_n = '1'))[*1]; -- 1 cycle
((sdram_a = "0010" & addr_value(7 downto 0) & '0') and (sdram_ba = addr_value(22 downto 21)) and (sdram_dq = data_value(31 downto 16)) and (sdram_cs_n = '0') and (sdram_ras_n = '1') and (sdram_cas_n = '0') and (sdram_we_n = '0'))[*1]; -- 1 cycle
((sdram_dq = data_value(15 downto 0)) and (sdram_cs_n = '0') and (sdram_ras_n = '1') and (sdram_cas_n = '1') and (sdram_we_n = '1'))[*1] -- 1 cycle
};
-- Relational operands between sequences may be, among others:
-- && : both must happen and last exactly the same number of cycles
-- & : both must happen, without any requirement on their durations
-- |-> : implication: both must happen, with the first cycle of the second occurring during the last cycle of the first
-- |=> : non-overlapping implication: both must happen, with the first cycle of the second occuring the cycle after the last cycle of the first
property write_seq (
hdltype unsigned(22 downto 0) addr_value;
hdltype std_logic_vector(31 downto 0) data_value;
hdltype std_ulogic fvm_rst
) is
always ( write_seq_Precond(addr_value, data_value) |-> write_seq_Cond(addr_value, data_value) ) abort (fvm_rst = '1');
}
And with just one line of code we can verify that the writes are working properly!
write_is_correct:
assert write_seq(addr_sig, data_sig, reset);
Linear interpolator
This designs makes a linear interpolation betwen two inputs, which are
fixed-point complex numbers defined as a VHDL record type that has two
fields: re (for the real part) and im (for the imaginary part). If the
input valid is asserted, the inferior and superior inputs are
interpolated for 12 cycles.
This example will be the first where assumptions are crucial. Due to how the design works, while it is interpolating:
its
inferiorandsuperiorinputs must not change, andits
validinput cannot be asserted
Thus, we use assumptions to ensure that if valid is asserted, the
superior and inferior inputs remain stable during the 12 interpolation
cycles. Similarly, if valid is asserted, it will not be re-asserted in the
next 12 cycles.
-- Assume inferior and superior do not change during the 12 cycles output
assume always ({valid} |=> {stable(inferior)[*12]}) abort rst;
assume always ({valid} |=> {stable(superior)[*12]}) abort rst;
-- Assume the interpolator will not be activated on the 12 cycles after
-- being activated (the interpolator needs 12 cycles to output the 12
-- interpolated data)
assume_not_activated_when_interpolating:
assume always {valid} |=> {(not valid)[*12]};
In addition to this, this example is the first with some computational complexity, due to its 10-bit by 5-bit multipliers.
Using drom2psl we can generate a property with these transactions:
And we can use the property this way (note that interpolator_predict is a
function that we define to predict the interpolated values):
interpolator_works_correctly:
assert interpolator_prop(input_tran, interpolator_predict(input_tran), rst);
Synchronous FIFO
This example also belongs to the OpenLogic library. This example is one of the most important, since FIFOs and memories are a recurring element in most designs, and will be responsible for a large part of their difficulty.
First, we need to understand why memory complicates our designs so much. Taking the example of a relatively small memory, 256x8 bits, that already gives us 2048 state bits, which is quite a lot in itself. Furthermore, with 256 memory locations, if it’s accessed once per cycle, it will take at least 256 cycles for the properties to verify that all memory locations are accessible, resulting in high latency. In fact, memory locations are often more detrimental due to the latency they generate than the data size, since the latter can be simplified by using symbolic constants.
The section Techniques to reduce proof complexity offers several tips for dealing with memory. The PSL code for this FIFO also reveals which techniques are used.
Intermediate examples
32-bit divider
This example belongs to the GRLIB IP library from Gaisler:
https://www.gaisler.com/grlib-ip-library. This is the first example in which
several libraries are used; the sources must be added in the order the
libraries are compiled, as explained in
fvm.FvmFramework.add_vhdl_source().
This case presents enormous computational complexity: a 64-bit dividend divided by a 32-bit divisor. This will make it impossible to verify data integrity for all combinations (296), just as it would be impossible to do so with simulation.
Therefore, to verify data integrity, we need to reduce the state space, that is, the possible combinations. The best way to do this is to use assumptions, although it would be advisable for the assumptions to occupy random bins so that different values are used during different executions. With 224 combinations, the property takes 1 hour. With 221 combinations, it takes 10 minutes.
-- We can assume some bits are always constant, although we should
-- randomize it for different executions
assume always ( divi.y(29 downto 20) = "0101010101" );
assume always ( divi.op1(29 downto 20) = "1101011010" );
assume always ( divi.op2(29 downto 20) = "0100001110" );
-- Or we can directly assume ranges
assume always ( 10000 > signed(divi.y) >= -10000);
assume always ( 20000 > signed(divi.op1) >= 0);
Asynchronous FIFO
This example also belongs to the OpenLogic library.
If we specify the clock and reset domains in formal.py, the steps
resets and clocks are very useful for checking clock and reset
domain crossing.
Regarding FIFO handling, it’s the same as the synchronous FIFO. The difference lies in the clocks. Until now, we defined the default clock at the top of the PSL file, and all properties used that clock. Now we’ll have to define the clock in each property (although PSL allows for more than one clock in a property, it’s not recommended, and most tools probably won’t support it):
-- Assert the FIFO level calculated on the write side is always
-- less than the depth of the FIFO
assert_in_level_within_bounds:
assert always ( (unsigned(In_Level) <= Depth_g) abort In_Rst) @rising_edge(In_Clk);
So, what do we do if we want a property that relates signals from different clock domains? Well, we can pass either of the two signals through a synchronizer (it can be a single-stage synchronizer because in formal verification there is no metastability) and implement the property on the clock of both. But what if we want an input to only move within one clock domain, something the formal solver doesn’t necessarily know? Some tools allow us to specify clock domains for inputs, but in PSL we can do something similar by creating a signal and assigning it to the desired clock domain (without specifying any value), and then making an assumption relating that signal to the original:
signal out_ready_reg, out_ready_reg_in_clk : std_logic;
process (In_Clk)
begin
if rising_edge(In_Clk) then
out_ready_reg_in_clk <= out_ready_reg;
end if;
end process;
assume always (Out_Ready = out_ready_reg_in_clk);
Difficult examples
IPv6 transceiver
This example belongs to the PoC (Pile of Cores) library: https://github.com/VLSI-EDA/PoC. This is the most complicated of all, both because of its size and because it has some of the complexity elements at the same time: bigger state space, a number of memories inside, and a latency which cannot be arbitrarily reduced due to the fact that the frames require specific headers. This results in greater verification efforts, requiring more and more complex properties. Therefore, using sequences and reusing code becomes more necessary.