Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect always_comb behavior, possible sensitivity issue #872

Closed
mkorbel1 opened this issue Feb 22, 2023 · 16 comments
Closed

Incorrect always_comb behavior, possible sensitivity issue #872

mkorbel1 opened this issue Feb 22, 2023 · 16 comments

Comments

@mkorbel1
Copy link

I'm working on finalizing another generator for ROHD that uses CIRCT instead of the native ROHD SystemVerilog generator. I've used both generators to create the same design in SystemVerilog to be run through iverilog. The original implementation passes, and the new one generated by CIRCT fails. I don't see an obvious reason why and so I'm suspecting a bug in iverilog.

I'm running on:
Icarus Verilog version 11.0 (stable) (s20150603-1127-gcf0bf4d9)

The original passing version:

// 
// Generated by ROHD - www.github.com/intel/rohd
// Generation time: 2023-02-22 13:51:52.465253
// ROHD Version: 0.4.2
// 

module SimplerExample(
input logic [7:0] a,
output logic [7:0] b
);
logic [7:0] inner;
//  combinational
always_comb begin
  inner = 8'hf;
  b = (a & inner);
  inner = 8'h0;
end

endmodule : SimplerExample
module tb;
logic [7:0] a;
logic [7:0] b;
SimplerExample dut(.a(a), .b(b));
initial begin
#1
a = 255;
#2
if(b !== 15) $error($sformatf("Expected b=0xf, but found b=0x%x with inputs {a: 255}", b));
#8
$finish;
end
endmodule

The new failing version, which should be equivalent:

// Generated by CIRCT firtool-1.31.0
module SimplerExample(	// tmp_circt/tmp_circt388006263.mlir:7:1
  input  [7:0] a,	// tmp_circt/tmp_circt388006263.mlir:7:27
  output [7:0] b	// tmp_circt/tmp_circt388006263.mlir:7:39
);

  wire [7:0] _GEN;	// tmp_circt/tmp_circt388006263.mlir:22:17
  reg  [7:0] _GEN_0;	// tmp_circt/tmp_circt388006263.mlir:12:6
  reg  [7:0] _GEN_1;	// tmp_circt/tmp_circt388006263.mlir:13:6
  always_comb begin	// tmp_circt/tmp_circt388006263.mlir:14:1
    _GEN_0 = 8'hF;	// tmp_circt/tmp_circt388006263.mlir:9:6, :15:1
    _GEN_1 = _GEN;	// tmp_circt/tmp_circt388006263.mlir:16:1, :22:17
    _GEN_0 = 8'h0;	// tmp_circt/tmp_circt388006263.mlir:10:6, :17:1
  end // always_comb
  assign _GEN = a & _GEN_0;	// tmp_circt/tmp_circt388006263.mlir:19:10, :22:17
  assign b = _GEN_1;	// tmp_circt/tmp_circt388006263.mlir:20:6, :23:1
endmodule


module tb;
logic [7:0] a;
logic [7:0] b;
SimplerExample dut(.a(a), .b(b));
initial begin
#1
a = 255;
#2
if(b !== 15) $error($sformatf("Expected b=0xf, but found b=0x%x with inputs {a: 255}", b));
#8
$finish;
end
endmodule

The failing version prints the following:

ERROR: tmp_test/tmp_test66809390.sv:28: Expected b=0xf, but found b=0x00 with inputs {a: 255}
       Time: 3  Scope: tb

This is based on a unit test from ROHD found here: https://github.com/intel/rohd/blob/main/test/comb_math_test.dart

@larsclausen
Copy link
Collaborator

I'm not sure this is not undefined behavior. This looks almost exactly like the example given in the TRM in section 4.8 "Race conditions".

@mkorbel1
Copy link
Author

That section you describe appears to be describing race conditions between a hardware assignment and non-synthesizable procedural code in an initial block. This example that I included has only synthesizable code in it, where the code in the always_comb block should produce hardware equivalent to "procedurally" executing the blocking assignments in order. I think the expected hardware behavior is unambiguous in this case?

@mkorbel1
Copy link
Author

Or if you meant a race between the mini-testbench and the hardware, I think the delay after assignment and before the check should resolve that

@larsclausen
Copy link
Collaborator

I don't think there is any differences in the rules for execution of blocking statements regardless of whether is initial, always_comb or any of the other procedural blocks.

@mkorbel1
Copy link
Author

What is the race you think might exist in this example? And why does the first example pass, and the second example fail?

@red0bear
Copy link

SystemVerilog on icarus verilog still recent work till i remember , you could try use verilog declarations to combinational e clocked pulse and compare results to give some clean view about problem if possible.

@larsclausen
Copy link
Collaborator

The first example has a single process. Within a single process the order of execution is defined. So there is fully deterministic.

The second example has two processes. These are running concurrently and the order of execution is non-deterministic. The TRM says that when running a blocking statement "execution may then continue with
the next sequential statement or with other Active or Reactive events.". So both executing the next sequential statement, which is _GEN_0 = 8'h0; or picking _GEN = a & _GEN_0; from the event queue and executing it should be valid implementations.

@mkorbel1
Copy link
Author

Don't the assign statements act as reactive non-blocking assignments? The full paragraph from 4.9.3 says:

When the process is returned (or if it returns immediately if no delay is specified), the process performs the
assignment to the left-hand side and enables any events based upon the update of the left-hand side. The
values at the time the process resumes are used to determine the target(s). Execution may then continue with
the next sequential statement or with other Active or Reactive events.

I would interpret this to mean that when the always_comb executes, the order of operations should be:

_GEN_0 = 8'hF;
// now _GEN_0 assignment enables events based on its update, namely:
assign _GEN = a & _GEN_0;
// "execution may now continue with the next sequential statement"
 _GEN_1 = _GEN;
// now again the other assign statement is enabled
assign b = _GEN_1;
// and finally, we can continue with the end of the block
_GEN_0 = 8'h0;

Effectively, this causes b = a & 8'hF

Am I misunderstanding something here?

@larsclausen
Copy link
Collaborator

Maybe my understanding is wrong, I don't know. But enables any events based upon the update of the left-hand side should mean that the continuous assignment gets placed in the active region. Not that it immediately executes.

@mkorbel1
Copy link
Author

I feel like an interpretation implying that the above design can have multiple behavioral interpretations opens the concern that any mixing of assign and always_comb on related signals could produce unexpected results. I looked through other parts of the document to try to find a firm answer.

In 10.3.2 on continuous assignments:

The continuous assignment statement shall place a continuous assignment on a net or variable data type. [...] Assignments on nets or variables shall be continuous and automatic. In other words, whenever an operand in the right-hand expression changes value, the whole right-hand side shall be evaluated. If the new value is different from the previous value, then the new value shall be assigned to the left-hand side.

I read this to imply that it should functionally behave like a constantly present assignment of combinational logic.

In 4.9.2 on procedural continuous assignments:

A procedural continuous assignment (which is the assign or force statement; see 10.6) corresponds to a process that is sensitive to the source elements in the expression. When the value of the expression changes, it causes an active update event to be added to the event region, using current values to determine the target.

This section sounds like your explanation of it being just placed in a region, however, in section 4.3:

Processes are sensitive to update events. When an update event is executed, all the processes that are sensitive to that event are considered for evaluation in an arbitrary order.

I think this could imply that the simulator could instead implement the order in the functionally equivalent following way which still permits the arbitrary execution order? Though something doesn't fully add up here either since it seems like we could get an infinite loop with this approach. Perhaps the interpretation of "previous value" is key here, since I think you could perhaps argue the "previous value" of _GEN was zero?

// first line of always_comb executes
_GEN_0 = 8'hF;

// _GEN_0 change triggers assignment to _GEN, which is scheduled in the region "using current values", thus save _GEN_0=8'hF, a=15'hFF for later
// TODO #1: assign _GEN = a & _GEN_0;

// second line of always_comb executes, _GEN_1 = 0
_GEN_1 = _GEN

// _GEN_1 triggers assignment to b, which is scheduled with saved value _GEN_1=0 for later
// TODO #2: assign b = _GEN_1;

// final line of always_comb executes, _GEN_0 = 0
_GEN_0 = 8'h0;

// ?? does this re-trigger `assign _GEN = a & _GEN_0;` again?  cause an infinite loop?  or is there something stopping that?

// execute TODO #1 with then-current value, _GEN=8'hF
assign _GEN = a & _GEN_0;

// execute TODO #2 with then-current value, _GEN_1=0
assign b = _GEN_1;

// always_comb is re-triggered via an update event since _GEN is a sensitivity
 _GEN_0 = 8'hF;
 
// again, trigger the assignment, values are the same as last time
// TODO #3: assign _GEN = a & _GEN_0;

// second line of always_comb executes, _GEN_1 = 8'hF, a new value
_GEN_1 = _GEN;

// _GEN_1 triggers assignment to b, which is scheduled with saved value _GEN_1=8'hF for later
// TODO #4: assign b = _GEN_1;

// final line of always_comb executes, _GEN_0 = 0
_GEN_0 = 8'h0;

// ?? again, does this retrigger `assign _GEN = a & _GEN_0;`?

// execute TODO #3 with then-current value, but there is no change, no re-trigger of always_comb
assign _GEN = a & _GEN_0;

// execute TODO #4 with then-current value, update b to 8'hF
assign b = _GEN_1;
 

@larsclausen
Copy link
Collaborator

The with (System)Verilog is that it very easy to create code that has non-deterministic behavior in simulation.

There is also this discussion that suggest that there is a unwritten rule to not interrupt a begin end block as long as there are no explicit delay instructions as that will create a whole new set of issues.

@martinwhitaker
Copy link
Collaborator

There is also this discussion that suggest that there is a unwritten rule to not interrupt a begin end block as long as there are no explicit delay instructions as that will create a whole new set of issues.

From that discussion

There is the rigamarole about allowing a begin-end to suspend in favor of executing another process, and then resume later. If taken to extremes, this would make the language very hard to use, since many common practices would become nondeterministic. This seems to have been stated partly to allow for simulator optimizations involving inlining of one process into another. For example, when an always block updates a variable, a simulator might immediately update a net
that is assigned from that variable. This can be viewed as suspending the always block, executing the continuous assignment, and then resuming the always block.

vvp does exactly that. However, depending on the complexity of the RHS expression in the continuous assignment, it may suspend the execution of the continuous assignment before it updates the target net. This is an optimisation that avoids (fully) evaluating the expression multiple times if more than one of the expression's primary operands changes in the same time step.

@mkorbel1
Copy link
Author

I ran a couple more experiments and it seems like the always_comb doesn't even properly detect a as a sensitivity.

Keeping the testbench the same:

module tb;
  logic [7:0] a;
  logic [7:0] b;
  SimplerExample dut(.a(a), .b(b));
  initial begin
  #1
  a = 255;
  #2
  if(b !== 15) $error($sformatf("Expected b=0xf, but found b=0x%x with inputs {a: 255}", b));
  #8
  $finish;
  end
endmodule

I added a $display in the always_comb and got the following:
Module:

module SimplerExample(	// tmp_circt/tmp_circt388006263.mlir:7:1
    input  [7:0] a,	// tmp_circt/tmp_circt388006263.mlir:7:27
    output [7:0] b	// tmp_circt/tmp_circt388006263.mlir:7:39
  );
  
    wire [7:0] _GEN;	// tmp_circt/tmp_circt388006263.mlir:22:17
    reg  [7:0] _GEN_0;	// tmp_circt/tmp_circt388006263.mlir:12:6
    reg  [7:0] _GEN_1;	// tmp_circt/tmp_circt388006263.mlir:13:6
    always_comb begin	// tmp_circt/tmp_circt388006263.mlir:14:1
      $display($sformatf("@%t: _GEN=%x", $time, _GEN));
      
      _GEN_0 = 8'hF;	// tmp_circt/tmp_circt388006263.mlir:9:6, :15:1
      _GEN_1 = _GEN;	// tmp_circt/tmp_circt388006263.mlir:16:1, :22:17
      _GEN_0 = 8'h0;	// tmp_circt/tmp_circt388006263.mlir:10:6, :17:1
    end // always_comb
    assign _GEN = a & _GEN_0;	// tmp_circt/tmp_circt388006263.mlir:19:10, :22:17
    assign b = _GEN_1;	// tmp_circt/tmp_circt388006263.mlir:20:6, :23:1
endmodule

Results:

bug.sv:11: warning: System task ($display) cannot be synthesized in an always_comb process.
@                   0: _GEN=xx
@                   0: _GEN=00
ERROR: bug.sv:30: Expected b=0xf, but found b=0x00 with inputs {a: 255}
       Time: 3  Scope: tb

Note that there's only execution at time 0, even though the TB drives a at time 1. By commenting out the line that resets _GEN_0 to 0, I get these results:

Module:

module SimplerExample(	// tmp_circt/tmp_circt388006263.mlir:7:1
    input  [7:0] a,	// tmp_circt/tmp_circt388006263.mlir:7:27
    output [7:0] b	// tmp_circt/tmp_circt388006263.mlir:7:39
  );
  
    wire [7:0] _GEN;	// tmp_circt/tmp_circt388006263.mlir:22:17
    reg  [7:0] _GEN_0;	// tmp_circt/tmp_circt388006263.mlir:12:6
    reg  [7:0] _GEN_1;	// tmp_circt/tmp_circt388006263.mlir:13:6
    always_comb begin	// tmp_circt/tmp_circt388006263.mlir:14:1
      $display($sformatf("@%t: _GEN=%x", $time, _GEN));
      
      _GEN_0 = 8'hF;	// tmp_circt/tmp_circt388006263.mlir:9:6, :15:1
      _GEN_1 = _GEN;	// tmp_circt/tmp_circt388006263.mlir:16:1, :22:17
    //   _GEN_0 = 8'h0;	// tmp_circt/tmp_circt388006263.mlir:10:6, :17:1
    end // always_comb
    assign _GEN = a & _GEN_0;	// tmp_circt/tmp_circt388006263.mlir:19:10, :22:17
    assign b = _GEN_1;	// tmp_circt/tmp_circt388006263.mlir:20:6, :23:1
endmodule

Results:

bug.sv:11: warning: System task ($display) cannot be synthesized in an always_comb process.
@                   0: _GEN=xx
@                   0: _GEN=0x
@                   1: _GEN=0f

Now at time 1 the always_comb re-executes.

Doesn't this imply that this is a sensitivity issue rather than a race or non-determinism between processes?

@martinwhitaker
Copy link
Collaborator

Doesn't this imply that this is a sensitivity issue rather than a race or non-determinism between processes?

No, because a is not referenced directly by any expression in the always_comb block. The always_comb block should only be sensitive to _GEN.

What is happening is:

  1. The always_comb block is automatically executed at time 0.
  2. The simulator executes _GEN_0 = 8'hF. This adds an evaluation event for _GEN to the active queue.
  3. The simulator now has a choice of either executing the next statement in the always_comb block or of executing the evaluation event for _GEN. In this case, vvp notices that the expression for _GEN is a logical operation and defers execution of the evaluation event (moving it to the back of the active queue), so chooses to execute the next statement, _GEN_1=_GEN. This adds an evaluation event for b to the active queue.
  4. The simulator now has a choice of either executing the next statement in the always_comb block or of executing the evaluation event for _GEN or of executing the evaluation event for b. In this case vvp notices that the expression for b is a simple assignment, so chooses to execute that evaluation event, and. if the value of 'b' changes, then chooses to execute the update event for 'b'.
  5. The simulator now has a choice of either executing the next statement in the always_comb block or of executing the evaluation event for _GEN. vvp has already moved that evaluation event to the back of the queue, so chooses to execute the next statement, _GEN_0=8'h0.
  6. There are no more statements in the always_comb block. so that process becomes inactive. The only active event left is the evaluation event for _GEN, so that is executed next. The first time round, the value of _GEN changes from 8'hx to 8'h0, so the update event for _GEN is executed, which re-activates the always_comb block (going back to step 1). The second time round, the value of _GEN doesn't change, and hence the always_comb block is never triggered again.

I believe this behaviour is entirely consistent with the scheduling semantics given in the IEEE standard.

You might want to test your code at https://www.edaplayground.com/. I think you'll find that other simulators behave the same way in this case.

@mkorbel1
Copy link
Author

Thank you for your detailed explanation. I'm pretty surprised and I haven't been able to find a contrasting point in the spec so far.

I'm also having a hard time coming up with a code generation solution which can treat logic feeding into an always_comb block as a sensitivity to that logic in the way I expected. Even with something like always @(<list of expected sensitivities>), there's still the non-determinism on execution order of the statements in the always block vs. the order of the assign statements. Even worse, said sensitivities may exist outside the scope of the current module.

@martinwhitaker
Copy link
Collaborator

As the current behaviour both conforms to the IEEE standard and matches the behaviour of other simulators, closing as invalid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants