Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frontends/desc64: Transition Regbus master to AXI master and improve performance #2

Closed

Conversation

Freakness109
Copy link
Collaborator

@Freakness109 Freakness109 commented Aug 7, 2022

This commit series transitions the descriptor frontend from using regbus for its master interface to using AXI.
Tested under Questa vsim 2019.3.

@Freakness109 Freakness109 force-pushed the desc64_regbus_to_axi branch from 4a2dbeb to 3f1d584 Compare August 7, 2022 17:25
@Freakness109 Freakness109 marked this pull request as ready for review August 7, 2022 17:31
@micprog
Copy link
Member

micprog commented Aug 7, 2022

Please make sure to also update src/systems/cva6_desc accordingly to avoid any incompatibilities

@Freakness109 Freakness109 force-pushed the desc64_regbus_to_axi branch from 9f67d5e to 3e21e56 Compare August 7, 2022 19:33
@@ -16,6 +16,9 @@ module idma_desc64_top #(
/// or see the idma backend documentation for more details
parameter type reg_rsp_t = logic,
parameter type reg_req_t = logic,
/// axi interface types. Use the AXI_TYPEDEF_ALL macros to define the types
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe comment here that this is the AXI bus fetching the descriptors. Might be helpful as your DMA will have two different AXI buses at the top

master_req.ar.addr = submitter_current_addr_q;
master_req.ar.len = 'd3; /* a descriptor is 4 words long (3 + 1) */
master_req.ar.size = 'b011; /* 8 */
master_req.ar.burst = BURST_INCR;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

directly reference the type here: axi_pkg::BURST_INCR and remove the import above.

master_req.ar.id = '0;
master_req.ar.addr = submitter_current_addr_q;
master_req.ar.len = 'd3; /* a descriptor is 4 words long (3 + 1) */
master_req.ar.size = 'b011; /* 8 */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you derive this from the DW?

// TODO: make sure that a burst does not cross a 4-KB boundary
always_comb begin : proc_submitter_axi_ar
master_req.ar = '0;
master_req.ar.id = '0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could make sense to be controllable via a parameter.

// ignore the b channel, we have no error reporting atm
assign master_req.b_ready = '1;

always_comb begin : proc_feedback_axi_aw
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id, and size as mentioned above

always_comb begin : proc_feedback_axi_w
master_req.w = '0;
master_req.w.data = ~'0;
master_req.w.strb = ~'0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'1


always_comb begin : proc_feedback_axi_w
master_req.w = '0;
master_req.w.data = ~'0;
Copy link
Collaborator

@thommythomaso thommythomaso Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'x

master_req.w = '0;
master_req.w.data = ~'0;
master_req.w.strb = ~'0;
master_req.w.last = '1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the width is known: 1'b1. Please check all the other occurrences of this.

@@ -114,8 +112,8 @@ module dma_desc_wrap #(
.idma_rsp_t ( idma_rsp_t ),
.idma_eh_req_t ( idma_pkg::idma_eh_req_t ),
.idma_busy_t ( idma_pkg::idma_busy_t ),
.axi_req_t ( axi_slv_req_t ),
Copy link
Collaborator

@thommythomaso thommythomaso Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe think about if error handling could be valuable in your case. Otherwise, we can leave it disconnected

@thommythomaso
Copy link
Collaborator

A few additional comments:

  • Can you do an entry in jobs.json with the corresponding testbench and synthesis parameters? You can just add man_simple pointing to /dev/zero in your case. Be sure your TB exits with a $finish.
  • FSMs to control AXI buses are usually destined to achieve poor performance only. Be sure to check the bus utilization of your manager interface carefully. Optimally there should be no state changes needed to handshake a signal.

@Freakness109 Freakness109 force-pushed the desc64_regbus_to_axi branch 5 times, most recently from a85fbbe to aa8d65c Compare August 15, 2022 10:32
@Freakness109 Freakness109 force-pushed the desc64_regbus_to_axi branch 4 times, most recently from 354527e to 537609d Compare August 24, 2022 15:41
@micprog
Copy link
Member

micprog commented Aug 25, 2022

  • Can you do an entry in jobs.json with the corresponding testbench and synthesis parameters? You can just add man_simple pointing to /dev/zero in your case. Be sure your TB exits with a $finish.

Make sure to run make gen_ci(and possibly other targets) and commit the changes, as the CI checks that the commit is clean, i.e. any generated files correspond to what they're supposed to be generated to. Currently, the github CI is failing due to a mismatch in the generated .gitlab-ci.yml.

@Freakness109 Freakness109 marked this pull request as draft August 25, 2022 19:47
@Freakness109 Freakness109 force-pushed the desc64_regbus_to_axi branch 2 times, most recently from c9714ae to d84c804 Compare September 2, 2022 11:02
Axel Vanoni and others added 7 commits October 11, 2022 18:26
Now that we changed the frontend, we also need to update its
synth module.
It lacks all verification instrumentation. For verification use _top.
The ready-valid signals were one cycle behind the data, leading to
wrong data being pushed to the fifo under continuous writes.
With the backend supporting handshaking, it isn't needed any longer.
Now featuring:
- Data-Flow design
- Bug-free(TM) implementation
@Freakness109 Freakness109 changed the title frontends/desc64: Transition Regbus master to AXI master frontends/desc64: Transition Regbus master to AXI master, improve performance and add prefetching Oct 11, 2022
@Freakness109 Freakness109 changed the title frontends/desc64: Transition Regbus master to AXI master, improve performance and add prefetching frontends/desc64: Transition Regbus master to AXI master and improve performance Oct 11, 2022
thommythomaso and others added 16 commits October 11, 2022 19:20
- These parameters provoke a crash on questasim:

vsim tb_idma_desc64_bench -t 1ps \
    -GNumberOfTests=10 \
    -GChainedDescriptors=10 \
    -GSimulationTimeoutCycles=200000 \
    -GNumContiguous=20 \
    -GMemLatency=16 \
    -GRAWCouplingAvail=0 \
    -GNSpeculation=20 \
    -GTransferLength=256 \
    -GDoIRQ=1 \
    +trace_file=trace-debug.log \
    -voptargs=+acc
This fixes some errors with Questa somehow logging the axi responses
as Z and allows us to measure utilization over the entire duration of
a transfer (not only the time the iDMA is busy, but also count CPU time)

Additionally, relay the NSpeculation, InputFifoDepth and PendingFifo
parameters up to the wrapper instantiation site.
As there still is a logic loop, it will probably fail to compile. Add it
none the less.
Loop was flush->speculation_valid->address to compare->flush.
We now flush the speculation fifo one cycle later while holding
the ready/valid at zero, in order not to miss anything.
In order to synthesize the cva6-desc64 system package, we need more
modules that weren't on the list yet. Add them.
On flush, we had one extra cycle until the next AR goes out. Fix that.
@Freakness109
Copy link
Collaborator Author

Now with the latest changes from the dataflow branch. What still needs to be done:

  • Evaluate whether to keep the testbench without the backend
  • Maybe add the Linux driver from Gitlab here as a patch file (and remove the questionable hotqueueing)
  • Move the bare-metal driver from the cva6-repo on Gitlab to here
  • Maybe test more size/prefetching combinations to tease out more potential deadlocks and logic loops.
  • Need to check if I messed something up with the linting rules
  • More comments in the implementation would be nice

Copy link
Collaborator

@thommythomaso thommythomaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An initial pass looks good. If it is ok for you, I will close this PR, take the branch back into the pulp-platform namespace, rebase and merge it there.

@thommythomaso
Copy link
Collaborator

Replaced by #18.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants