SoC Design
112 Autumn
07
黃鉦淳
王語
何佳玲
這是一個基於 Caravel SoC 的研發專案,為加快記憶體讀取效率以及分擔 CPU 負載,本專案在 user project 開發多個 IP 以及記憶體控制電路,並設計相應韌體,加速一系列任務需求。 研發項目
- 以特化 IP 取代 CPU 完成高複雜度運算,包含 FIR 、矩陣乘法、快速排序法
- 設計 DMA 讓特化 IP 可以直接訪問 Memory
- 設計 Memory Arbiter
- 優化韌體,降低外部記憶體訪問次數並增加硬體使用率
- 設計 Data FIFO 與 prefetch 功能,作為 cache 使用。
task list
- UART transfer
- FIR
- Matrix multiply
- Quick sorting
Accelerator IP
- Direct Memory Access(DMA) with AXI-stream protocol
- Block RAM controller with AXI-stream protocol
- Memory Arbiter
- FIFO
- Domain specific IP for FIR, Matrix multiply, Quick sorting (ASIC in the picture below)
- 112 SOC Lab - Final Project
int main() {
// mprj_init
// la_init
// uart_interrupt_init
// apply_config
for(int i = 0; i < TIMES_RERUN; i++) {
// Workload
fir(); matmul(); qsort();
// Workload_check
fir_check(); matmul_check(); qsort_check();
}
}
- generate project firmware and simulation on vivado
cd ~/testbench make
make[1]: Entering directory '~/testbench'
make[1]: Leaving directory '~/testbench'
Reading main.hex
main.hex loaded into memory
Memory 5 bytes = 0x6f 0x00 0x00 0x0b 0x13
VCD info: dumpfile main.vcd opened for output.
Times = 1/1 - UART
Times = 1/1 - Hardware(check)
Times = 1/1 - Hardware
tx_data[0] = 1'b0
tx_data[1] = 1'b0
tx_data[2] = 1'b0
tx_data[3] = 1'b0
tx_data[4] = 1'b0
tx_data[5] = 1'b0
tx_data[6] = 1'b0
tx_data[7] = 1'b0
tx complete - data: 8'd000, 8'h00
Test start - FIR
Test end - FIR
Test start - matmul
Test end - matmul
Test start - qsort
Test end - qsort
Test check start - FIR
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 0, golden ans[15:0] = 0
FIR passed - pattern # 0
ans[31:16] = 65535, golden ans[31:16] = 65535
ans[15:0] = 65526, golden ans[15:0] = 65526
FIR passed - pattern # 1
ans[31:16] = 65535, golden ans[31:16] = 65535
ans[15:0] = 65507, golden ans[15:0] = 65507
FIR passed - pattern # 2
ans[31:16] = 65535, golden ans[31:16] = 65535
ans[15:0] = 65511, golden ans[15:0] = 65511
FIR passed - pattern # 3
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 35, golden ans[15:0] = 35
FIR passed - pattern # 4
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 158, golden ans[15:0] = 158
FIR passed - pattern # 5
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 337, golden ans[15:0] = 337
FIR passed - pattern # 6
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 539, golden ans[15:0] = 539
FIR passed - pattern # 7
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 732, golden ans[15:0] = 732
FIR passed - pattern # 8
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 915, golden ans[15:0] = 915
FIR passed - pattern # 9
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 1098, golden ans[15:0] = 1098
FIR passed - pattern #10
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 1281, golden ans[15:0] = 1281
FIR passed - pattern #11
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 1464, golden ans[15:0] = 1464
FIR passed - pattern #12
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 1647, golden ans[15:0] = 1647
FIR passed - pattern #13
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 1830, golden ans[15:0] = 1830
FIR passed - pattern #14
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 2013, golden ans[15:0] = 2013
FIR passed - pattern #15
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 2196, golden ans[15:0] = 2196
FIR passed - pattern #16
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 2379, golden ans[15:0] = 2379
FIR passed - pattern #17
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 2562, golden ans[15:0] = 2562
FIR passed - pattern #18
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 2745, golden ans[15:0] = 2745
FIR passed - pattern #19
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 2928, golden ans[15:0] = 2928
FIR passed - pattern #20
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 3111, golden ans[15:0] = 3111
FIR passed - pattern #21
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 3294, golden ans[15:0] = 3294
FIR passed - pattern #22
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 3477, golden ans[15:0] = 3477
FIR passed - pattern #23
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 3660, golden ans[15:0] = 3660
FIR passed - pattern #24
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 3843, golden ans[15:0] = 3843
FIR passed - pattern #25
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 4026, golden ans[15:0] = 4026
FIR passed - pattern #26
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 4209, golden ans[15:0] = 4209
FIR passed - pattern #27
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 4392, golden ans[15:0] = 4392
FIR passed - pattern #28
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 4575, golden ans[15:0] = 4575
FIR passed - pattern #29
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 4758, golden ans[15:0] = 4758
FIR passed - pattern #30
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 4941, golden ans[15:0] = 4941
FIR passed - pattern #31
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 5124, golden ans[15:0] = 5124
FIR passed - pattern #32
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 5307, golden ans[15:0] = 5307
FIR passed - pattern #33
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 5490, golden ans[15:0] = 5490
FIR passed - pattern #34
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 5673, golden ans[15:0] = 5673
FIR passed - pattern #35
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 5856, golden ans[15:0] = 5856
FIR passed - pattern #36
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 6039, golden ans[15:0] = 6039
FIR passed - pattern #37
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 6222, golden ans[15:0] = 6222
FIR passed - pattern #38
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 6405, golden ans[15:0] = 6405
FIR passed - pattern #39
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 6588, golden ans[15:0] = 6588
FIR passed - pattern #40
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 6771, golden ans[15:0] = 6771
FIR passed - pattern #41
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 6954, golden ans[15:0] = 6954
FIR passed - pattern #42
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 7137, golden ans[15:0] = 7137
FIR passed - pattern #43
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 7320, golden ans[15:0] = 7320
FIR passed - pattern #44
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 7503, golden ans[15:0] = 7503
FIR passed - pattern #45
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 7686, golden ans[15:0] = 7686
FIR passed - pattern #46
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 7869, golden ans[15:0] = 7869
FIR passed - pattern #47
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 8052, golden ans[15:0] = 8052
FIR passed - pattern #48
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 8235, golden ans[15:0] = 8235
FIR passed - pattern #49
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 8418, golden ans[15:0] = 8418
FIR passed - pattern #50
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 8601, golden ans[15:0] = 8601
FIR passed - pattern #51
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 8784, golden ans[15:0] = 8784
FIR passed - pattern #52
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 8967, golden ans[15:0] = 8967
FIR passed - pattern #53
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 9150, golden ans[15:0] = 9150
FIR passed - pattern #54
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 9333, golden ans[15:0] = 9333
FIR passed - pattern #55
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 9516, golden ans[15:0] = 9516
FIR passed - pattern #56
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 9699, golden ans[15:0] = 9699
FIR passed - pattern #57
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 9882, golden ans[15:0] = 9882
FIR passed - pattern #58
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 10065, golden ans[15:0] = 10065
FIR passed - pattern #59
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 10248, golden ans[15:0] = 10248
FIR passed - pattern #60
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 10431, golden ans[15:0] = 10431
FIR passed - pattern #61
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 10614, golden ans[15:0] = 10614
FIR passed - pattern #62
ans[31:16] = 0, golden ans[31:16] = 0
ans[15:0] = 10797, golden ans[15:0] = 10797
FIR passed - pattern #63
Test check end - FIR
Test check start - matmul
ans = 62, golden ans = 62
matmul passed - pattern #00
ans = 68, golden ans = 68
matmul passed - pattern #01
ans = 74, golden ans = 74
matmul passed - pattern #02
ans = 80, golden ans = 80
matmul passed - pattern #03
ans = 62, golden ans = 62
matmul passed - pattern #04
ans = 68, golden ans = 68
matmul passed - pattern #05
ans = 74, golden ans = 74
matmul passed - pattern #06
ans = 80, golden ans = 80
matmul passed - pattern #07
ans = 62, golden ans = 62
matmul passed - pattern #08
ans = 68, golden ans = 68
matmul passed - pattern #09
ans = 74, golden ans = 74
matmul passed - pattern #10
ans = 80, golden ans = 80
matmul passed - pattern #11
ans = 62, golden ans = 62
matmul passed - pattern #12
ans = 68, golden ans = 68
matmul passed - pattern #13
ans = 74, golden ans = 74
matmul passed - pattern #14
ans = 80, golden ans = 80
matmul passed - pattern #15
Test check end - matmul
Test check start - qsort
ans = 40, golden ans = 40
qsort passed - pattern #0
ans = 893, golden ans = 893
qsort passed - pattern #1
ans = 2541, golden ans = 2541
qsort passed - pattern #2
ans = 2669, golden ans = 2669
qsort passed - pattern #3
ans = 3233, golden ans = 3233
qsort passed - pattern #4
ans = 4267, golden ans = 4267
qsort passed - pattern #5
ans = 4622, golden ans = 4622
qsort passed - pattern #6
ans = 5681, golden ans = 5681
qsort passed - pattern #7
ans = 6023, golden ans = 6023
qsort passed - pattern #8
ans = 9073, golden ans = 9073
qsort passed - pattern #9
Test check end - qsort
main_tb.v:88: $finish called at 3251237500 (1ps)
Base | End | Hardware | Description |
---|---|---|---|
3800_0000 | 3800_04FF | BRAM_u0 | Initialized datas |
3800_1000 | 3800_1FFF | BRAM_u0 | RISC-V Instructions |
3800_7000 | 3800_7FFF | BRAM_u1 | Calculated Result |
3000_8000 | 3000_8000 | DMA_Controller | DMA_cfg |
3000_8004 | 3000_8004 | DMA_Controller | DMA_addr |
3100_0000 | 3100_0000 | uart_ctrl | RX_DATA |
3100_0004 | 3100_0004 | uart_ctrl | TX_DATA |
3100_0008 | 3100_0008 | uart_ctrl | STAT_REG |
// ~/testbench/main_tb.v
assign checkbits = mprj_io[31:16];
checkbits | Hardware | Meaning |
---|---|---|
16'hAB00 | FIR | testbench has received CPU - FIR start signal |
16'hAB01 | FIR | testbench has received CPU - FIR end signal |
16'hAB10 | matmul | testbench has received CPU - matmul start signal |
16'hAB11 | matmul | testbench has received CPU - matmul end signal |
16'hAB20 | qsort | testbench has received CPU - qsort start signal |
16'hAB21 | qsort | testbench has received CPU - qsort end signal |
16'hAB30 | FIR | testbench has received CPU - FIR_check start signal |
16'hAB31 | FIR | testbench has received CPU - FIR_check end signal |
16'hAB40 | matmul | testbench has received CPU - matmul_check start signal |
16'hAB41 | matmul | testbench has received CPU - matmul_check end signal |
16'hAB50 | qsort | testbench has received CPU - qsort_check start signal |
16'hAB51 | qsort | testbench has received CPU - qsort_check end signal |
+------+------+-------+------+---------+--------+
DMA_cfg | done | idle | start | type | channel | length |
| [12] | [11] | [10] | [9] | [8:7] | [6:0] |
+------+------+-------+------+---------+--------+
DMA_addr | addr_DMA2RAM |
| [12:0] |
+-----------------------------------------------+
+-------------------------------------------------------------------+
RX_DATA | DATA BITS |
| [7:0] |
+-------------------------------------------------------------------+
TX_DATA | DATA BITS |
| [7:0] |
+-----------+-------------+---------+----------+---------+----------+
STAT_REG | Frame Err | Overrun Err | Tx_full | Tx_empty | Rx_full | Rx_empty |
| [5] | [4] | [3] | [2] | [1] | [0] |
+-----------+-------------+---------+----------+---------+----------+
File : Linker Script
MEMORY {
vexriscv_debug : ORIGIN = 0xf00f0000, LENGTH = 0x00000100
dff : ORIGIN = 0x00000000, LENGTH = 0x00000400
dff2 : ORIGIN = 0x00000400, LENGTH = 0x00000200
flash : ORIGIN = 0x10000000, LENGTH = 0x01000000
mprj : ORIGIN = 0x31000000, LENGTH = 0x00100000
rawdata : ORIGIN = 0x38000000, LENGTH = 0x00000500
mprjram : ORIGIN = 0x38001000, LENGTH = 0x00001000
hk : ORIGIN = 0x26000000, LENGTH = 0x00100000
csr : ORIGIN = 0xf0000000, LENGTH = 0x00010000
}
BRAM_u0 & BRAM_u1 do not concurrent working!
Priority | BRAM_u0 | BRAM_u1 |
---|---|---|
Highest | CPU (Write) | DMA (Write) |
CPU (Prefetch) | CPU (Read) | |
DMA (Read) | ||
Lowest | CPU (Read) |
- dma : DMA controller
- brc0 : BRAM controller u0
- brc1 : BRAM controller u1
- abt : Arbiter
- d0 : data0
- d1 : data1
- a0 : addr0
- a1 : addr1
-
DMA Read data from BRAM_u0
BRAM controller u0 -> DMA controller
- Arbiter is idle
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 clk | | | | | | | | | | | | | | | | | | | | | | | | | | dma_r_addr ___/a0|a1|a2|a3|a4|a5|a6|a7|a8|a9|aA|aB\____________________________________ dma_r_ready ___/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\____________________________________ abt_r_ack ___/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\____________________________________ brc0_out_valid _________________________________/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\______ brc0_data_out _________________________________/d0|d1|d2|d3|d4|d5|d6|d7|d8|d9|dA|dB\______ |<-----------10T------------->| minimun latency : 10T < if add a cache at DMA can reduce the latency to 1T , maybe could implement 0T ! >
- Arbiter is busy
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 clk | | | | | | | | | | | | | | | | | | | | | | | | | | dma_r_addr ___/‾a0‾‾|a1|a2|a3|a4|a5|a6|a7|a8|a9|aA|aB\_________________________________ dma_r_ready ___/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\_________________________________ abt_r_ack _________/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\_________________________________ brc0_out_valid _______________________________________/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\ brc0_data_out _______________________________________/d0|d1|d2|d3|d4|d5|d6|d7|d8|d9|dA|dB\ |<-----------10T------------->|
- Interrupt by cpu
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 clk | | | | | | | | | | | | | | | | | | | | | | | | | | | dma_r_addr ___/‾a0‾‾|a1|a2|a3|a4|a5|‾a6‾‾|a7|a8|a9|aA|aB\_________________________________ dma_r_ready ___/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\_________________________________ abt_r_ack _________/‾‾‾‾‾‾‾‾‾‾‾‾‾‾\__/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\_________________________________ brc0_out_valid _______________________________________/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\__/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\ brc0_data_out _______________________________________/d0|d1|d2|d3|d4|d5|xx|d6|d7|d8|d9|dA|dB\ |<-----------10T------------->|
-
DMA Write data to BRAM_u1
DMA controller -> BRAM controller u1
- DMA is the highest priority of Arbiter
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
clk | | | | | | | | | | | | | | | |
dma_w_addr ___/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\______
dma_w_valid ___/‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾\______
dma_w_data ___/d0|d1|d2|d3|d4|d5|d6|d7|d8|d9|dA|dB\______
latency : 0T