Ibex 是什么?
Ibex was initially developed as part of the PULP platform under the name "Zero-riscy", and has been contributed to lowRISC who maintains it and develops it further. It is under active development.
Ibex 是一個產品級的 32 位開源 RISC-V 處理器,使用 SystemVerilog 編寫,麻雀雖小(11000 行左右),五章俱全。支持 RV32I、RV32C、RV32M、RV32B 等拓展,支持了 M-Mode 和 U-Mode,完整實現了 RISC-V 指令集規定的控制狀態寄存器、中斷異常、調試支持等,適用於嵌入式系統。
總體架構如下:

流水線
Ibex 默認使用兩級流水線,但也支持三級流水線(實驗性特性)。兩級流水分別為:
- 取值(IF):通過預取緩沖區(prefetch buffer)從內存中取值,可以一個周期取一條指令,只要指令側內存支持。
- 譯碼/執行(ID/EX):譯碼並立即執行,所有的操作,包括寄存器讀寫、內存訪問都在該階段進行。
Ibex 支持多周期指令,每條指令都至少需要兩個周期才能通過流水線,周期數更大的指令將導致流水線停頓多個周期。指令類型及其停頓周期如下:
| Instruction Type | Stall Cycles | Description |
|---|---|---|
| Integer Computational | 0 | Integer Computational Instructions are defined in the RISCV-V RV32I Base Integer Instruction Set. |
| CSR Access | 0 | CSR Access Instruction are defined in ‘Zicsr’ of the RISC-V specification. |
| Load/Store | 1 - N | Both loads and stores stall for at least one cycle to await a response. For loads this response is the load data (which is written directly to the register file the same cycle it is received). For stores this is whether an error was seen or not. The longer the data side memory interface takes to receive a response the longer loads and stores will stall. |
| Multiplication | 0/1 (Single-Cycle Multiplier) 2/3 (Fast Multi-Cycle Multiplier) clog2(op_b)/32 (Slow Multi-Cycle Multiplier) |
0 for MUL, 1 for MULH. 2 for MUL, 3 for MULH. clog2(op_b) for MUL, 32 for MULH. See details in Multiplier/Divider Block (MULT/DIV). |
| Division Remainder | 1 or 37 | 1 stall cycle if divide by 0, otherwise full long division. See details in Multiplier/Divider Block (MULT/DIV) |
| Jump | 1 - N | Minimum one cycle stall to flush the prefetch counter and begin fetching from the new Program Counter (PC). The new PC request will appear on the instruction-side memory interface the same cycle the jump instruction enters ID/EX. The longer the instruction-side memory interface takes to receive data the longer the jump will stall. |
| Branch (Not-Taken) | 0 | Any branch where the condition is not met will not stall. |
| Branch (Taken) | 2 - N 1 - N (Branch Target ALU enabled) | Any branch where the condition is met will stall for 2 cycles as in the first cycle the branch is in ID/EX the ALU is used to calculate the branch condition. The following cycle the ALU is used again to calculate the branch target where it proceeds as Jump does above (Flush IF stage and prefetch buffer, new PC on instruction-side memory interface the same cycle it is calculated). The longer the instruction-side memory interface takes to receive data the longer the branch will stall. With the parameter BranchTargetALU set to 1 a separate ALU calculates the branch target simultaneously to calculating the branch condition with the main ALU so 1 less stall cycle is required. |
| Instruction Fence | 1 - N | The FENCE.I instruction as defined in ‘Zifencei’ of the RISC-V specification. Internally it is implemented as a jump (which does the required flushing) so it has the same stall characteristics (see above). |
取指(IF)
Ibex 通過預取緩沖區(prefetch buffer)從內存中取值,對指令高速緩存處於草案階段,需要專門配置。Ibex 的預取緩沖區線性地取指令知道緩沖區滿,指令連同對應的 PC 一起存儲到取指隊列中。當執行跳轉指令時,IF 階段的控制區刷新預取緩沖區。
Ibex 支持 RV32C(壓縮指令拓展),在 IF 階段解壓壓縮指令,使得 ID/EX 不需要針對壓縮指令專門設計。
分支預測策略也是可配置的,BranchPrediction設置為1,則假設跳向負偏移的地址的指令總是成功,默認的策略還不清楚。
在外部,IF 接口(指令側內存接口)僅執行字對齊的指令提取。 未對齊的指令提取通過執行兩個單獨的字對齊的指令提取來處理。 在內部,內核可以處理字對齊和半字對齊的指令地址,以支持壓縮指令。 指令地址的LSB在內部被忽略。
譯碼/執行(ID/EX)
ID/EX 階段由多個塊組成:
- Instruction Decode Block:控制 ID/EX 的總體執行過程
- Decoder:負責具體的解碼和信號發送
- Controller:維護處理器的有限狀態機,尤其在跳轉發生時設置 PC,處理異常/中斷
- Registers::寄存器組,本處理器沒有在寄存器組中實現前遞
- EX Block:主要是實例化 ALu 和乘法器/除法器
- ALU:實現 RV32I 中的算術邏輯運算,為乘除法器提供支持,也負責跳轉目標地址的計算
- Multiplier/Divider Block(MULT/DIV):實現 RV32M 拓展
- CSR:控制狀態寄存器,其中包含異常/中斷處理相關的寄存器、性能計數器、調試寄存器等
- LSU:存儲/加載單元,負責讀寫內存
加載-存儲單元(Load-Store Unit)
Ibex 是 32 位 RISC-V 處理器,自然可以存取字、半字、字節,任何加載-存儲操作都間感到值流水線停頓至少一個周期。Ibex 支持未對齊的內存訪問,一次未對齊的內存訪問將導致兩次對齊的內存訪問,LSU 負責合並兩次訪問。不論訪問是否成功,未對齊的內存訪問都導致兩次對齊的內存訪問,如果第一次對齊的內存訪問失敗,LSU 仍會接收到內存發送的數據,LSU 會將它忽略,從而保證行為的正確性。
LSU 負責處理器和內存間的交互(存取內存),最重要的地方在於 LSU 使用的協議。LSU 接口如下:
| Signal | Direction | Description |
|---|---|---|
data_req_o |
output | Request valid, must stay high until data_gnt_i is high for one cycle |
data_addr_o[31:0] |
output | Address, word aligned |
data_we_o |
output | Write Enable, high for writes, low for reads. Sent together with data_req_o |
data_be_o[3:0] |
output | Byte Enable. Is set for the bytes to write/read, sent together with data_req_o |
data_wdata_o[31:0] |
output | Data to be written to memory, sent together with data_req_o |
data_gnt_i |
input | The other side accepted the request. Outputs may change in the next cycle. |
data_rvalid_i |
input | data_err_i and data_rdata_i hold valid data when data_rvalid_i is high. This signal will be high for exactly one cycle per request. |
data_err_i |
input | Error response from the bus or the memory: request cannot be handled. High in case of an error. |
data_rdata_i[31:0] |
input | Data read from memory |
- LSU 讀取內存時向內存發送 32 比特字對齊的地址
data_addr_o和請求標志data_req_o(整個周期,知道接收到data_gnt_i),寫時還要發送 32 比特數據data_wdata_o和 Byte Enable 標志data_be_o。內存接收到信號和數據后返回data_gnt_i,告知 LSU 請求已接收並將處理。內存立刻應答但很可能在多個周期后才服務請求(內存訪問速度較慢)。 - LSU 接收到內存應答
data_req_o后就可以發送下一次請求了,內存會記錄上次接收到的請求,不會丟失。 - 如果內存在服務請求后, LSU 發送
data_rvalid_i信號(一整個周期),LSU 通過觀察data_err_i判斷內存是否正確服務請求並接收/忽略data_rdata_i。data_gnt_i告知 LSU 內存已收到請求,data_rvalid_i告知 LSU 內存已服務請求。 - 同一時刻可能存在多個已發送未服務的請求,協議認為所有的內存請求都是順序的(不存在亂序),並且內存會為每一個已服務的請求返回
data_rvalid_i。
寄存器組
Ibex 支持 32 個通用寄存器(RV32I)或 15 個 通用寄存器(RV32E)。寄存器堆具有兩個讀取端口和一個寫入端口,寄存器堆數據在請求讀取的同一周期內可用。 沒有寫入讀取的前遞路徑,因此,如果同時讀取和寫入一個寄存器,則讀取將返回當前值,而不是正在寫入的值。
Ibex 的實現了三個版本的寄存器堆:
- 基於觸發器的版本:使用 Verilator 首選
- 基於鎖存器的版本:使用 ASCI 實現首選
- 基於 FPGA 的版本:使用 FPGA 首選
中斷/異常,性能計數器,狀態控制寄存器,調試支持,物理內存保護(PMP)
Ibex 支持 RISV-V 指令集規范規定的全部以上特性。
