最近發現論壇上好多SRIO的帖子,剛好應客戶需求我總結了一些SRIO的東西,在這里也分享出來,作為拋磚引玉吧。
首先坦白來說我不是太懂SRIO,至少SRIO協議我沒完全看過,O(∩_∩)O~,為了給客戶寫這些東西,我也查閱了很多資料和書籍,如果有不對的地方,歡迎大家討論和指正。(論壇上的SRIO大蝦很多,比如Zhan,Allen等,嘿嘿……)
SRIO其實不是個通用的名字,真正能google到的叫Lp-serial,這個全稱是什么暫且買個關子,它是一種協議,就是規定了兩個都遵照這種協議的設備可以通信。要注意的是這里規定的是兩個設備,不是三個也不是四個,這個理解是比較重要的。比如你一個switch連接了3個DSP和2個FPGA,這5個都可以通過SRIO協議通信的。但是本質上,這3個DSP和2個FPGA都是和Switch通信,再進一步來說這個協議是端到端的協議。之所以要說明這個問題,就是很多客戶反映的DSP到FPAG發包木有成功,如果中間有switch,那么你需要檢查DSP到switch的鏈路,switch到FPGA的鏈路。而不是籠統的去看DSP到FPGA,這就是協議的本質。
再說協議,協議規定了SRIO在物理層傳遞是按照固定的報文的。如果你是做原始的SRIO的IP,那么你需要手動的拼接這些報文;如果你用TI芯片,恭喜你,TI使用LSU幫你拼接,你只需要配置LSU寄存器就可以了。所以有人問LSU怎么填,那么如果你懂了協議里面的包格式,同時了解LSU如何對應到協議中去,那么你就不會有任何疑問了。(這當然是說的簡單啊……)
繼而就說到SRIO的錯誤檢測了,通常遇到SRIO錯誤,我們這里拋去硬件信號質量錯誤(這種錯誤需要看眼圖)。我們一般首先會看offset為0x158的SPn_ERR_STAT寄存器。比特位域如下表所示,這個寄存器可以分為3個部分來看,一是port狀態,二是輸出和輸出的stop error,三是重傳的stop error。我們下面分三部分重點說明各個狀態是什么意義。
Bit |
Name |
Description |
0 |
Port_Uninitialized |
輸入和輸出端口未初始化,bit0和bit1是互斥的,在同一時刻有且只有1個bit為1 (硬件自動設置和清除) |
1 |
Port_Ok |
輸入和輸出端口已經被初始化完成,且雙方互相發送error-free控制符號(硬件自動設置和清除) |
2 |
Port_Error |
輸入或者輸出端口遇到一個硬件無法恢復的錯誤,主要是指link-response未收到或者收到錯誤response |
4 |
Port_Write_Pnd |
端口要求發起一個port-write的維護操作告知對端錯誤狀態。Port-write的接收方式是預先定義好的,當出現端口錯誤的時候就會往該方發送port-write的維護報文 |
8 |
Input_Error_STP |
輸入端口檢測到一個傳輸錯誤(硬件自動設置和清除) |
9 |
Input_Error_ENC |
輸入端口曾經檢測到一個傳輸錯誤,隨着bit8的置位而置位,寫1可清除 |
10 |
Input_Retry_STP |
輸入端口進入重傳停止狀態 |
16 |
Output_Error_STP |
輸出端口檢測到一個傳輸錯誤(硬件自動設置和清除) |
17 |
Output_Error_ENC |
輸出端口曾經檢測到一個傳輸錯誤,隨着bit16的置位而置位,寫1可清除 |
18 |
Output_Retry_STP |
輸出端口進入重傳停止狀態(硬件自動設置和清除) |
19 |
Output_Retried |
輸出端口重傳標志,隨着bit18設置而置位,寫1可清除 |
20 |
Output_Retry_Enc |
輸出口曾經處於輸出重傳狀態 |
24 |
Output_Degrd_Enc |
輸出端口的degraded錯誤數達到或者超過門限值 |
25 |
Output_Fail_Enc |
輸出端口的Failed錯誤數達到或者超過門限值 |
26 |
Output_Pkt_Drop |
輸出端口丟棄一個包(只對Switch設備) |
Port uninitialized and Port Ok
端口未初始化和端口OK是一組狀態,端口狀態只能是未初始化或者OK。通常在剛開始的時候端口狀態時未初始化的,需要用戶進行初始化配置才能變成端口OK狀態。端口的初始化配置主要是端口的接收時鍾窗對齊以及端口寬度的確認過程;大部分情況端口寬度通常是固定配置的,只有接收時鍾窗需要調整。
接收時鍾窗調整的過程是,兩個連接的設備都互相向對方不停的發送training control symbol和link-request control symbol。成功收到並檢測出control symbol的端口會回復一個idle control symbol,收到idle control symbol的端口會清除port uninitialized狀態轉為port ok狀態。
Input and Output Error Stop
輸入和輸出停止錯誤是成雙成對存在的
錯誤發生場景:
設備A給設備B發送報文
設備B發現接收到的idle控制符號或者報文錯誤,那么設備B進入input error stop狀態(該bit置1,同時input error encounter也置位)。
設備B發送PNA(packet-not-accpet)控制符號給設備A
設備A收到PNA后停止發送任何消息,備份當前發送失敗的消息並進入output error stop狀態(該bit置1,同時output error encounter也置位)。
錯誤恢復場景:
前提:設備A處於output error stop,設備B處於input error stop
設備A發送link-request給設備B
設備B回應link-response給設備A,並清除input error stop狀態
設備A收到link-response,清除output error stop狀態。
設備A繼續發送上次未成功報文或者發送優先級更高的報文
Input and Output Retry Stop
輸入和輸出重傳錯誤是成雙成對存在的
錯誤發生場景:
設備A給設備B發送報文
設備B發現一些臨時問題導致不能接收報文(比如沒有空閑buffer可以接收),那么設備B丟棄該報文,進入input retry stop狀態(該bit置1,同時input retry encounter也置位)。
設備B發送PR(packet-retry)控制符號給設備A
設備A收到PR后停止發送任何消息,備份當前發送失敗的消息並進入output retry stop狀態(該bit置1,同時output retry encounter也置位)。
錯誤恢復場景:
前提:設備A處於output retry stop,設備B處於input retry stop
設備A發送restart-from-retry給設備B
設備B收到restart-from-retry后,清除input error stop狀態並開始接收報文
設備A清除output retry stop狀態,繼續發送上次未成功報文或者發送優先級更高的報文
需要指出的是,這個寄存器是SRIO錯誤狀態判斷的最基本的寄存器,還有更高級的東東,可惜我也是一知半解,下次學習了再分享吧!
First, I don't often give praise for support but I must say Travis, Karthik and Derek from TI have been extremely instrumental in getting my SRIO environment to work and bringing me up to speed on tips and tricks for SRIO. It has been very nice to see success and progress. Thanks again guys!
So in an effort to consolidate much of what I have learned, I will post here some information that I would have found extremely helpful 5 weeks ago :)
My environment:
I was using a aTCA chassis with a MCH that has an SRIO switch. For more on my setup, please see this post: http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/164695.aspx Here I say what parts I am using and some commands that I found helpful when working with the MCH.
My Goal:
Get DSP-A on C6678 EVM-A to send a message to DSP-B on C6678 EVM-B via an SRIO switch.
TI Example Programs:
When the TI programs are in loopback mode, they seem to work just fine. They use the CSL (Chip Support Library) to access registers. The CSL is nice. Plus once you open a CSL handle, you can just access the structure with all the registers yourself (see example in the after port ok section).
To switch between loopback and "normal" mode use these CSLs:
CSL_SRIO_SetLoopbackMode(hSrio, 0);
CSL_SRIO_SetNormalMode(hSrio,0);
While it seemed to make sense for me to use the MultiCoreLoopback example or the ChipToChip example, it turns out these are extremely complex and make it difficult to learn/understand what is going on.
Travis recommended using the loopbackDioIsr example project. This project is simpler using less of the queuing capabilities of the dsp. This program essentially accesses an exact memory location as given in the LSU (load storage unit) registers at the destination ID. So in the case of a loopback, it is just another location in DSP-A. In normal mode, it is a memory location in DSP-B (assuming you set up the destination IDs). Be careful, this also means you can write to any location in memory - any!
Switch and routing issues:
- Remember all device ideas have to be unique. So if you use the same example program on each DSP (A&B), it won't work unless you change the device ids.
- Remember that the switch will need to be configured to properly route destination ids on the packets. This can be done with maintanence packets or by direct configuration of the switch. In my case, the switch has a default configuration file that I modified to route from DSP-A to DSP-B.
- Remember to make sure that the switch enables the input and output on the port you are using. If it is not enabled only maintanence packets will be received. All other types will be dropped.
- my switch could only accept one connection from one device ID. The way the TI SRIO works is that you can have multiple port connections but they will always have the same device ID. The examples are written to make (4) 1 lane connections to the other device. You might need to adjust this portion of the examples if your switch is like mine.
- Travis says "port-writes" should be disabled unless you have a specific reason to use them.
- some switches need to be specially enabled to accept 16 bit destination ids. Just something to keep in mind.
Understanding Ack IDs (from Travis):
Normal handshaking at the physical layer would be like this:
Device A sends a packet to Device B with ackid n
There is a transmission error on packet ackid n
Device B sees a CRC error and goes into Input error stopped state
(drops all new RX packets)
Device B sends a PNA control symbol to Device A
On reception of the PNA, Device A goes into output error stopped
state (stops sending any new packets)
Device A sends a LR Input status control symbol to device B
On reception of the LR input status, Device B sends a Link
maintenance response control symbol indicating packet ackid n was the PNA
packet.
Also, Device B then enters normal mode.
On reception of the link maintenance response packet, Device A goes
into normal mode and starts resending packets to device B stating with packet
ackid n
Things to check after "Port Ok":
After you get a port ok, if you are having problems sending messages here are some registers you should check (the listed register is for Port 0).
- ERR_Stat (TI register 0xB158)
- LM_RESP (TI register B144)
- ERR_Det (TI register C040)
- SP0_CTL (TI register B15C)
These collectively told me that the switch was not accepting my packets and in the end lead to the discovery that the switch had not enabled input and output messaging and was only accepting maintanence packets.
If you ever see the Output Error Stop condition or the Input Error Stop condition, there is a magic number that is to be written to a register. In fact, Travis recommends doing this no matter what after receiving "Port Ok".
hSrio->RIO_PLM[i].RIO_PLM_SP_LONG_CS_TX1 = 0x2003F044;
System_printf("SRIO (Core %i): Correct Output Error Stop Condition.\n", coreNum);
After you have sent messages using the LSU, there is an LSU status register that is very helpful for indicating if the transfer was good or not.
Maintenance Packets:
Here is a short blip of code that I wrote to read a register from the switch via a maintanence packet. this function will work with the dioIsr example. Sorry about the formatting.
static Int32 maintanenceReadReg(Srio_SockHandle handle, UInt32 srioReg)
{
Srio_SockAddrInfo to;
uint16_t compCode;
uint16_t counter;
int32_t startTime;
UInt8 * pReadRespBuf = NULL;
UInt8 * pTmpRead = NULL;
pReadRespBuf = (uint8_t*)Osal_srioDataBufferMalloc(4);
if(pReadRespBuf == NULL)
{
System_printf("Error: pReadRespBuf Memory allocation failed.\n");
}
pTmpRead = pReadRespBuf;
for (counter = 0; counter < 4; counter++)
{
*pTmpRead++ = 0x55;
}
to.dio.rapidIOMSB = 0x0;
to.dio.rapidIOLSB = srioReg; //0x0015C;//(uint32_t)&dstDataBufPtr[srcDstBufIdx][0];
to.dio.dstID = DEVICE_ID4_8BIT;
to.dio.ttype = 0; //Read
to.dio.ftype = 8; //Maintanence packets
/* Send the DIO Information. */
if (Srio_sockSend_DIO (handle, pReadRespBuf, 4, (Srio_SockAddrInfo*)&to) < 0)
{
System_printf ("Error: (Core %d): Could not send message.\n", coreNum);
return -1;
}
/* Wait for the interrupt to occur without touching the peripheral. */
/* Other useful work could be done here such as by invoking a scheduler */
startTime = TSCL;
while((! srioLsuIsrServiced) && ((TSCL - startTime) < SRIO_DIO_LSU_ISR_TIMEOUT));
if (! srioLsuIsrServiced) {
System_printf ("ISR didn't happen within set time - %d cycles. Example failed !!!\n", SRIO_DIO_LSU_ISR_TIMEOUT);
return -1;
}
Osal_srioDataBufferFree(pReadRespBuf, 4);
return 0;
}
Calling the function:
maintanenceReadReg(mySrioSocket, 0x15C);
CSL_SRIO_ClearLSUPendingInterrupt (hSrioCSL, 0xFFFFFFFF, 0xFFFFFFFF);
srioLsuIsrServiced = 0;
This clearingLSUPending interrupt is important - it has to happen after each transmission (at least in this example).
Various Other Posts to check:
http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/168310.aspx
http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/168310.aspx
http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/167006.aspx
http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/165949.aspx
I am sure I have forgotten a few things but hopefully this will get you started and post away, hopefully Travis, Derek or Karthik will see it and be able to help!
Good luck!
Brandy
PS - thanks again guys, it feels great to be moving forward!!