1. NVMe概述
- NVMe是一個針對基於PCIe的固態硬盤的高性能的、可擴展的主機控制器接口。
- NVMe的顯著特征是提供多個隊列來處理I/O命令。單個NVMe設備支持多達64K個I/O 隊列,每個I/O隊列可以管理多達64K個命令。
- 當主機發出一個I/O命令的時候,主機系統將命令放置到提交隊列(SQ),然后使用門鈴寄存器(DB)通知NVMe設備。
- 當NVMe設備處理完I/O命令之后,設備將處理結果寫入到完成隊列(CQ),並引發一個中斷通知主機系統。
- NVMe使用MSI/MSI-X和中斷聚合來提高中斷處理的性能。
2. SPDK概述
The Storage Performance Development Kit (SPDK) provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications. It achieves high performance by moving all of the necessary drivers into userspace and operating in a polled mode instead of relying on interrupts, which avoids kernel context switches and eliminates interrupt handling overhead.
SPDK(存儲性能開發套件)為編寫高性能的、可擴展的、用戶態存儲應用提供了一套工具和庫函數。SPDK之所以能實現高性能,是因為所有必要的驅動被挪到了用戶空間運行,使用輪詢模式代替了中斷模式,從而避免了內核上下文切換和消除了中斷處理開銷。
The bedrock of SPDK is a user space, polled-mode, asynchronous, lockless NVMe driver. This provides zero-copy, highly parallel access directly to an SSD from a user space application. The driver is written as a C library with a single public header. Similarly, SPDK provides a user space driver for the I/OAT DMA engine present on many Intel Xeon-based platforms with all of the same properties as the NVMe driver.
SPDK的基石是一個運行在用戶空間的、采用輪詢模式的、異步的、無鎖的NVMe驅動。用戶空間應用程序可直接訪問SSD盤,而且是零拷貝、高度並行地訪問SSD盤。該驅動程序實現為一個C函數庫,該函數庫攜帶一個單一的公共頭文件。類似地,SPDK為許多基於Intel至強平台的I/OAT DMA引擎提供了一個用戶空間驅動程序,NVMe驅動所具備的所有屬性,該驅動程序都具備。
SPDK also provides NVMe-oF and iSCSI servers built on top of these user space drivers that are capable of serving disks over the network. The standard Linux kernel iSCSI and NVMe-oF initiator can be used (or the Windows iSCSI initiator even) to connect clients to the servers. These servers can be up to an order of magnitude more CPU efficient than other implementations.
SPDK還提供了NVMe-oF和基於這些用戶空間驅動程序構建的iSCSI服務器, 從而有能力提供網絡磁盤服務。客戶端可以使用標准的Linux內核iSCSI和NVMe-oF initiator(或者甚至使用Windows的iSCSI initiator)來連接服務器。跟其他實現比起來,這些服務器在CPU利用效率方面可以達到數量級的提升。
SPDK is an open source, BSD licensed set of C libraries and executables hosted on GitHub. All new development is done on the master branch and stable releases are created quarterly. Contributors and users are welcome to submit patches, file issues, and ask questions on our mailing list.
SPDK是一個開源的、BSD授權的集C庫和可執行文件一體的開發套件,其源代碼通過GitHub托管。所有新的開發都放到master分支上,每個季度發布一個穩定版本。歡迎代碼貢獻者和用戶提交補丁、報告問題,並通過郵件列表提問。
3. SPDK/NVMe驅動概述
The NVMe driver is a C library that may be linked directly into an application that provides direct, zero-copy data transfer to and from NVMe SSDs. It is entirely passive, meaning that it spawns no threads and only performs actions in response to function calls from the application itself. The library controls NVMe devices by directly mapping the PCI BAR into the local process and performing MMIO. I/O is submitted asynchronously via queue pairs and the general flow isn't entirely dissimilar from Linux's libaio.
NVMe驅動是一個C函數庫,可直接鏈接到應用程序從而在應用與NVMe固態硬盤之間提供直接的、零拷貝的數據傳輸。這是完全被動的,意味着不會開啟線程,只是執行來自應用程序本身的函數調用。這套庫函數直接控制NVMe設備,通過將PCI BAR寄存器直接映射到本地進程中然后執行基於內存映射的I/O(MMIO)。I/O是通過隊列對(QP)進行異步提交,其一般的執行流程跟Linux的libaio相比起來,並非完全不同。
進一步的詳細信息,請閱讀這里。
4. 其他NVMe驅動實現
- Linux內核NVMe驅動 : 去官網www.kernel.org或者鏡像下載內核源代碼,然后閱讀include/linux/nvme.h, drivers/nvme
- NVMeDirect : 這是韓國人發起的一個開源項目,跟SPDK/NVMe驅動類似,但是嚴重依賴於Linux內核NVMe驅動的實現。 可以說,NVMeDirect是一個站在巨人肩膀上的用戶態I/O框架。
Do not let what you cannot do interfere with what you can do. | 別讓你不能做的事妨礙到你能做的事。