這兩天的一個小任務是MongoDB服務器的調優,恰好這段時間對Linux的各種性能診斷、調優感興趣,就順着這個任務多翻了些書和文章。
新學到的一個東西是 Linux磁盤的I/O隊列調度策略,至少MySQL和PostgreSQL都推薦調整這個:
- http://www.mysqlperformanceblog.com/2009/01/30/linux-schedulers-in-tpcc-like-benchmark/
- http://www.cybertec.at/postgresql-linux-kernel-io-tuning/
傻瓜化說明
簡單地說,對於磁盤I/O,Linux提供了cfq, deadline和noop三種調度策略
- cfq: 這個名字是Complete Fairness Queueing的縮寫,它是一個復雜的調度策略,按進程創建多個隊列,試圖保持對多個進程的公平(這就沒考慮讀操作和寫操作的不同耗時)
- deadline: 這個策略比較簡單,只分了讀和寫兩個隊列(這顯然會加速讀取量比較大的系統),叫這個名字是內核為每個I/O操作都給出了一個超時時間
- noop: 這個策略最簡單,只有單個隊列,只有一些簡單合並操作
考慮到硬件配置、實際應用場景(讀寫比例、順序還是隨機讀寫)的差異,上面的簡單解釋對於實際選擇沒有太大幫助,實際該選擇哪個基本還是要實測來驗證。不過下面幾條說明供參考:
- 根據多篇文章的說法,
deadline
和noop
差異不是太大,但它們倆與cfq
差異就比較大 - MySQL這類數據存儲系統不要使用
cfq
(時序數據庫可能會有所不同。不過也有說從來沒見過deadline
比cfq
差的情況) - 對於虛擬機上面的磁盤,建議采用比較簡單的
noop
,畢竟數據實際上怎么落盤取決於虛擬化那一層 - 我手邊幾個vm的默認值是:centos6是cfq,ubuntu12.04是xxxx,centos7和ubuntu14.04是deadline (不過這只代表這幾台,我不知道是否具有代表性)
用如下命令可以查到每個磁盤的當前設置
# cat /sys/block/sda/queue/scheduler
[noop] deadline cfq
(方括號里面的是當前選定的調度策略)
用如下方法 即時 可以修改設置
echo deadline > /sys/block/sda/queue/scheduler
# or
echo deadline | sudo tee /sys/block/sda/queue/scheduler
附:《高性能MySQL》里面的相關說明
在《高性能MySQL》第九章 Operating System and Hardware Optimization(英文版的第434頁)有如下內容(注意要第三版,在第二版里面還沒有這一節)
Choosing a Disk Queue Scheduler
On GNU/Linux, the queue scheduler determines the order in which requests to a block
device are actually sent to the underlying device.The default is Completely Fair Queueing, or
cfq
. It’s okay for casual use on laptops and desktops, where it helps prevent
I/O starvation, but it’s terrible for servers. It causes very poor response times under the types of workload that MySQL generates, because it stalls some requests in the queue
needlessly.You can see which schedulers are available, and which one is active, with the following
command:$ cat /sys/block/sda/queue/scheduler noop deadline [cfq]
You should replace sda with the device name of the disk you’re interested in. In our
example, the square brackets indicate which scheduler is in use for this device.The
other two choices are suitable for server-class hardware, and in most cases they work
about equally well. Thenoop
scheduler is appropriate for devices that do their own
scheduling behind the scenes, such as hardware RAID controllers and SANs, anddeadline
is fine both for RAID controllers and disks that are directly attached. Our benchmarks show very little difference between these two. The main thing is to use anything
butcfq
, which can cause severe performance problems.Take this advice with a grain of salt, though, because the disk schedulers actually come
in many variations in different kernels, and there is no indication of that in their names.