本文轉載自:https://www.cnblogs.com/LubinLew/p/cpu_affinity.html,版權歸原作者所有。
0、准備知識
超線程技術(Hyper-Threading):就是利用特殊的硬件指令,把兩個邏輯內核(CPU core)模擬成兩個物理芯片,
讓單個處理器都能使用線程級並行計算,進而兼容多線程操作系統和軟件,減少了CPU的閑置時間,提高的CPU的運行效率。
我們常聽到的雙核四線程/四核八線程指的就是支持超線程技術的CPU.
物理CPU:機器上安裝的實際CPU, 比如說你的主板上安裝了一個8核CPU,那么物理CPU個數就是1個,所以物理CPU個數就是主板上安裝的CPU個數。
邏輯CPU:一般情況,我們認為一顆CPU可以有多核,加上intel的超線程技術(HT), 可以在邏輯上再分一倍數量的CPU core出來;
邏輯CPU數量 = 物理CPU數量 x CPU cores x 2(如果支持並開啟HT) //前提是CPU的型號一致,如果不一致只能一個一個的加起來,不用直接乘以物理CPU數量 //比如你的電腦安裝了一塊4核CPU,並且支持且開啟了超線程(HT)技術,那么邏輯CPU數量 = 1 × 4 × 2 = 8
Linux下查看CPU相關信息, CPU的信息主要都在/proc/cupinfo中,
# 查看物理CPU個數 cat /proc/cpuinfo|grep "physical id"|sort -u|wc -l # 查看每個物理CPU中core的個數(即核數) cat /proc/cpuinfo|grep "cpu cores"|uniq # 查看邏輯CPU的個數 cat /proc/cpuinfo|grep "processor"|wc -l # 查看CPU的名稱型號 cat /proc/cpuinfo|grep "name"|cut -f2 -d:|uniq
Linux查看某個進程運行在哪個邏輯CPU上
ps -eo pid,args,psr
#參數的含義: pid - 進程ID args - 該進程執行時傳入的命令行參數 psr - 分配給進程的邏輯CPU
例子:
[~]# ps -eo pid,args,psr | grep nginx
9073 nginx: master process /usr/ 1
9074 nginx: worker process 0
9075 nginx: worker process 1
9076 nginx: worker process 2
9077 nginx: worker process 3
13857 grep nginx 3
Linux查看線程的TID
TID就是Thread ID,他和POSIX中pthread_t表示的線程ID完全不是同一個東西.
Linux中的POSIX線程庫實現的線程其實也是一個輕量級進程(LWP),這個TID就是這個線程的真實PID.
但是又不能通過getpid()函數獲取,Linux中定義了gettid()這個接口,但是通常都是未實現的,所以需要使用下面的方式獲取TID。
//program #include <sys/syscall.h> pid_t tid; tid = syscall(__NR_gettid);// or syscall(SYS_gettid) //command-line 3種方法(推薦第三種方法) (1)ps -efL | grep prog_name (2)ls /proc/pid/task //文件夾名即TID
(3)ps -To 'pid,lwp,psr,cmd' -p PID
1、CPU親和性(親和力)
1.1 基本概念
CPU affinity 是一種調度屬性(scheduler property), 它可以將一個進程"綁定" 到一個或一組CPU上.
在SMP(Symmetric Multi-Processing對稱多處理)架構下,Linux調度器(scheduler)會根據CPU affinity的設置讓指定的進程運行在"綁定"的CPU上,而不會在別的CPU上運行.
Linux調度器同樣支持自然CPU親和性(natural CPU affinity): 調度器會試圖保持進程在相同的CPU上運行, 這意味着進程通常不會在處理器之間頻繁遷移,進程遷移的頻率小就意味着產生的負載小。
因為程序的作者比調度器更了解程序,所以我們可以手動地為其分配CPU核,而不會過多地占用CPU0,或是讓我們關鍵進程和一堆別的進程擠在一起,所有設置CPU親和性可以使某些程序提高性能。
1.2 表示方法
CPU affinity 使用位掩碼(bitmask)表示, 每一位都表示一個CPU, 置1表示"綁定".
最低位表示第一個邏輯CPU, 最高位表示最后一個邏輯CPU.
CPU affinity典型的表示方法是使用16進制,具體如下.
0x00000001 is processor #0 0x00000003 is processors #0 and #1 0xFFFFFFFF is all processors (#0 through #31)
2、taskset命令
taskset命名用於獲取或者設定CPU親和性.
# 命令行形式 taskset [options] mask command [arg]...
taskset [options] -p [mask] pid
PARAMETER
mask : cpu親和性,當沒有-c選項時, 其值前無論有沒有0x標記都是16進制的,
當有-c選項時,其值是十進制的.
command : 命令或者可執行程序
arg : command的參數
pid : 進程ID,可以通過ps/top/pidof等命令獲取
OPTIONS
-a, --all-tasks (舊版本中沒有這個選項)
這個選項涉及到了linux中TID的概念,他會將一個進程中所有的TID都執行一次CPU親和性設置.
TID就是Thread ID,他和POSIX中pthread_t表示的線程ID完全不是同一個東西.
Linux中的POSIX線程庫實現的線程其實也是一個進程(LWP),這個TID就是這個線程的真實PID. -p, --pid 操作已存在的PID,而不是加載一個新的程序 -c, --cpu-list 聲明CPU的親和力使用數字表示而不是用位掩碼表示. 例如 0,5,7,9-11. -h, --help display usage information and exit -V, --version output version information and exit
USAGE
1) 使用指定的CPU親和性運行一個新程序
taskset [-c] mask command [arg]...
舉例:使用CPU0運行ls命令顯示/etc/init.d下的所有內容
taskset -c 0 ls -al /etc/init.d/
2) 顯示已經運行的進程的CPU親和性
taskset -p pid
舉例:查看init進程(PID=1)的CPU親和性
taskset -p 1
3) 改變已經運行進程的CPU親和力
taskset -p[c] mask pid
舉例:打開2個終端,在第一個終端運行top命令,第二個終端中
首先運行:[~]# ps -eo pid,args,psr | grep top #獲取top命令的pid和其所運行的CPU號
其次運行:[~]# taskset -cp 新的CPU號 pid #更改top命令運行的CPU號
最后運行:[~]# ps -eo pid,args,psr | grep top #查看是否更改成功
PERMISSIONS
一個用戶要設定一個進程的CPU親和性,如果目標進程是該用戶的,則可以設置,如果是其他用戶的,則會設置失敗,提示 Operation not permitted.當然root用戶沒有任何限制.
任何用戶都可以獲取任意一個進程的CPU親和性.
taskset命令其實就是使用sched_getaffinity()和sched_setaffinity()接口實現的,相信看完了第3節你也能自己實現一個taskset命令.
有興趣的可以看一下其源代碼:ftp://ftp.kernel.org/pub/linux/utils/util-linux/vX.YZ/util-linux-X.YZ-xxx.tar.gz /schedutils/taskset.c
3、編程API
下面是用用於設置和獲取CPU親和性相關的API.
#define _GNU_SOURCE #include <sched.h> #include <pthread.h> //for pthread functions(last 4) 注意<pthread.h>包含<sched.h> /* MACRO */ /* The following macros are provided to operate on the CPU set set */ /* Clears set, so that it contains no CPUs */ void CPU_ZERO(cpu_set_t *set); void CPU_ZERO_S(size_t setsize, cpu_set_t *set); /* Add CPU cpu to set */ void CPU_SET(int cpu, cpu_set_t *set); void CPU_SET_S(int cpu, size_t setsize, cpu_set_t *set); /* Remove CPU cpu from set */ void CPU_CLR(int cpu, cpu_set_t *set); void CPU_CLR_S(int cpu, size_t setsize, cpu_set_t *set); /* Test to see if CPU cpu is a member of set */ int CPU_ISSET(int cpu, cpu_set_t *set); int CPU_ISSET_S(int cpu, size_t setsize, cpu_set_t *set); /* Return the number of CPUs in set */ void CPU_COUNT(cpu_set_t *set); void CPU_COUNT_S(size_t setsize, cpu_set_t *set); /* The following macros perform logical operations on CPU sets */ /* Store the logical AND of the sets srcset1 and srcset2 in destset (which may be one of the source sets). */ void CPU_AND(cpu_set_t *destset, cpu_set_t *srcset1, cpu_set_t *srcset2); void CPU_AND_S(size_t setsize, cpu_set_t *destset, cpu_set_t *srcset1, cpu_set_t *srcset2); /* Store the logical OR of the sets srcset1 and srcset2 in destset (which may be one of the source sets). */ void CPU_OR(cpu_set_t *destset, cpu_set_t *srcset1, cpu_set_t *srcset2); void CPU_OR_S(size_t setsize, cpu_set_t *destset, cpu_set_t *srcset1, cpu_set_t *srcset2); /* Store the logical XOR of the sets srcset1 and srcset2 in destset (which may be one of the source sets). */ void CPU_XOR(cpu_set_t *destset, cpu_set_t *srcset1, cpu_set_t *srcset2); void CPU_XOR_S(size_t setsize, cpu_set_t *destset, cpu_set_t *srcset1, cpu_set_t *srcset2); /* Test whether two CPU set contain exactly the same CPUs. */ int CPU_EQUAL(cpu_set_t *set1, cpu_set_t *set2); int CPU_EQUAL_S(size_t setsize, cpu_set_t *set1, cpu_set_t *set2); /* The following macros are used to allocate and deallocate CPU sets: */ /* Allocate a CPU set large enough to hold CPUs in the range 0 to num_cpus-1 */ cpu_set_t *CPU_ALLOC(int num_cpus); /* Return the size in bytes of the CPU set that would be needed to hold CPUs in the range 0 to num_cpus-1. This macro provides the value that can be used for the setsize argument in the CPU_*_S() macros */ size_t CPU_ALLOC_SIZE(int num_cpus); /* Free a CPU set previously allocated by CPU_ALLOC(). */ void CPU_FREE(cpu_set_t *set); /* API */ /* Set the CPU affinity for a task */ int sched_setaffinity(pid_t pid, size_t cpusetsize, cpu_set_t *mask); /* Get the CPU affinity for a task */ int sched_getaffinity(pid_t pid, size_t cpusetsize, cpu_set_t *mask); /* set CPU affinity attribute in thread attributes object */ int pthread_attr_setaffinity_np(pthread_attr_t *attr, size_t cpusetsize, const cpu_set_t *cpuset); /* get CPU affinity attribute in thread attributes object */ int pthread_attr_getaffinity_np(const pthread_attr_t *attr, size_t cpusetsize, cpu_set_t *cpuset); /* set CPU affinity of a thread */ int pthread_setaffinity_np(pthread_t thread, size_t cpusetsize, const cpu_set_t *cpuset); /* get CPU affinity of a thread */ int pthread_getaffinity_np(pthread_t thread, size_t cpusetsize, cpu_set_t *cpuset);
相關的宏通常都分為2種,一種是帶_S后綴的,一種不是不帶_S后綴的, 從聲明上看帶_S后綴的宏都多出一個參數 setsize.
從功能上看他們的區別是帶_S后綴的宏是用於操作動態申請的CPU set(s),所謂的動態申請其實就是使用宏 CPU_ALLOC 申請,
參數setsize 可以是通過宏 CPU_ALLOC_SIZE 獲得,兩者的用法詳見下面的例子.
相關的API只有6個, 前2個是用來設置進程的CPU親和性,需要注意的一點是,當這2個API的第一個參數pid為0時,表示使用調用進程的進程ID;
后4個是用來設置線程的CPU親和性。其實sched_setaffinity()也可以用來設置線程的CPU的親和性,也就是taskset “-a”選項中提到的TID概念。
3.1 例子一:使用2種方式(帶和不帶_S后綴的宏)獲取當前進程的CPU親和性
#define _GNU_SOURCE #include <sched.h> #include <unistd.h> /* sysconf */ #include <stdlib.h> /* exit */ #include <stdio.h> int main(void) { int i, nrcpus; cpu_set_t mask; unsigned long bitmask = 0; CPU_ZERO(&mask); /* Get the CPU affinity for a pid */ if (sched_getaffinity(0, sizeof(cpu_set_t), &mask) == -1) { perror("sched_getaffinity"); exit(EXIT_FAILURE); } /* get logical cpu number */ nrcpus = sysconf(_SC_NPROCESSORS_CONF); for (i = 0; i < nrcpus; i++) { if (CPU_ISSET(i, &mask)) { bitmask |= (unsigned long)0x01 << i; printf("processor #%d is set\n", i); } } printf("bitmask = %#lx\n", bitmask); exit(EXIT_SUCCESS); } /*----------------------------------------------------------------*/ #define _GNU_SOURCE #include <sched.h> #include <unistd.h> /* sysconf */ #include <stdlib.h> /* exit */ #include <stdio.h> int main(void) { int i, nrcpus; cpu_set_t *pmask; size_t cpusize; unsigned long bitmask = 0; /* get logical cpu number */ nrcpus = sysconf(_SC_NPROCESSORS_CONF); pmask = CPU_ALLOC(nrcpus); cpusize = CPU_ALLOC_SIZE(nrcpus); CPU_ZERO_S(cpusize, pmask); /* Get the CPU affinity for a pid */ if (sched_getaffinity(0, cpusize, pmask) == -1) { perror("sched_getaffinity"); CPU_FREE(pmask); exit(EXIT_FAILURE); } for (i = 0; i < nrcpus; i++) { if (CPU_ISSET_S(i, cpusize, pmask)) { bitmask |= (unsigned long)0x01 << i; printf("processor #%d is set\n", i); } } printf("bitmask = %#lx\n", bitmask); CPU_FREE(pmask); exit(EXIT_SUCCESS); }
執行結果如下(4核CPU):
[cpu_affinity #1]$ gcc -g -Wall cpu_affinity.c
[cpu_affinity #2]$ taskset 1 ./a.out
processor #0 is set
bitmask = 0x1
[cpu_affinity #3]$ taskset 1 ./a.out
processor #0 is set
bitmask = 0x1
[cpu_affinity #4]$ taskset 2 ./a.out
processor #1 is set
bitmask = 0x2
[cpu_affinity #5]$ taskset 3 ./a.out
processor #0 is set
processor #1 is set
bitmask = 0x3
[cpu_affinity #6]$ taskset 4 ./a.out
processor #2 is set
bitmask = 0x4
[cpu_affinity #7]$ taskset 5 ./a.out
processor #0 is set
processor #2 is set
bitmask = 0x5
[cpu_affinity #8]$ taskset 6 ./a.out
processor #1 is set
processor #2 is set
bitmask = 0x6
[cpu_affinity #9]$ taskset 7 ./a.out
processor #0 is set
processor #1 is set
processor #2 is set
bitmask = 0x7
[cpu_affinity #10]$ taskset 8 ./a.out
processor #3 is set
bitmask = 0x8
[cpu_affinity #11]$ taskset 9 ./a.out
processor #0 is set
processor #3 is set
bitmask = 0x9
[cpu_affinity #12]$ taskset A ./a.out
processor #1 is set
processor #3 is set
bitmask = 0xa
[cpu_affinity #13]$ taskset B ./a.out
processor #0 is set
processor #1 is set
processor #3 is set
bitmask = 0xb
[cpu_affinity #14]$ taskset C ./a.out
processor #2 is set
processor #3 is set
bitmask = 0xc
[cpu_affinity #15]$ taskset D ./a.out
processor #0 is set
processor #2 is set
processor #3 is set
bitmask = 0xd
[cpu_affinity #16]$ taskset E ./a.out
processor #1 is set
processor #2 is set
processor #3 is set
bitmask = 0xe
[cpu_affinity #17]$ taskset F ./a.out
processor #0 is set
processor #1 is set
processor #2 is set
processor #3 is set
bitmask = 0xf
[cpu_affinity #18]$ taskset 0 ./a.out
sched_setaffinity: Invalid argument
failed to set pid 0's affinity.
執行結果
[cpu_affinity #1]$ gcc -g -Wall cpu_affinity.c [cpu_affinity #2]$ taskset 1 ./a.out processor #0 is set bitmask = 0x1 [cpu_affinity #3]$ taskset 1 ./a.out processor #0 is set bitmask = 0x1 [cpu_affinity #4]$ taskset 2 ./a.out processor #1 is set bitmask = 0x2 [cpu_affinity #5]$ taskset 3 ./a.out processor #0 is set processor #1 is set bitmask = 0x3 [cpu_affinity #6]$ taskset 4 ./a.out processor #2 is set bitmask = 0x4 [cpu_affinity #7]$ taskset 5 ./a.out processor #0 is set processor #2 is set bitmask = 0x5 [cpu_affinity #8]$ taskset 6 ./a.out processor #1 is set processor #2 is set bitmask = 0x6 [cpu_affinity #9]$ taskset 7 ./a.out processor #0 is set processor #1 is set processor #2 is set bitmask = 0x7 [cpu_affinity #10]$ taskset 8 ./a.out processor #3 is set bitmask = 0x8 [cpu_affinity #11]$ taskset 9 ./a.out processor #0 is set processor #3 is set bitmask = 0x9 [cpu_affinity #12]$ taskset A ./a.out processor #1 is set processor #3 is set bitmask = 0xa [cpu_affinity #13]$ taskset B ./a.out processor #0 is set processor #1 is set processor #3 is set bitmask = 0xb [cpu_affinity #14]$ taskset C ./a.out processor #2 is set processor #3 is set bitmask = 0xc [cpu_affinity #15]$ taskset D ./a.out processor #0 is set processor #2 is set processor #3 is set bitmask = 0xd [cpu_affinity #16]$ taskset E ./a.out processor #1 is set processor #2 is set processor #3 is set bitmask = 0xe [cpu_affinity #17]$ taskset F ./a.out processor #0 is set processor #1 is set processor #2 is set processor #3 is set bitmask = 0xf [cpu_affinity #18]$ taskset 0 ./a.out sched_setaffinity: Invalid argument failed to set pid 0's affinity.
3.2 例子二:設置進程的CPU親和性后再獲取顯示CPU親和性
#define _GNU_SOURCE #include <sched.h> #include <unistd.h> /* sysconf */ #include <stdlib.h> /* exit */ #include <stdio.h> int main(void) { int i, nrcpus; cpu_set_t mask; unsigned long bitmask = 0; CPU_ZERO(&mask); CPU_SET(0, &mask); /* add CPU0 to cpu set */ CPU_SET(2, &mask); /* add CPU2 to cpu set */ /* Set the CPU affinity for a pid */ if (sched_setaffinity(0, sizeof(cpu_set_t), &mask) == -1) { perror("sched_setaffinity"); exit(EXIT_FAILURE); } CPU_ZERO(&mask); /* Get the CPU affinity for a pid */ if (sched_getaffinity(0, sizeof(cpu_set_t), &mask) == -1) { perror("sched_getaffinity"); exit(EXIT_FAILURE); } /* get logical cpu number */ nrcpus = sysconf(_SC_NPROCESSORS_CONF); for (i = 0; i < nrcpus; i++) { if (CPU_ISSET(i, &mask)) { bitmask |= (unsigned long)0x01 << i; printf("processor #%d is set\n", i); } } printf("bitmask = %#lx\n", bitmask); exit(EXIT_SUCCESS); }
3.3 例子三:設置線程的CPU屬性后再獲取顯示CPU親和性
這個例子來源於Linux的man page.
#define _GNU_SOURCE #include <pthread.h> //不用再包含<sched.h> #include <stdio.h> #include <stdlib.h> #include <errno.h> #define handle_error_en(en, msg) \ do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0) int main(int argc, char *argv[]) { int s, j; cpu_set_t cpuset; pthread_t thread; thread = pthread_self(); /* Set affinity mask to include CPUs 0 to 7 */ CPU_ZERO(&cpuset); for (j = 0; j < 8; j++) CPU_SET(j, &cpuset); s = pthread_setaffinity_np(thread, sizeof(cpu_set_t), &cpuset); if (s != 0) { handle_error_en(s, "pthread_setaffinity_np"); } /* Check the actual affinity mask assigned to the thread */ s = pthread_getaffinity_np(thread, sizeof(cpu_set_t), &cpuset); if (s != 0) { handle_error_en(s, "pthread_getaffinity_np"); } printf("Set returned by pthread_getaffinity_np() contained:\n"); for (j = 0; j < CPU_SETSIZE; j++) //CPU_SETSIZE 是定義在<sched.h>中的宏,通常是1024 { if (CPU_ISSET(j, &cpuset)) { printf(" CPU %d\n", j); } } exit(EXIT_SUCCESS); }
3.4 例子四:使用seched_setaffinity設置線程的CPU親和性
#define _GNU_SOURCE #include <sched.h> #include <stdlib.h> #include <sys/syscall.h> // syscall int main(void) { pid_t tid; int i, nrcpus; cpu_set_t mask; unsigned long bitmask = 0; CPU_ZERO(&mask); CPU_SET(0, &mask); /* add CPU0 to cpu set */ CPU_SET(2, &mask); /* add CPU2 to cpu set */ // get tid(線程的PID,線程是輕量級進程,所以其本質是一個進程) tid = syscall(__NR_gettid); // or syscall(SYS_gettid); /* Set the CPU affinity for a pid */ if (sched_setaffinity(tid, sizeof(cpu_set_t), &mask) == -1) { perror("sched_setaffinity"); exit(EXIT_FAILURE); } exit(EXIT_SUCCESS); }
----------------------------------------------------------------------------------------------------------------
參考文獻:
- http://www.yboren.com/posts/44412.html?utm_source=tuicool
- http://www.ibm.com/developerworks/cn/linux/l-affinity.html
- http://saplingidea.iteye.com/blog/633616
- http://blog.csdn.net/ttyttytty12/article/details/11726569
- https://en.wikipedia.org/wiki/Processor_affinity
- http://blog.chinaunix.net/uid-23622436-id-3311579.html
- http://www.cnblogs.com/emanlee/p/3587571.html
- http://blog.chinaunix.net/uid-26651253-id-3342161.html
- http://blog.csdn.net/delphiwcdj/article/details/8476547
- http://www.man7.org/linux/man-pages/man3/pthread_setaffinity_np.3.html
- http://www.man7.org/linux/man-pages/man3/pthread_attr_setaffinity_np.3.html
- man CPU_SET taskset