逆向過elf程序都知道,GCC的canary,x86_64下從fs:0x28偏移處獲取,32位下從gs:0x14偏移處獲取。但知道canary如何產生,為什么在這里取的人比較少。
下面以x86_64平台為例,通過glibc源碼分析一下。
看第一個問題:為什么從%fs:0x28處取。%fs寄存器被glibc定義為存放tls信息,查看tls結構:
typedef struct
{
void *tcb; /* Pointer to the TCB. Not necessarily the
thread descriptor used by libpthread. */
dtv_t *dtv;
void *self; /* Pointer to the thread descriptor. */
int multiple_threads;
int gscope_flag;
uintptr_t sysinfo;
uintptr_t stack_guard; /* canary,0x28偏移 */
uintptr_t pointer_guard;
……
} tcbhead_t;
可以看到%fs:0x28實際取的是當前線程控制塊的stack_guard變量,這個變量在線程創建時已經固定。下面看第二個問題,stack_guard如何賦值的。
Linux加載器完成elf加載后,會將入口設置為_start,並在棧上為_start提供入參。_start的代碼在sysdeps/x86_64/start.S文件中。
_start從棧上取參數,然后調用__libc_start_main()函數,這個函數也是在main()函數之前執行:
58 ENTRY (_start)
59 /* Clearing frame pointer is insufficient, use CFI. */
60 cfi_undefined (rip)
61 /* Clear the frame pointer. The ABI suggests this be done, to mark
62 the outermost frame obviously. */
63 xorl %ebp, %ebp
64
65 /* Extract the arguments as encoded on the stack and set up
66 the arguments for __libc_start_main (int (*main) (int, char **, char **),
67 int argc, char *argv,
68 void (*init) (void), void (*fini) (void),
69 void (*rtld_fini) (void), void *stack_end).
70 The arguments are passed via registers and on the stack:
71 main: %rdi
72 argc: %rsi
73 argv: %rdx
74 init: %rcx
75 fini: %r8
76 rtld_fini: %r9
77 stack_end: stack. */
__libc_start_main()首先以_dl_random這個全局變量為入參,生成canary,然后通過THREAD_SET_STACK_GUARD宏將canary賦值給tls的stack_guard變量。
198 /* Set up the stack checker's canary. */
199 uintptr_t stack_chk_guard = _dl_setup_stack_chk_guard (_dl_random);
200 # ifdef THREAD_SET_STACK_GUARD
201 THREAD_SET_STACK_GUARD (stack_chk_guard);
202 # else
203 __stack_chk_guard = stack_chk_guard;
204 # endif
看下_dl_random哪里來的,在glibc源碼中,有2處但實現大致相同:
126 /* Random data provided by the kernel. */
127 void *_dl_random;
288 case AT_RANDOM:
289 _dl_random = (void *) av->a_un.a_val;
注意av這個變量,逆向跟蹤發現其最終來自__libc_start_main()的argv參數。也就是_dl_random是由加載器提供的。而AT_RANDOM表示內核提供了接口,支持canary的隨機數生成。可以使用下面命令查看:
kiiim@ubuntu :~/glibc-2.22$ LD_SHOW_AUXV=1 /bin/true grep AT_RANDOM
AT_RANDOM: 0x7fffdaf776e9
看下實際代碼中,這個內核接口指的是什么,canary值又如何取。
rand_size = CONFIG_SECURITY_AUXV_RANDOM_SIZE * sizeof(unsigned long);
u_rand_bytes = NULL;
if (rand_size) {
unsigned char k_rand_bytes[CONFIG_SECURITY_AUXV_RANDOM_SIZE * sizeof(unsigned long)];
get_random_bytes(k_rand_bytes, rand_size);
u_rand_bytes = (elf_addr_t __user *)STACK_ALLOC(p, rand_size);
if (__copy_to_user(u_rand_bytes, k_rand_bytes, rand_size))
return -EFAULT;
}
發現在內核中通過get_random_bytes()接口產生,並copy_to_user()到用戶空間。而內核中的安全隨機數,也推薦使用get_random_bytes()生成。下面看下實現:
http://lxr.free-electrons.com/source/drivers/char/random.c
void get_random_bytes(void *buf, int nbytes)
{
#if DEBUG_RANDOM_BOOT > 0
if (unlikely(nonblocking_pool.initialized == 0))
printk(KERN_NOTICE "random: %pF get_random_bytes called "
"with %d bits of entropy available\n",
(void *) _RET_IP_,
nonblocking_pool.entropy_total);
#endif
trace_get_random_bytes(nbytes, _RET_IP_);
extract_entropy(&nonblocking_pool, buf, nbytes, 0, 0);
}
EXPORT_SYMBOL(get_random_bytes);
而看一下read /dev/urandom的內核實現:
static ssize_t
urandom_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos)
{
int ret;
if (unlikely(nonblocking_pool.initialized == 0))
printk_once(KERN_NOTICE "random: %s urandom read "
"with %d bits of entropy available\n",
current->comm, nonblocking_pool.entropy_total);
nbytes = min_t(size_t, nbytes, INT_MAX >> (ENTROPY_SHIFT + 3));
ret = extract_entropy_user(&nonblocking_pool, buf, nbytes);
trace_urandom_read(8 * nbytes, ENTROPY_BITS(&nonblocking_pool),
ENTROPY_BITS(&input_pool));
return ret;
}
可以看到get_random_bytes()與read /dev/urandom實現是相同的,都是通過extract_entropy*從"entropy pool"中取的隨機數。只不過一個在內核空間用,將結果返回到一塊內核buffer,一個在用戶空間使用,將結果返回到一塊用戶buffer。
下面再來看下,程序中如何使用這個canary。分析a()函數:
void a() {
int a = 3;
char str[16];
}
x86_64平台匯編如下:
(gdb) disass a
Dump of assembler code for function a:
0x000000000040055d <+0>: push %rbp
0x000000000040055e <+1>: mov %rsp,%rbp
0x0000000000400561 <+4>: sub $0x30,%rsp
0x0000000000400565 <+8>: mov %fs:0x28,%rax
0x000000000040056e <+17>: mov %rax,-0x8(%rbp)
0x0000000000400572 <+21>: xor %eax,%eax
0x0000000000400574 <+23>: movl $0x3,-0x24(%rbp) ;變量重排,a的地址低於str地址
0x000000000040057b <+30>: mov -0x8(%rbp),%rax
0x000000000040057f <+34>: xor %fs:0x28,%rax
0x0000000000400588 <+43>: je 0x40058f <a+50>
0x000000000040058a <+45>: callq 0x400440 <__stack_chk_fail@plt>
0x000000000040058f <+50>: leaveq
0x0000000000400590 <+51>: retq
End of assembler dump.
可以看到,GCC的棧保護還實現了變量重排。但與微軟實現不同,GCC取出canary后並沒有與ebp異或,直接放到棧上。也就是說,同一線程中,所有的canary值都是相同的,通過調試驗證也中如此:
Breakpoint 1, 0x000000000040056e in a () at 1.c:4
4 void a() {
(gdb) p/x $rax
$1 = 0xc609d364696f6000
(gdb) c
Continuing.
Breakpoint 2, 0x00000000004005a2 in b () at 1.c:9
9 void b() {
(gdb) p/x $rax
$2 = 0xc609d364696f6000
(gdb) c
Continuing.
Breakpoint 1, 0x000000000040056e in a () at 1.c:4
4 void a() {
(gdb) p/x $rax
$3 = 0xc609d364696f6000
(gdb)