Linux ELF格式分析


http://www.cnblogs.com/hzl6255/p/3312262.html

ELF, Executable and Linking Format, 是一種用於可執行文件、目標文件、共享庫和核心轉儲的標准文件格式。  ELF格式是是UNIX系統實驗室作為ABI(Application Binary Interface)而開發和發布的。

這里簡單介紹一下相關歷史:  
- UNIX:        最初采用的格式為a.out,之后被System V中的COFF取代,最后則被SVR4中的ELF格式所取代。  
- Windows:   采用的則是COFF格式的變種PE格式 
- MAC OS X: 采用的是Mach-O格式

ELF有四種不同的類型:  
1. 可重定位文件(Relocatable): 編譯器和匯編器產生的.o文件,需要被Linker進一步處理  
2. 可執行文件(Executable): Have all relocation done and all symbol resolved except perhaps shared library symbols that must be resolved at run time  
3. 共享對象文件(Shared Object): 即動態庫文件(.so)  
4. 核心轉儲文件(Core File): 

1.ELF文件結構 

可以從兩個角度來描述ELF文件結構  
~1. Compilers,assemblers,linkers: 由Section header table描述的Sections組成  
~2. System loader: 由Program header table描述的Segments組成

ELF_struct

TIP:  
- A single segment usually consist of several sections.  
- Relocatable files have Section header tables. Executable files have Program header tables. Shared object files have both  
- Sections are intended for further processing by a linker, while the segments are intended to be mapped into memory  
- 只有ELF header是固定在文件的首部, 而Program header和Section header的位置則由ELF header指出

ELF數據表示: 六種數據類型(32-bit)

Name Size Alignment Purpose
Elf32_Addr 4 4 Unsigned program address
Elf32_Off 4 4 Unsigned file offset
Elf32_Half 2 2 Unsigned medium interger
Elf32_Word 4 4 unsigned interger
Elf32_Sword 4 4 Signed interger
unsigned char 1 1 Unsigned small interger

@1: 

ELF header: 在文件開始處,描述了整個文件的組織,占用 52-bytes

#define EI_NIDENT (16)
typedef struct
{
  unsigned char e_ident[EI_NIDENT];   /* Magic number and other info */
  Elf32_Half    e_type;               /* Object file type */
  Elf32_Half    e_machine;            /* Architecture */
  Elf32_Word    e_version;            /* Object file version */
  Elf32_Addr    e_entry;              /* Entry point virtual address */
  Elf32_Off     e_phoff;              /* Program header table file offset */
  Elf32_Off     e_shoff;              /* Section header table file offset */
  Elf32_Word    e_flags;              /* Processor-specific flags */
  Elf32_Half    e_ehsize;             /* ELF header size in bytes */
  Elf32_Half    e_phentsize;          /* Program header table entry size */
  Elf32_Half    e_phnum;              /* Program header table entry count */
  Elf32_Half    e_shentsize;          /* Section header table entry size */
  Elf32_Half    e_shnum;              /* Section header table entry count */
  Elf32_Half    e_shstrndx;           /* Section header string table index */
} Elf32_Ehdr;

我們來看看一個最基本的ELF header

[root@bogon ~]# readelf -h a.out 
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x80482a0                 /* e_entry */
  Start of program headers:          52 (bytes into file)      /* e_phoff */
  Start of section headers:          1992 (bytes into file)    /* e_shoff: See Starting address of section headers */
  Flags:                             0x0
  Size of this header:               52 (bytes)                /* e_ehsize */
  Size of program headers:           32 (bytes)                /* e_phentsize */
  Number of program headers:         8                         /* e_phnum */
  Size of section headers:           40 (bytes)                /* e_shentsize */
  Number of section headers:         29                        /* e_shnum */
  Section header string table index: 26                        /* e_shstrndx */

從elf header我們可以得到如下信息?

@2:

section header:  包含section的信息。

每個section header占 40-bytes (即e_shentsize大小)

/* Section header.  */
typedef struct
{
  elf32_word    sh_name;        /* Section name (string tbl index) */
  elf32_word    sh_type;        /* Section type */
  elf32_word    sh_flags;       /* Section flags */
  elf32_addr    sh_addr;        /* Section virtual addr at execution */
  elf32_off     sh_offset;      /* Section file offset */
  elf32_word    sh_size;        /* Section size in bytes */
  elf32_word    sh_link;        /* Link to another section */
  elf32_word    sh_info;        /* Additional section information */
  elf32_word    sh_addralign;   /* Section alignment */
  elf32_word    sh_entsize;     /* Entry size if section holds table */
} elf32_shdr;

Section Type(*sh_type*) 

PROGBITS:           This holds program contents including code, data, and debugger information. 
NOBITS:             Like PROGBITS. However, it occupies no space. 
SYMTAB and DYNSYM:  These hold symbol table.                              [See below] STRTAB: This is a string table, like the one used in a.out. [See below] REL and RELA: These hold relocation information. DYNAMIC and HASH: This holds information related to dynamic linking. 

下面列舉了一些常見的Section:

.text:  (PROGBITS:ALLOC+EXECINSTR)
     可執行代碼
.data:  (PROGBITS:ALLOC+WRITE)
     初始化數據
.rodata:(PROGBITS:ALLOC)
     只讀數據
.bss:   (NOBITS:ALLOC+WRITE)
     未初始化數據,運行時會置0
.rel.text, .rel.data, and .rel.rodata:(REL)
     靜態鏈接的重定位信息
.rel.plt: (REL)
     The list of elements in the PLT, which are liable to the relocatio during the dynamic linking(if PLT is used)
.rel.dyn: (REL)
     The relocation for dynamically linked functions(if PLT is not used)     
.symtab: 
符號表 .strtab:
字符串表 .shstrtab:
Section String Table, 段名表 .init, .fini: (PROGBITS:ALLOC+EXECINSTR)
程序初始化與終結代碼段 .interp: (PROGBITS:ALLOC)
This section holds the pathname of a program interpreter.For present,this is used to run the run-time dynamic linker to load the program and to link in any required shared libraries. .got, .plt: (PROGBIT)
動態鏈接的跳轉表和全局入口表.

TIP: 符號表(symtab)和字符串表(strtab)的區別 
strtab就是記錄ELF文件中的字符串常量,變量名等等 
symtab記錄的則是函數和變量(符號), 主要用於鏈接時目標文件之間對地址的引用

下面是基本的Section header tables [0x7c8 = 1992]

[root@bogon ~]# readelf -s a.out 
there are 29 section headers, starting at offset 0x7c8:
section headers:
  [nr] name              type            addr     off    size   es flg lk inf al
  [ 0]                   null            00000000 000000 000000 00      0   0  0
  [ 1] .interp           progbits        08048134 000134 000013 00   a  0   0  1
  [ 2] .note.abi-tag     note            08048148 000148 000020 00   a  0   0  4
  [ 3] .hash             hash            08048168 000168 000024 04   a  4   0  4
  [ 4] .dynsym           dynsym          0804818c 00018c 000040 10   a  5   1  4
  [ 5] .dynstr           strtab          080481cc 0001cc 000045 00   a  0   0  1
  [ 6] .gnu.version      versym          08048212 000212 000008 02   a  4   0  2
  [ 7] .gnu.version_r    verneed         0804821c 00021c 000020 00   a  5   1  4
  [ 8] .rel.dyn          rel             0804823c 00023c 000008 08   a  4   0  4
  [ 9] .rel.plt          rel             08048244 000244 000010 08   a  4  11  4
  [10] .init             progbits        08048254 000254 000017 00  ax  0   0  4
  [11] .plt              progbits        0804826c 00026c 000030 04  ax  0   0  4
  [12] .text             progbits        080482a0 0002a0 000198 00  ax  0   0 16
  [13] .fini             progbits        08048438 000438 00001c 00  ax  0   0  4
  [14] .rodata           progbits        08048454 000454 00000c 00   a  0   0  4
  [15] .eh_frame_hdr     progbits        08048460 000460 00001c 00   a  0   0  4
  [16] .eh_frame         progbits        0804847c 00047c 000058 00   a  0   0  4
  [17] .ctors            progbits        080494d4 0004d4 000008 00  wa  0   0  4
  [18] .dtors            progbits        080494dc 0004dc 000008 00  wa  0   0  4
  [19] .jcr              progbits        080494e4 0004e4 000004 00  wa  0   0  4
  [20] .dynamic          dynamic         080494e8 0004e8 0000c8 08  wa  5   0  4
  [21] .got              progbits        080495b0 0005b0 000004 04  wa  0   0  4
  [22] .got.plt          progbits        080495b4 0005b4 000014 04  wa  0   0  4
  [23] .data             progbits        080495c8 0005c8 000004 00  wa  0   0  4
  [24] .bss              nobits          080495cc 0005cc 000008 00  wa  0   0  4
  [25] .comment          progbits        00000000 0005cc 000114 00      0   0  1
  [26] .shstrtab         strtab          00000000 0006e0 0000e5 00      0   0  1
  [27] .symtab           symtab          00000000 000c50 000440 10     28  49  4
  [28] .strtab           strtab          00000000 001090 000249 00      0   0  1
key to flags:
  w (write), a (alloc), x (execute), m (merge), s (strings)
  i (info), l (link order), g (group), x (unknown)
  o (extra os processing required) o (os specific), p (processor specific)

string table:

這里的string是以null結尾的字符序列,用來表示Symbol和Section的名稱,用索引來引用該字符串 
對於Section string[.shstrtab] , ELF Header中的成員變量e_shstrndx則指明了所在Section, 
索引則保存在每個Elf32_Shdr的sh_name中

SeeMore

symbol table: 

定位和重定位程序的符號定義和引用

SeeMore

Relocation table:

SeeMore 

@3: 

Program header: 指出怎樣創建進程映像,含有每個program header的入口

每個Program segment Header占 32-bytes(即e_phentsize大小)

typedef struct
{
  Elf32_Word    p_type;        /* Segment type */
  Elf32_Off     p_offset;      /* Segment file offset */
  Elf32_Addr    p_vaddr;       /* Segment virtual address */
  Elf32_Addr    p_paddr;       /* Segment physical address */
  Elf32_Word    p_filesz;      /* Segment size in file */
  Elf32_Word    p_memsz;       /* Segment size in memory */
  Elf32_Word    p_flags;       /* Segment flags */
  Elf32_Word    p_align;       /* Segment alignment */
} Elf32_Phdr;

Type of segment(*p_type*)

PT_PHDR:    Specifies the location and size of the program header table itself, both in the file and in the memory image of the program.
PT_LOAD:    This segment is a loadable segment.
PT_DYNAMIC: This array element specifies dynamic linking information.
PT_INTERP:  This element specified the location and size of a null-terminated path name to invoke as an interpreter.

下面是Program header實例

[root@bogon ~]# readelf -l a.out 
Elf file type is EXEC (Executable file)
Entry point 0x80482a0
There are 8 program headers, starting at offset 52
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x08048034 0x08048034 0x00100 0x00100 R E 0x4
  INTERP         0x000134 0x08048134 0x08048134 0x00013 0x00013 R   0x1
      [Requesting program interpreter: /lib/ld-linux.so.2]
  LOAD           0x000000 0x08048000 0x08048000 0x004d4 0x004d4 R E 0x1000
  LOAD           0x0004d4 0x080494d4 0x080494d4 0x000f8 0x00100 RW  0x1000
  DYNAMIC        0x0004e8 0x080494e8 0x080494e8 0x000c8 0x000c8 RW  0x4
  NOTE           0x000148 0x08048148 0x08048148 0x00020 0x00020 R   0x4
  GNU_EH_FRAME   0x000460 0x08048460 0x08048460 0x0001c 0x0001c R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4
 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 
   03     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.ABI-tag 
   06     .eh_frame_hdr 
   07

@4:

Section: 提供了目標文件的各項信息(如指令、數據、符號表、重定位信息等)

2. ELF文件分析

很多工具可以用來分析ELF文件

除了上面的readelf外,還有objdump,objcopy等   

# objdump -x /bin/ls                         # 查看ELF文件的section
# objdump -j .data -s /bin/ls                # 顯示指定section內容
#
# objcopy -O binary -j .text a.out text.bin  # 將.text section導入到text.bin文件中

完整的分析教程:  <Linux C編程一站式學習-ELF文件>

3. ELF文件解析

很多地方有對ELF文件的解析 Linux對ELF文件的加載: 

execve() –> sys_execve() –> do_execve() –> search_binary_handler() -elf-> load_elf_binary()/load_elf_library()

binutils中readelf很形象的解析了ELF文件

開源項目ELFToolChain

atratus/coLinux/LINE: 其中的ELF Loader值得參考

4. 參考文檔

RefSpes:   Linux Foundation Referenced Specifications

SysV ABI:  System V ABI

ELF規范:    Executable and Linking Format Specification V1.2

ELF格式:    ELF Format

PE格式:     PE Format


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM