COFF - 中間文件格式解析


 

 

 

G Common Object File Format (COFF)

 

Overall structure 630 

File header 632 

Optional header 633 

Section headers 634 

Raw data sections 636 

COFF relocation information 637 

Line number information 639 

Symbol table 641 

Additional symbols 643 

String table 643

 

This section describes the Common Object File Format,

本節描述了通用對象文件格式

COFF, used by the linker.

COFF文件, 提供給鏈接器連接成可執行文件的中間文件

 

 

Overall structure

整體的COFF文件結構體:

The COFF Object Format is used both for object files (.o extension) and executable files.

COFF目標格式既用於中間文件,也用於可執行文件

Some of the information is only present in object files,

一些信息只出現在對象文件中

other information is only present in the executable files.

其他的信息只出現在可執行文件中

 

Table G-1   COFF file components COFF文件組成

Section  區段名

Description  說明

File header  

文件頭

Contains general information; always present.  

包含一般性的消息, 永遠有效

Optional header  

擴展頭

Contains information about an executable file; usually only present in executables.  

包含關於可執行文件的信息, 通常只出現在可執行文件中

Section header  

區段頭

Contains information about the different COFF sections; one for each section.  

包含每個不同的COFF區段信息, 每個區段頭對應每個區段

Raw data sections

原始數據區

One for each section containing raw data, such as machine instructions and initialized variables.  

每個區段包含的數據, 例如可執行的機器碼,和用來初始化變量的數據

Relocation information

重定位信息  

Contains information about unresolved references to symbols in other modules;

包含來自其它文件中沒有確定地址的符號的信息.

one for each section having external references.

每個區段都有一個外部符號

Usually only present in object files and not in executable files.  

通常在目標文件出現而不在可執行文件中出現

Line number information  

行號信息

Contains debugging information about source line numbers;

好漢源代碼行號的調試信息

one for each section if compiled with the -g option.  

如果編譯選項含有-g參數,那么每一個區段都含有

Symbol table  

符號表

Contains information about all the symbols in the object file;

包含目標文件的所有符號信息

present if not stripped from an executable file.  

目標文件都含有, 可執行文件如果沒有剔除的話也有

String table  

字符串表

Contains long symbol names.

包含一些長過8字節的符號名  

The following figure shows the COFF file structure:

下圖顯示的是COFF文件結構

 

File header 文件頭

The file header contains general information about the object file

文件頭包含目標文件的一般信息

and has the following structure from the file filehdr.h:

下面是來自filehdr.h文件的結構體

 

 1 struct filehdr { 
 3     unsigned short  f_magic;    /* magic 魔術字 */ 
 5     unsigned short  f_nscns;    /* number of sections 區段個數*/ 
 7     long            f_timdat;   /* date stamp  時間戳*/ 
 9     long            f_symptr;   /* fileptr to symtab 符號表的文件偏移*/ 
11     long            f_nsyms;    /* symtab count 符號表個數*/ 
13     unsigned short  f_opthdr;   /* sizeof(optional hdr) 擴展頭的大小*/ 
15     unsigned short  f_flags;    /* flags COFF文件屬性*/ 
17 }; 

 


Table G-2   COFF header fields 

Field  

Description  

 

f_magic 

Magic number used to identify the file as a COFF file. It has the value 0x170 for the PowerPC family of processors.

 

f_nscns 

Number of sections this file contains.

這個文件包含的區段個數

 

f_timdat 

Creation time of the file represented as a 32 bit value.

一個32位的數,表示文件的生成時間

 

f_symptr 

File offset of the symbol table.

符號表的文件偏移

 

f_nsyms 

Number of entries in the symbol table.

符號表的條目數

 

f_opthdr 

Number of bytes in the Optional Header.

擴展頭的字節數

 

f_flags 

Bit field containing the following flags:

這個是個位域,包含着以下信息

 

 

F_RELFLG (0x1)  

Set if the COFF file does not contain relocation information;

如果設置,這個COFF文件就是不存在重定位信息

 normally true only for executable files.  

通常只有可執行文件為true(1)

 

F_EXEC (0x2)  

Set if the file is executable and all references are resolved.  

如果設置,則文件是一個所有符號引用都確定的可執行文件

 

F_LNNO (0x4)  

Set if the COFF file does not contain line number information;

如果設置,則文件是一個沒有行號信息的對象文件

this symbolic debugging information can be stripped with the -s option or the strip program.  

這些調試符號信息可以被-s的參數或剔除程序給剔除

 

F_LSYMS (0x8)  

Set if the COFF file does not contain local symbols;

如果設置該位,文件將沒有本地符號

these symbols can be stripped with the -X and -x options to the assembler and linker.  

可以用匯編器和鏈接器傳入-X和-x參數剔除符號

 

F_AR32W (0x200)  

如果設置該位,則為大端的字節序  

 

Optional header

The optional header contains information about an executable file and has the following structure from the file aouthdr.h:

擴展頭包含可執行文件的信息.下面是來自aouthdr.h頭文件的結構體:

 1 typedef struct aouthdr {  
 3     short   magic;              /* a.out magic */ 
 5     short   vstamp;             /* version stamp */ 
 7     long    tsize;              /* .text size */ 
 9     long    dsize;              /* .data size */ 
11     long    bsize;              /* .bss size */ 
13     long    entry;              /* entry point */ 
15     long    text_start;         /* fileptr to .text */ 
17     long    data_start;         /* fileptr to .data */ 
19 } AOUTHDR; 

 


Table G-3   COFF optional (executable) header fields 

Field  

Description  

magic  

Value 0x10b.  

vstamp  

Set by the option -VS, but not used by the linker.  

tsize  

Size of the .text section.  

dsize  

Size of the .data section.  

bsize  

Size of the .bss section.  

entry  

Entry point in the executable program where execution will begin. The default entry point is the symbol start defined in the file function main(). The -e option can change this to any other symbol in the program.  

text_start 

File offset to the .text section in the COFF file.  

data_start 

File offset to the .data section in the COFF file.  

 

Section headers

區段頭

There is one section header for each section in the COFF file,

每個COFF文件的區段都有像下面一樣的區段頭

specified by the f_nscns field in the COFF File Header.

由COFF的文件頭結構體中的 f_nscns字段指出它的文件偏移

Section headers have the following structure from the file scnhdr.h:

下面是來自scnhdr.h頭文件的區段頭結構體

 1 struct scnhdr {                     /* modified COFF*/ 
 3     char            s_name[8];      /* section name 區段名*/ 
 5     long            s_paddr;        /* physical address 物理地址*/ 
 7     long            s_vaddr;        /* virtual address 虛擬地址*/ 
 9     long            s_size;         /* size of section 區段的字節數*/ 
11     long            s_scnptr;       /* fileptr to raw data 指向原始數據的文件偏移*/ 
13     long            s_relptr;       /* fileptr to reloc 指向重定位表的文件偏移*/ 
15     long            s_lnnoptr;      /* fileptr to lineno 指向行號表的文件偏移*/ 
17     unsigned long short  s_nreloc;       /* reloc count 重定位表條目數*/ 
19     unsigned long short  s_nlnno;        /* line number count 行號表條目數*/ 
21     long            s_flags;        /* flags */ 
23 };

 


 

 

#define SCNHDR struct scnhdr 

#define SCNHSZ sizeof(SCNHDR) 

Table G-4   COFF section header fields 

Field  

Description  

 

s_name[8] 

Eight byte null terminated section name.

8個字節, 以NULL為結束符的區段名

Standard names include .text, .data, and .bss.

標准的區段名包含:.text, .data, and .bss.

 

s_paddr 

Physical start address of the section.

區段的物理起始地址.

It is usually set to the same value as s_vaddr,

它通常被設置為s_vaddr設相同的值

but can be set to a different value with the command in the linker command language.

但是在鏈接器語言中可以設置不同的值

This can be useful when initialized data is physically allocated to a ROM address,

當給一個ROM分配一個實際地址以初始化數據時,它是有用的

but moved to a logical address in RAM at start-up.

但在啟動后將被亦作一個虛擬地址

 

s_vaddr 

Logical start address of the section as allocated by the assembler or linker.

區段被匯編器和鏈接器分配的虛擬開始地址

 

s_size  

Size in bytes of the memory allocated to the section.

區段被分配的內存的字節數

 

s_scnptr 

File offset to the raw data of the section.

區段原始數據的文件偏移

Note that the .注意,

bss section does not have any raw data since it will be initialized by the operating system.

bss部分沒有任何原始數據,因為它將由操作系統初始化

 

s_relptr 

File offset to the relocation information of the section.

區段重定位數據的文件偏移

 

s_lnnoopt 

File offset to the line number information of the section.

區段行號表信息的文件偏移

 

s_nreloc 

Number of relocation information entries.

重定位數據的數目

 

s_nlnno 

Number of line number information entries.

行號表數據的數目

 

s_flags 

Bit field containing the following flags:

該位域包含以下信息

 

 

STYP_TEXT (0x20)  

set for a .text section.

被設置時,這是一個代碼段  

 

STYP_DATA (0x40)  

set for a .data section.  

被設置時,這是一個數據段

 

STYP_BSS (0x80)  

set for .bss section.

被設置時,這是一個未初始化的數據段 

 

STYP_INFO (0x200)  

set for a .comment section.  

The following table shows the correspondence between the type-spec as defined on p.409 and the COFF section flags assigned to the output section.

Table G-5   type-spec - COFF section flag correspondence

type-spec  

Section flags (s_flags)  

BSS  

STYP_BSS  

COMMENT  

STYP_INFO  

CONST  

STYP_DATA  

DATA  

STYP_DATA  

TEXT  

STYP_TEXT  

 

Raw data sections 原始數據區段

The Raw Data Sections contain the actual raw data for each section.

原始數據區段包含每個區段的實際的原始數據

Table G-6   COFF section names 

.text  

Machine instructions, constant data, and strings  

可執行的機器碼, 常量數據和常量字符串

.sdata2  

Small constant data; see the Set size limit for "small const" variables (-Xsmall-const=n), p.106.  

.data  

Initialized data.  用於初始化全局變量的數據

.sdata  

Small initialized data; see the Set size limit for "small data" variables (-Xsmall-data=n), p.106.  

.bss  

Uninitialized data; does not have any raw data.  

未初始化的數據, 不存在任何原始數據

.sbss  

Small uninitialized data.  

.comment  

Comments from #ident directives in C.  

 

.init  

Code that is to be executed before the main() function.  

在main()函數之前被執行的代碼

.fini  

Code that is to be executed when the user program has finished execution.  

當用戶程序執行完畢后被執行的代碼

.eini  

The instructions of the .fini code;

.fini區段的指令

the .init, .fini, and .eini sections should be placed after each other in memory.  

當彼此都在內存之后 .init, .fini, and .eini區段應該被分配

 

COFF relocation information

The Relocation Information segment contains information about unresolved references.

重定位段包含外部未分配地址的符號.

Since compilers and assemblers do not know at what absolute memory address a symbol will be allocated,

當匯編器和編譯器不知道怎么給一個符號分配絕對的內存地址時.

and since they are unaware of definitions of symbols in other files,

因為匯編器和編譯器不知道該符號會在其他文件定義

every reference to such a symbol will create a relocation entry.

所有這樣的符號引用將被創建一個重定位條目

The relocation entry will point to the address where the reference is being made,

這個重定位條目將會指向這個符號被引用的地址.

and to the symbol table entry that contains the symbol that is referenced.

所以,當一個符號是被引用的,它會被包含在符號表的條目中.

The linker will use this information to fill in the correct address after it has allocated addresses to all symbols.

鏈接器可以使用這些信息給所有符號分配地址后,糾正這些被引用符號的地址

When an offset is added to a symbol in the assembly source,

當添加一個匯編源碼中的符號時

lwz     r3,(var+16)(r0) 

move.l  var+16,d0 

 

that offset is stored in the addressing mode,

這個偏移到的地方是沒有尋址模式的

so that adding the real address of the symbol with the address field will yield a correct reference.

這樣添加的符號的真正地址的字段將產生一個正確的參考。

The relocation segment does not exist in executable files.

重定位段在可執行文件中不存在

A relocation entry has the following structure from the file reloc.h:

 1 struct reloc {                  /* modified COFF */ 
 3     long            r_vaddr;    /* 引用的地址(文件偏移) */ 
 5     long            r_symndx;   /* 在符號表的索引(符號名) */ 
 7     unsigned short  r_type;     /* 重定位類型 */ 
 9     unsigned short  r_offset;   /* 高位的字是真實地址*/ 
11 }; 
12 
13  
14 
15 #define RELOC   struct reloc 
16 
17 #define RELSZ   sizeof(RELOC) 
18 
19 #define RELSZ   10              /* sizeof(RELOC) */ 

 


Table G-7   COFF relocation entry fields  

Field  

Description  

r_vaddr 

The relative address of the area within the current section to be patched with the correct address.  

修正的地址是被匹配到的當前區段頭的相對地址 , 這是指向需要修正的地址,這個值是個段內偏移,以這個段的開始的地方為偏移.

 

r_symndx 

Index into the symbol table pointing to the entry describing the symbol that is referenced at r_vaddr.  

r_vaddr. 地址對應的符號, 該數值是符號表條目一個索引.

r_type 

Type of addressing mode used;

使用尋址模式的類型

it describes whether the mode is absolute or relative,

它描述是絕對尋址還是相對尋址

and the size of the addressing mode.

和尋址模式的字節數

See the table below for relocation types used by the Wind River tools.  

通過風河公司工具查看下面的重定位類型的使用

r_offset 

The high 16 bits of any offset that is added to the symbol in a R_HVRT16, R_LVRT16, and R_HAVRT16 relocation modes.

當一個符號的r_type是 R_HVRT16, R_LVRT16, and R_HAVRT16 中的一個類型時,r_offset的高16位

Since the address field in the instruction is only 16 bits, it cannot represent a large offset. Example:

addis r13,r0,(var+0x123456)@ha. 

 

The address field in the addis instruction will contain 0x3456 and r_offset will contain 0x12.  

  

 

Table G-8   COFF relocation types 

Relocation type  

Number

 

Description  

R_RELWORD  

16

 

16 bit absolute address:

lwz    r3,var(r0)  

R_HVRT16  

131

 

Higher 16 bits of an absolute address:

addis  r3,r0,var@h  

R_LVRT16  

132

 

Lower 16 bits of an absolute address:

lwz    r3,var@l(r0)  

R_HAVRT16  

136

 

Adjusted higher 16 bits of an absolute address. If the lower 16 bits is a negative number, one is added to the upper 16 bits:

addis  r3,r0,var@ha  

R_PCR16S2  

137

 

16 bit PC relative address where the lower two bits are ignored:

bc     4,2,label  

R_PCR26S2  

138

 

26 bit PC relative address where the lower two bits are ignored:

bl     func  

R_REL16S2  

139

 

16 bit absolute address where the lower two bits are ignored:

bca    4,2,label  

R_REL26S2  

140

 

26 bit absolute address where the lower two bits are ignored:

bla    func  

 

Line number information

The line number information segment contains the mapping from source line numbers to machine instruction addresses used by symbolic debuggers. This information is only available if the -g option is specified to the compiler.

Line number entries for a section form groups of pairs where the first pair in a group is a pointer to the function containing the source. After that, every source line that has generated any instruction has an entry specifying the line number relative to the beginning of the function, and the corresponding instruction address. Normally only the .text section has line number information. The following table demonstrates the layout of the line number entries:

A line number entry has the following structure from the file linenum.h:

 1 struct lineno { 
 3     union { 
 5         long        l_symndx; 
 7         long        l_paddr; 
 9     } l_addr; 
11     unsigned long short  l_lnno; 
13 }; 
14 
15  
16 
17 #define LINENO      struct lineno 
18 
19 #define LINESZ      sizeof(LINENO) 
20 
21 #define LINESZ      6 
22 
23 Table G-9   COFF line number fields 

 


Field  

Description  

l_symndx  

Symbol table index for a new function; only valid if l_lnno is set to zero.  

l_paddr  

Instruction address corresponding to the source line l_lnno.  

l_lnno  

Source line relative to the start of the current function.  

 

Symbol table

The symbol table is an array of entries containing information about the symbols referenced in the COFF file.

符號表是一個包含COFF文件的符號引用的一個數組.

 A symbol table entry has the following structure from the file syms.h:

 1 struct syment { 
 3     union { 
 5         char        _n_name[8]; 
 7         struct { 
 9             long    _n_zeroes; 
11             long    _n_offset; 
13         } _n_n; 
15         char        *_n_nptr[2] 
17     } _n; 
19     long            n_value; 
21     short           n_scnum; 
25     unsigned short  n_type; 
27     char            n_sclass;
29     char            n_numaux;
31     short           n_pad; 
33 }; 
34 
35  
36 
37 #define SYMENT      struct syment 
38 
39 #define SYMESZ c    20 
40 
41 #define SYMESZ      18 
42 
43 #define n_name      _n._n_name 
44 
45 #define n_nptr      _n._n_nptr[1] 
46 
47 #define n_zeroes    _n._n_n._n_zeroes 
48 
49 #define n_offset    _n._n_n._n_offset 

 


Table G-10   COFF symbol table fields 

Field  

Description  

n_name 

Name of the symbol if the length is less than or equal to 8 bytes.

如果長度小於等於8則是符號的符號名,

If it is less than 8 bytes the name is terminated by a null character0.  

如果符號名稱小於8字節,則該符號名稱一NULL字符結尾

n_zeroes 

Zero if a symbol name is longer than 8 bytes.

如果符號名的長度大於8.則這個位段的值為0

This field overlaps the first 4 bytes of n_name.  

這個位段是和 n_name位段的首4個字節重疊的

n_offset 

An offset into the String Table if n_zeroes is zero.  

如果n_zeroes位段是0,則n_offset是字符串表的一個偏移(以字符串表開始地址為偏移)

n_value 

This pointer allows for overlays.  

 

n_value 

A value whose contents depends on the symbol type.

這個值取決於符號的類型

Normally it contains the address or the size of the symbol if the symbol is a common block.

這個位段一般保存符號所在區段的段內偏移地址,或者一個普通類型的塊的占用的空間的大小

A zero value indicates an undefined symbol if n_scnum is also zero.

如果nvalue位段和n_scnum位段的值都是0,那么這個符號是一個未定義的符號

n_scnum 

Section number of the symbol starting with one.

符號所在區段的索引(區段頭作為一個數組,該索引就是這個數組的索引)

A zero value indicates one of two things:

當它的值為0時,說明有以下兩種情況:

If n_value is zero then the symbol is an undefined symbol that must be defined in another file.

當n_value的值也是0時,這個符號是一個未定義的符號,這個符號必須在其他的文件中被定義

If n_value is not zero then the symbol is a common block of size n_value.

當n_value的值不是0時,這n_value的值是普通數據塊占用的空間的大小

All common blocks with the same name are combined by the linker and put in the.bss section,

具有相同名稱的所有常見的塊由鏈接器組合到可執行文件中

unless some other file defines that symbol in a section.  

n_type 

Type of the symbol; only set if compiled with -g.

表示符號的類型,  僅當編譯器使用了-g參數時才有值

n_sclass 

Storage class of the symbol. There are over 20 storage classes, but most are used only

with the -g compiler option.

符號的村粗類型.有超過20中存儲類型, 但大多數僅僅在編譯器用-g參數生成中間的文件.

The two classes of interest to the linker are C_EXT, external storage, and C_STAT, static (local to the file) storage.  

鏈接器一般只對2種存儲類型感興趣,它們是:C_EXT,外部存儲類型和 C_STAT靜態存儲類型

n_numaux 

Number of auxiliary entries used by the symbol.  

符號使用的輔助條目數

n_pad  

Pad the structure to a multiple of four bytes.

沒有意義的字段,只是為了讓結構體4字節對齊.

 

Any auxiliary entries to a symbol are stored immediately after the symbol in the table. They are mainly used for symbolic debugging (-g option) and are not discussed here.

 

Additional symbols

Wind River uses special COFF symbols as follows:

Table G-11   Special COFF Symbols

Extension  

Description  

!sn!section-name  

Long section-name.  

!cd!name  

COMDAT-section-name. See Mark sections as COMDAT for linker collapse (-Xcomdat), p.71.  

!sf!flags  

Section flags (a: allocate, w: write, x: execute, b: bss/nocode).  

!al!value  

Section alignment.  

!wk!symbol-name  

Weak symbol. See weak pragma, p.138.  

 

String table

The string table contains the null terminated names of symbols longer than eight characters. Those symbols point into the string table through an offset, n_offset. The first four bytes of the string table contain the size of the table and after that all strings are stored sequentially.

 

 

 

 

support@windriver.com 

Copyright © 2002, Wind River Systems, Inc. All rights reserved.


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM