GNU LD 腳本學習筆記

本文轉載自查看原文 2018-02-01 11:56 1534 Linux/ ld/ 嵌入式

LD腳本（linker script）是什么

GNU ld是鏈接器，ld實際並不是GCC的一部分，ld屬於binutils軟件包。但是嵌入式開發時，下載的linaro GCC工具集中是包含 arm-linux-gnueabihf-ld 的。工作中我經常使用ARM的scatter文件，和這個LD腳本差不多，只不過scatter文件的功能要弱不少，這也是為什么ARM6中armclang也是推薦使用 GNU LD腳本的原因，ARM也不想維護自己特有的編譯器了，只要專心把clang bytecode翻譯成ARM指令的優化做好。

所有的鏈接過程都是由LD腳本控制的，寫這個腳本的語言稱為 linker command language，LD腳本的最主要的功能是描述如何將輸入文件映射到輸出文件以及輸出文件的存儲布局（memory layout）。在操作系統上開發時一般不會涉及到LD腳本，這是因為如果未使用命令行-T來指定腳本，ld會使用內置的默認腳本，這個腳本可以通過 ld --verbose 來查看，例如 arm-linux-gnueabihf-ld --verbose的輸出如下

/* Script for -z combreloc: combine and sort reloc sections */
/* Copyright (C) 2014-2017 Free Software Foundation, Inc.
   Copying and distribution of this script, with or without modification,
   are permitted in any medium without royalty provided the copyright
   notice and this notice are preserved.  */
OUTPUT_FORMAT("elf32-littlearm", "elf32-bigarm",
	      "elf32-littlearm")
OUTPUT_ARCH(arm)
ENTRY(_start)
SEARCH_DIR("=/home/tcwg-buildslave/workspace/tcwg-make-release/builder_arch/amd64/label/tcwg-x86_64-build/target/arm-linux-gnueabihf/_build/builds/destdir/x86_64-unknown-linux-gnu/arm-linux-gnueabihf/lib"); SEARCH_DIR("=/usr/local/lib"); SEARCH_DIR("=/lib"); SEARCH_DIR("=/usr/lib");
SECTIONS
{
  /* Read-only sections, merged into text segment: */
  PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x00010000)); . = SEGMENT_START("text-segment", 0x00010000) + SIZEOF_HEADERS;
  .interp         : { *(.interp) }
  .note.gnu.build-id : { *(.note.gnu.build-id) }
  .hash           : { *(.hash) }
  .gnu.hash       : { *(.gnu.hash) }
  .dynsym         : { *(.dynsym) }
  .dynstr         : { *(.dynstr) }
  .gnu.version    : { *(.gnu.version) }
  .gnu.version_d  : { *(.gnu.version_d) }
  .gnu.version_r  : { *(.gnu.version_r) }
  .rel.dyn        :
    {
      *(.rel.init)
      *(.rel.text .rel.text.* .rel.gnu.linkonce.t.*)
      *(.rel.fini)
      *(.rel.rodata .rel.rodata.* .rel.gnu.linkonce.r.*)
      *(.rel.data.rel.ro .rel.data.rel.ro.* .rel.gnu.linkonce.d.rel.ro.*)
      *(.rel.data .rel.data.* .rel.gnu.linkonce.d.*)
      *(.rel.tdata .rel.tdata.* .rel.gnu.linkonce.td.*)
      *(.rel.tbss .rel.tbss.* .rel.gnu.linkonce.tb.*)
      *(.rel.ctors)
      *(.rel.dtors)
      *(.rel.got)
      *(.rel.bss .rel.bss.* .rel.gnu.linkonce.b.*)
      PROVIDE_HIDDEN (__rel_iplt_start = .);
      *(.rel.iplt)
      PROVIDE_HIDDEN (__rel_iplt_end = .);
    }
  .rela.dyn       :
    {
      *(.rela.init)
      *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*)
      *(.rela.fini)
      *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*)
      *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*)
      *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*)
      *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*)
      *(.rela.ctors)
      *(.rela.dtors)
      *(.rela.got)
      *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*)
      PROVIDE_HIDDEN (__rela_iplt_start = .);
      *(.rela.iplt)
      PROVIDE_HIDDEN (__rela_iplt_end = .);
    }
  .rel.plt        :
    {
      *(.rel.plt)
    }
  .rela.plt       :
    {
      *(.rela.plt)
    }
  .init           :
  {
    KEEP (*(SORT_NONE(.init)))
  }
  .plt            : { *(.plt) }
  .iplt           : { *(.iplt) }
  .text           :
  {
    *(.text.unlikely .text.*_unlikely .text.unlikely.*)
    *(.text.exit .text.exit.*)
    *(.text.startup .text.startup.*)
    *(.text.hot .text.hot.*)
    *(.text .stub .text.* .gnu.linkonce.t.*)
    /* .gnu.warning sections are handled specially by elf32.em.  */
    *(.gnu.warning)
    *(.glue_7t) *(.glue_7) *(.vfp11_veneer) *(.v4_bx)
  }
  .fini           :
  {
    KEEP (*(SORT_NONE(.fini)))
  }
  PROVIDE (__etext = .);
  PROVIDE (_etext = .);
  PROVIDE (etext = .);
  .rodata         : { *(.rodata .rodata.* .gnu.linkonce.r.*) }
  .rodata1        : { *(.rodata1) }
  .ARM.extab   : { *(.ARM.extab* .gnu.linkonce.armextab.*) }
   PROVIDE_HIDDEN (__exidx_start = .);
  .ARM.exidx   : { *(.ARM.exidx* .gnu.linkonce.armexidx.*) }
   PROVIDE_HIDDEN (__exidx_end = .);
  .eh_frame_hdr : { *(.eh_frame_hdr) *(.eh_frame_entry .eh_frame_entry.*) }
  .eh_frame       : ONLY_IF_RO { KEEP (*(.eh_frame)) *(.eh_frame.*) }
  .gcc_except_table   : ONLY_IF_RO { *(.gcc_except_table
  .gcc_except_table.*) }
  .gnu_extab   : ONLY_IF_RO { *(.gnu_extab*) }
  /* These sections are generated by the Sun/Oracle C++ compiler.  */
  .exception_ranges   : ONLY_IF_RO { *(.exception_ranges
  .exception_ranges*) }
  /* Adjust the address for the data segment.  We want to adjust up to
     the same address within the page on the next page up.  */
  . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE));
  /* Exception handling  */
  .eh_frame       : ONLY_IF_RW { KEEP (*(.eh_frame)) *(.eh_frame.*) }
  .gnu_extab      : ONLY_IF_RW { *(.gnu_extab) }
  .gcc_except_table   : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) }
  .exception_ranges   : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) }
  /* Thread Local Storage sections  */
  .tdata	  : { *(.tdata .tdata.* .gnu.linkonce.td.*) }
  .tbss		  : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) }
  .preinit_array     :
  {
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array))
    PROVIDE_HIDDEN (__preinit_array_end = .);
  }
  .init_array     :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*)))
    KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .ctors))
    PROVIDE_HIDDEN (__init_array_end = .);
  }
  .fini_array     :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*)))
    KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .dtors))
    PROVIDE_HIDDEN (__fini_array_end = .);
  }
  .ctors          :
  {
    /* gcc uses crtbegin.o to find the start of
       the constructors, so we make sure it is
       first.  Because this is a wildcard, it
       doesn't matter if the user does not
       actually link against crtbegin.o; the
       linker won't look for a file to match a
       wildcard.  The wildcard also means that it
       doesn't matter which directory crtbegin.o
       is in.  */
    KEEP (*crtbegin.o(.ctors))
    KEEP (*crtbegin?.o(.ctors))
    /* We don't want to include the .ctor section from
       the crtend.o file until after the sorted ctors.
       The .ctor section from the crtend file contains the
       end of ctors marker and it must be last */
    KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .ctors))
    KEEP (*(SORT(.ctors.*)))
    KEEP (*(.ctors))
  }
  .dtors          :
  {
    KEEP (*crtbegin.o(.dtors))
    KEEP (*crtbegin?.o(.dtors))
    KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .dtors))
    KEEP (*(SORT(.dtors.*)))
    KEEP (*(.dtors))
  }
  .jcr            : { KEEP (*(.jcr)) }
  .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) }
  .dynamic        : { *(.dynamic) }
  . = DATA_SEGMENT_RELRO_END (0, .);
  .got            : { *(.got.plt) *(.igot.plt) *(.got) *(.igot) }
  .data           :
  {
    PROVIDE (__data_start = .);
    *(.data .data.* .gnu.linkonce.d.*)
    SORT(CONSTRUCTORS)
  }
  .data1          : { *(.data1) }
  _edata = .; PROVIDE (edata = .);
  . = .;
  __bss_start = .;
  __bss_start__ = .;
  .bss            :
  {
   *(.dynbss)
   *(.bss .bss.* .gnu.linkonce.b.*)
   *(COMMON)
   /* Align here to ensure that the .bss section occupies space up to
      _end.  Align after .bss to ensure correct alignment even if the
      .bss section disappears because there are no input sections.
      FIXME: Why do we need it? When there is no .bss section, we don't
      pad the .data section.  */
   . = ALIGN(. != 0 ? 32 / 8 : 1);
  }
  _bss_end__ = . ; __bss_end__ = . ;
  . = ALIGN(32 / 8);
  . = SEGMENT_START("ldata-segment", .);
  . = ALIGN(32 / 8);
  __end__ = . ;
  _end = .; PROVIDE (end = .);
  . = DATA_SEGMENT_END (.);
  /* Stabs debugging sections.  */
  .stab          0 : { *(.stab) }
  .stabstr       0 : { *(.stabstr) }
  .stab.excl     0 : { *(.stab.excl) }
  .stab.exclstr  0 : { *(.stab.exclstr) }
  .stab.index    0 : { *(.stab.index) }
  .stab.indexstr 0 : { *(.stab.indexstr) }
  .comment       0 : { *(.comment) }
  /* DWARF debug sections.
     Symbols in the DWARF debugging sections are relative to the beginning
     of the section so we begin them at 0.  */
  /* DWARF 1 */
  .debug          0 : { *(.debug) }
  .line           0 : { *(.line) }
  /* GNU DWARF 1 extensions */
  .debug_srcinfo  0 : { *(.debug_srcinfo) }
  .debug_sfnames  0 : { *(.debug_sfnames) }
  /* DWARF 1.1 and DWARF 2 */
  .debug_aranges  0 : { *(.debug_aranges) }
  .debug_pubnames 0 : { *(.debug_pubnames) }
  /* DWARF 2 */
  .debug_info     0 : { *(.debug_info .gnu.linkonce.wi.*) }
  .debug_abbrev   0 : { *(.debug_abbrev) }
  .debug_line     0 : { *(.debug_line .debug_line.* .debug_line_end ) }
  .debug_frame    0 : { *(.debug_frame) }
  .debug_str      0 : { *(.debug_str) }
  .debug_loc      0 : { *(.debug_loc) }
  .debug_macinfo  0 : { *(.debug_macinfo) }
  /* SGI/MIPS DWARF 2 extensions */
  .debug_weaknames 0 : { *(.debug_weaknames) }
  .debug_funcnames 0 : { *(.debug_funcnames) }
  .debug_typenames 0 : { *(.debug_typenames) }
  .debug_varnames  0 : { *(.debug_varnames) }
  /* DWARF 3 */
  .debug_pubtypes 0 : { *(.debug_pubtypes) }
  .debug_ranges   0 : { *(.debug_ranges) }
  /* DWARF Extension.  */
  .debug_macro    0 : { *(.debug_macro) }
  .debug_addr     0 : { *(.debug_addr) }
  .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) }
  .note.gnu.arm.ident 0 : { KEEP (*(.note.gnu.arm.ident)) }
  /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) }
}

默認的腳本

大家可以看到，默認的腳本比自己寫的還要復雜得多，原來GCC會輸出這么多的section！

LD腳本的理解

LD腳本由許多命令組成，下面我們以u-boot中的am335x的u-boot-spl.lds為例來說明下，先貼上這文件內容

 1 MEMORY { .sram : ORIGIN = CONFIG_SPL_TEXT_BASE,\
 2         LENGTH = CONFIG_SPL_MAX_SIZE }
 3 MEMORY { .sdram : ORIGIN = CONFIG_SPL_BSS_START_ADDR, \
 4         LENGTH = CONFIG_SPL_BSS_MAX_SIZE }
 5 
 6 OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm", "elf32-littlearm")
 7 OUTPUT_ARCH(arm)
 8 ENTRY(_start)
 9 SECTIONS
10 {
11     .text      :
12     {
13         __start = .;
14         *(.vectors)
15         arch/arm/cpu/armv7/start.o    (.text*)
16         *(.text*)
17     } >.sram
18 
19     . = ALIGN(4);
20     .rodata : { *(SORT_BY_ALIGNMENT(.rodata*)) } >.sram
21 
22     . = ALIGN(4);
23     .data : { *(SORT_BY_ALIGNMENT(.data*)) } >.sram
24 
25     . = ALIGN(4);
26     .u_boot_list : {
27         KEEP(*(SORT(.u_boot_list*)));
28     } >.sram
29 
30     . = ALIGN(4);
31     __image_copy_end = .;
32 
33     .end :
34     {
35         *(.__end)
36     }
37 
38     _image_binary_end = .;
39 
40     .bss :
41     {
42         . = ALIGN(4);
43         __bss_start = .;
44         *(.bss*)
45         . = ALIGN(4);
46         __bss_end = .;
47     } >.sdram
48 }

這個文件涉及如下幾個命令

MEMORY：描述板上的存儲器位置，ORIGIN為起始地址，LENGTH為字節數，其中ORIGIN可縮寫為org或o，LENGTH可縮寫為len或l
OUTPUT_FORMAT：指定輸出文件的格式，這里指定了無論命令是否選擇了大小端，都輸出ARM的小端格式的指令
OUTPUT_ARCH：指定輸出的架構
ENTRY：指定入口地址，注意這里使用的是代碼中定義的_start符號，也就是說腳本中可以直接訪問符號表中的符號
SECTIONS：這是腳本中最重要的命令了，所有的LD腳本都會有這個命令，用來指定如何將輸入文件映射到輸出文件等等，要看懂SECTIONS的內容需要許多概念，下面來一一說明

object

鏈接器的目的是把多個輸入文件組成一個輸出文件，這些文件都叫做object文件（包括最終生成的可執行文件）。object文件有很多內容，但最重要的是它包含一組段（section），輸入文件中的段稱為輸入段（input section），而輸出文件中的段稱為輸出段（output section）。

section

每個section都有自己的名字（如.text/.data/.bss）和大小，大部分section還有自己的數據（.bss就是一種有大小無數據的section）。

有些section是loadable，如.text段，運行時需要把它們的數據加載到內存中去
還有一些section是allocatable，如.bss段，運行時需要在內存中為它們留空間，但是不用加載任何數據到這段內存中去（一般會清零）
除此之外的section，一般只是包含一些調用信息

VMA/LMA

loadable 或 allocatable 的section都有兩個地址

VMA：Virtual Memory Address，這是運行時section的地址（操作系統上，程序運行的時候的地址一般經過MMU映射過的虛擬地址）
LMA：Load Memory Address，這是還未開始運行時，section處在的位置

如果是在操作系統上，可執行程序由外部加載器加載，VMA和LMA一般來說是相同的，但是在嵌入式裸機代碼中，LMA可能是在ROM中，程序從ROM中開始運行，初始化代碼（VMA==LMA的代碼，或與位置無關的代碼）負責把其他VMA和LMA不同的代碼加載到VMA中。

location counter

SECTIONS命令中，"."是一個特殊的符號，表示當前VMA，"."的初始值是0，通過給"."賦值可以增加它的值，每創建一個新的輸出section時，"."也會根據其大小相應地增加。通過直接賦值給"."可能會產生空洞，這些空洞會被填充上（可以指定填充值）。需要注意的是，通過賦值不可以使"."回退，如果ld檢測到這種情況，就會報錯。

有了上面這些概念，我們再來分析下SECTIONS

輸入段的指定方式：file-name(section-name)或 archive-name:file-name(section-name)，所有這些名稱都可以使用 *和?等通配符
輸出段的指定方式：section-name { input-sections }，輸出段的名稱與可執行文件的格式相關
表達式：以分號結尾的表達式，用於直接創建符號或改變"."
>指定VMA（運行時）在哪個存儲器中，為了方便映射到不同的內存，還可以創建REGION_ALIAS
ALIGN(size)返回"."對齊到size字節的值，但是不會改變"."
SORT_BY_ALIGNMENT是按對齊大小倒序排列，對齊大的放前面，以減少padding
SORT是SORT_BY_NAME簡寫，按照名稱順序排列
KEEP是即使沒有代碼引用，也保留下來（匯編或其他外部代碼會使用這些初始化數據）

這樣我們就基本能看懂上面的腳本了，另外我覺得還有幾點比較重要：

LD腳本直接創建的符號，也會放到符號表中，但是要注意這個符號不同於C代碼中的符號。符號表是一個名稱到地址的映射，代碼中的符號都會被分配到一個存儲器的位置，運行時會有值的概念，但LD腳本中創建的符號是不會分配內存的，所以它只有地址，而沒有存儲位置（值），在C語言中引用時，可以extern為變量，然后使用&獲取其地址，或者直接extern為數組，使用數組名即可
當".=exp"在輸出段的{}中時，".=exp"就相當於所在的".=輸出段的起始VMA+exp"
在輸出段的{}以外的地方給符號賦值是有風險的，".=."的方式可以限制ld旋轉orphan section的位置，參見這里

要更詳細地了解LD腳本，請參見下面的文獻。相比arm的scatter文件，gnu ld script更加靈活，我最喜歡的就是可以給各種位置定義符號名，而不是用Image$$RO$$Base之類的magic名稱。

參考文獻

[1] https://sourceware.org/binutils/docs/ld/Scripts.html#Scripts

[2] https://wiki.osdev.org/Linker_Scripts

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 GNU LD之二LD script GNU LD之一LMA和VMA gnu ld(Linker Scripts)鏈接介紹 Anaconda3-x86_64-conda_cos6-linux-gnu-ld ld鏈接腳本語法簡介 shell腳本編程學習筆記（一） bash腳本編程學習筆記（一） Nodejs學習筆記 --- 調用命令方法ld_process.exec()和child_process.execFile() ld鏈接腳本文件語法解析之一鏈接腳本文件（.ld .lds）詳解