X86調用約定 calling convention

本文轉載自查看原文 2013-09-15 21:09 2858 C/C++/ IDA/ WINDOWS/ DELPHI

http://zh.wikipedia.org/wiki/X86%E8%B0%83%E7%94%A8%E7%BA%A6%E5%AE%9A

這里描述了在x86芯片架構上的調用約定(calling conventions)。調用約定描述了被調用代碼的接口：

原子(標量)參數，或復雜參數獨立部分的分配順序;
參數是如何被傳遞的(放置在棧上，或是寄存器中，亦或兩者混合);
被調用者應保存調用者的哪個寄存器;
調用函數時如何為任務准備堆棧，以及任務完成如何恢復;

這與編程語言中對於大小和格式的分配緊密相關。另一個密切相關的是名稱修飾，這決定了代碼中的符號名稱如何映射到鏈接器中的符號名。

調用約定，類型表示和名稱修飾這三者的統稱，即是總所周知的應用二進制接口(ABI)。

不同編譯器在實現這些約定總是有細微的差別存在，所以在不同編譯器編譯出來的代碼很難接合起來。

另一方面，有些約定被當作一種API標准(如stdcall)，編譯器實現都較為一致。

調用者清理 cdecl syscall optlink

在這些約定中，調用者自己清理棧上的變元(arguments)，這樣就運行了可變參數列表的實現，如printf()。

cdecl

cdecl(C declaration，即C聲明)是源起C語言的一種調用約定，x86架構上的許多C編譯器都使用這個約定。
在cdecl中，子例程變元是在棧上傳遞的。EAX寄存器返回整型值和內存地址，浮點數則是在ST0 x87寄存器上。
EAX, ECX和EDX寄存器是由調用者保存的，其余的寄存器由被調用者保存。(EBX, EBP, ESI, EDI)
當調用一個新函數時，x87浮點寄存器ST0到ST7都必須為空(彈出或釋放掉)，而且在退出函數時ST1到ST7也必須為空。
在C語言中，函數參數是以相反順序推入棧的。在GNU/Linux GCC，把這一約定做為事實上的標准。
GCC自4.5版本開始，調用函數時，堆棧上的數據必須以16B對齊(之前的版本只需要4B對齊即可)。

cdecl調用約定通常作為x86 C編譯器的默認調用規則，許多編譯器也提供了自動切換調用約定的選項。
如果需要手動指定調用規則為cdecl，編譯器可能會支持如下語法：

void _cdecl funct();

其中_cdecl修飾符需要在函數原型中給出，在函數聲明中會覆蓋掉其他的設置。

syscall

與cdecl類似，變元被從右到左推入棧中。EAX, ECX和EDX不會保留值。參數列表的大小被放置在AL寄存器中(?)。

syscall是32位OS/2 API的標准。

optlink

變元也是從右到左被推入棧。從最左邊開始的三個字符變元會被放置在EAX, EDX和ECX中，最多四個浮點變元會被傳入ST(0)到ST(3)中----

雖然這四個參數的空間也會在參數列表的棧上保留。函數的返回值在EAX或ST(0)中。保留的寄存器有EBP, EBX, ESI和EDI。

optlink在IBM VisualAge編譯器中被使用。

被調用者清理 pascal register stdcall fastcall (microsoft, borland )

如果被調用者要清理棧上的參數，需要在編譯階段知道棧上有多少字節要處理。因此，此類的調用約定並不能兼容於可變參數列表，如printf()。
然而，這種調用約定也許會更有效率，因為需要解堆棧的代碼不要在每次調用時都生成一遍。
使用此規則的函數容易在asm代碼被認出，因為它們會在返回前解堆棧。
x86 ret指令允許一個可選的16位參數說明棧字節數，用來在返回給調用者之前解堆棧。代碼類似如下：

ret 12

pascal

基於Pascal語言的調用約定，參數從左至右入棧(與cdecl相反)。被調用者負責在返回前清理堆棧。此調用約定常見在如下16-bit API中：OS/2 1.x，微軟Windows 3.x，以及Borland Delphi版本1.x。

register

Borland fastcall的別名而已。

stdcall

這個一個Pascal調用約定的變體，被調用者依舊負責清理堆棧，但是參數從右往左入棧----與cdecl一致。
寄存器EAX, ECX和EDX被指定在函數中使用，返回值放置在EAX中。

stdcall對於微軟Win32 API和Open Watcom C++是標准。

fastcall

此約定還未被標准化，不同編譯器的實現也不一致。典型的fastcall約定會傳遞一個或多個變元到寄存器上，減少對內存的訪問。

Microsoft fastcall

Microsoft或GCC的__fastcall約定(也即__msfastcall)傳入頭兩個變元(從左至右)到ECX和EDX中，剩下的變元從右至左推入棧上。

Borland fastcall

從左至右，傳入三個參數至EAX, EDX和ECX中。剩下的參數推入棧，也是從左至右。

在32位編譯器Embarcadero Delphi中，這是缺省調用約定，在編譯器中以register形式為人知。在i386上的某些版本Linux也使用了此約定。

調用者或被調用者清理 thiscall

thiscall

在調用C++非靜態成員函數時使用此約定。基於所使用的編譯器和函數是否使用可變參數，有兩個主流版本的thiscall。

對於GCC編譯器，thiscall幾乎與cdecl等同：調用者清理堆棧，參數從右到左傳遞。差別在於this指針，thiscall會在最后把指針推入棧中，雖然在函數原型中它是隱式的第一個參數。

在微軟Visual C++編譯器中，this指針被傳到ECX寄存器上，被調用者負責清理堆棧，其余同此編譯器的C版本和Windows API函數使用的stdcall約定。
當函數使用可變參數，此時調用者負責清理堆棧(參考cdecl)。 thiscall約定只在微軟Visual C++ 2005及其之后的版本被顯式指定。
其他編譯器中，thiscall並不是一個關鍵字(反匯編器如IDA使用__thiscall)。

x86-64調用約定

x86-64調用約定得益於更多的寄存器可以用來傳參。而且，不兼容的調用約定也更少了，不過還是有2種主流的規則。

微軟x64調用約定

微軟x64調用約定使用RCX, RDX, R8, R9這四個寄存器傳遞頭四個整型或指針變量(從左到右)，
使用XMM0, XMM1, XMM2, XMM3來傳遞浮點變量。
其他的參數直接入棧(從右至左)。

整型返回值放置在RAX中，浮點返回值在XMM0中。

少於64位的參數並沒有做零擴展，此時高位充斥着垃圾。

在Windows x64環境下編譯代碼時，只有一種調用約定----就是上面描述的約定，也就是說，32位下的各種約定在64位下統一成一種了。

在微軟x64調用約定中，調用者的一個職責是在調用函數之前(無論實際的傳參使用多大空間)，在棧上分配一個32B的“影子空間”；並且在調用之后用彈出此堆棧。

影子空間是用來給RCX, RDX, R8和R9提供溢出空間的(?)，即使是對於少於四個參數的函數而言。

例如，一個函數擁有5個整型參數，第一個到第四個放在寄存器中，第五個就被推到影子空間棧頂上。

當函數被調用，此棧用來組成返回值----影子空間32位+第五個參數。

在x86-64體系下，Visual Studio 2008在XMM6和XMM7中(同樣的有XMM8到XMM15)存儲浮點數。

結果對於用戶寫的匯編語言例程，必須保存XMM6和XMM7(x86不用保存這兩個寄存器)，

這也就是說，在x86和x86-64之間移植匯編例程時，需要注意在函數調用之前/之后，要保存/恢復XMM6和XMM7。

System V AMD64 ABI

此約定主要在Solaris，GNU/Linux，FreeBSD和其他非微軟OS上使用。

頭六個整型參數放在寄存器RDI, RSI, RDX, RCX, R8和R9上；同時XMM0到XMM7用來放置浮點變元。

對於系統調用，R10用來替代RCX。同微軟x64約定一樣，其他額外的參數推入棧，返回值保存在RAX中。

與微軟不同的是，不需要提供影子空間。在函數入口，返回值與棧上第七個整型參數相鄰。

調用約定(pascal,fastcall,stdcall,thiscall,cdecl)區別等

http://blog.csdn.net/maotoula/article/details/6762062

一：函數調用約定;
函數調用約定是函數調用者和被調用的函數體之間關於參數傳遞、返回值傳遞、堆棧清除、寄存器使用的一種約定;
它是需要二進制級別兼容的強約定,函數調用者和函數體如果使用不同的調用約定,將可能造成程序執行錯誤,必須把它看作是函數聲明的一部分;

二：常見的函數調用約定;

VC6中的函數調用約定;

        調用約定        堆棧清除    參數傳遞 
        __cdecl 調用者      從右到左,通過堆棧傳遞 
        __stdcall      函數體      從右到左,通過堆棧傳遞 
        __fastcall     函數體      從右到左,優先使用寄存器(ECX,EDX),然后使用堆棧 
        thiscall       函數體      this指針默認通過ECX傳遞,其它參數從右到左入棧

__cdecl是C/C++的默認調用約定; VC的調用約定中並沒有thiscall這個關鍵字,它是類成員函數默認調用約定;
C/C++中的main(或wmain)函數的調用約定必須是__cdecl,不允許更改;
默認調用約定一般能夠通過編譯器設置進行更改,如果你的代碼依賴於調用約定,請明確指出需要使用的調用約定;

Delphi6中的函數調用約定;

        調用約定        堆棧清除    參數傳遞 
        register       函數體      從左到右,優先使用寄存器(EAX,EDX,ECX),然后使用堆棧 
        pascal         函數體      從左到右,通過堆棧傳遞

        cdecl 調用者   從右到左,通過堆棧傳遞(與C/C++默認調用約定兼容) 
        stdcall        函數體      從右到左,通過堆棧傳遞(與VC中的__stdcall兼容) 
        safecall       函數體      從右到左,通過堆棧傳遞(同stdcall)

Delphi中的默認調用約定是register,它也是我認為最有效率的一種調用方式,而cdecl是我認為綜合效率最差的一種調用方式;

VC中的__fastcall調用約定一般比register效率稍差一些;

C++Builder6中的函數調用約定;

        調用約定        堆棧清除    參數傳遞 

        __fastcall     函數體      從左到右,優先使用寄存器(EAX,EDX,ECX),然后使用堆棧 (兼容Delphi的register) 
        register       函數體      從左到右,優先使用寄存器(EAX,EDX,ECX),然后使用堆棧 (兼容Delphi的register) 
        __pascal       函數體      從左到右,通過堆棧傳遞

        __cdecl 調用者      從右到左,通過堆棧傳遞(與C/C++默認調用約定兼容) 
        __stdcall      函數體      從右到左,通過堆棧傳遞(與VC中的__stdcall兼容) 
        __msfastcall   函數體      從右到左,優先使用寄存器(ECX,EDX),然后使用堆棧(兼容VC的__fastcall)

常見的函數調用約定中,只有cdecl約定需要調用者來清除堆棧;

C/C++中的函數支持參數數目不定的參數列表,比如printf函數;由於函數體不知道調用者在堆棧中壓入了多少參數,
所以函數體不能方便的知道應該怎樣清除堆棧,那么最好的辦法就是把清除堆棧的責任交給調用者; 這應該就是cdecl調用約定存在的原因吧;

VB一般使用的是stdcall調用約定;(ps：有更強的保證嗎)
Windows的API中,一般使用的是stdcall約定;(ps: 有更強的保證嗎)
建議在不同語言間的調用中(如DLL)最好采用stdcall調用約定,因為它在語言間兼容性支持最好;

三:函數返回值傳遞方式

   其實，返回值的傳遞從處理上也可以想象為函數調用的一個out形參數；函數返回值傳遞方式也是函數調用約定的一部分；
   有返回值的函數返回時：一般int、指針等32bit數據值(包括32bit結構)通過eax傳遞，(bool,char通過al傳遞，short通過ax傳遞),
特別的__int64等64bit結構(struct) 通過edx,eax兩個寄存器來傳遞(同理：32bit整形在16bit環境中通過dx,ax傳遞);
其他大小的結構(struct)返回時把其地址通過eax返回；(所以返回值類型不是1,2,4,8byte時，效率可能比較差)
   參數和返回值傳遞中，引用方式的類型可以看作與傳遞指針方式相同；
   float/double(包括Delphi中的extended)都是通過浮點寄存器st(0)返回；

1.__cdecl
所謂的C調用規則。按從右至左的順序壓參數入棧，由調用者把參數彈出棧。切記：對於傳送參數的內存棧是由調用者來維護的。
返回值在EAX中因此，對於象printf這樣變參數的函數必須用這種規則。編譯器在編譯的時候對這種調用規則的函數生成修飾名的餓時候，僅在輸出函數名前加上一個下划線前綴，格式為_functionname。

2.__stdcall
按從右至左的順序壓參數入棧，由被調用者把參數彈出棧。_stdcall是Pascal程序的缺省調用方式，通常用於Win32 Api中，切記：函數自己在退出時清空堆棧，返回值在EAX中。　　
__stdcall調用約定在輸出函數名前加上一個下划線前綴，后面加上一個“@”符號和其參數的字節數，格式為_functionname@number。如函數int func(int a, double b)的修飾名是_func@12。

3.__fastcall
__fastcall調用的主要特點就是快，因為它是通過寄存器來傳送參數的（實際上，它用ECX和EDX傳送前兩個雙字（DWORD）或更小的參數，剩下的參數仍舊自右向左壓棧傳送，被調用的函數在返回前清理傳送參數的內存棧）。__fastcall調用約定在輸出函數名前加上一個“@”符號，后面也是一個“@”符號和其參數的字節數，格式為@functionname@number。
這個和__stdcall很象，唯一差別就是頭兩個參數通過寄存器傳送。注意通過寄存器傳送的兩個參數是從左向右的，即第一個參數進ECX，第2個進EDX，其他參數是從右向左的入stack。返回仍然通過EAX.

4.__pascal
這種規則從左向右傳遞參數，通過EAX返回，堆棧由被調用者清除

5.__thiscall

僅僅應用於"C++"成員函數。this指針存放於CX寄存器，參數從右到左壓。thiscall不是關鍵詞，因此不能被程序員指定

調用約定可以通過工程設置：Setting...\C/C++ \Code Generation項進行選擇，缺省狀態為__cdecl。

函數調用方式： Stdcall Cdecl Fastcall WINAPI CALLBACK PASCAL Thiscall Fortran Syscall Declspec(Naked)

http://www.cnitblog.com/textbox/archive/2010/03/10/64575.html

現代的編程語言的函數竟然有那麽多的調用方式。這些東西要完全理解還得通過匯編代碼才好理解。他們各自有自己的特點
其實這些調用方式的差別在主要在一下幾個方面

1.參數處理方式（傳遞順序，存取(利用盞還是寄存器)）
2.函數的結尾處理方式（善后處理如:棧的恢復由誰恢復? 函數內恢復/還是調用后恢復）

以下是理論：

__cdecl  由調用者平棧，參數從右到左依次入棧 是C和C＋＋程序的缺省調用方式。每一個調用它的函數都包含清空堆棧的代碼，
           所以產生的可執行文件大小會比調用_stdcall函數的大。函數采用從右到左的壓棧方式。VC將函數編譯后會在函數名前面加上
           下划線前綴。是MFC缺省調用約定
__stdcall ，WINAPI，CALLBACK ，PASCAL    由被調用者平棧，參數從右到左依次入棧 ._stdcall是Pascal程序的缺省調用方式，
           通常用於Win32   Api中，函數采用從右到左的壓棧方式，自己在退出時清空堆棧。VC將函數編譯后會在函數名前面加上下划
           線前綴，在函數名后加上"@"和參數的字節數

__fastcall 由被調用者平棧，參數先賦值給寄存器，然后入棧  “人”如其名，它的主要特點就是快，因為它是通過寄存器來傳送參數的
          （實際上，它用ECX和EDX傳送前兩個雙字（DWORD）或更小的參數，剩下的參數仍舊自右向左壓棧傳送，被調用的函數在返回前
          清理傳送參數的內存棧），在函數名修飾約定方面，它和前兩者均不同.
          _fastcall方式的函數采用寄存器傳遞參數，VC將函數編譯后會在函數名前面加上"@"前綴，在函數名后加上"@"和參數的字節數。  

__thiscall 由被調用者平棧，參數入棧，this 指針賦給 ecx 寄存器 僅僅應用於“C++”成員函數。this指針存放於CX寄存器，參數從右
           到左壓。thiscall不是關鍵詞，因此不能被程序員指定。   


__declspec(naked) 這是一個很少見的調用約定，一般程序設計者建議不要使用。編譯器不會給這種函數增加初始化和清理代碼，
          更特殊的是，你不能用return返回返回值，只能用插入匯編返回結果。這一般用於實模式驅動程序設計.

以下是實踐：

int __stdcall test_stdcall(char para1, char para2)
{ 
  para1 = para2;
  return 0;
}
int __cdecl test_cdecl(char para, )
{ 
  char p = '\n';
  va_list marker;
  va_start( marker, para );
  while( p != '\0' )
  { 
    p = va_arg( marker, char);
    printf("%c\n", p);
  }
  va_end( marker );
  return 0;
}

int pascal test_pascal(char para1, char para2)
{ 
  return 0;
}

int __fastcall test_fastcall(char para1, char para2, char para3, char para4)
{ 
  para1 = (char)1;
  para2 = (char)2;
  para3 = (char)3;
  para4 = (char)4;
  return 0;
}
__declspec(naked) void __stdcall test_naked(char para1, char para2)
{ 
  __asm
  { 
    push ebp
    mov ebp, esp
    push eax
    mov al,byte ptr [ebp + 0Ch]
    xchg byte ptr [ebp + 8],al
    pop eax
    pop ebp
    ret 8
  }
//    return ;
}

int main( int argc, char* argv[ ] )
{
  test_stdcall( 'a', 'b' );
  test_cdecl( 'c', 'd', 'e', 'f', 'g', 'h', '\0' );
  test_pascal( 'e', 'f' );
  test_fastcall( 'g', 'h', 'i', 'j' );
  test_naked( 'k', 'l' );
  return 0;
}

匯編代碼如下

int main(int argc, char* argv[])
{
00411350  push        ebp  
00411351  mov         ebp,esp 
00411353  sub         esp,0C0h 
00411359  push        ebx  
0041135A  push        esi  
0041135B  push        edi  
0041135C  lea         edi,[ebp-0C0h] 
00411362  mov         ecx,30h 
00411367  mov         eax,0CCCCCCCCh 
0041136C  rep stos    dword ptr es:[edi] 

    test_stdcall( 'a', 'b' );
0041136E  push        62h  
00411370  push        61h  
00411372  call        _test_stdcall@8

    test_cdecl( 'c','d','e','f','g' ,'h' ,'\0');
00411377  push        0    
00411379  push        68h  
0041137B  push        67h  
0041137D  push        66h  
0041137F  push        65h  
00411381  push        64h  
00411383  push        63h  
00411385  call        _test_cdecl
0041138A  add         esp,1Ch ;恢復_test_cdecl參數壓入前的堆棧指令是: add esp,n*4 n=7, 參數的數量

    test_fastcall( 'g', 'h', 'i', 'j' );
0041138D  push        6Ah  
0041138F  push        69h  
00411391  mov         dl,68h 
00411393  mov         cl,67h 
00411395  call        test_fastcall

    test_naked( 'k', 'l');
0041139A  push        6Ch  
0041139C  push        6Bh  
0041139E  call        _test_naked

    return 0;
004113A3  xor         eax,eax 
}

int __stdcall test_stdcall(char para1, char para2)
{
004111F0  push        ebp  
004111F1  mov         ebp,esp 
004111F3  sub         esp,0C0h 

004111F9  push        ebx  
004111FA  push        esi  
004111FB  push        edi  
004111FC  lea         edi,[ebp-0C0h] 
00411202  mov         ecx,30h 
00411207  mov         eax,0CCCCCCCCh 
0041120C  rep stos    dword ptr es:[edi] ;初始edi
    para1 = para2;
0041120E  mov         al,byte ptr [para2] ;mov al,byte ptr[ebp+c]
00411211  mov         byte ptr [para1],al ;mov byte ptr[ebp+8],al
    return 0;
00411214  xor         eax,eax
00411216  pop         edi  
00411217  pop         esi  
00411218  pop         ebx  

00411219  mov         esp,ebp 
0041121B  pop         ebp  
0041121C  ret          8 ;恢復到壓入函數參數前堆棧,由於有兩個參數所以ret 8 相當於 pop eip 然后esp+8
}

int __cdecl test_cdecl(char para,... )
{
00411230  push        ebp  
00411231  mov         ebp,esp 
00411233  sub         esp,0D8h 
0041123C  lea         edi,[ebp-0D8h] 
00411242  mov         ecx,36h 
00411247  mov         eax,0CCCCCCCCh 
0041124C  rep stos    dword ptr es:[edi] 
    char    p = '\n';
0041124E  mov         byte ptr [p],0Ah 
    va_list marker;
    va_start( marker, para );
00411252  lea         eax,[ebp+0Ch] 
00411255  mov         dword ptr [marker],eax 
    while( p != '\0' )
00411258  movsx       eax,byte ptr [p] 
0041125C  test        eax,eax 
0041125E  je          test_cdecl+60h (411290h) 
    {
        p = va_arg( marker, char);
00411260  mov         eax,dword ptr [marker] 
00411263  add         eax,4 
00411266  mov         dword ptr [marker],eax 
00411269  mov         ecx,dword ptr [marker] 
0041126C  mov         dl,byte ptr [ecx-4] 
0041126F  mov         byte ptr [p],dl 
        printf("%c\n", p);
00411272  movsx       eax,byte ptr [p] 
00411276  mov         esi,esp 
00411278  push        eax  
00411279  push        offset string "%c\n" (41401Ch) 
0041127E  call        dword ptr [__imp__printf (416180h)] 
00411284  add         esp,8 
0041128E  jmp         test_cdecl+28h (411258h) 
    }
    va_end( marker );
00411290  mov         dword ptr [marker],0 
    return 0;
00411297  xor         eax,eax 
004112A9  mov         esp,ebp 
004112AB  pop         ebp  
004112AC  ret    
}
     
int __fastcall test_fastcall(char para1, char para2, char para3, char para4)
{
004112D0  push        ebp  
004112D1  mov         ebp,esp 
004112D3  sub         esp,0D8h  
004112DD  lea         edi,[ebp-0D8h] 
004112E3  mov         ecx,36h 
004112E8  mov         eax,0CCCCCCCCh 
004112ED  rep stos    dword ptr es:[edi] 
004112EF  pop         ecx  
004112F0  mov         byte ptr [ebp-14h],dl 
004112F3  mov         byte ptr [ebp-8],cl 
    para1 = (char)1;
004112F6  mov         byte ptr [para1],1 
    para2 = (char)2;
004112FA  mov         byte ptr [para2],2 
    para3 = (char)3;
004112FE  mov         byte ptr [para3],3 
    para4 = (char)4;
00411302  mov         byte ptr [para4],4 
    return 0;
00411306  xor         eax,eax  
0041130B  mov         esp,ebp 
0041130D  pop         ebp  
0041130E  ret         8  ;由於使用了ecx ,edx 傳遞參數 本來4個參數只使用兩push 所以這里是 ret 4*2
}
     

__declspec(naked) void __stdcall test_naked(char para1, char para2)
{
00411330  push        ebp      ;這里編譯器沒加入任何初始化和清棧的指令,你代碼如何寫它就復制過來
00411331  mov         ebp,esp 
00411333  push        eax  
00411334  mov         al,byte ptr [para2]    
00411337  xchg        al,byte ptr [para1] 
0041133A  pop         eax  
0041133B  pop         ebp  
0041133C  ret         8  
}

http://securityetalii.es/2013/01/20/calling-conventions-hunting/

Calling Conventions Hunting

Posted on 20/01/2013 by Adrián — Leave a comment

When trying to understand a binary, it’s key to be able to identify functions, and with them, their parameters and local variables. This will help the reverser figuring out APIs, data structures, etc. In short, gaining a deep understanding of the software. When dealing with functions, it’s essential to be able to identify the calling convention in use, as many times that will allow the reverser to perform educated guesses on the arguments and local variables used by the function. I’ll try to describe here a couple of points that may aid in identifying the calling convention of any given function and the number and ordering of its parameters.

Calling Conventions

A calling convention defines how functions are called in a program. They influence how data (arguments/variables) is laid on the stack when the function call takes place. A comprehensive definition of calling conventions is beyond the scope of this blog, nonetheless the most common ones are briefly described below.

cdecl

Description: Standard C/C++ calling convention. Allows functions to receive a dynamic number of parameters.

Cleans the stack: The caller is responsible for restoring the stack after making a function call.

Arguments passed: On the stack. Arguments are received in reverse order (i.e. from right to left). This is because the first argument is pushed onto the stack first, and the last is pushed last.

void _cdecl fun();

fastcall

Description: Slightly better performance calling convention.

Cleans the stack: The callee is responsible for restoring the stack before returning.

Arguments passed: First two arguments are passed in registers (ECX and EDX). The rest are passed through the stack.

void __fastcall func();

stdcall

Description: Very common in Windows (used by most APIs).

Cleans the stack: The callee is responsible for cleaning up the stack before returning. Usually by means of a RETN #N instruction.

Arguments passed: On the stack. Arguments received from left to right (opposite to cdecl). First argument is pushed last.

void __stdcall fun();

thiscall

Description: Used when C++ method with a static number of parameters is called. Specially thought to improve performance of OO languages (saves EDX for the this pointer with VC++. GCC pushes the this pointer onto the stack last). When a dynamic number of parameters is required, compilers usually fall back to cdecl and pass the this pointer as the first parameter on the stack.

Cleans the stack: In GCC, caller cleans the stack. In Microsoft VC++ the callee is responsible for cleaning up.

Arguments passed: From right to left (as cdecl). First argument is pushed first, and last argument is pushed last.

void __thiscall func();

Let the small table below serve as a quick reminder.

http://www.cs.virginia.edu/~evans/cs216/guides/x86.html

Calling Convention

To allow separate programmers to share code and develop libraries for use by many programs,

and to simplify the use of subroutines in general, programmers typically adopt a common calling convention.

The calling convention is a protocol about how to call and return from routines.

For example, given a set of calling convention rules, a programmer need not examine the definition of a subroutine to determine

how parameters should be passed to that subroutine.

Furthermore, given a set of calling convention rules, high-level language compilers can be made to follow the rules,

thus allowing hand-coded assembly language routines and high-level language routines to call one another.

In practice, many calling conventions are possible.

We will use the widely used C language calling convention.

Following this convention will allow you to write assembly language subroutines that are safely callable from C (and C++) code,

and will also enable you to call C library functions from your assembly language code.

The C calling convention is based heavily on the use of the hardware-supported stack.

It is based on the push, pop, call, and ret instructions.

Subroutine parameters are passed on the stack.

Registers are saved on the stack, and local variables used by subroutines are placed in memory on the stack.

The vast majority of high-level procedural languages implemented on most processors have used similar calling conventions.

The calling convention is broken into two sets of rules.

The first set of rules is employed by the caller of the subroutine, and the second set of rules is observed by the writer of the subroutine (the callee).

It should be emphasized that mistakes in the observance of these rules quickly result in fatal program errors

since the stack will be left in an inconsistent state; thus meticulous care should be used when implementing the call convention in your own subroutines.

A good way to visualize the operation of the calling convention is to draw the contents of the nearby region of the stack during subroutine execution. The image above depicts the contents of the stack during the execution of a subroutine with three parameters and three local variables. The cells depicted in the stack are 32-bit wide memory locations, thus the memory addresses of the cells are 4 bytes apart. The first parameter resides at an offset of 8 bytes from the base pointer. Above the parameters on the stack (and below the base pointer), the call instruction placed the return address, thus leading to an extra 4 bytes of offset from the base pointer to the first parameter. When the ret instruction is used to return from the subroutine, it will jump to the return address stored on the stack.

Caller Rules

To make a subrouting call, the caller should:

Before calling a subroutine, the caller should save the contents of certain registers that are designated caller-saved. The caller-saved registers are EAX, ECX, EDX. Since the called subroutine is allowed to modify these registers, if the caller relies on their values after the subroutine returns, the caller must push the values in these registers onto the stack (so they can be restore after the subroutine returns.
To pass parameters to the subroutine, push them onto the stack before the call. The parameters should be pushed in inverted order (i.e. last parameter first). Since the stack grows down, the first parameter will be stored at the lowest address (this inversion of parameters was historically used to allow functions to be passed a variable number of parameters).
To call the subroutine, use the call instruction. This instruction places the return address on top of the parameters on the stack, and branches to the subroutine code. This invokes the subroutine, which should follow the callee rules below.

After the subroutine returns (immediately following the call instruction), the caller can expect to find the return value of the subroutine in the register EAX. To restore the machine state, the caller should:

Remove the parameters from stack. This restores the stack to its state before the call was performed.
Restore the contents of caller-saved registers (EAX, ECX, EDX) by popping them off of the stack. The caller can assume that no other registers were modified by the subroutine.

Example
The code below shows a function call that follows the caller rules. The caller is calling a function _myFunc that takes three integer parameters. First parameter is in EAX, the second parameter is the constant 216; the third parameter is in memory location var.

push [var] ; Push last parameter first
push 216   ; Push the second parameter
push eax   ; Push first parameter last

call _myFunc ; Call the function (assume C naming)

add esp, 12

Note that after the call returns, the caller cleans up the stack using the add instruction.

We have 12 bytes (3 parameters * 4 bytes each) on the stack, and the stack grows down.

Thus, to get rid of the parameters, we can simply add 12 to the stack pointer.

The result produced by _myFunc is now available for use in the register EAX.

The values of the caller-saved registers (ECX and EDX), may have been changed.

If the caller uses them after the call, it would have needed to save them on the stack before the call and restore them after it.

Callee Rules

The definition of the subroutine should adhere to the following rules at the beginning of the subroutine:

Push the value of EBP onto the stack, and then copy the value of ESP into EBP using the following instructions:
```
    push ebp
    mov  ebp, esp
```
This initial action maintains the base pointer, EBP. The base pointer is used by convention as a point of reference for finding parameters and local variables on the stack.
When a subroutine is executing, the base pointer holds a copy of the stack pointer value from when the subroutine started executing.
Parameters and local variables will always be located at known, constant offsets away from the base pointer value.

We push the old base pointer value at the beginning of the subroutine so that we can later restore the appropriate base pointer value for the caller when the subroutine returns.
Remember, the caller is not expecting the subroutine to change the value of the base pointer.
We then move the stack pointer into EBP to obtain our point of reference for accessing parameters and local variables.
Next, allocate local variables by making space on the stack. Recall, the stack grows down, so to make space on the top of the stack, the stack pointer should be decremented.
The amount by which the stack pointer is decremented depends on the number and size of local variables needed.
For example, if 3 local integers (4 bytes each) were required, the stack pointer would need to be decremented by 12 to make space for these local variables
(i.e., sub esp, 12). As with parameters, local variables will be located at known offsets from the base pointer.
Next, save the values of the callee-saved registers that will be used by the function must be saved.
To save registers, push them onto the stack. The callee-saved registers are EBX, EDI, and ESI
(ESP and EBP will also be preserved by the calling convention, but need not be pushed on the stack during this step).

After these three actions are performed, the body of the subroutine may proceed. When the subroutine is returns, it must follow these steps:

Leave the return value in EAX.
Restore the old values of any callee-saved registers (EDI and ESI) that were modified.
The register contents are restored by popping them from the stack. The registers should be popped in the inverse order that they were pushed.
Deallocate local variables.
The obvious way to do this might be to add the appropriate value to the stack pointer (since the space was allocated by subtracting the needed amount from the stack pointer).
In practice, a less error-prone way to deallocate the variables is to move the value in the base pointer into the stack pointer:
mov esp, ebp.
This works because the base pointer always contains the value that the stack pointer contained immediately prior to the allocation of the local variables.
Immediately before returning, restore the caller's base pointer value by popping EBP off the stack.
Recall that the first thing we did on entry to the subroutine was to push the base pointer to save its old value.
Finally, return to the caller by executing a ret instruction. This instruction will find and remove the appropriate return address from the stack.

Note that the callee's rules fall cleanly into two halves that are basically mirror images of one another.

The first half of the rules apply to the beginning of the function, and are commonly said to define the prologue to the function.

The latter half of the rules apply to the end of the function, and are thus commonly said to define the epilogue of the function.

Example
Here is an example function definition that follows the callee rules:

.486
.MODEL FLAT
.CODE
PUBLIC _myFunc
_myFunc PROC
  ; Subroutine Prologue
  push ebp     ; Save the old base pointer value.
  mov ebp, esp ; Set the new base pointer value.
  sub esp, 4   ; Make room for one 4-byte local variable.
  push edi     ; Save the values of registers that the function
  push esi     ; will modify. This function uses EDI and ESI.
  ; (no need to save EBX, EBP, or ESP)

  ; Subroutine Body
  mov eax, [ebp+8]   ; Move value of parameter 1 into EAX
  mov esi, [ebp+12]  ; Move value of parameter 2 into ESI
  mov edi, [ebp+16]  ; Move value of parameter 3 into EDI

  mov [ebp-4], edi   ; Move EDI into the local variable
  add [ebp-4], esi   ; Add ESI into the local variable
  add eax, [ebp-4]   ; Add the contents of the local variable
                     ; into EAX (final result)

  ; Subroutine Epilogue 
  pop esi      ; Recover register values
  pop  edi
  mov esp, ebp ; Deallocate local variables
  pop ebp ; Restore the caller's base pointer value
  ret
_myFunc ENDP
END

The subroutine prologue performs the standard actions of saving a snapshot of the stack pointer in EBP (the base pointer),
allocating local variables by decrementing the stack pointer, and saving register values on the stack.

In the body of the subroutine we can see the use of the base pointer.
Both parameters and local variables are located at constant offsets from the base pointer for the duration of the subroutines execution.
In particular, we notice that since parameters were placed onto the stack before the subroutine was called, they are always located below the base pointer (i.e. at higher addresses) on the stack.
The first parameter to the subroutine can always be found at memory location [EBP+8], the second at [EBP+12], the third at [EBP+16].

Similarly, since local variables are allocated after the base pointer is set, they always reside above the base pointer (i.e. at lower addresses) on the stack.

In particular, the first local variable is always located at [EBP-4], the second at [EBP-8], and so on.

This conventional use of the base pointer allows us to quickly identify the use of local variables and parameters within a function body.

The function epilogue is basically a mirror image of the function prologue.
The caller's register values are recovered from the stack, the local variables are deallocated by resetting the stack pointer,
the caller's base pointer value is recovered, and the ret instruction is used to return to the appropriate code location in the caller.

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 x86 x64調用約定及傳參順序常見函數調用約定(x86、x64、arm、arm64) x86 x64下調用約定淺析 x86_64匯編：調用約定 RTC（x86) x86匯編之棧與子程序調用 ARM, X86和MIPS android x86 安裝 X86匯編 BT X86、ARM有何不同？