Linux C 語言之 Hello World 詳解

Linux C 語言之 Hello World 詳解

第一個 C 語言程序

學習 C 語言，大多數接觸的第一個 C 語言程序便是經典的 Hello World 程序，程序的功能是在當前終端上打印 “Hello World” 字符串！
該程序的實現代碼如下：

#include <stdio.h>

void main()
{
  printf("Hello World\n");
}

在 GNU/Linux 系統中，使用 gcc 編譯器，編譯並執行 helloworld 程序的指令為：

通過 vi 編輯器編寫上面代碼，並保存為 helloworld.c
使用 gcc 編譯器編譯源代碼生成可執行文件 helloworld： gcc -o helloworld helloworld.c
執行當前目錄中的 helloworld 程序：./helloworld

當前終端屏幕就會打印 Hello World，如下圖：

程序運行原理

GNU/Linux 系統中可執行程序都是 elf 格式二進制文件，該文件跟 Windows 系統的 exe 文件類似，通過 Linux 的 Shell 比如 Bash 加載到內存，由操作系統啟動
新線程，然后開始執行。我們可以通過 file 命令查看目標文件的格式：
:~$ file helloworld
helloworld: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=203388067920d237ab234e8eb97714f56919799f, not stripped

編譯，鏈接

從源代碼生成可執行文件，需要很多步驟，最主要的步驟就是編譯和鏈接。在我們上述的過程中，編譯和鏈接都是由 gcc 程序完成的。
當然我們也可以分開來執行編譯和鏈接過程：

gcc -c helloworld.c
ld -o helloworld helloworld.o -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o /usr/lib/x86_64-linux-gnu/crtn.o -lc

可以看到，簡單的 helloworld 程序依賴了大量的系統文件，其中主要的是程序運行環境相關的 crt （C RunTime Library）和系統 c 語言庫 glibc。
當然不同的平台這個步驟可能不同，可以在 gcc 命令中添加 -v 參數，查看編譯和鏈接的完整步驟。

運行時

我們從代碼可見的程序起始是 main 函數，但是編譯器在編譯鏈接的過程中，在我們的程序中添加了運行時代碼，所以程序的起始並不是 main 函數了，可以通過 nm 查看我們的程序的地址和符號：
$ nm helloworld

0000000000600734 D __bss_start
0000000000600730 D __data_start
0000000000600730 W data_start
0000000000600570 d _DYNAMIC
0000000000600734 D _edata
0000000000600738 D _end
0000000000400464 T _fini
0000000000600708 d _GLOBAL_OFFSET_TABLE_
                 w __gmon_start__
0000000000400340 T _init
0000000000600570 d __init_array_end
0000000000600570 d __init_array_start
000000000040047c R _IO_stdin_used
0000000000400460 T __libc_csu_fini
00000000004003f0 T __libc_csu_init
                 U __libc_start_main@@GLIBC_2.2.5
00000000004003a0 T main
                 U puts@@GLIBC_2.2.5
00000000004003c0 T _start

可以看到 main 函數已經不是在程序的代碼段開頭了。可以通過對 gcc 添加 -Map 參數，來生成程序的 map 文件，方便我們查看程序的代碼段，數據段等信息：
gcc -o helloworld helloworld.c -Wl,-Map,helloworld.map
通過 helloworld.map 可以清晰的看到 main 函數所在的 text 段，和相關的地址信息。

鏈接庫

gcc 默認動態庫的搜索路徑搜索的先后順序是：

編譯目標代碼時指定的動態庫搜索路徑；
環境變量LD_LIBRARY_PATH指定的動態庫搜索路徑；
配置文件/etc/ld.so.conf中指定的動態庫搜索路徑；
默認的動態庫搜索路徑/lib、/usr/lib。
所以指定目標庫的時候需要使用 -rpath 參數傳遞路徑給 gcc。
我們這里只是使用了標准 c 庫，版本為 ldd 展示的 /lib/x86_64-linux-gnu/libc.so.6 GLIBC_2.2.5
ldd helloworld
linux-vdso.so.1 => (0x00007ffd493f3000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5f12756000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5f12b20000)

編譯器優化

我們顯示調用的 c 庫函數是 printf，在 c 語言庫中 stdio.h 中定義：

/* Write formatted output to stdout.

   This function is a possible cancellation point and therefore not
   marked with __THROW.  */
extern int printf (const char *__restrict __format, ...);

但是實際上，我們通過 nm 命令看到可執行文件中調用的 c 庫的 puts，通過匯編更能清晰的看到這個調用的詳細情況：
gcc -S helloworld.c
cat helloworld.s

        .file   "helloworld.c"
        .section        .rodata
.LC0:
        .string "Hello World"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $.LC0, %edi
        call    puts
        nop
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .ident  "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609"
        .section        .note.GNU-stack,"",@progbits

當打印的全部是字符串，即沒有需要轉為字符串的操作的時候， gcc 會把 printf 優化成 puts。所以對於編譯器的優化對程序員來說有時候是透明的。
我們需要仔細的檢查編譯器是否對我們的代碼進行了優化。

Hello World 打印原理

從上面的分析，我們知道，我們的 helloworld 程序主要是調用了 puts 函數進行打印，puts 在 glibc 中的實現如下：

/* Write the string in S and a newline to stdout.  */
int
puts (const char *s)
{
  return fputs (s, stdout) || putchar ('\n') == EOF ? EOF : 0;
}

該函數主要是調用 fputs 將字符串送到 stdout （標注輸出），並送出一個換行符！換行符同樣是送到 stdout ：

/* Write the character C on stdout.  */
int
putchar (int c)
{
  return __putc (c, stdout);
}

stdout, stdin 和 stderr

那么 stdout 是什么，glibc 是如何通過 stdout 將我們的終端相連接的呢？
stdout 在 glibc 中是 FILE 類型的指針：

/* Standard streams.  */
extern FILE *stdin, *stdout, *stderr;
#ifdef __STRICT_ANSI__
/* ANSI says these are macros; satisfy pedants.  */
#define	stdin	stdin
#define	stdout	stdout
#define	stderr	stderr
#endif

這 3 個指針分別是對應 fd 號為 0,1,2 的 3 個標准 fd 的封裝：

/* Standard streams.  */
#define	READ		1, 0
#define	WRITE		0, 1
#define	BUFFERED	0
#define	UNBUFFERED	1
#define	stdstream(name, next, fd, readwrite, unbuffered)		      \
    {									      \
      _IOMAGIC,								      \
      NULL, NULL, NULL, NULL, 0,					      \
      (void *) fd,							      \
      { readwrite, /* ... */ },						      \
      { NULL, NULL, NULL, NULL, NULL },					      \
      { NULL, NULL },							      \
      -1, -1,								      \
      (next),								      \
      NULL, '\0', 0,							      \
      0, 0, unbuffered, 0, 0, 0, 0					      \
    }
static FILE stdstreams[3] =
  {
    stdstream (&stdstreams[0], &stdstreams[1], STDIN_FILENO, READ, BUFFERED),
    stdstream (&stdstreams[1], &stdstreams[2], STDOUT_FILENO, WRITE, BUFFERED),
    stdstream (&stdstreams[2], NULL, STDERR_FILENO, WRITE, UNBUFFERED),
  };
FILE *stdin = &stdstreams[0];
FILE *stdout = &stdstreams[1];
FILE *stderr = &stdstreams[2];

其中可以明確的知道：

只有 stderr 是不緩沖的，stdin 和 stdout 都是緩沖的，那么輸出到 stdout 的字符可能不會立即顯示
stdin 是只讀的， stdout 和 stderr 是只能寫的，其他的操作，比如讀 stdout 是不可預知的。
fd 是顯示直接強制賦值的，就是說 0,1,2 應該是已經打開的描述符，否則會出現輸入輸出錯誤。

那么是在何時打開的標准描述符呢？

stdio 與 tty

stdio 是與 tty 對應的，一個系統中可以有很多用戶，或者一個用戶打開了多個終端，但是 printf 等輸出都是在當前終端上。
stdio 是與 tty 一一對應。從 glibc 的代碼我們可以找到打開標准描述符 0,1,2 的位置：
login_tty.c

int
login_tty(fd)
	int fd;
{
	(void) setsid();
#ifdef TIOCSCTTY
	if (ioctl(fd, TIOCSCTTY, (char *)NULL) == -1)
		return (-1);
#else
	{
	  /* This might work.  */
	  char *fdname = ttyname (fd);
	  int newfd;
	  if (fdname)
	    {
	      if (fd != 0)
		(void) close (0);
	      if (fd != 1)
		(void) close (1);
	      if (fd != 2)
		(void) close (2);
	      newfd = open (fdname, O_RDWR);
	      (void) close (newfd);
	    }
	}
#endif
	(void) dup2(fd, 0);
	(void) dup2(fd, 1);
	(void) dup2(fd, 2);
	if (fd > 2)
		(void) close(fd);
	return (0);
}

每次登陸的時候，系統會將當前的 login 程序傳入的 fb， dump 出來 3 份，分別的 fb 值就是 0,1,2
因此， stdin、stdout、stderr 其實對應的是同一個文件，這個文件就是當前 login 使用的 tty 。

從內存到設備

我們的 helloworld 程序被 shell 加載到內存， “Hello World” 字符串也是在內存的位置，如何輸出到 tty 設備呢？
我們 tty 設備是虛擬的設備，可能是 LCD 顯示器，可能是串口，也可能是 LED 顯示器。其中的對應和輸出流，
那就是要牽涉到具體的設備驅動，那又是另一個領域才能講清楚的了。大概的數據流就是：

輸出設備和 tty 是綁定的，輸出到 tty 就會把數據傳遞給顯示設備驅動程序
設備驅動程序會把字符串數據最后通過 DMA 或者其他總線方式發給設備
最終的設備會顯示我們需要看到的字符串 “Hello World”

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 linux C(hello world) C語言之如何上機運行第一個Hello World小程序入門C語言--Hello World C語言學習之路---Hello,World! 快速學習C語言一: Hello World 不一樣的go語言之入門篇-Hello World 「C語言」C輸出hello world！系統發生了什么？一，徹底理解第一個C語言程序 Hello World ubuntu下創建c語言程序之hello world 用VS2017編寫C語言的Hello World