x86 Linux Hello World 示例

這是 32 位 x86 Linux 的 NASM 程式集中的基本 Hello World 程式,直接使用系統呼叫(沒有任何 libc 函式呼叫)。這需要很多,但隨著時間的推移它會變得可以理解。以分號(;)開頭的行是註釋。

如果你還不熟悉低階 Unix 系統程式設計,你可能只想在 asm 中編寫函式並從 C 或 C++程式中呼叫它們。然後你可以擔心學習如何處理暫存器和記憶體,而不用學習 POSIX 系統呼叫 API 和 ABI 來使用它。

這會產生兩個系統呼叫: write(2)_exit(2) (不是重新整理 stdio 緩衝區的 exit(3) libc 包裝器等等)。 (從技術上講,_exit() 呼叫 sys_exit_group,而不是 sys_exit,但這隻在多執行緒程序中很重要 。)另請參閱 syscalls(2)以獲取有關係統呼叫的文件,以及使用 libc 包裝函式直接進行系統呼叫之間的區別。

總之,系統呼叫是通過將 args 放在適當的暫存器中,並將系統呼叫號放在 eax 中,然後執行 int 0x80 指令。另請參見 Assembly 中系統呼叫的返回值是什麼? 有關如何使用 C 語法記錄 asm 系統呼叫介面的更多說明。

32 位 ABI 的系統呼叫號碼在/usr/include/i386-linux-gnu/asm/unistd_32.h 中(/usr/include/x86_64-linux-gnu/asm/unistd_32.h 中的內容相同)。

#include <sys/syscall.h> 最終將包含正確的檔案,因此你可以執行 echo '#include <sys/syscall.h>' | gcc -E - -dM | less 來檢視巨集 defs( 有關在 C 頭中查詢 asm 的常量的更多資訊, 請參閱此答案

section .text             ; Executable code goes in the .text section
global _start             ; The linker looks for this symbol to set the process entry point, so execution start here
;;;a name followed by a colon defines a symbol.  The global _start directive modifies it so it's a global symbol, not just one that we can CALL or JMP to from inside the asm.
;;; note that _start isn't really a "function".  You can't return from it, and the kernel passes argc, argv, and env differently than main() would expect.
 _start:
    ;;; write(1, msg, len);
    ; Start by moving the arguments into registers, where the kernel will look for them
    mov     edx,len       ; 3rd arg goes in edx: buffer length
    mov     ecx,msg       ; 2nd arg goes in ecx: pointer to the buffer
    ;Set output to stdout (goes to your terminal, or wherever you redirect or pipe)
    mov     ebx,1         ; 1st arg goes in ebx: Unix file descriptor. 1 = stdout, which is normally connected to the terminal.

    mov     eax,4         ; system call number (from SYS_write / __NR_write from unistd_32.h).
    int     0x80          ; generate an interrupt, activating the kernel's system-call handling code.  64-bit code uses a different instruction, different registers, and different call numbers.
    ;; eax = return value, all other registers unchanged.

    ;;;Second, exit the process.  There's nothing to return to, so we can't use a ret instruction (like we could if this was main() or any function with a caller)
    ;;; If we don't exit, execution continues into whatever bytes are next in the memory page,
    ;;; typically leading to a segmentation fault because the padding 00 00 decodes to  add [eax],al.

    ;;; _exit(0);
    xor     ebx,ebx       ; first arg = exit status = 0.  (will be truncated to 8 bits).  Zeroing registers is a special case on x86, and mov ebx,0 would be less efficient.
                      ;; leaving out the zeroing of ebx would mean we exit(1), i.e. with an error status, since ebx still holds 1 from earlier.
    mov     eax,1         ; put __NR_exit into eax
    int     0x80          ;Execute the Linux function

section     .rodata       ; Section for read-only constants

             ;; msg is a label, and in this context doesn't need to be msg:.  It could be on a separate line.
             ;; db = Data Bytes: assemble some literal bytes into the output file.
msg     db  'Hello, world!',0xa     ; ASCII string constant plus a newline (0x10)

             ;;  No terminating zero byte is needed, because we're using write(), which takes a buffer + length instead of an implicit-length string.
             ;; To make this a C string that we could pass to puts or strlen, we'd need a terminating 0 byte. (e.g. "...", 0x10, 0)

len     equ $ - msg       ; Define an assemble-time constant (not stored by itself in the output file, but will appear as an immediate operand in insns that use it)
                          ; Calculate len = string length.  subtract the address of the start
                          ; of the string from the current position ($)
  ;; equivalently, we could have put a str_end: label after the string and done   len equ str_end - str

在 Linux 上,你可以將此檔案儲存為 Hello.asm,並使用以下命令從中構建 32 位可執行檔案:

nasm -felf32 Hello.asm                  # assemble as 32-bit code.  Add -Worphan-labels -g -Fdwarf  for debug symbols and warnings
gcc -nostdlib -m32 Hello.o -o Hello     # link without CRT startup code or libc, making a static binary

有關使用 GNU as 指令構建 32 位或 64 位靜態或動態連結 Linux 可執行檔案,NASM / YASM 語法或 GNU AT&T 語法的更多詳細資訊,請參閱此答案 。 (關鍵點:確保在 64 位主機上構建 32 位程式碼時使用 -m32 或等效程式碼,否則在執行時會出現令人困惑的問題。)

你可以使用 strace 跟蹤它的執行情況,以檢視它所做的系統呼叫:

$ strace ./Hello 
execve("./Hello", ["./Hello"], [/* 72 vars */]) = 0
[ Process PID=4019 runs in 32 bit mode. ]
write(1, "Hello, world!\n", 14Hello, world!
)         = 14
_exit(0)                                = ?
+++ exited with 0 +++

stderr 上的跟蹤和 stdout 上的常規輸出都到達終端,因此它們干擾了 write 系統呼叫。如果你願意,可以重定向或跟蹤到檔案。請注意,這樣可以讓我們輕鬆檢視 syscall 返回值,而無需新增程式碼來列印它們,實際上比使用常規偵錯程式(如 gdb)更容易。

該程式的 x86-64 版本非常相似,將相同的 args 傳遞給相同的系統呼叫,只是在不同的暫存器中。並使用 syscall 指令代替 int 0x80