How do I call the exit function in the C library

  • 2020-06-19 11:13:50
  • OfStack

Compile greater than operator

The original plan was to compile if expressions, but I found nothing that could be used as an expression for the test-ES6en part of if, so I decided to implement the function of comparing two Numbers under 1 first. I decided to use the greater than operator as an example -- the greater than operator means > La. So, my goal is to compile code like this


(> 1 2)

And the results after the comparison are placed in the EAX register. Since the language is still very rudimentary, there is no such thing as a Boolean, so the C language is treated as a numeric value of 0 for logical false and other values for logical true. So when the above expression is translated into assembly code and finally run, you should see that the value in the EAX register is 0.

To compile the greater than operator and put the result into the EAX register, the new instructions CMP, JG, and JMP are needed. My idea is to put the first operand into the EAX register and the second into the EBX register. The two registers are then compared using the CMP instruction. If the value in EAX is greater than EBX, the JG instruction is used to jump to one MOV instruction, which changes the value of register EAX to 1. Otherwise, JG is not executed. Execute the next 1 MOV instruction, write the value 0 to the EAX register, and then use JMP to hop away to avoid executing the first MOV instruction. The idea is pretty simple.

Before modifying jjcc2, you also need to pair in inside-ES38en /aux > Support, but nothing special, just add the parameters to member, right > It's just a symbol. Then, change jjcc2 to the following form


(defun jjcc2 (expr globals)
 " Supports two Numbers 4 The compiler of the operation "
 (check-type globals hash-table)
 (cond ((eq (first expr) '+)
  `((movl ,(get-operand expr 0) %eax)
  (movl ,(get-operand expr 1) %ebx)
  (addl %ebx %eax)))
 ((eq (first expr) '-)
  `((movl ,(get-operand expr 0) %eax)
  (movl ,(get-operand expr 1) %ebx)
  (subl %ebx %eax)))
 ((eq (first expr) '*)
  ;;  Put the result of multiplying two Numbers in the first place 2 The register in which the operands are located 
  ;;  Because we agreed to use it EAX The register is given as the final result continuation Use the register, so the 2 The operands should be EAX
  `((movl ,(get-operand expr 0) %eax)
  (movl ,(get-operand expr 1) %ebx)
  (imull %ebx %eax)))
 ((eq (first expr) '/)
  `((movl ,(get-operand expr 0) %eax)
  (cltd)
  (movl ,(get-operand expr 1) %ebx)
  (idivl %ebx)))
 ((eq (first expr) 'progn)
  (let ((result '()))
  (dolist (expr (rest expr))
  (setf result (append result (jjcc2 expr globals))))
  result))
 ((eq (first expr) 'setq)
  ;;  The assignment statement is compiled simply by treating the assigned symbol as 1 Global variables, and then will eax The contents of the register are moved in here 
  ;; TODO:  Here, expr the second The result has to be 1 I need a symbol 
  ;; FIXME:  I don't know what I should assign to it, so let me just write it down, okay 0 ! 
  (setf (gethash (second expr) globals) 0)
  (values (append (jjcc2 (third expr) globals)
    ;;  In order to facilitate stringify The implementation of the function is constructed directly here RIP-relative Formal string 
    `((movl %eax ,(get-operand expr 0))))
   globals))
 ((eq (first expr) '_exit)
  ;;  Because you know that _exit Only need to 1 So I'm going to put the number one 1 Two operands are plugged into EDI Just in the register 
  ;; TODO:  A better way to write it would be there 1 A separate function handles this argument passing (to match) calling convention The way) 
  `((movl ,(get-operand expr 0) %edi)
  (movl #x2000001 %eax)
  (syscall)))
 ((eq (first expr) '>)
  ;;  In order to be able to put the results after the comparison into EAX Register, with my current incomplete assembly language knowledge, can think of the following ways 
  (let ((label-greater-than (intern (symbol-name (gensym)) :keyword))
  (label-end (intern (symbol-name (gensym)) :keyword)))
  ;;  According to the article ( https://en.wikibooks.org/wiki/X86_Assembly/Control_Flow#Comparison_Instructions The number to the left of the greater than sign should be CMP Instruction of the first 2 Of the operands, the one on the right is the first 1 Of the operands 
  `((movl ,(get-operand expr 0) %eax)
  (movl ,(get-operand expr 1) %ebx)
  (cmpl %ebx %eax)
  (jg ,label-greater-than)
  (movl $0 %eax)
  (jmp ,label-end)
  ,label-greater-than
  (movl $1 %eax)
  ,label-end)))))

You can then run the following code in REPL


(let* ((ht (make-hash-table))
 (asm (jjcc2 (inside-out '(_exit (> 1 2))) ht)))
 (stringify asm ht))

The output assembly code is


 .data
G809: .long 0
 .section __TEXT,__text,regular,pure_instructions
 .globl _main
_main:
 MOVL $1, %EAX
 MOVL $2, %EBX
 CMPL %EBX, %EAX
 JG G810
 MOVL $0, %EAX
 JMP G811
G810:
 MOVL $1, %EAX
G811:
 MOVL %EAX, G809(%RIP)
 MOVL G809(%RIP), %EDI
 MOVL $33554433, %EAX
 SYSCALL

Once the compile link runs, you should get the desired result. Let's begin the body of this article

Call the exit function of the C standard library

In the introduction above, the greater than sign ( > ), then the compilation of if expressions is a matter of ease, without much explanation. In this section, I'll show you how to generate assembly code that calls the exit(3) function from the C language standard library.

There is no built-in function called EXIT in Common Lisp, so as with the previous implementation of _exit1, I will add a new one to recognize (first expr), the symbol exit. In order to call the exit function in the C language standard library, you need to follow the calling convention. For a function like exit, which takes only one argument, the case is simple and you just need to do the same for _exit1. In the beginning, I wrote code like this


(defun jjcc2 (expr globals)
 ;;  Omit unnecessary information 
 (cond ;;  Omit unnecessary information 
 ((member (first expr) '(_exit exit))
  ;;  Temporarily recognized in a hardcoded manner 1 Whether a function comes from C A standard library of languages 
  `((movl ,(get-operand expr 0) %edi)
  (call :|_exit|)))))

Compiling (exit 1) yields the following code


 .data
 .section __TEXT,__text,regular,pure_instructions
 .globl _main
_main:
 MOVL $1, %EDI
 CALL _exit

However, after such code has been compiled and linked, 1 will run into a segment error (segmentation fault). After a dog search, I realized that when I called the C function on macOS, I had to align the stack to 16 bytes -- which I understood to mean align the pointer to the top of the stack to 16 bytes. Therefore, I modified jjcc2 into the following form


(defun jjcc2 (expr globals)
 ;;  Omit unnecessary information 
 (cond ;;  Omit unnecessary information 
 ((member (first expr) '(_exit exit))
  ;;  Temporarily recognized in a hardcoded manner 1 Whether a function comes from C A standard library of languages 
  `((movl ,(get-operand expr 0) %edi)
  ;;  According to the answer ( https://stackoverflow.com/questions/12678230/how-to-print-argv0-in-nasm ) Said, in macOS On the call C Language function that needs to align the stack to 16 position 
  ;;  Pretend to align the top of the stack. Because the top address of the stack grows to the bottom address, you only need to lower the address 16 Just erase the bits 
  (and ,(format nil "$0x~X" #XFFFFFFF0) %esp)
  (call :|_exit|)))))

It turns out it still doesn't work. Finally, I had no choice but to write a simple piece of C code and then generate assembly code with gcc-ES94en to see how the stack alignment should be handled. After a lot of fiddling and fiddling, it turns out that the RSP register is being handled instead of the ESP register -- I don't know why, but ESP is the low 32-bit version of RSP.

Finally, jjcc2 can be successfully compiled (exit 1) by writing as follows


(defun jjcc2 (expr globals)
 " Supports two Numbers 4 The compiler of the operation "
 (check-type globals hash-table)
 (cond ((eq (first expr) '+)
   `((movl ,(get-operand expr 0) %eax)
   (movl ,(get-operand expr 1) %ebx)
   (addl %ebx %eax)))
  ((eq (first expr) '-)
   `((movl ,(get-operand expr 0) %eax)
   (movl ,(get-operand expr 1) %ebx)
   (subl %ebx %eax)))
  ((eq (first expr) '*)
   ;;  Put the result of multiplying two Numbers in the first place 2 The register in which the operands are located 
   ;;  Because we agreed to use it EAX The register is given as the final result continuation Use the register, so the 2 The operands should be EAX
   `((movl ,(get-operand expr 0) %eax)
   (movl ,(get-operand expr 1) %ebx)
   (imull %ebx %eax)))
  ((eq (first expr) '/)
   `((movl ,(get-operand expr 0) %eax)
   (cltd)
   (movl ,(get-operand expr 1) %ebx)
   (idivl %ebx)))
  ((eq (first expr) 'progn)
   (let ((result '()))
   (dolist (expr (rest expr))
    (setf result (append result (jjcc2 expr globals))))
   result))
  ((eq (first expr) 'setq)
   ;;  The assignment statement is compiled simply by treating the assigned symbol as 1 Global variables, and then will eax The contents of the register are moved in here 
   ;; TODO:  Here, expr the second The result has to be 1 I need a symbol 
   ;; FIXME:  I don't know what I should assign to it, so let me just write it down, okay 0 ! 
   (setf (gethash (second expr) globals) 0)
   (values (append (jjcc2 (third expr) globals)
       ;;  In order to facilitate stringify The implementation of the function is constructed directly here RIP-relative Formal string 
       `((movl %eax ,(get-operand expr 0))))
     globals))
  ;; ((eq (first expr) '_exit)
  ;; ;;  Because you know that _exit Only need to 1 So I'm going to put the number one 1 Two operands are plugged into EDI Just in the register 
  ;; ;; TODO:  A better way to write it would be there 1 A separate function handles this argument passing (to match) calling convention The way) 
  ;; `((movl ,(get-operand expr 0) %edi)
  ;; (movl #x2000001 %eax)
  ;; (syscall)))
  ((eq (first expr) '>)
   ;;  In order to be able to put the results after the comparison into EAX Register, with my current incomplete assembly language knowledge, can think of the following ways 
   (let ((label-greater-than (intern (symbol-name (gensym)) :keyword))
    (label-end (intern (symbol-name (gensym)) :keyword)))
   ;;  According to the article ( https://en.wikibooks.org/wiki/X86_Assembly/Control_Flow#Comparison_Instructions The number to the left of the greater than sign should be CMP Instruction of the first 2 Of the operands, the one on the right is the first 1 Of the operands 
   `((movl ,(get-operand expr 0) %eax)
    (movl ,(get-operand expr 1) %ebx)
    (cmpl %ebx %eax)
    (jg ,label-greater-than)
    (movl $0 %eax)
    (jmp ,label-end)
    ,label-greater-than
    (movl $1 %eax)
    ,label-end)))
  ((eq (first expr) 'if)
   ;;  Assume that if The result of the statement's test expression is also placed in %eax It's in the register, so you just take it %eax The value trace in the register 0 Just make a comparison (similar to C Language) 
   (let ((label-else (intern (symbol-name (gensym)) :keyword))
    (label-end (intern (symbol-name (gensym)) :keyword)))
   (append (jjcc2 (second expr) globals)
     `((cmpl $0 %eax)
      (je ,label-else))
     (jjcc2 (third expr) globals)
     `((jmp ,label-end)
      ,label-else)
     (jjcc2 (fourth expr) globals)
     `(,label-end))))
  ((member (first expr) '(_exit exit))
   ;;  Temporarily recognized in a hardcoded manner 1 Whether a function comes from C A standard library of languages 
   `((movl ,(get-operand expr 0) %edi)
   ;;  According to the answer ( https://stackoverflow.com/questions/12678230/how-to-print-argv0-in-nasm ) Said, in macOS On the call C Language function that needs to align the stack to 16 position 
   ;;  Pretend to align the top of the stack. Because the top address of the stack grows to the bottom address, you only need to lower the address 16 Just erase the bits 
   (and ,(format nil "$0x~X" #XFFFFFFFFFFFFFFF0) %rsp)
   (call :|_exit|)))))

The generated assembly code is shown below


  .data
  .section __TEXT,__text,regular,pure_instructions
  .globl _main
_main:
  MOVL $1, %EDI
  AND $0xFFFFFFFFFFFFFFF0, %RSP
  CALL _exit

Well, at this point I was thinking, if You want to support other functions from the C language standard library, you just have to follow the same path, which seems pretty simple -- naively, I thought.

conclusion


Related articles: