GCC Calling Conventions
Question 1: What is the advantage of using callee and caller-saved registers? Why can’t all the registers be either callee-saved or caller-saved?
GCC dictates how the stack is used. Contract between caller and callee in x86 is:
- An entry to a function (i.e., just after call):
- %eip points to the first instruction of function
- %esp+4 points at first argument
- %esp points at return address
- After ret instruction
- %eip contains return address
- %esp points to the argument pushed by caller
- Called function may have trashed arguments
- %eax (and %edx, if return type is 64-bit) contains return value (or trash if the function is void)
- %eax, %edx, and %ecx may be trashed
- %ebp, %ebx, %esi, and %edi must contains content from time of call
- %eax, %ecx, %edx are “caller save” registers
- %ebp, %ebx, %esi, and %edi are “callee save” registers
- The above text is a part of course notes and seems to be taken from https://www.cse.iitd.ernet.in/~sbansal/os/lec/l5.html
- Google Search
- Calling Convention for different C++ compilers and operating systems - 57 Page PDF (By Agner Fog. Technical University of Denmark.)
- The 64 bit x86 C Calling Convention - 6 Page PDF Chapter. This chapter was derived from a document written by Adam Ferrari and later updated by Alan Batson, Mike Lack, Anita Jones, and Aaron Bloomfield. Does not talk about Caller v/s Callee and why/what reasons are there for it.
- It’s beneficial for a calling convention to designate both caller-save registers and callee-save registers. If the convention designated all registers as callee-save, then subroutines would not be able to use any registers at all without saving them onto the stack first — which would be a waste, since some of the saved registers would be transient values that the calling subroutine did not care about long-term. And if the convention designated all registers as caller-save, then programmers would be forced to save many registers before every call to a subroutine and to restore them afterwards, lengthening the amount of time to call a subroutine. Ref: 1, 2
- In general, neither caller‐save nor callee‐save is “best”:
- If caller isn’t using a register, caller‐save is better
- If callee doesn’t need a register, callee‐save is better
- If “do need to save”, callee‐save generally makes smaller programs
- Functions are called from multiple places
- So… “some of each” and compiler tries to “pick registers” that minimize amount of saving/restoring
- Ref: https://courses.cs.washington.edu/courses/cse410/17wi/lectures/CSE410-L13-procedures-II_17wi.pdf
- Caller-save are volatile, Callee-save are non-volatile. Ref
Question 2: Why do we need to save all the registers on the stack on an interrupt? Can we only save callee-saved registers?
From Course Notes: Interrupts are events from external devices that force a processor (CPU) to execute a different stream of Instructions. Interrupts can be triggered by multiple devices. To distinguish between interrupts from different devices, a vector number is associated with every interrupt. The x86 hardware supports 256 different interrupt vectors. An IDT contains the handler for every interrupt. On interrupt, the hardware automatically saves the EIP and EFLAGS on the stack before jumping to the target handler. An IRET instruction automatically pops the EFLAGS and EIP from the stack and restore them
- When a hardware interrupt occurs on an x86 the flags and return code segment+offset are pushed onto the stack. Then interrupts are disabled. This is to set the stage for the interrupt routine to service the interrupt: switch stacks or whatever it wants to do before either re-enabling interrupts and processing some more before/or returning from the interrupt. The iret instruction pops the previously saved flags (including the interrupt flag which was originally enabled) and the return location so that the interrupted routine can continue processing none the wiser. Ref
- GitLab Source Code of some part of GCC: https://gitlab.indel.ch/thirdparty/gcc/commit/5ed3cc7b66af4758f7849ed6f65f4365be8223be
- For regular function calls we use the registers and stack to pass parameters, but interrupt threads have logically separate registers and stack. More specifically, registers are automatically saved by the processor as it switches from main program (foreground thread) to interrupt service routine (background thread). Exiting an ISR will restore the registers back to their previous values. Thus, all parameter passing must occur through global memory. One cannot pass data from the main program to the interrupt service routine using registers or the stack. Ref: http://users.ece.utexas.edu/~valvano/Volume1/E-Book/C12_Interrupts.htm
- The compiler automatically saves and restores any registers that it uses in an interrupt function.
If it sees that the interrupt function calls another function, it will save and restore all of the working registers that the called function might use.
It is not necessary to be concerned with individual registers when programming in C, since the W register array is a compiler-managed resource.
Saving Registers - Recursive Interrupts
Question 3: Do you think that saving register on the stack also works with recursive interrupts (a recursive interrupt mean an interrupt handler can also be interrupted)? Justify your answer.
Question 4: Look at the implementation of “printf” in Pintos source code. printf takes a variable number of arguments. A function that receives a variable number of arguments must declare the name of at least one parameter. Given the name of a parameter, “va start” routine can retrieve additional parameters. Do you think the compiler can retrieve the first argument, if the arguments are passed in the reversed order, i.e., the argument just after the return address is the nth argument? Justify your answer.
- Godbolt - Compiler Explorer. Link: https://godbolt.org/#g:!((g:!((g:!((h:codeEditor,i:(fontScale:1.7000000000000006,j:1,options:(colouriseAsm:'0',compileOnChange:'0'),source:'long+pcount_r(unsigned+long+x)+%7B%0A++if+(x+%3D%3D+0)%0A++++return+0%3B%0A++else%0A++++return+(x%261)%2Bpcount_r(x+%3E%3E+1)%3B%0A%7D%0A'),l:'5',n:'1',o:'C%2B%2B+source+%231',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:g520,filters:(b:'0',commentOnly:'0',directives:'0'),fontScale:1.7000000000000006,options:'-O1'),l:'5',n:'0',o:'%231+with+x86-64+gcc+5.2',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4) from some famous course
- CSE IIT-Delhi Course