Is GCC broken when taking the address of an argument on ARM7TDMI?

Tags: assembly arm gcc c
Question!

My C code snippet takes the address of an argument and stores it in a volatile memory location (preprocessed code):

void foo(unsigned int x) {
    *(volatile unsigned int*)(0x4000000 + 0xd4) = (unsigned int)(&x);
}

int main() {
    foo(1);
    while(1);
}

I used an SVN version of GCC for compiling this code. At the end of function foo I would expect to have the value 1 stored in the stack and, at 0x40000d4, an address pointing to that value. When I compile without optimizations using the flag -O0, I get the expected ARM7TMDI assembly output (commented for your convenience):

        .align  2
        .global foo
        .type   foo, %function
foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        sub     sp, sp, #8
        str     r0, [sp, #4]     @ 3. Store the argument on the stack
        mov     r3, #67108864
        add     r3, r3, #212
        add     r2, sp, #4       @ 4. Address of the stack variable
        str     r2, [r3, #0]     @ 5. Store the address at 0x40000d4
        add     sp, sp, #8
        bx      lr
        .size   foo, .-foo
        .align  2
        .global main
        .type   main, %function
main:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        stmfd   sp!, {r4, lr}
        mov     r0, #1           @ 1. Pass the argument in register 0
        bl      foo              @ 2. Call function foo
.L4:
        b       .L4
        .size   main, .-main
        .ident  "GCC: (GNU) 4.4.0 20080820 (experimental)"

It clearly stores the argument first on the stack and from there stores it at 0x40000d4. When I compile with optimizations using -O1, I get something unexpected:

        .align  2
        .global foo
        .type   foo, %function
foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        sub     sp, sp, #8
        mov     r2, #67108864
        add     r3, sp, #4        @ 3. Address of *something* on the stack
        str     r3, [r2, #212]    @ 4. Store the address at 0x40000d4
        add     sp, sp, #8
        bx      lr
        .size   foo, .-foo
        .align  2
        .global main
        .type   main, %function
main:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        stmfd   sp!, {r4, lr}
        mov     r0, #1           @ 1. Pass the argument in register 0
        bl      foo              @ 2. Call function foo
.L4:
        b       .L4
        .size   main, .-main
        .ident  "GCC: (GNU) 4.4.0 20080820 (experimental)"

This time the argument is never stored on the stack even though something from the stack is still stored at 0x40000d4.

Is this just expected/undefined behaviour? Have I done something wrong or have I in fact found a Compiler Bug™?



Answers

Tomi Kyöstilä wrote

development for the Game Boy Advance. I was reading about its DMA system and I experimented with it by creating single-color tile bitmaps. The idea was to have the indexed color be passed as an argument to a function which would use DMA to fill a tile with that color. The source address for the DMA transfer is stored at 0x40000d4.

That's a perfectly valid thing for you to do, and I can see how the (unexpected) code you got with the -O1 optimization wouldn't work.

I see the (expected) code you got with the -O0 optimization does what you expect -- it puts value of the color you want on the stack, and a pointer to that color in the DMA transfer register.

However, even the (expected) code you got with the -O0 optimization wouldn't work, either. By the time the DMA hardware gets around to taking that pointer and using it to read the desired color, that value on the stack has (probably) long been overwritten by other subroutines or interrupt handlers or both. And so both the expected and the unexpected code result in the same thing -- the DMA is (probably) going to fetch the wrong color.

I think you really intended to store the color value in some location where it stays safe until the DMA is finished reading it. So a global variable, or a function-local static variable such as

// Warning: Three Star Programmer at work

// Warning: untested code.
void foo(unsigned int x) {
    static volatile unsigned int color = x; // "static" so it's not on the stack
    volatile unsigned int** dma_register =
        (volatile unsigned int**)(0x4000000 + 0xd4);
    *dma_register = &color;
}

int main() {
    foo(1);
    while(1);
}

Does that work for you?

You see I use "volatile" twice, because I want to force two values to be written in that particular order.



One thing to note is that according to the standard, casts are r-values. GCC used to allow it, but in recent versions has become a bit of a standards stickler.

I don't know if it will make a difference, but you should try this:

void foo(unsigned int x) {
    volatile unsigned int* ptr = (unsigned int*)(0x4000000 + 0xd4);
    *ptr = (unsigned int)(&x);
}

int main() {
    foo(1);
    while(1);
}

Also, I doubt you intended it, but you are storing the address of the function local x (which is a copy of the int you passed). You likely want to make foo take an "unsigned int *" and pass the address of what you really want to store.

So I feel a more proper solution would be this:

void foo(unsigned int *x) {
    volatile unsigned int* ptr = (unsigned int*)(0x4000000 + 0xd4);
    *ptr = (unsigned int)(x);
}

int main() {
    int x = 1;
    foo(&x);
    while(1);
}

EDIT: finally, if you code breaks with optimizations it is usually a sign that your code is doing something wrong.



I think Even T. has the answer. You passed in a variable, you cannot take the address of that variable inside the function, you can take the address of a copy of that variable though, btw that variable is typically a register so it doesnt have an address. Once you leave that function its all gone, the calling function loses it. If you need the address in the function you have to pass by reference not pass by value, send the address. It looks to me that the bug is in your code, not gcc.

BTW, using *(volatile blah *)0xabcd or any other method to try to program registers is going to bite you eventually. gcc and most other compilers have this uncanny way of knowing exactly the worst time to strike.

Say the day you change from this

*(volatile unsigned int *)0x12345 = someuintvariable;

to

*(volatile unsigned int *)0x12345 = 0x12;

A good compiler will realize that you are only storing 8 bits and there is no reason to waste a 32 bit store for that, depending on the architecture you specified, or the default architecture for that compiler that day, so it is within its rights to optimize that to an strb instead of an str.

After having been burned by gcc and others with this dozens of times I have resorted to forcing the issue:

.globl PUT32
PUT32:
   str r1,[r0]
   bx lr





   PUT32(0x12345,0x12);

Costs a few extra clock cycles but my code continues to work yesterday, today, and will work tomorrow with any optimization flag. Not having to re-visit old code and sleeping peacefully through the night is worth a few extra clock cycles here and there.

Also if your code breaks when you compile for release instead of compile for debug, that also means it is most likely a bug in your code.

By : dwelch


This video can help you solving your question :)
By: admin