Cortex-M loading 32-bit variable optimization

I’m trying to compile the following test code below, that only writes the 32-bits variable into a pointer. I write it once as byte access, and second time as word access.

void load_data_8(uint32_t value, void* d) {
    uint8_t* d_ptr = d;

    *d_ptr++ = (value>>0)&0xFF;
    *d_ptr++ = (value>>8)&0xFF;
    *d_ptr++ = (value>>16)&0xFF;
    *d_ptr++ = (value>>24)&0xFF;
    
    *d_ptr++ = (value>>24)&0xFF;
    *d_ptr++ = (value>>16)&0xFF;
    *d_ptr++ = (value>>8)&0xFF;
    *d_ptr++ = (value>>0)&0xFF;
}

void load_data_32(uint32_t value, void* d) {
    uint32_t* d_ptr = d;

    *d_ptr = value;
}

Compiler: ARM GCC 11.2.1
Compiler flags: -mcpu=cortex-m7 -O3 (C-M7 has unaligned memory access instructions)
Compiler produces the following:

load_data_8:
        rev     r3, r0
        str     r0, [r1]  @ unaligned
        str     r3, [r1, #4]      @ unaligned
        bx      lr
load_data_32:
        str     r0, [r1]
        bx      lr
main:
        movs    r0, #0
        bx      lr

And if I compile the same code for cortex-m0plus, which has even less capabilities for unaligned memory access, I get this:

Compiler flags: -mcpu=cortex-m0plus -O3

load_data_8:
        push    {r4, lr}
        lsrs    r3, r0, #8
        lsrs    r2, r0, #16
        uxtb    r4, r0
        uxtb    r3, r3
        uxtb    r2, r2
        lsrs    r0, r0, #24
        strb    r4, [r1]
        strb    r3, [r1, #1]
        strb    r2, [r1, #2]
        strb    r0, [r1, #3]
        strb    r0, [r1, #4]
        strb    r2, [r1, #5]
        strb    r3, [r1, #6]
        strb    r4, [r1, #7]
        pop     {r4, pc}
load_data_32:
        str     r0, [r1]
        bx      lr

C-M7 test: What is the reason for @ unaligned message in the load_data_8 function for Cortex-M7, but not in the load_data_32? How does compiler know that data pointer in the load_data_32 won’t be unaligned?
C-M0+ test: Why it does not produce the same code for load_data_8 and load_data_32, given in both cases we write 32-bits of data in a CPU endianness (little)? What makes it different from core standpoint if the type is 8-bit vs 32-bit, given that memory is in a sequence?

You need to sign in to view this answers

About Us

Categories

Android

C#

C++

CSS

GPL

HTML

Contact Info

Cortex-M loading 32-bit variable optimization

Leave feedback about this Cancel Reply

PROS

CONS

Categories

Android

C#

C++

CSS

GPL

HTML

java

javascript

jQuery

Node.js

pdf

PHP

Recent Posts

Postgres drop type XX000 “cache lookup failed for type”

PostgreSQL how to merge rows where some fields match and others are null

About Us

Categories

Android

C#

C++

CSS

GPL

HTML

Contact Info

Follow Us

Cortex-M loading 32-bit variable optimization

Share This Post:

Leave feedback about this Cancel Reply

PROS

CONS

Related Post

Android

C#

C++

CSS

GPL

HTML

java

javascript

jQuery

Node.js

pdf

PHP