31 Replies Latest reply on Feb 15, 2019 6:45 AM by ilg

    Debugging issues with linker optimization

    brendansimon

      I've been having no end of trouble getting an existing IAR EWARM project ported to GNU MCU Eclipse and being able to debug it.

      Finally I found one of the root causes -- the "link time optimizer (-flto)"

      With this optimization disabled, my app loads and debugs ok.

      With this optimization enabled, all the interrupt handlers get optimized out.  GDB starts up, initializes some stuff, and then shutdown straight away.

      The only way I could get any interrupt handlers to appear in the map file (and thus the output elf file) is to use the `used` gcc function attribute.  This is not really a solution as all the `weak` default handlers never get included, which is generally what you want if you don't override them with your own handlers.

      Even if I use the `used` attribute on all the functions, GDB still doesn't play ball.  The same symptom of immediate shutdown prevails, so there are probably some other stuff the the link time optimizer removes.

      Is this an issue worth reporting on the GNU MCU Eclipse issue tracker, or is it a GCC issue to be reported upstream somewhere?

      Thanks, Brendan.

      • Reply
        • Re: Debugging issues with linker optimization
          brendansimon

          The very latest GNU MCU Eclipse ARM GCC release (2 Feb 2019) has some fixes for LTO.

          GDB now no longer shutsdown

          However I still can't debug the program.  The interrupt handlers have still been optimized out (e.g. Reset_Handler does not exist in the linker map).

          • Re: Debugging issues with linker optimization
            ilg

            > all the interrupt handlers get optimized out

             

            that's normal, you need a carefully constructed linker script to 'KEEP' the array of vectors even if not refered. not a bug.

             

            the blinky project tutorial is functional, even with LTO in both debug/release, as long as you are not on windows, where debug is not functional. I don't know where you got your linker scripts and your startup vectors, but I suggest you take a look at the functional code in the blinky project and adjust your files.

             

            and, for the moment, LTO is still 'experimental', when it works it is great, but sometimes the linker crashes.

              • Re: Debugging issues with linker optimization
                brendansimon

                My linker scripts are exactly the same as blinky, because I stole them, along with 99% of the startup code.  I did modify _startup to use initialise functions from a ported IAR project, which uses STM32 STD PERIPHERAL libraries (i.e. not the newer STM32 HAL libraries).  And I excluded all other files that were not needed for a successful build (_initialize_hardware.c, _write.c, stm32f4xx_hal_msp.c, system_stm32f4xx.c, _reset_hardware.c, _cxx.cpp, _sbrk.c, _syscalls.c, assert.c).

                 

                It all works fine with LTO optimization disabled.  i.e. interrupts work and I can debug it.  I'm using macOS and I'm sure I did some quick debug tests on Windows 10 too and worked ok (from memory).

                 

                I tried rebuilding the blinky example with LTO enabled, and I found that the interrupt handlers also disappeared from the map file (1004 lines with LTO enabled, 5418 lines with LTO disabled).

                 

                Toolchain is /Users/brendan/Library/xPacks/@gnu-mcu-eclipse/arm-none-eabi-gcc/8.2.1-1.2.1/.content/bin

                 

                Maybe using the GCC 7 release (v7.3.1-1.1.1) is more stable and recommended for production code?  I tried 7.3.1 too and does the same thing.

                 

                From your description of LTO, it sounds like most/all serious development would have LTO disabled.  I presume that still has enough optimization to weed out unused functions, etc, to reduce code size?  If that is the case, then there is no pressing need to use LTO.

                 

                Thanks,

                Brendan.

                  • Re: Debugging issues with linker optimization
                    ilg

                    > most/all serious development would have LTO disabled

                     

                    it happens that these days I did a lot of work to make it work, and for now I think I have a toolchain version that works for all my tests.

                     

                    so, I'm getting somehow more optimistic about LTO ;-)

                     

                    for autoritative information, here is an article:

                     

                    Honza Hubička's Blog: GCC 8: link time and interprocedural optimization

                     

                    as for your case, please do the following: instantiate the STM32F4 template with the -flto enabled. then build the Debug configuration and run the result on QEMU, exactly as explained in the blinky tutorial. or at least enable the listing and check if the vectors are there. repeat for the Release configuration. in my tests everything is fine, the application blinks, both with gcc 7 and gcc 8.

                     

                    once you have both Debug and Release configurations functional, check the differences to your application.

                     

                    right now I'm building a new toolchain release (8.2.1-1.4) that is expected to work with -flto and -g/-g3 even on Windows. Linux and macOS were already functional in 8.2.1-1.3.

                      • Re: Debugging issues with linker optimization
                        brendansimon

                        The project fails to build with LTO enabled (undefined symbols `_isatty` and `_fstat`).  The build succeeds with LTO disabled.

                         

                        Here is the failed linker output (LTO enabled):

                         

                        Building file: ../src/stm32f4xx_hal_msp.c

                        Invoking: GNU ARM Cross C Compiler

                        arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb -mfloat-abi=soft -Og -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -flto -fno-move-loop-invariants -Wunused -Wuninitialized -Wall -Wextra -Wmissing-declarations -Wconversion -Wpointer-arith -Wpadded -Wshadow -Wlogical-op -Waggregate-return -Wfloat-equal  -g3 -DDEBUG -DUSE_FULL_ASSERT -DOS_USE_SEMIHOSTING -DTRACE -DOS_USE_TRACE_SEMIHOSTING_DEBUG -DSTM32F407xx -DUSE_HAL_DRIVER -DHSE_VALUE=8000000 -I"../include" -I"../system/include" -I"../system/include/cmsis" -I"../system/include/stm32f4-hal" -std=gnu11 -Wmissing-prototypes -Wstrict-prototypes -Wbad-function-cast -Wno-missing-prototypes -Wno-missing-declarations -MMD -MP -MF"src/stm32f4xx_hal_msp.d" -MT"src/stm32f4xx_hal_msp.d" -c -o "src/stm32f4xx_hal_msp.o" "../src/stm32f4xx_hal_msp.c"

                        Finished building: ../src/stm32f4xx_hal_msp.c

                         

                        Building target: Blinky_20190204.elf

                        Invoking: GNU ARM Cross C++ Linker

                        arm-none-eabi-g++ -mcpu=cortex-m4 -mthumb -mfloat-abi=soft -Og -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -flto -fno-move-loop-invariants -Wunused -Wuninitialized -Wall -Wextra -Wmissing-declarations -Wconversion -Wpointer-arith -Wpadded -Wshadow -Wlogical-op -Waggregate-return -Wfloat-equal  -g3 -T mem.ld -T libs.ld -T sections.ld -nostartfiles -Xlinker --gc-sections -L"../ldscripts" -Wl,-Map,"Blinky_20190204.map" --specs=nano.specs -o "Blinky_20190204.elf"  ./system/src/stm32f4-hal/stm32f4xx_hal.o ./system/src/stm32f4-hal/stm32f4xx_hal_cortex.o ./system/src/stm32f4-hal/stm32f4xx_hal_dfsdm.o ./system/src/stm32f4-hal/stm32f4xx_hal_flash.o ./system/src/stm32f4-hal/stm32f4xx_hal_gpio.o ./system/src/stm32f4-hal/stm32f4xx_hal_iwdg.o ./system/src/stm32f4-hal/stm32f4xx_hal_pwr.o ./system/src/stm32f4-hal/stm32f4xx_hal_rcc.o  ./system/src/newlib/_cxx.o ./system/src/newlib/_exit.o ./system/src/newlib/_sbrk.o ./system/src/newlib/_startup.o ./system/src/newlib/_syscalls.o ./system/src/newlib/assert.o  ./system/src/diag/Trace.o ./system/src/diag/trace_impl.o  ./system/src/cortexm/_initialize_hardware.o ./system/src/cortexm/_reset_hardware.o ./system/src/cortexm/exception_handlers.o  ./system/src/cmsis/system_stm32f4xx.o ./system/src/cmsis/vectors_stm32f407xx.o  ./src/BlinkLed.o ./src/Timer.o ./src/_initialize_hardware.o ./src/_write.o ./src/main.o ./src/stm32f4xx_hal_msp.o  

                        /Users/brendan/Library/xPacks/@gnu-mcu-eclipse/arm-none-eabi-gcc/8.2.1-1.3.1/.content/bin/../lib/gcc/arm-none-eabi/8.2.1/../../../../arm-none-eabi/bin/ld: /Users/brendan/Library/xPacks/@gnu-mcu-eclipse/arm-none-eabi-gcc/8.2.1-1.3.1/.content/bin/../lib/gcc/arm-none-eabi/8.2.1/../../../../arm-none-eabi/lib/thumb/v7e-m/nofp/libg_nano.a(lib_a-fstatr.o): in function `_fstat_r':

                        fstatr.c:(.text._fstat_r+0xe): undefined reference to `_fstat'

                        /Users/brendan/Library/xPacks/@gnu-mcu-eclipse/arm-none-eabi-gcc/8.2.1-1.3.1/.content/bin/../lib/gcc/arm-none-eabi/8.2.1/../../../../arm-none-eabi/bin/ld: /Users/brendan/Library/xPacks/@gnu-mcu-eclipse/arm-none-eabi-gcc/8.2.1-1.3.1/.content/bin/../lib/gcc/arm-none-eabi/8.2.1/../../../../arm-none-eabi/lib/thumb/v7e-m/nofp/libg_nano.a(lib_a-isattyr.o): in function `_isatty_r':

                        isattyr.c:(.text._isatty_r+0xc): undefined reference to `_isatty'

                        collect2: error: ld returned 1 exit status

                        make: *** [Blinky_20190204.elf] Error 1

                         

                        23:20:54 Build Failed. 3 errors, 2 warnings. (took 9s.592ms)

                         

                        Here is the successful output (LTO disabled):

                         

                        Building file: ../src/stm32f4xx_hal_msp.c

                        Invoking: GNU ARM Cross C Compiler

                        arm-none-eabi-gcc -mcpu=cortex-m4 -mthumb -mfloat-abi=soft -Og -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -fno-move-loop-invariants -Wunused -Wuninitialized -Wall -Wextra -Wmissing-declarations -Wconversion -Wpointer-arith -Wpadded -Wshadow -Wlogical-op -Waggregate-return -Wfloat-equal  -g3 -DDEBUG -DUSE_FULL_ASSERT -DOS_USE_SEMIHOSTING -DTRACE -DOS_USE_TRACE_SEMIHOSTING_DEBUG -DSTM32F407xx -DUSE_HAL_DRIVER -DHSE_VALUE=8000000 -I"../include" -I"../system/include" -I"../system/include/cmsis" -I"../system/include/stm32f4-hal" -std=gnu11 -Wmissing-prototypes -Wstrict-prototypes -Wbad-function-cast -Wno-missing-prototypes -Wno-missing-declarations -MMD -MP -MF"src/stm32f4xx_hal_msp.d" -MT"src/stm32f4xx_hal_msp.d" -c -o "src/stm32f4xx_hal_msp.o" "../src/stm32f4xx_hal_msp.c"

                        Finished building: ../src/stm32f4xx_hal_msp.c

                         

                        Building target: Blinky_20190204.elf

                        Invoking: GNU ARM Cross C++ Linker

                        arm-none-eabi-g++ -mcpu=cortex-m4 -mthumb -mfloat-abi=soft -Og -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -fno-move-loop-invariants -Wunused -Wuninitialized -Wall -Wextra -Wmissing-declarations -Wconversion -Wpointer-arith -Wpadded -Wshadow -Wlogical-op -Waggregate-return -Wfloat-equal  -g3 -T mem.ld -T libs.ld -T sections.ld -nostartfiles -Xlinker --gc-sections -L"../ldscripts" -Wl,-Map,"Blinky_20190204.map" --specs=nano.specs -o "Blinky_20190204.elf"  ./system/src/stm32f4-hal/stm32f4xx_hal.o ./system/src/stm32f4-hal/stm32f4xx_hal_cortex.o ./system/src/stm32f4-hal/stm32f4xx_hal_dfsdm.o ./system/src/stm32f4-hal/stm32f4xx_hal_flash.o ./system/src/stm32f4-hal/stm32f4xx_hal_gpio.o ./system/src/stm32f4-hal/stm32f4xx_hal_iwdg.o ./system/src/stm32f4-hal/stm32f4xx_hal_pwr.o ./system/src/stm32f4-hal/stm32f4xx_hal_rcc.o  ./system/src/newlib/_cxx.o ./system/src/newlib/_exit.o ./system/src/newlib/_sbrk.o ./system/src/newlib/_startup.o ./system/src/newlib/_syscalls.o ./system/src/newlib/assert.o  ./system/src/diag/Trace.o ./system/src/diag/trace_impl.o  ./system/src/cortexm/_initialize_hardware.o ./system/src/cortexm/_reset_hardware.o ./system/src/cortexm/exception_handlers.o  ./system/src/cmsis/system_stm32f4xx.o ./system/src/cmsis/vectors_stm32f407xx.o  ./src/BlinkLed.o ./src/Timer.o ./src/_initialize_hardware.o ./src/_write.o ./src/main.o ./src/stm32f4xx_hal_msp.o  

                        Finished building target: Blinky_20190204.elf

                         

                        Invoking: GNU ARM Cross Create Flash Image

                        arm-none-eabi-objcopy -O ihex "Blinky_20190204.elf"  "Blinky_20190204.hex"

                        Finished building: Blinky_20190204.hex

                         

                        Invoking: GNU ARM Cross Print Size

                        arm-none-eabi-size --format=berkeley "Blinky_20190204.elf"

                           text       data        bss        dec        hex    filename

                          13307        156        784      14247       37a7    Blinky_20190204.elf

                        Finished building: Blinky_20190204.siz

                         

                         

                        23:16:49 Build Finished. 0 errors, 2 warnings. (took 6s.740ms)

                          • Re: Debugging issues with linker optimization
                            ilg

                            oops!

                             

                            this is a new problem, related to newlib...

                             

                            in all my tests I used the Freestanding version (select it during project creation).

                             

                            I planned to address the newlib issues at a later time, but it looks like I have to do it earlier :-(

                             

                            do you really need posix support in your apps?

                              • Re: Debugging issues with linker optimization
                                brendansimon

                                no I don't need posix support.  I was just following the blinky tutorial exactly to test your assertion

                                 

                                So now it builds with both LTO enabled and disabled, however I get the same symptoms.  i.e. smaller map file with no interrupt handlers with LTO enabled, and larger map file and interrupt handlers with LTO disabled.

                                 

                                My other project is free standing too.  Maybe the freestanding (or lack of posix syscalls) causes the above symptoms?

                                  • Re: Debugging issues with linker optimization
                                    ilg

                                    > i.e. smaller map file with no interrupt handlers with LTO enabled

                                     

                                    strange. in my case the vectors are there, and the app is functional:

                                     

                                     

                                    Disassembly of section .isr_vector:
                                    
                                    08000000 <__isr_vectors>:
                                    8000000: 00 00 02 20 ad 03 00 08 d9 01 00 08 c5 01 00 08    ... ............
                                    8000010: c1 01 00 08 ad 01 00 08 99 01 00 08 00 00 00 00    ................
                                      ...
                                    800002c: 95 01 00 08 91 01 00 08 00 00 00 00 8d 01 00 08    ................
                                    800003c: fd 06 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800004c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800005c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800006c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800007c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800008c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800009c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    80000ac: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    80000bc: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    80000cc: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    80000dc: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    80000ec: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    80000fc: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800010c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800011c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800012c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800013c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800014c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800015c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800016c: 89 01 00 08 89 01 00 08 89 01 00 08 89 01 00 08    ................
                                    800017c: 00 00 00 00 89 01 00 08 89 01 00 08                ............
                                    
                                    08000188 <ADC_IRQHandler>:
                                    Default_Handler():
                                    /Users/ilg/Desktop/eclipse-workspace-2018-12/f4b-fs/Debug/../system/src/cmsis/vectors_stm32f407xx.c:349
                                    
                                    void __attribute__ ((section(".after_vectors")))
                                    Default_Handler(void)
                                    {
                                    #if defined(DEBUG)
                                    __DEBUG_BKPT();
                                    8000188: be00      bkpt 0x0000
                                    /Users/ilg/Desktop/eclipse-workspace-2018-12/f4b-fs/Debug/../system/src/cmsis/vectors_stm32f407xx.c:353
                                    #endif
                                    while (1)
                                      {
                                      }
                                    800018a: e7fe      b.n 800018a <ADC_IRQHandler+0x2>

                                     

                                     

                                    and the build looks like:

                                     

                                    14:52:30 **** Incremental Build of configuration Debug for project f4b-fs ****
                                    make all 
                                    Building target: f4b-fs.elf
                                    Invoking: GNU ARM Cross C++ Linker
                                    arm-none-eabi-g++ -mcpu=cortex-m4 -mthumb -mfloat-abi=soft -Og -fmessage-length=0 -fsigned-char -ffunction-sections -fdata-sections -ffreestanding -flto -fno-move-loop-invariants -Wall -Wextra  -g3 -T mem.ld -T libs.ld -T sections.ld -nostartfiles -Xlinker --gc-sections -L"../ldscripts" -Wl,-Map,"f4b-fs.map" --specs=nano.specs -o "f4b-fs.elf"  ./system/src/stm32f4-hal/stm32f4xx_hal.o ./system/src/stm32f4-hal/stm32f4xx_hal_cortex.o ./system/src/stm32f4-hal/stm32f4xx_hal_dfsdm.o ./system/src/stm32f4-hal/stm32f4xx_hal_flash.o ./system/src/stm32f4-hal/stm32f4xx_hal_gpio.o ./system/src/stm32f4-hal/stm32f4xx_hal_iwdg.o ./system/src/stm32f4-hal/stm32f4xx_hal_pwr.o ./system/src/stm32f4-hal/stm32f4xx_hal_rcc.o  ./system/src/newlib/_cxx.o ./system/src/newlib/_exit.o ./system/src/newlib/_sbrk.o ./system/src/newlib/_startup.o ./system/src/newlib/_syscalls.o ./system/src/newlib/assert.o  ./system/src/diag/Trace.o ./system/src/diag/trace_impl.o  ./system/src/cortexm/_initialize_hardware.o ./system/src/cortexm/_reset_hardware.o ./system/src/cortexm/exception_handlers.o  ./system/src/cmsis/system_stm32f4xx.o ./system/src/cmsis/vectors_stm32f407xx.o  ./src/BlinkLed.o ./src/Timer.o ./src/_initialize_hardware.o ./src/_write.o ./src/main.o ./src/stm32f4xx_hal_msp.o   
                                    Finished building target: f4b-fs.elf
                                     
                                    Invoking: GNU ARM Cross Create Flash Image
                                    arm-none-eabi-objcopy -O ihex "f4b-fs.elf"  "f4b-fs.hex"
                                    Finished building: f4b-fs.hex
                                     
                                    Invoking: GNU ARM Cross Create Listing
                                    arm-none-eabi-objdump --source --all-headers --demangle --line-numbers --wide "f4b-fs.elf" > "f4b-fs.lst"
                                    Finished building: f4b-fs.lst
                                     
                                    Invoking: GNU ARM Cross Print Size
                                    arm-none-eabi-size --format=berkeley "f4b-fs.elf"
                                       text   data    bss    dec    hex filename
                                       8716    164    496   9376   24a0 f4b-fs.elf
                                    Finished building: f4b-fs.siz
                                     
                                    
                                    
                                    14:52:31 Build Finished. 0 errors, 0 warnings. (took 1s.118ms)
                                    • Re: Debugging issues with linker optimization
                                      ilg

                                      Brendan,

                                       

                                      After some failures and lengthy retries, I finally managed to install a brand new Win 10 VM, with everything freshly installed.

                                       

                                      Unfortunately I still cannot reproduce your case, both the Debug & Release configurations of the STM32F4 blinky project (Freestanding, Semihosting DEBUG), with LTO enabled (without -g for now), create correct binaries, with the array of vectors at 0x80000000 and the handlers immediately after.

                                       

                                      Even more, I executed the Debug binary on QEMU and it is functional, it blinks the 4 leds as expected.

                                       

                                      I have no idea what happens in your environment to make the vectors disappear. Can you copy here the content at 0x80000000 (use the linting file)?

                                • Re: Debugging issues with linker optimization
                                  brendansimon

                                  Hi Liviu,

                                   

                                  Just wondering how soon the next gcc release is going to be, so I can have LTO working on Windows, without the linker errors

                                  • invalid string offset 114950144 >= 1780 for section `.strtab'
                                  • ELF section name out of range
                                  • error: lto-wrapper failed

                                   

                                  Thanks, Brendan.

                            • Re: Debugging issues with linker optimization
                              brendansimon

                              SOLVED !!

                               

                              Turns out the vector and exception files were missing some `used` attributes.  Maybe they were never there, but more likely is that I accidentally removed them when I was experimenting/investigating issues and I forgot to put them back (or something along those lines).

                               

                              With those `used` attribute for the __isr_vectors` symbol, the vectors were being optimize out and the `_start()` function was being placed at 0x08000000 (where the vectors normally live).

                               

                              vectors_stm32f407xx.c

                               

                              __attribute__ ((section(".isr_vector"),used))     <==== `used` was missing

                              pHandler __isr_vectors[] =

                               

                               

                              exception_handlers.c

                              void __attribute__ ((section(".after_vectors"),weak,used))     <==== `used` was missing

                              HardFault_Handler_C (ExceptionStackFrame* frame __attribute__((unused)),

                                                   uint32_t lr __attribute__((unused)))

                              void __attribute__ ((section(".after_vectors"),weak,used))     <==== `used` was missing

                              BusFault_Handler_C (ExceptionStackFrame* frame __attribute__((unused)),

                                                  uint32_t lr __attribute__((unused)))

                              void __attribute__ ((section(".after_vectors"),weak,used))     <==== `used` was missing

                              UsageFault_Handler_C (ExceptionStackFrame* frame __attribute__((unused)),

                                                    uint32_t lr __attribute__((unused)))