How to convert from an armlink scatter file to a GNU ld linker script

For some of my research projects, I use the Intel iMote (ISN100-BA) platform, provided by Intel research labs Cambridge. The default TinyOS platform code uses GNU gcc for compiling, but depends on the ARM developer suite (ADS 1.2 or Realview) linker for the final step. Replacing this with the GNU linker ld was a non-trivial task, and part of it was to convert the memory layout from the armlink scatter file to a GNU ld linker script.

In the following, I use the ScatterFile.txt shipped with the iMote TinyOS platform, but the steps should be applicable to other files as well. Please email me when you have successfully converted other files, so that I can add more examples to this page. For brevity, I have removed all comments from the original file:


LoadFlashCode    0x004000  ABSOLUTE 0x7C000
{
    FlashCode2 +0  ;
    {
        *	(FlashBoot)   ; area is found in SU_FlashBoot.s
    }
    FlashCode3 +0
    {
        *         (+RO)
    }

    RomLinkV1 0x200000 OVERLAY
    {
        SU_ROMLinkV1.o        (+RO)
    }
    RomLinkV2 0x200000 OVERLAY
    {
        SU_ROMLinkV2.o        (+RO)
    }

    IRamStack +0 ABSOLUTE
    {
        * (Stacks)
    }

    IRamCode  +0 ABSOLUTE
    {
	TM_FlashDrvAmend.o    (+RO)
        TM_IRamCode.o         (+RO)
    }

    IRamNI +0
    {
      *	(ResetFlags)
      SU_PowerUpCheckVar.o (+ZI)
    }

    IRamRW +0
    {
        * ( +RW )
    }

    IRamZI +0
    {
        * ( +ZI )
    }

    BRamNI 0x20CB00 ABSOLUTE
    {
    }

    BRamZI 0x20F900 ABSOLUTE 0x700
    {
        BBD_BufferSramVar.o   ( +ZI)
        LM_BufferSramVar.o    ( +ZI)
        BP_BufferSramVar.o    ( +ZI)
    }

    XRamZI 0x400000 ABSOLUTE
    {
      BP_DummyXramVar.o ( +ZI)
    }
}

 

There are numerous examples for GNU ld linker scripts on the web, also for ARM7TDMI (which is used on the iMote). These can be used for the general structure. But to generate a working file, we first need to find out the segments used by the target platform, that is, start and end addresses for loading and executing code and data. Using the manual for armlink, these can be extracted from the above file (yes, reverse engineering).
The first line

LoadFlashCode    0x004000  ABSOLUTE 0x7C000

(the name is ignored) tells us that the flash memory starts at 0x4000 with a size of 0x7C000 bytes; this is called the "load region" for armlink, and defines how the generated binary file will look like. In the first block (blocks inside the load region define the "exec region", which might have a different mapping),

FlashCode2 +0  ;
    {
        *	(FlashBoot)
    }

a boot loader code is put at the beginning of the load region. The name is again arbitrary and can be ignored for the address layout, but the +0 is an offset for the block, relative to the preceeding section (as this is the first region, relative to the load region start address). Since the address offset is zero, this tells us that, for executing code, this platform uses the same start address during execution that we need to use for loading. That is, the load region and the exec region for initialization code start at the same absolute address. This is to be expected for most platforms.
Although the name is not relevant for the linker itself, exec regions automatically generate Load$$Name symbols, which can be referenced in the code. This is typically used by the boot loader to set up the run-time environment before starting the custom code, i.e. for transforming the load segmentation into the exec segmentation. Since GNU ld does not generate these symbols automatically, we will need to do so explicitly, as described below.
There are two components in the lines within these blocks: a module selector, which matches input file names (object or library files), and an (optional) input section selector in brackets, which matches the section that the compiler created. Both components need to match for the line to include anything in the output. The above line matches any input file (i.e. don't care) that defines a section named FlashBoot (which is, in this case, contained in the file SU_FlashBoot.o). Then the next block

FlashCode3 +0
    {
        *         (+RO)
    }

simply appends all the other executable code and constant values (which are marked read-only, i.e. as an RO section, by the compiler). Again, the +0 is an offset and tells us that our own executable code is mapped to the same addresses for loading and for execution. Now it gets slightly more complicated. The next two exec regions

RomLinkV1 0x200000 OVERLAY
    {
        SU_ROMLinkV1.o        (+RO)
    }
    RomLinkV2 0x200000 OVERLAY
    {
        SU_ROMLinkV2.o        (+RO)
    }

use the same absolute address during execution, namely 0x200000, and the OVERLAY keyword. This means that two different code blocks (code because the section matches are RO) can be available in the area starting at address 0x200000 and that the executed code somehow needs to select which one is used (in the iMote TinyOS case, this is done by the boot loader). For linking, this means that two code blocks that are separate in the load region (i.e. the binary file to be flashed) need to use the same start address when being referenced to for execution. The modules referenced by the module selectors, i.e. SU_ROMLinkV1.o and SU_ROMLinkV2.o (contained in the motelib.a file), are appended to the code area in the load region.
Although this is not made explicit, the new base address tells us that a new segment was started. We can guess (taking into account the load region names...), the the segment starting at address 0x4000 is a flash memory, both during load and during execution time, and that the segment starting at address 0x200000 is RAM. The data sheet of the iMote tells us that it has a size of 64kB, or 0x10000, so we can use this for verification that the RAM area is sufficient for the executable.
The following regions therefore define the RAM layout during execution. A small part at the start is already consumed by the overlaid regions, followed by

IRamStack +0 ABSOLUTE
    {
        * (Stacks)
    }

which defines our stack region. The +0 ABSOLUTE address modifier tells us that this exec region follows the preceding, but that it does not inherit its OVERLAY flag. As for the first load region, this matches a specific section name (defined in motelib.a). Then two more code modules (only the code and constant data parts marked RO of two specific modules, again contained in motelib.a)

IRamCode  +0 ABSOLUTE
    {
	TM_FlashDrvAmend.o    (+RO) ;//Ammendments to Flash driver code
        TM_IRamCode.o         (+RO)
    }

are appended and need to be copied from the load address to the execution address by the boot loader (these are different by now, since the modules need to be appended to the generated binary that is loaded to the flash, but will be executed in RAM addresses). The next exec region

IRamNI +0
    {
      *	(ResetFlags)
      SU_PowerUpCheckVar.o (+ZI)
    }

defines another part identified by section name and room for some variables that should be initialized with zero (the ZI attribute, which indicates that this is "BSS" data, i.e. uninitialized static variables). Finally, the next two exec regions

IRamRW +0
    {
        * ( +RW )
    }
    IRamZI +0
    {
        * ( +ZI )
    }

define the RAM areas for all other initialized (RW) and uninitialized (ZI) variables that have not been added by previous regions. These are the normal "DATA" and "BSS" segments. Then seemingly nonsensical, an emtpy exec region

BRamNI 0x20CB00 ABSOLUTE
    {
    }

that only defines the matching symbols at address 0x20CB00 but does not reserve any further space (in neither the load nor the exec areas). At the very end of the available RAM block, the exec region

BRamZI 0x20F900 ABSOLUTE 0x700
    {
        BBD_BufferSramVar.o   ( +ZI)
        LM_BufferSramVar.o    ( +ZI)
        BP_BufferSramVar.o    ( +ZI)
    }

reserves space for uninitialized variables (ZI) defined in three modules within the last 0x700 bytes, starting at address 0x20F900. Taking module names into account, we can guess that this is a small buffered SRAM area, possibly used for performance reasons (a detailed data sheet would make this guessing unnecessary...). Note that this does again not consume any space in the load region, because this exec region only defines uninitialized variables. The above description of "all other variables that have not been added by previous regions" is not completely correct. Not order matches, but most specific lines.
Finally, the last exec region

XRamZI 0x400000 ABSOLUTE
    {
      BP_DummyXramVar.o ( +ZI)
    }

defines yet another segment, starting at address 0x400000. It also doesn't append anything to the load region, but reserves space for uninitialized variables (ZI) of a specific module.
Summarizing this analysis, the scatter file defines three regions:

  • Start address 0x4000 with a length of 0x7C000: This is mapped in both the load and the execution areas and defines the flash memory. Some code (marked RO) is executed in-place, and other code and constants (marked RO) and start values for initialized variables (marked RW) are stored in the load area (in the binary file and loaded into the flash memory) and need to be copied to the other, writable segments by the boot loader before executing the main program.
  • Start address 0x200000 with a length of 0x10000: This is the normal RAM for the execution area, and the linker reserves space for defined variables here.
  • Start address 0x400000 with unknown length: Contains only some uninitialized variables. This means that the linker only needs to set resolve some symbols to lie within this segment, but does not need to put anything in there.

Now we need to write an ld linker scripts that creates the same layout. A complete reference for ld linker scripts can be found here. The initial definition of the three regions is simple:

MEMORY
{
  FLASH (rx)  : ORIGIN = 0x004000, LENGTH = 0x7C000
  RAM   (rwx) : ORIGIN = 0x200000, LENGTH = 0x10000
  RAM2  (rwx) : ORIGIN = 0x400000, LENGTH = 0x10000
}

As we don't really know the length of the second RAM region but the LENGTH attribute is mandatory, we need to guess something. As it is just used to check for overflowing regions, this should not lead to any problems. Note that the addresses specified in the MEMORY layout refer to the executable area, i.e. exec regions in armlink terms. The main part of an ld linker script is its SECTIONS block, which we start with the code area that is mapped to the same address in the load and execution areas, i.e. the flash:

SECTIONS
{
  .text : {
    * (FlashBoot)
    __end_of_boot__ = ABSOLUTE(.);

    * (.text)
    * (.rodata)
    * (.rodata*)
    * (.glue_7)
    * (.glue_7t)
  } > FLASH
  __end_of_text__ = .;

This first output section is named .text, which is the standard name for ELF format files, and first includes the section named FlashBoot from any of the input files and then all remaining .text, .rodata, and special glue code segments from any of the input files. The addresses are set automatically, because the whole output section is put into the defined FLASH memory region. Additionally, we define the symbol __end_of_boot__ to contain the end address of the boot loader code, for later use.

This page was last modified on 2010-05-03