Raspberry Pi Pico (RP2040) SRAM and Flash Programming

Program the Pico (RP2040) from the perspective of an 8-bit MCU developer’s view, focus on details in SRAM and flash memory programming.

--by Captdam @ Feb 14, 2026 Aug 27, 2025

Index

I have been working with 8-bit MCUs like 8051 and AVR for a long time; but new to 32-bit MCU like the RP2040 MCU on Raspberry Pi Pico board. In this blog, I will record my first trial on Pico from the perspective of an 8-bit MCU developer's view. I will be focused on programming the on-chip run-time memory and off-chip flash memory.

I will be writing baremetal Assembly and C language applications that directly writing / reading the MCU registers instead of using libraries. This expose me to the actual system control registers, allowing me to understanst the full functionality of the MCU.


This article is intended for developers who are familiar with 8-bit MCUs and use Assembly and C language to develop bare metal applications, but new to 32-bit RP2040 and ARM Cortex-M0+.

Since we are creating bare metal applications, we will be directly writing to and reading from the MCU control registers. No library is used.

We will rely on the documents heavily. It includes all information we need about the MCU control registers.

Because of the RP2040 document updates, and for some reason they decided to redirect my links to the old document to the new document, I decided to create a copy of the current version (2025-02-20) on my server. You may obtain this document from the official link here (as 2026-02-10).

RP2040 Document Colophon
RP2040 Document Colophon

Difference Between 8-bit MCU and 32-bit MCU

Other than the data wdith difference, the memory architecture is different between them.

8-bit MCUs in Harvard Architecture

Harvard Architecture
Harvard Architecture Found on ATtiny24

8-bit MCUs like 8051 and AVR are in Harvard architecture. The instruction (program) bus is connected to the program memory, the data bus is connected to the data memory. They are not crossed.

The program is saved on on-chip program memory, and in most cases, the on-chip flash. When we download the binary to MCU, we burn the program to the on-chip flash. When the MCU starts, the CPU execution unit directly reads the on-chip flash.

Other than the flash, there is the data memory. It can be an on-chip SRAM, or an external memory chip, or both. This memory is for data only, the CPU cannot execute from it.

32-bit MCUs in Von Neumann Architecture

Von Neumann Architecture
Von Neumann Architecture Found on RP2040 ©RP2040

32-bit MCU like the RP2040 are in Von Neumann architecture. The instruction bus and the data bus are connected to the same memory. There is only one universal memory interface, for ARM architecture, that's the AHB-Lite Crossbar.

More specifically, one memory space. From the CPU's perspective, it only sees one memory space. Regions in the one universal memory space can be specialized, some allow write and read, some are read only; some can be used for data only, some can be used for program only, some can be used as both; and some are mapped to MCU control registers. The memory bus will map different physical memories into one memory space.

The CPU is connected to the mian memory, but not a flash that store program in a non-volatile way. To execute a program, a bootloader must load the program into a executable region in the main memory before the CPU can execute it.

RP2040 supports execution in place (XIP). It seems that the CPU can execute from off-chip flash (or any device connected to the QSPI port); however, the XIP hardware must first cache the program into the on-chip memory. Therefore, the CPU is executing the program in the on-chip memory.

HelloWorld.asm

We will write a very simple program that blinks the on-board LED on GPIO25.

Project file for this example can be found here.

ASM Program

Following is our program source code in assembly language in file main.s:

Architecture

RP2040 Architecture
RP2040 Architecture ©RP2040

.cpu cortex-m0plus
.thumb
.align 2
.thumb_func
			

At the beginning, we tell the assembler that we are using Cortex M0+ CPU (the CPU used by RP2040).

This CPU supports Thumb instruction set only, where the width of instruction is 2 bytes. Therefore, we let the assembler use the Thumb instruction set and we would like to align the instruction code by 2 bytes.

Entry Point

RP2040 SRAM Address
RP2040 SRAM Address ©RP2040

.global reset
reset:
	ldr	r0, =0x20041000		@ Stack Pointer @ SRAM bank 4
	mov	sp, r0
			

Our entry point is routine reset. We will make the label reset global. In that way, the reset routine will be visible in the linking stage.

We want to use SRAM bank 4 (address 0x20004000 - 0x200040FFF, 4KiB in size) as our stack. The ARM Cortex-M0+ CPU stack pointer (SP) indicates the last stacked item on the stack memory and grows downwards; therefore, we need to set the initial SP at the top of the memory region + 1. That is, 0x20041000. Save this address into stack pointer SP.

Enable Output

RP2040 RESETS_RESET
Reset control. If a bit is set it means the peripheral is in reset. 0 means the peripheral's reset is deasserted. @ 0x4000C000 ©RP2040

	ldr	r3, =0x4000f000		@RESETS_BASE + RESET  + 0x3000
	mov	r2, #32			@ 1 << 5 (IO_BANK0)
	str	r2, [r3, #0]
			

After power-up, the user IOs (IO Bank 0, GPIO 0 - 29) is in reset; hence, we need to bring it out of reset before use. To do so, we will need to clear the 5th bit (IO_BANK0) of the RESET register (address 0x4000C000).

Furthermore, adding an address offset of 0x3000 allows atomically write 1 to clear bits. Therefore, we will write (1<<5) to address 0x4000F000.

RP2040 Atomic Register Access
Atomic Register Access ©RP2040

IO Mode

RP2040 IO_BANK0_GPIO25_CTRL
GPIO 25 control including function select and overrides. @ 0x400140CC ©RP2040
RP2040 SIO_GPIO_OE
GPIO output enable @ 0x400140CC ©RP2040

	ldr	r3, =0x400140cc		@ IO_BANK0_BASE + GPIO25_CTRL
	mov	r2, #5			@ Function 5 (SIO)
	str	r2, [r3, #0]

	ldr	r3, =0xd0000020		@ SIO_BASE + GPIO_OE
	ldr	r2, =(1<<25)
	str	r2, [r3, #0]
			

To set the functionality of GPIO25, we will need to write the functionality to address 0x400140CC (GPIO25_CTRL). To use it as a simple IO, write 5 (for SIO).

RP2040 Function Select
RP2040 GPIO Function Select ©RP2040

To use GPIO25 for output, write 1 to the 25th bit on register at address 0xD0000020 (GPIO_EN).

Blink

GPIO output value XOR
GPIO output value XOR @ 0xd000001c ©RP2040

blink:
	ldr	r3, =0xd000001c		@ SIO_BASE + GPIO_XOR
	ldr	r2, =(1<<25)	@ GPIO25
	str	r2, [r3, #0]
	
	ldr	r0, =650000
	bl	delay
	b	blink

delay:
	mov	r4,r0
loop:
	sub	r4,r4,#1
	cmp	r4, #0
	bne	loop
	bx	lr

.align 4
			

The routine blink is used to blink the LED. To blink the LED on GPIO25, we will xor that bit. To atomically xor that bit, write 1 to 25th bit on register at address 0xD000001C (GPIO_OUT_XOR).

The routine delay is used to generate some busy-delay between each XOR operation using the loop under the label loop.

These labels / routines are used only in this file, and don't need to be visible in the linking stage; therefore, don't need to be global.

It is always a good idea to restore the default alignment of 4 for the 32-bit CPU at the end. This doesn't apply to Cortex-M0+ which uses Thumb instruction only, that is 2-byte aligned, but apply to other ARMs. Let's use it to stay safe.

To assemble this code, use arm-none-eabi-as --warn --fatal-warnings -g main.s -o main.o, result saved in file main.o.

Linker Script

Once assembled, we have to link the program to the target platform. Assembly only makes the program executable, linking makes it loadable.

Following is our linker script in file main.ld:


MEMORY {
	SRAM(rwx) : ORIGIN = 0x20000000, LENGTH = 264k
}

ENTRY(reset)

SECTIONS {
	. = ORIGIN(SRAM);
	.text : {
		*(.text)
	} >SRAM
}
	

In this linker script, we tell the linker:

  1. There is a readable and writeable memory called SRAM, starting at address 0x20000000, that is 264KiB long.
  2. Entry point is the routine reset. Start execution from that position.
  3. Put the text segment (program code) in SRAM, starting from the origin (beginning) of SRAM.

To link this program, use arm-none-eabi-ld -nostdlib -nostartfiles -T main.ld main.o -o main.elf, result saved in file main.elf

UF2 File

Now, the program is ready. To download it to the Pico, we will need to do one more step: pack it into uf2 format to download via USB.

The RP2040 provides a very easy-to-use way to download. Compared to the traditional method that requires a physical programmer (or a special interface not common on modern PCs such as UART), RP2040 supports download over USB. Once powered up, RP2040 will represent itself as a mass storage device. This allows the developer to download the program file the same way as copy a file into a USB disk.

To generate the uf2 file, use pico-elf2uf2 main.elf main.uf2, result saved in file main.uf2.

You can download the SDK source code but you will need to build the SDK to run it on your computer.

You can find the elf2uf2 program in pico-sdk/elf2uf2 (SDK version 1). See source code in /tools/elf2uf2 in SDK1.5.1 on GitHub.

I copied the elf2uf2 program into my environment /bin to use it directly.

We can open the uf2 file in hex mode to check its content, use od -t x4 main.uf2:


Addr    X0       X4       X8       XC	
0000000 0a324655 9e5d5157 00002000 20000000
0000020 00000100 00000000 00000001 e48bff56
0000040 4685480b 22204b0b 4b0b601a 601a2205
0000060 4a0b4b0a 4b0b601a 601a4a09 f000480a
0000100 e7f8f801 3c011c04 d1fc2c00 46c04770
0000120 20041000 4000f000 400140cc d0000020
0000140 02000000 d000001c 0009eb10 00000000
0000160 00000000 00000000 00000000 00000000
*
0000760 00000000 00000000 00000000 0ab16f30
0001000
	

Cross reference the UF2 format, we can understant the content as follow:

UF2 File Content
Offset,Size Name Content Comment
0,8 Magic number: 0x0A324655, 0x9E5D5157 0a324655 9e5d5157
8,4 Flag 00002000 familyID present - when set, the fileSize/familyID holds a value identifying the board family (usually corresponds to an MCU)
12,4 Address in flash (memory for RP2040) where the data should be written 20000000 Same as the linker script ORIGIN(SRAM), beginning of SECTIONS
16,4 Number of bytes used in data (often 256) 00000100 256 bytes long
20,4 Sequential block number; starts at 0 00000000 0, first block
24,4 Total number of blocks in file 00000001 1 block only
28,4 File size or board family ID or zero e48bff56 Family ID for RP2040, see this list
32,476 Data, padded with zeros 4685480b... The program, verify with arm-none-eabi-objdump -d main.elf
508,4 Final magic number: 0x0AB16F30 0ab16f30

Download to Pico

When we power the Pico up, the CPU starts executing the on-chip bootloader (offen called the first stage bootloader). This bootloader program is located at the boot ROM and cannot be changed. This bootloader allows the Pico to be connected to PC via USB as a mass storage device (like a USB disk) for programming.

RP2040 will start as a USB mass storage device if the BOOTSEL button pressed (connected to the on-board Flash CSn pin, RP2040 pin 56), or ther is no valid program in the Flash.

We can program the chip by copying the uf2 program file into the mass storage device.

Once the uf2 file is received, the bootloader will try to copy the content in that file into specified location.

LED on Pico
LED on Pico

We can see the LED on the Pico board blinking.

Download to SRAM vs to Flash

Cut the power, then connect the power. The program disappeared. The Pico starts as a mass storage device. Where the h... is my program?

Download to SRAM

In our linker script, we have:


MEMORY {
	SRAM(rwx) : ORIGIN = 0x20000000, LENGTH = 264k
}
	

This causes the uf2 file to use the 0x20000000 as target address. Therefore, the bootloader will copy the content of the uf2 file (which is our program) to address 0x20000000. For RP2040, this address is mapped to SRAM.

The SRAM is volatile. Once the power is lost, its content is lost.

In fact, even if the program is not lost in SRAM, the CPU will always stall in on-chip bootloader stage. It either waiting for new uf2 file if the flash is invalid, or reload the program from the flash if the flash is valid.

Download to Flash

According to the datasheet, the XIP address starts from 0x10000000. This address maps the external 2MiB on-board flash chip.

RP2040's flash address width allows a size of up to 16MiB (0x01000000). The flash is mirrored 4 times, spanning from 0x10000000 to 0x13FFFFFF, for different cache strategies. 0x10000000 - 0x10FFFFFF represents the most basic one, the “cacheable, allocating - Normal cache operation”. For the on-chip bootloader, writing to 0x10000000 - 0x10FFFFFF means write to the physical flash; writing to 0x11000000 - 0x13FFFFFF is not allowed.

To download to the XIP (flash) memory, we will need to link the program to that location in order to let the on-chip bootloader copy our program into the physical falsh.

Let's modify the liner script to write to 0x10000000, as follow:


MEMORY {
	Flash(rx) : ORIGIN = 0x10000000, LENGTH = 2048k
}

ENTRY(reset)

SECTIONS {
	. = ORIGIN(Flash);
	.text : {
		*(.text)
	} >Flash
}
	

Keep the assembly code source file untouched. Relink, regenerate the uf2 file, use arm-none-eabi-ld -nostdlib -nostartfiles -T main.ld main.o -o main.elf then pico-elf2uf2 main.elf main.uf2.

If we check the content of the newly generated uf2 file, we can find that the content at file offset 12 (Address in flash where the data should be written) changed to 0x10000000. This tells the on-chip bootloader to write the program to 0x10000000, which is mapped to the on-board flash.

Program the chip by copying the uf2 program file into the mass storage device. It failed. The LED is not blinking. The chip stalls at on-chip bootloader stage. Shortly after Pico received the uf2 file, it presents itself as a mass storage device again.

If there is a valid program in the flash, it will be overwritten.

The Checksum

RP2040 Boot Seqence
RP2040 Boot Seqence ©RP2040

From the CPU's perspective of view, there is no programmed, or un-programmed flash. Reading the flash always produces some result, either 1 or 0, ther is no such thing like "undefined". An un-programmed flash will provide random bytes, or all empty bytes (there is always some data).

At power up, the on-chip bootloader will read 256 bytes of data from the on-board flash. Then, it uses checksum to verify the data from the flash:

The reason is: We had the program in flash, but without a correct checksum. So, the on-chip bootloader failed to load our program.

Download to Flash

Now, we will be working on programming the flash, so the program can be preserved after power lost.

Project file for this example can be found here.

The Source Code

Before we add the checksum, we have to make a slight modification on our assembly source code:


.cpu cortex-m0plus
.thumb
.align 2
.thumb_func

.section .boot2, "ax"
.global reset
reset:
	ldr	r0, =0x20041000		@ Stack Pointer @ SRAM bank 4
	mov	sp, r0

	ldr	r3, =0x4000f000		@RESETS_BASE + RESET  + 0x3000
	mov	r2, #32			@ 1 << 5 (IO_BANK0)
	str	r2, [r3, #0]

	ldr	r3, =0x400140cc		@ IO_BANK0_BASE + GPIO25_CTRL
	mov	r2, #5			@ Function 5 (SIO)
	str	r2, [r3, #0]

	ldr	r3, =0xd0000020		@ SIO_BASE + GPIO_OE
	ldr	r2, =(1<<25)
	str	r2, [r3, #0]

blink:
	ldr	r3, =0xd000001c		@ SIO_BASE + GPIO_XOR
	ldr	r2, =(1<<25)	@ GPIO25
	str	r2, [r3, #0]
	
	ldr	r0, =650000
	bl	delay
	b	blink

delay:
	mov	r4,r0
loop:
	sub	r4,r4,#1
	cmp	r4, #0
	bne	loop
	bx	lr

.align 4
	

We add .section .boot2, “ax”. This is to name the following program code boot2, and indicate the program is allocatable and executable.

Save this source code as boot2_src.s.

Assemble it, use arm-none-eabi-as --warn --fatal-warnings -g boot2_src.s -o boot2_src.o, result saved in file boot2_src.o.


We also need some modification on the linker script:


MEMORY {
	FLASH(rx) : ORIGIN = 0x10000000, LENGTH = 2048k
	SRAM(rwx) : ORIGIN = 0x20000000, LENGTH = 264k
}

SECTIONS {
	. = ORIGIN(FLASH);
	.text : {
		KEEP(*(.boot2))
	} >FLASH
}
	

Store the boot2 section at the beginning of flash.

Save this linker script as boot2_src.ld.

Link it, use arm-none-eabi-ld -nostdlib -nostartfiles -T boot2_src.ld boot2_src.o -o boot2_src.elf, result saved in file boot2_src.elf.


Generate a binary file containing the program code, use arm-none-eabi-objcopy -O binary boot2_src.elf boot2_src.bin, result saved in file boot2_src.bin. We can check the content of this binary file using od -t x4 boot2_src.bin:


0000000 4685480b 22204b0b 4b0b601a 601a2205
0000020 4a0b4b0a 4b0b601a 601a4a09 f000480a
0000040 e7f8f801 3c011c04 d1fc2c00 46c04770
0000060 20041000 4000f000 400140cc d0000020
0000100 02000000 d000001c 0009eb10
0000114
	

To be honest, it is not necessary to neither name the section boot2 in source code nor specify the memories and sections in the linker script. They all got destroyed in the binary file.

However, it is a good idea to properly name and link them, so we don't get confused in the future.

Append the Checksum

The RP2040 document says, the on-chip bootloader will load the first 256 bytes from the flash into SRAM bank 5 and verify the checksum:

We will add the checksum using the tool that comes with the SDK. Use pico-pad_checksum -s 0xFFFFFFFF boot2_src.bin boot2.s, result saved in boot2.s. We can open this assembly source code file in text mode:


// Padded and checksummed version of: boot2_src.bin

.cpu cortex-m0plus
.thumb

.section .boot2, "ax"

.byte 0x0b, 0x48, 0x85, 0x46, 0x0b, 0x4b, 0x20, 0x22, 0x1a, 0x60, 0x0b, 0x4b, 0x05, 0x22, 0x1a, 0x60
.byte 0x0a, 0x4b, 0x0b, 0x4a, 0x1a, 0x60, 0x0b, 0x4b, 0x09, 0x4a, 0x1a, 0x60, 0x0a, 0x48, 0x00, 0xf0
.byte 0x01, 0xf8, 0xf8, 0xe7, 0x04, 0x1c, 0x01, 0x3c, 0x00, 0x2c, 0xfc, 0xd1, 0x70, 0x47, 0xc0, 0x46
.byte 0x00, 0x10, 0x04, 0x20, 0x00, 0xf0, 0x00, 0x40, 0xcc, 0x40, 0x01, 0x40, 0x20, 0x00, 0x00, 0xd0
.byte 0x00, 0x00, 0x00, 0x02, 0x1c, 0x00, 0x00, 0xd0, 0x10, 0xeb, 0x09, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
.byte 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xda, 0x37, 0x23, 0x75
	

As we can see, this file contains the program code (see file boot2_src.bin) plus 4 bytes of checksum at the end of the file. The total length is 256 bytes.

Note that at the top of this assembly code, they had been given the section name boot2. This is because in most cases, this code (the first 256 bytes read by on-chip bootloader) is used as the second stage bootloader, which loads the main program.

You can find the pad_checksum program in pico-sdk/src/rp2_common/boot_stage2 (SDK version 1). See source code in /src/rp2_common/boot_stage2 in SDK1.5.1 on GitHub.

I copied the pad_checksum program into my environment /bin to use it directly.

This tool is a Python script. By default, it uses 256 as pad size, 0 as seed (initial value). We are OK with 256 pad size, but we would like to change the initial value (seed) to 0xFFFFFFFF by using -s 0xFFFFFFFF.

Inside this tool, on line 42, there is ((binascii.crc32(bytes(bitrev(b, 8) for b in idata_padded), args.seed ^ 0xffffffff) ^ 0xffffffff) & 0xffffffff, 32). In the Python document, it states: binascii.crc32: Compute CRC-32, the unsigned 32-bit checksum of data, starting with an initial CRC of value. … The algorithm is consistent with the ZIP file checksum. According to GZIP File Format Specification, page 12, the polynomial in ZIP algorithm is 0xedb88320, which is the reverse of 0x04c11db7 required by RP2040 on-chip bootloader.

UF2 File with Checksum

Finally, we can download the program with proper checksum.

Assemble the padded and checksummed assembly code, use arm-none-eabi-as --warn --fatal-warnings -g boot2.s -o boot2.o, result saved in file boot2.o.

Create a new linker script:


MEMORY {
	FLASH(rx) : ORIGIN = 0x10000000, LENGTH = 2048k
	SRAM(rwx) : ORIGIN = 0x20000000, LENGTH = 264k
}

SECTIONS {
	. = ORIGIN(FLASH);
	.text : {
		KEEP(*(.boot2))
	} >FLASH
}
	

This will place the boot2 section (our program with checksum) at the beginning of the flash (mapped to XIP at address 0x10000000). This should request the on-chip bootloader to burn the program to flash after it received the uf2 file.

Link it, use arm-none-eabi-ld -nostdlib -nostartfiles -T main.ld boot2.o -o main.elf, result saved in file main.elf.

Generate uf2 file, use pico-elf2uf2 main.elf boot2.uf2, result saved in file boot2.uf2.

Program the chip. Remove the power and reapply the power.

Our program is kept.

The Next Step

As we can see, the on-chip bootloader will only load 256 bytes of data from flash (including 4 bytes of checksum), that is, only 126 Thumb instructions (2 bytes each). This is far less than the space to store a reasonable program.

Therefore, we have to introduce the second stage bootloader (boot2) that fits into the 256-byte space, saved in flash. Its only purpose is to prepare the main program execution.

In my next article, we will discuss the SDK 2nd stage bootloader.