Print string on 68HC11
This article illustrates string structure in memory and how to print string on PC using C and on microprocessor using assembly. Assembly language of 68HC11 is used as example.
68hc11, 6800, microprocessor, ASCII, string, memory structure, printf, C, Assembly
--by Captdam @ Sep 2, 2024I wrote this guide a few years ago when I was a teaching assistant in University. That course was about the architecture of RISC and CISC microprocessor and assembly programming of 68HC11. I recently found this article in my Gist and I think it is worth putting it up here on my blog.
There was an assignment of printing a text (ASCII string) and lots of students were having difficulty with the assignment. I felt it was hard to illustrate it in email to students due to all the limitations of email and it was so inefficient to repeat the same text to all students; therefore, I am writing this article to help students understand how to do that assignment and how microprocessors work.
Helpful tips
I want to put these tips at the beginning of this article. Please please please do these!
First, print the reference sheet and have it on hand. The reference sheet is your best friend when programming a microprocessor, it tells us all the instructions we can use and how to use them. If you want to dive deeper to see how the hardware works, read the reference sheet. The reference sheet can be found here: https://www.nxp.com/docs/en/reference-manual/M68HC11RM.pdf
Second, use the “Run step” function of the simulator. Do not hope the program will run flawlessly at the first time. It’s 99.99% guaranteed that the program will have some bugs at the first time. I would be nervous if my program can run at the first time, it always means there is some serious bug hiding and a disaster is about to happen. Set some breakpoints and run step by step. Unlike computer programming where the system is highly abstract, different hardwares work together to make the microprocessor running. Watch for all register changes as you advance the program.
Third, always reset the simulator after modifying code to clear the memory.
String encoding
PC example
Let's strat from computer programming. When writing software, we usually have data and executable code. Data is used to hold some information we are interested in and code are used to manipulate the data. There are two types of data, the variables in RAM that can be modified by the program and must be created at run time, and the read-only constants in ROM that comes with the program and cannot be modified (for microprocessors in Harvard architecture). In this article, we will be focus on read-only constants. Consider the following code in C:
const char const* myString = “This is some data.”;
printf(“%s”, myString);
In this program, myString
is our data, it contains the text we want to print out; printf
is our code, it instructs how to work with the data.
When we compile this program, the compiler will first encode our data This is some data
in ASCII format (a commonly used encoding method to represent human readable characters). The encoded data will be stored somewhere in our program, and the compiler remembers where the data is stored.
Let’s say our data is stored at address 0x1234
(0x means hexadecimal, I personally prefer 0x instead of $ because 0x is used commonly). More detailed, a chunk of memory is used to store the string; the first character ‘T’ is stored at 0x1234
, the second character ‘h’ is stored at 0x1235
, the third character ‘i’ is stored at 0x1236
and so on. The address of the first character 0x1234
is applied to the name of the data myString
. For the compiler, when we say data myString
, it means address 0x1234
.
Then, the compiler will compile the code: printf(“%s”, myString);
. As we can see, the printf
has two parameters. When executing this program, the first parameter “%s”
tells the computer to handle the data as an ASCII string. Using a more technical way to speak, it means a null-terminated array of characters in ASCII encoding.
The second parameter myString
tells the computer where the data is. In our case, myString
means address 0x1234
. Therefore, the computer will start to fetch data from 0x1234
and print whatever is stored at 0x1234
on screen based on ASCII encoding. After printing whatever in 0x1234
, the computer will print whatever in 0x1235
and so on.
So, when will the computer stop printing? In other words, how can the computer know it reaches the end of our data string? When we encode our data string, we will put a null-terminator (a byte of 0) at the end of the string (In C language, we use double quote mark to tell the compiler to append the 0 byte at the end of the string for us). When the computer prints our data, if the computer sees a 0, it knows that it reaches the end of the string. Therefore, the printf
function terminates.
Microprocessor example
myString
fcc 'This is some data.'
fcb 0
In microprocessors, we have the same approach in Assembly language as C language. Let’s look at our example again.
First, we have a label called myString
. Similar to our C example above, it will be converted to an address by assembler.
The fcc
is a pseudo-instruction. It cannot be assembled into an instruction the microprocessor hardware can understand; instead, it is used to instruct the assembler. This fcc
tells the assembler to transfer the string in this line into binary code using the ASCII encoding method. In our case, the string ‘This is some data.’ will be stored in our program using ASCII. Notice that, null terminator is not appended.
The fcd
is a pseudo-instruction too. It tells the assembler to directly store the value in our program. In our program, 0 (null-terminator) is stored.
The following table shows what is in the ROM (program space) after assembling. Assume we have 6 bytes of code/data before message:
Real Address Address relative to label Content
0x05 myString-1 Something before our string
0x06 myString T (First character in the string is located at message+0)
0x07 myString+1 h
0x08 myString+2 i
0x09 myString+3 s
......
0x15 myString+22 t
0x16 myString+22 a
0x17 myString+23 . (Last character of the string)
0x34 myString+24 \0 (Null-terminator)
0x35 myString+25 Something after our string
Print one character
On a PC, we rely on the underlying operating system (OS) to display data. Our program will prepare data in our program’s memory, in other words, a buffer in user space. Then, we invoke a system call with a pointer to this buffer. Then, the OS will use the pointer to retrieve the data and do all the necessary tasks to put them on display.
However, this is not the case for microprocessors. There is no OS, we have to do everything ourselves. We perform this task by reading and displaying each character one at a time. You may already have noticed that our microprocessor is just a simple IC without any display on it. How can we print our data? Okay, we will use some external device for this purpose: a serial port. We will let the microprocessor output the characters through the serial port. The serial port can be connected to a PC with serial port monitor or to a LCD display with serial port inputs.
The microprocessor comes with a UART. Beside setting up the mode and baud-rate of UART, all we need to do to send a character is to write the character to a special register SCDR
at address 0x102F
. As mentioned in the reference sheet.
ldaa 'X'
staa $102f
In the above example, we first load a character ‘X’ into accumulator A. Then, we write the content in accumulator A, which contains the ASCII code of character ‘X’, into address 0x102F
. This write operation triggers the UART hardware’s transmitting operation. Of course, we should add code to check the status of the UART transmitter and wait if there is ongoing transmission.
You may have the experience that after a few days away from a program, you just cannot remember what you write in the program. Imagine that you have been away from computer for a week and you come back to this program, you just wonder what the **** is address 0x102F
. Therefore, let’s pack this code into a subroutine.
sendChar
staa $102f
rts
This subroutine will write the content in the accumulator into a UART data register to initial a transmitting operation. From now on, we can forget the details of how you use UART. We simply load whatever we can to send into accumulator A and call sendChar
subroutine. The following example shows how to send two character OK
:
ldaa 'O'
jsr sendChar
ldaa 'K'
jsr sendChar
sendChar
staa $102f
rts
In fact, this is how the interface of computer function void putc(char c)
implemented: Load a character into a register, then call the function to print that character in the register.
Print string
Although we can load and print one character by one character like the above "OK" example, it is very low efficient.
As mentioned before, string is encoded by placing a null-terminator at the end of the string. Therefore, we can print the characters of the string by order, until we see a null-terminator.
Let’s first start from computer C code:
const char const* myString = “This is some data.”;
for (const char* p = myString; *p != '\0'; p++) {
putc(*p);
}
// Or, more detail
const char const* ptr = myString;
while(1) {
char current = *ptr;
if (current == '\0') break; // '\0' is zero-terminator and it represents number 0
putc(current);
ptr++;
}
After compiling, the variable myString
will hold a pointer to the data (more specifically, the address of the first character). We will load the first first character, print the first character, advance the pointer to load the second character, print the second character, and so on. Once we see the zero-terminator, we stop.
Now, let’s convert the C language into assembly for 68HC11:
ldx myString ; Load address of data to index register (pointer)
loop
lda idx, 0 ; Load A fron pointer
beq done ; If previous load is 0 which sets zero flag, branch to done label
jsr sendChar ; Print character
inx ; Advance pointer
bra loop ; Continue the loop
done
bra * ; halt
sendChar
staa $102f
rts
myString
fcc 'This is some data.'
fcb 0
68HC11 has two special 16-bit registers, the index registers: IX
and IY
(early models of 6800 had only one index register). The index register is used to access memory by data address, similar to the idea of a pointer in C language.
Let’s read this assembly code together:
We first load the address of the data myString
into our index register IX
. At this moment, IX
holds the address of the first character T
.
Next, we load the accumulator with the value pointed by IX
with 0 offset. In other words, load T
into the accumulator.
As the reference sheet on this page mentioned, the lda
instruction will modify the zero flag according to the value loaded. Which means, the zero flag will be set or reset if the value loaded is equal to or not zero. If so, that means we have reached the end of the string and we should stop the operation; otherwise, we should keep printing and advancing. The beq
instruction in the next line will cause the program to branch to the done
label if the zero flag is set, effectively ending the loop.
Now, we have the character to print in the accumulator. We can call the sendChar
subroutine to print that character.
Finally, we advance the pointer by increasing it by one, and go the the beginning of the loop to repeat the same task with a new pointer.
Now, let's create a subroutine for our string printing function:
ldx str1
jsr sendString
ldx str2
jsr sendString
sendString ; Print string pointed by IDX
loop
lda idx, 0 ; Load A fron pointer
beq done ; If previous load is 0 which sets zero flag, branch to done label
jsr sendChar ; Print character
inx ; Advance pointer
bra loop ; Continue the loop
done
rts
sendChar ; Print character in ACCA
staa $102f
rts
str1
fcc '111'
fcb 0
str2
fcc '222'
fcb 0
In the above example, the subroutine will need the caller to load the pointer into index register IX
before calling this subroutine. It prints until a zero-terminator is loaded.