Print string on 68HC11

This article illustrates string structure in memory and how to print string on PC using C and on microprocessor using assembly. Assembly language of 68HC11 is used as example.

--by Captdam @ Sep 2, 2024

[en] Here is the English version of this article

【中】 这里是这篇文章的中文版

I wrote this guide a few years ago when I was a teaching assistant in University. That course was about the architecture of RISC and CISC microprocessor and assembly programming of 68HC11. I recently found this article in my Gist and I think it is worth putting it up here on my blog.

There was an assignment of printing a text (ASCII string) and lots of students were having difficulty with the assignment. I felt it was hard to illustrate it in email to students due to all the limitations of email and it was so inefficient to repeat the same text to all students; therefore, I am writing this article to help students understand how to do that assignment and how microprocessors work.

Helpful tips

I want to put these tips at the beginning of this article. Please please please do these!

First, print the reference sheet and have it on hand. The reference sheet is your best friend when programming a microprocessor, it tells us all the instructions we can use and how to use them. If you want to dive deeper to see how the hardware works, read the reference sheet. The reference sheet can be found here: https://www.nxp.com/docs/en/reference-manual/M68HC11RM.pdf

Second, use the “Run step” function of the simulator. Do not hope the program will run flawlessly at the first time. It’s 99.99% guaranteed that the program will have some bugs at the first time. I would be nervous if my program can run at the first time, it always means there is some serious bug hiding and a disaster is about to happen. Set some breakpoints and run step by step. Unlike computer programming where the system is highly abstract, different hardwares work together to make the microprocessor running. Watch for all register changes as you advance the program.

Third, always reset the simulator after modifying code to clear the memory.

String encoding

PC example

Let's strat from computer programming. When writing software, we usually have data and executable code. Data is used to hold some information we are interested in and code are used to manipulate the data. There are two types of data, the variables in RAM that can be modified by the program and must be created at run time, and the read-only constants in ROM that comes with the program and cannot be modified (for microprocessors in Harvard architecture). In this article, we will be focus on read-only constants. Consider the following code in C:


const char const* myString = “This is some data.”;
printf(“%s”, myString);
	

In this program, myString is our data, it contains the text we want to print out; printf is our code, it instructs how to work with the data.

When we compile this program, the compiler will first encode our data This is some data in ASCII format (a commonly used encoding method to represent human readable characters). The encoded data will be stored somewhere in our program, and the compiler remembers where the data is stored.

Let’s say our data is stored at address 0x1234 (0x means hexadecimal, I personally prefer 0x instead of $ because 0x is used commonly). More detailed, a chunk of memory is used to store the string; the first character ‘T’ is stored at 0x1234, the second character ‘h’ is stored at 0x1235, the third character ‘i’ is stored at 0x1236 and so on. The address of the first character 0x1234 is applied to the name of the data myString. For the compiler, when we say data myString, it means address 0x1234.

Then, the compiler will compile the code: printf(“%s”, myString);. As we can see, the printf has two parameters. When executing this program, the first parameter “%s” tells the computer to handle the data as an ASCII string. Using a more technical way to speak, it means a null-terminated array of characters in ASCII encoding.

The second parameter myString tells the computer where the data is. In our case, myString means address 0x1234. Therefore, the computer will start to fetch data from 0x1234 and print whatever is stored at 0x1234 on screen based on ASCII encoding. After printing whatever in 0x1234, the computer will print whatever in 0x1235 and so on.

So, when will the computer stop printing? In other words, how can the computer know it reaches the end of our data string? When we encode our data string, we will put a null-terminator (a byte of 0) at the end of the string (In C language, we use double quote mark to tell the compiler to append the 0 byte at the end of the string for us). When the computer prints our data, if the computer sees a 0, it knows that it reaches the end of the string. Therefore, the printf function terminates.

Microprocessor example


myString
	fcc	'This is some data.'
	fcb	0
	

In microprocessors, we have the same approach in Assembly language as C language. Let’s look at our example again.

First, we have a label called myString. Similar to our C example above, it will be converted to an address by assembler.

The fcc is a pseudo-instruction. It cannot be assembled into an instruction the microprocessor hardware can understand; instead, it is used to instruct the assembler. This fcc tells the assembler to transfer the string in this line into binary code using the ASCII encoding method. In our case, the string ‘This is some data.’ will be stored in our program using ASCII. Notice that, null terminator is not appended.

The fcd is a pseudo-instruction too. It tells the assembler to directly store the value in our program. In our program, 0 (null-terminator) is stored.

The following table shows what is in the ROM (program space) after assembling. Assume we have 6 bytes of code/data before message:


Real Address	Address relative to label	Content
0x05		myString-1			Something before our string
0x06		myString			T	(First character in the string is located at message+0)
0x07		myString+1			h
0x08		myString+2			i
0x09		myString+3			s
......
0x15		myString+22			t
0x16		myString+22			a
0x17		myString+23			.	(Last character of the string)
0x34		myString+24			\0	(Null-terminator)
0x35		myString+25			Something after our string
	

Print one character

On a PC, we rely on the underlying operating system (OS) to display data. Our program will prepare data in our program’s memory, in other words, a buffer in user space. Then, we invoke a system call with a pointer to this buffer. Then, the OS will use the pointer to retrieve the data and do all the necessary tasks to put them on display.

However, this is not the case for microprocessors. There is no OS, we have to do everything ourselves. We perform this task by reading and displaying each character one at a time. You may already have noticed that our microprocessor is just a simple IC without any display on it. How can we print our data? Okay, we will use some external device for this purpose: a serial port. We will let the microprocessor output the characters through the serial port. The serial port can be connected to a PC with serial port monitor or to a LCD display with serial port inputs.

The microprocessor comes with a UART. Beside setting up the mode and baud-rate of UART, all we need to do to send a character is to write the character to a special register SCDR at address 0x102F. As mentioned in the reference sheet.


	ldaa	'X'
	staa	$102f
	

In the above example, we first load a character ‘X’ into accumulator A. Then, we write the content in accumulator A, which contains the ASCII code of character ‘X’, into address 0x102F. This write operation triggers the UART hardware’s transmitting operation. Of course, we should add code to check the status of the UART transmitter and wait if there is ongoing transmission.

You may have the experience that after a few days away from a program, you just cannot remember what you write in the program. Imagine that you have been away from computer for a week and you come back to this program, you just wonder what the **** is address 0x102F. Therefore, let’s pack this code into a subroutine.


sendChar
	staa	$102f
	rts
	

This subroutine will write the content in the accumulator into a UART data register to initial a transmitting operation. From now on, we can forget the details of how you use UART. We simply load whatever we can to send into accumulator A and call sendChar subroutine. The following example shows how to send two character OK:


	ldaa	'O'
	jsr	sendChar
	ldaa	'K'
	jsr	sendChar

sendChar
	staa	$102f
	rts
	

In fact, this is how the interface of computer function void putc(char c) implemented: Load a character into a register, then call the function to print that character in the register.

Print string

Although we can load and print one character by one character like the above "OK" example, it is very low efficient.

As mentioned before, string is encoded by placing a null-terminator at the end of the string. Therefore, we can print the characters of the string by order, until we see a null-terminator.

Let’s first start from computer C code:


const char const* myString = “This is some data.”;

for (const char* p = myString; *p != '\0'; p++) {
	putc(*p);
}

// Or, more detail

const char const* ptr = myString;
while(1) {
	char current = *ptr;
	if (current == '\0') break; // '\0' is zero-terminator and it represents number 0
	putc(current);
	ptr++;
}
	

After compiling, the variable myString will hold a pointer to the data (more specifically, the address of the first character). We will load the first first character, print the first character, advance the pointer to load the second character, print the second character, and so on. Once we see the zero-terminator, we stop.

Now, let’s convert the C language into assembly for 68HC11:


	ldx	myString	; Load address of data to index register (pointer)
loop
	lda	idx, 0		; Load A fron pointer
	beq	done		; If previous load is 0 which sets zero flag, branch to done label
	jsr	sendChar	; Print character
	inx			; Advance pointer
	bra	loop		; Continue the loop
done
	bra	*		; halt

sendChar
	staa	$102f
	rts

myString
	fcc	'This is some data.'
	fcb	0
	

68HC11 has two special 16-bit registers, the index registers: IX and IY (early models of 6800 had only one index register). The index register is used to access memory by data address, similar to the idea of a pointer in C language.

Let’s read this assembly code together:

We first load the address of the data myStringinto our index register IX. At this moment, IX holds the address of the first character T.

Next, we load the accumulator with the value pointed by IX with 0 offset. In other words, load T into the accumulator.

As the reference sheet on this page mentioned, the lda instruction will modify the zero flag according to the value loaded. Which means, the zero flag will be set or reset if the value loaded is equal to or not zero. If so, that means we have reached the end of the string and we should stop the operation; otherwise, we should keep printing and advancing. The beq instruction in the next line will cause the program to branch to the done label if the zero flag is set, effectively ending the loop.

Now, we have the character to print in the accumulator. We can call the sendChar subroutine to print that character.

Finally, we advance the pointer by increasing it by one, and go the the beginning of the loop to repeat the same task with a new pointer.

Now, let's create a subroutine for our string printing function:


ldx	str1
jsr	sendString
ldx	str2
jsr	sendString

sendString			; Print string pointed by IDX
loop
	lda	idx, 0		; Load A fron pointer
	beq	done		; If previous load is 0 which sets zero flag, branch to done label
	jsr	sendChar	; Print character
	inx			; Advance pointer
	bra	loop		; Continue the loop
done
	rts
	

sendChar			; Print character in ACCA
	staa	$102f
	rts

	
str1
	fcc	'111'
	fcb	0
str2
	fcc	'222'
	fcb	0
	

In the above example, the subroutine will need the caller to load the pointer into index register IX before calling this subroutine. It prints until a zero-terminator is loaded.