Read Me!!
What is NyanSim?
NyanSim is a LPC2478 simulator designed for the UNSW DESN2000 (rip 2142) labs. It interprets your executable (.axf), and simulates what the actual QVGA board would be doing.
Sounds cool, how do I use it?
You'll need to first build a project in uVision, in order to get the .axf file. If it doesn't compile in uVision, the .axf file will not be created/updated. Once you have successfully built your project, click on "Select a file", navigate to the output folder in your project directory, and upload the .axf file. Congrats, your program has been loaded onto the virtual microcontroller! You can "Run" your program, "Stop" it and "Reset" it. NyanSim also offers primitive debugging features, including a disassembly window and "Step" and "Run to cursor" commands. The cursor is set by selecting a line in the disassembly window (it will appear blue). The instruction about to be executed is highlighted yellow.
Usage notes
Some things to keep in mind:
- NyanSim only works with ARM instructions, not Thumb. It will throw an error if you try to give it Thumb instructions. Make sure to disable Thumb instructions in uVision by going to "Project->Options for Target", and disabling "Enable ARM/Thumb Interworking" in both the "C/C++" and "Asm" tabs. You'll need to build again after this.
- NyanSim runs slightly slower than the actual chip (see below for more info). The real QVGA board runs at 72MHz, but this simulator runs at about 10-40MHz (depending on your processor and browser).
- Some QVGA board functionalities are either not implemented, or partially implemented. See below for a full list of what is currently supported.
- The yellow highlighted line in the disassembly window is not the PC, but PC-8 (the instruction about to be executed). Note that on reset, the PC will start at 8.
- You may need to modify some parts of the labs to be compatible with the simulator. See below for details.
It's not working :-(
- Ensure you have built your project, and it compiled successfully.
- Ensure you've read the usage notes above, particularly regarding Thumb instructions.
If all else fails, there is the possibility it's a bug. Flick me an email via the contact page of this site, attaching a .zip of your project.
Why is it named NyanSim?
You'll work out why once you get your "play_song" project working :-)
Notes per lab
- Lab 0: hexloader/demo won't work, as it is compiled into a .hex file rather than the .axf file that is used for the rest of the labs.
- Lab 4: you don't need to change anything, but for blinky note that the LEDs will blink about 2-3 times slower than they will on the real board.
- Lab 5: sine_wave project currently not supported, as you need to use an oscilloscope. For the play_song project, since you are using one of the timers on the QVGA board you will need to consider the difference in speed between the real board and your web browser. This will affect how you set up your prescale register. I recommend running the project with no modification in your browser, observing the clock speed achieved (under the registers), and setting up the prescale register appropriately.
Supported functionalities and peripherals
The more that are implemented, the slower the simulator becomes. Thus, some functionalities are only partially implemented as below. If you need a functionality for your project that is currently not implemented, please contact me via the email on the contact page to request that it be implemented.
- LED1-8 are supported. These are set over I2C, see note below on I2C.
- Everything on the daughterboard is fully supported - RGB LEDs, push buttons, LED ladder. Both buttons can be considered to be debounced.
- DAC and volume potentiometer are supported. Bias in the DACR is not implemented.
- Timer0 is implemented with all 4 match registers. Interrupts in the MCR are not supported. Timers 1-3 are not supported.
- ADC and potentiometer are supported. The ADC currently only supports channels AD0.2 (red potentiometer) and AD0.1 (light sensor on breakout board). Other channels are not supported. Burst mode is not supported. CLKDIV is not considered, since it takes considerable time for the cpu to request the channel value from the GUI.
- LCD screen is supported. The LCD screen is set up over SPI. See note below on SPI
- Touch screen is supported. Only 8-bit resolution with differential reference and power down between conversions is supported.
- I2C is implemented quite spartanly. The processor only works in master transmit mode, and there is only one slave - the led dimmer that drives LED1-8 and PB1-4. Because only master transmit is implemented, only the LEDs work.
- SPI is also implemented quite spartanly. The 2 SPI peripherals are the LCD screen and the touchscreen controller. SPI is used to set up the LCD screen in lcd_display_init(), and since this is code you will not modify, the simulator just assumes that you have set it up correctly. All SPI communication to the LCD screen is ignored. Whilst SPI is actually implemented for the touch screen controller, it only works with the S0SPCR configured as per Lab 6. Furthermore, SCLK and thus S0SPCCR are ignored.
If you're interested...
NyanSim is implemented in 2 halves - the frontend that presents the GUI to you that you interact with, and a background thread called a Web Worker that implements the LPC2478 processor. The reason these two are needed is because if I was to implement the cpu along with the GUI, the GUI would become quite unresponsive, as it would need to wait for the cpu to finish what it was doing before responding to your input (pushing a button etc). No one likes an unresponsive webpage!
This actually works quite well, because it means that the cpu just continually executes instructions, sending messages to the GUI and responding to messages from the GUI as required, which is similar to how the communication protocols (SPI, I2C) work anyway! You might notice that when running NyanSim that an entire core of your computer is being hogged by a chrome resource - that's the Web Worker! It is working super hard to make the simulator as fast as possible.
The first thing that happens is that your uploaded .axf file is parsed, and memory is initialised. It's worth noting that the .axf file produced by uVision is an ELF file, and so it has a standard format. You can see in the disassembly window above that I've been able to pull out some crude debugging info, such as function names.
Memory cannot be implemented in its entirety - the LPC2478 uses 32-bit addresses, which means a 4GB address space. Imagine your chrome tab trying to use 4GB of memory! To get around this, memory is implemented as a sparse tree. The tree has 4 levels, and is implemented with arrays. Each node in the tree has 256 children. This means that I effectively split memory up into 256^3 = 16777216 blocks of 256 bytes, which are allocated only if they are used by your project. This also means that if you try to use memory that has not been initialised, the simulator crashes. Hopefully you are initialising your memory! (think DCD vibes)
So how does the processor execute instructions? When parsing your .axf file, NyanSim grabs all your instructions (encoded), decodes them, and works out exactly what they are. Then, rather than storing an array with the machine or assembly code for the instructions, NyanSim stores an array of functions, where each function corresponds to a particular instruction (MOV, STR, ADD etc). And so you can see that NyanSim doesn't really do any pipelining, but rather just executes instruction after instruction by calling the function at the index in the PC-8, and then incrementing the PC. We can see that the speed of the processor is limited by how fast these instructions execute. On the LPC2478, the number of clock cycles per instruction is usually 1-3, but implementing these instructions as functions in JavaScript is nowhere near as fast. Consider what I have to do (sequentially) in JavaScript that happens concurrently in the LPC2478 for an ADD instruction:
- Check the condition flags in the CPSR and work out whether to continue executing the ADD
- Perform the addition
- Set the destination register with the result
- Potentially generate N, Z, C, V and update the CPSR if S is set in the instruction
- Increment the PC
- Increment Timer0 if enabled
- Check if Timer0 has matched any of the 4 match registers, and potentially do something if this is the case
This all takes time, and so an average simulated clock speed of ~20MHz is actually quite impressive given what the simulated processor is actually doing. Above I mentioned that some functionalities are only partially implemented to avoid unnecessarily slowing down the processor. Imagine if I implemented Timers1-3 as well - every instruction we would now have to increment these timers as well as check if they had matched any of their respective match registers - expensive!
One further point to mention is that both the DAC data and LCD screen data are buffered rather than being continuously sent to the GUI. Imagine if I didn't do this - every time your code sets a pixel in the LCD screen a message would be sent from the cpu worker to the GUI - 240*320 messages for a screen refresh! This would overload the GUI and be a performance bottleneck. So instead, the CPU buffers the data to be sent to both the LCD screen and the speaker, and sends it at 20ms and 500ms intervals respectively. This is why when using the speaker there is a discontinuity every 0.5s - this is the join between 2 speaker buffers.