Sunday, March 25, 2018

Announcing ubaboot

ubaboot is a 512 byte open-source USB bootloader for atmega32u4.

100% size-optimized assembly packs many features into a small footprint:
  • Flash write and read (verify)
  • EEPROM write and read (verify)
  • Signature, lock, and fuse read

Its tiny size allows up to the maximum hardware-supported 31.5 KiB for user programs.

Check it out here: https://github.com/rrevans/ubaboot


How does it work?

ubaboot attaches to the system as a custom USB device. An included sample user-mode pyusb driver works on Linux and programs/verifies chip memories by sending control transfers. README.md contains a full description of the protocol if you wish to program your own driver or work with other platforms.


How is it so small?

  • No interrupts. The ISR table itself takes a lot of space as does the stack push/pops needed to enter and exit interrupts. Instead the main loop polls for events.
  • All data is held in registers. There are no variables in RAM (and no stack). Absolute/indirect accesses to RAM/stack require too many instructions.
  • Logic optimized to fall-through. Instead of branching to the end, the logic falls-through to the next comparison and branches over each of the remaining cases.
  • Event handling uses bit twiddling to match both state and event in a single compare. State numbers are picked so that masking one interrupt bit produces a unique value for every combination of event and state. The dispatch then uses a series of single 8-bit comparisons.
  • Jumps and subroutines are used to reuse code. The setup logic is laid out specifically to optimize for this. Flash and descriptor reads use the same code path.
  • Setup logic eagerly loads registers used for most transfers. This avoids repeating those loads for each command.
  • Hardware registers are accessed through Y+offset loads/stores. Normal absolute register load/stores use 2 words each while indirect loads/stores use 1 word.
  • USB setup headers are read directly into %r2..%r9 aliases in data space ($2..$9) via indirect Z pointer access.

This visualization shows how it all fits together:

-Bob