HN
Today

BIO: The Bao I/O Coprocessor

This post introduces the BIO I/O coprocessor, an open-source RISC-V alternative to the Raspberry Pi's PIO, designed for the Baochip-1x SoC. It details a rigorous architectural comparison, highlighting how a RISC-centric approach can yield a more compact, higher-clock-rate design despite requiring more instructions per task. The article appeals to hardware enthusiasts on HN by delving into fundamental trade-offs between CISC and RISC philosophies in specialized hardware, complete with detailed technical explanations and code examples.

7
Score
0
Comments
#3
Highest Rank
3h
on Front Page
First Seen
Mar 23, 5:00 PM
Last Seen
Mar 23, 7:00 PM
Rank Over Time
1735

The Lowdown

The article introduces BIO (Bao I/O), an I/O coprocessor developed for the mostly open-source 22nm Baochip-1x SoC, drawing a direct comparison to the well-known Raspberry Pi PIO. The author, bunnie, outlines the motivation behind creating an alternative by first dissecting the PIO's architecture and then presenting the BIO's RISC-based design.

  • I/O coprocessors are designed to offload time-critical I/O tasks from main CPUs, ensuring deterministic response times and reducing jitter.
  • The author began by cloning the Raspberry Pi PIO to understand its functionality, finding it to be surprisingly resource-intensive in FPGAs, consuming more logic area and exhibiting poorer timing compared to a full RISC-V CPU core.
  • The PIO's high resource usage and timing issues are attributed to its complex, CISC-like instruction set, where a single instruction can perform multiple operations (e.g., nominal op, program counter management, data rotation, FIFO checks, pin side-setting, interrupt handling) requiring extensive hardware like barrel shifters.
  • A caveat is raised regarding potential patent encumbrance by the Raspberry Pi Foundation on the PIO's implementation, making open-source re-implementations risky.
  • Influenced by RISC philosophy, the author conceived BIO as a 'RISC-all-the-things' I/O coprocessor, prioritizing simplicity and leveraging the vast software ecosystem of RISC-V.
  • The BIO is built around a compact RV32E PicoRV32 core, augmented with custom 'register queues' (r16-r31) that implement full/empty blocking semantics, crucial for deterministic inter-core communication and I/O.
  • Key architectural features include blocking registers for shared FIFOs (x16-x19), a 'halt to quantum' register (x20) for precise timing, and a 'halt to event' register (x30) for synchronization, enabling cycle-accurate control without complex cycle-counting in software.
  • An optional BDMA extension allows BIO cores to function as smart DMA engines, capable of complex data transformations, with memory access controlled by a security-focused whitelist.
  • Programming examples demonstrate DMA operations using multiple BIO cores collaboratively and a SPI bit-banging implementation that leverages the 'snap to quantum' feature for precise timing.
  • A direct architectural comparison reveals the BIO's smaller logic area (half of the PIO's) and higher clock rate (over 4x in ASIC flow) at the expense of requiring more instructions per task. The BIO uses larger, private 4KiB instruction memory per core, in contrast to the PIO's small, shared 32-entry memory.
  • To facilitate development, a C toolchain for BIO programs was created, leveraging Zig's clang to compile C code into Rust assembly macros, ensuring compatibility with the Xous OS's pure-Rust build environment.
  • Resources including example libraries, RTL, and unit tests are provided, alongside information on the 'Dabao' development board for accessing physical BIO hardware.

In essence, the BIO represents a thoughtful re-evaluation of I/O coprocessor design through a RISC lens, prioritizing area efficiency and broad software tooling compatibility over the PIO's highly specialized, complex instruction set. While it may not achieve the same peak bit-banging speed for simple tasks as the PIO, its RISC-V foundation, blocking registers, and 'quantum' timing mechanism make it a more flexible and adaptable solution for offloading complex protocol stacks and achieving precise real-time control within a smaller hardware footprint.