Oneshot Timer Drivers
The NuttX timing subsystem consists of four layers:
- 1 Hardware Timer Drivers: Includes implementations of various hardware
timer drivers.
- 2 Timer Driver Abstraction: Such as Oneshot and Timer, which provide
timer hardware abstraction.
- 3 OS Timer Interfaces: Timer and Alarm, offering relative and
absolute timer interfaces.
- 4 OS Timer Abstraction: The wdog (watchdog) module manages software timers
and provides a unified timer API to upper layers.
Here we focus on the oneshot timer driver abstraction.
Oneshot is the timer driver abstraction that provides:
Unified API for different timer hardware.
Functional correct and optimized timing conversion between cycle counts and natural time.
Background
Computers typically rely on hardware cycle counters that count level changes from external clock signals. These signals are generated by a crystal oscillator. The signal then goes through a Phase-Locked Loop (PLL) for frequency multiplication, and is output as the clock signal to the hardware cycle counter.
The counter counting up or counting down—with each level changes, enabling hardware-based timing. To generate timing interrupts, timer hardware includes a comparator. It triggers a CPU interrupt when the down-counter reaches zero or the cycle counter matches a preset value.
Based on the functions of the timers, we can abstract a minimal set of capabilities that a timer should provide:
Read the current cycle count
Trigger an event at an absolute cycle count
Trigger an event after a relative cycle count
From an OS perspective, The second one and third one are functionally equivalent assuming the first one is available. By reprogramming the timer, these can also emulate periodical timers. Although these methods are similar in expressiveness, timers that use relative delays tend to be less accurate and efficient than those supporting absolute timing. This is because reading the current time introduces additional CPU overhead, affecting both timing-precision and performance.
Oneshot Drivers API
OneShot currently offers new count-based interfaces, while also providing timespec-based interfaces for compatibility with older drivers. We strongly recommend using the count-based interface due to its superior performance. Besides, count-based APIs are easier to implement, as they only need to focus on reading and writing timer-related registers without needing to perform error-prone time conversion.
In count-based interface design, oneshot adopts the following principles:
- Minimalist design: Significantly simplifies the implementation
for drivers.
- Count-based interfaces: Uses count cycles as the unit for both reading
time and setting timers.
- Supports both absolute and relative timers: Compatible with underlying
timer hardware, regardless of whether it uses absolute or relative timing.
- No status returns: Since read/write operations on timer hardware should
not fail, any failure should result in an assertion at the driver level.
- No callbacks or parameters: All expiration callback and parameter
management is handled at the upper layer, preventing thread-unsafe usage.
The count-based interface is as follows:
clkcnt_t (*current)(FAR struct oneshot_lowerhalf_s *lower);
void (*start)(FAR struct oneshot_lowerhalf_s *lower, clkcnt_t delay);
void (*start_absolute)(FAR struct oneshot_lowerhalf_s *lower, clkcnt_t cnt);
void (*cancel)(FAR struct oneshot_lowerhalf_s *lower);
clkcnt_t (*max_delay)(FAR struct oneshot_lowerhalf_s *lower);
The above count-based interfaces provide functions for:
getting the current timer count,
starting a relative timer,
starting an absolute timer,
canceling a timer event
and getting the maximum timer delay.
Note that if the driver uses a count-based API, it should call
oneshot_count_init during initialization to tell the upper-layer
the timer frequency.
The count-based interfaces are enabled via CONFIG_ONESHOT_COUNT.
The following are the deprecated timespec interfaces:
int (*max_delay)(FAR struct oneshot_lowerhalf_s *lower, FAR struct timespec *ts);
int (*start)(FAR struct oneshot_lowerhalf_s *lower, FAR const struct timespec *ts);
int (*cancel)(FAR struct oneshot_lowerhalf_s *lower, FAR struct timespec *ts);
int (*current)(FAR struct oneshot_lowerhalf_s *lower, FAR struct timespec *ts);
They provide functions for:
getting the maximum timer delay,
starting a relative timer,
canceling a timer event
and getting the current timer count.
ClockCount
The recommended oneshot APIs are all count-based. So how do we handle time conversion? We provide a unified ClockCount(clockcount.h) layer for fast and safe time conversions, including:
count to timespec
count to tick
timespec to count
tick to count
We notice that there always at least two divisions in timing conversion. So clockcount implements two methods to accelerate time conversion:
1. Invariant Divisor Division Optimization: Used for converting counts to seconds or ticks. It can be enabled via
CONFIG_ONESHOT_FAST_DIVISION. This division optimization can transforms a division into:
one unsigned high multiplication (UMULH),
one subtraction,
one addition, and
one logical right shift (LShR).
Please note that Invariant Divisor Division Optimization does not necessarily provide a performance advantage. It is related to the overhead of UMULH and UDIV instructions on different CPU platforms. E.g. On early ARMv8A platforms (Cortex A-53), UMULH took 6 cycles, which meant that enabling optimization was actually less efficient than direct division using the UDIV instructions.
2. Multiply-Shift Approximate Division: Used to convert delta counts into nanoseconds or ticks.
Note this was enabled by default. If extramely precise time conversion is required, it should be disable. This method trades off slight precision (a few nanoseconds) for better performance. However, due to potential multiplication overflow, it is only suitable for relative time conversions. The first method is exact, but takes about 6-9 CPU cycles. The approximate approach requires only one unsigned multiplication and one LShR, typically consuming around 4 CPU cycles, making it significantly faster.
Combining 1 and 2, we can achieve a fast and precise time conversion.