The Kernel Address Sanitizer (KASAN)

Overview

Kernel Address Sanitizer (KASAN) is a dynamic memory safety error detector designed to find out-of-bounds and use-after-free bugs.

The current version of NuttX has two modes:

  1. Generic KASAN

  2. Software Tag-Based KASAN

Generic KASAN, enabled with CONFIG_MM_KASAN_GENERIC, is the mode intended for debugging, similar to linux user level ASan. This mode is supported on many CPU architectures, but it has significant performance and memory overheads. The current NuttX Generic KASAN can support memory out of bounds detection allocated by the default NuttX heap allocator,which depends on CONFIG_MM_DEFAULT_MANAGER or CONFIG_MM_TLSF_MANAGER, and detection of out of bounds with global variables.

Software Tag-Based KASAN or SW_TAGS KASAN, enabled with CONFIG_MM_KASAN_SW_TAGS, can be used for both debugging, This mode is only supported for arm64, but its moderate memory overhead allows using it for testing on memory-restricted devices with real workloads.

Support

Architectures

Generic KASAN is supported on x86_64, arm, arm64, riscv, xtensa and so on.

Software Tag-Based KASAN modes are supported only on arm64.

Usage

To enable Generic KASAN, configure the kernel with:

CONFIG_MM_KASAN=y
CONFIG_MM_KASAN_ALL=y
CONFIG_MM_KASAN_GENERIC=y

If you want to enable global variable out of bounds detection, you can add configurations based on the above:

CONFIG_MM_KASAN_GLOBAL=y

To enable Software Tag-Based KASAN, configure the kernel with:

CONFIG_MM_KASAN=y
CONFIG_MM_KASAN_ALL=y
CONFIG_MM_KASAN_SW_TAGS=y

Implementation details

Generic KASAN:

Compile with param -fsanitize=kernel-address, Compile-time instrumentation is used to insert memory access checks. Compiler inserts function calls (__asan_load*(addr), __asan_store*(addr)) before each memory access of size 1, 2, 4, 8, or 16. These functions check whether memory accesses are valid or not by checking corresponding shadow memory.

It is slightly different from Linux. On the one hand, in terms of the source of the shadow area; NuttX’s shadow area comes from the end of each heap. During heap initialization, it is offset and a kasan region is shaped at the end. Regions between multiple heaps are concatenated using a linked list.

Secondly, in order to save more memory consumption, the implementation of NuttX adopts a bitmap detection method; For example, in the case of a 32-bit machine, if the NuttX heap allocator allocates four bytes of memory to it, the kasan module will allocate a shadow area of one bit per unit of memory group on a four byte basis. If the shadow area is 0, the memory group can be accessed, otherwise 1 is inaccessible

Thirdly, the implementation of global variable out of bounds detection for this NuttX is also different from Linux. Due to the particularity of the shadow region, NuttX needs to construct kasan regions separately for the data and bss segments where the global variable is located. Before compiling, add the compile option ‘–param asan-globals=1’. In this way, the compiler will store all global variable information in this special sections, ‘.data..LASAN0’, These two segments store information about all global variables and can be parsed using the following structure:

struct kasan_global {
  const void *beg;                /* Address of the beginning of the global variable. */
  size_t size;                    /* Size of the global variable. */
  size_t size_with_redzone;       /* Size of the variable + size of the redzone. 32 bytes aligned. */
  const void *name;
  const void *module_name;        /* Name of the module where the global variable is declared. */
  unsigned long has_dynamic_init; /* This is needed for C++. */

  /* It will point to a location that stores the file row,
   * column, and file name information of each global variable */

  struct kasan_source_location *location;
  char *odr_indicator;
};

In order to reduce the amount of data generated by the compiler occupying the already precious flash space. NuttX’s approach is to use multiple links to extract the global variable information in elf through scripts, construct the region and shadow of the global variables according to the rules of kasan region, form an array, and finally link it to the program. The program concatenates the array to kasan’s region linked list.

The data generated by the compiler will be placed in a non-existent memory block. After the compilation is completed, this segment will be deleted and will not be copied to the bin file of the final burned board.

Software Tag-Based KASAN:

Software Tag-Based KASAN uses a software memory tagging approach to checking access validity. It is currently only implemented for the arm64 architecture.

Software Tag-Based KASAN uses the Top Byte Ignore (TBI) feature of arm64 CPUs to store a pointer tag in the top byte of kernel pointers. It uses shadow memory to store memory tags associated with each heap allocated memory cell (therefore, it dedicates 1/8 th of the kernel memory for shadow memory).

On each memory allocation, Software Tag-Based KASAN generates a random tag, tags the allocated memory with this tag, and embeds the same tag into the returned pointer.

Software Tag-Based KASAN uses compile-time instrumentation to insert checks before each memory access. These checks make sure that the tag of the memory that is being accessed is equal to the tag of the pointer that is used to access this memory. In case of a tag mismatch, Software Tag-Based KASAN prints a bug report.

For developers

Ignoring accesses

If you want the module you are writing to not be inserted by the compiler, you can add the option ‘CFLAGS += -fno-sanitize=kernel-address’ to a single module. If it is a file, you can write it this way, special_file.o: CFLAGS = -fno-sanitize=kernel-address