Skip to content

02 - Bootloader Essentials and Kernel Entry

Now that we have our development environment set up from the previous tutorial, it's time to dive into the core of our operating system: the bootloader and kernel entry point. In this tutorial, we'll examine the source files (loader.asm, kernel.h, kernel.cpp), the linker script (linker.ld), and the build system (Makefile) to understand how these components work together to bring our OS to life.

Our main goal here is to successfully transition control from the system firmware and GRUB (the bootloader) to our custom kernel, creating a solid foundation for everything we'll build on top of it.


Folder structure

hashx86/
├── kernel.cpp
├── linker.ld
├── Makefile
├── include/
│   └── kernel.h
│   └── stdint.h
│   └── types.h
└── asm/
    └── loader.asm

The Multiboot Standard and loader.asm

The Multiboot Standard is a specification that defines how a bootloader like GRUB should load an operating system kernel. By following this standard, our kernel can be loaded by any compatible bootloader, and we get important information about the system's current state.

Our loader.asm file contains the very first code that runs after GRUB hands control over to our kernel.

; Refer to https://www.gnu.org/software/grub/manual/multiboot/multiboot.html#Header-layout for more info.

; ---------------------------------------------------------------------------------------------- ;
;                                   Multiboot header (VGA)                                       ;
; ---------------------------------------------------------------------------------------------- ;
FMBALIGN    equ 1<<0                        ; Align loaded modules on page boundaries            ;
MEMINFO     equ 1<<1                        ; Provide memory map                                 ;
VIDINFO     equ 0<<2                        ; Video mode set                                     ;
;                                                                                                ;
FLAGS       equ FMBALIGN | MEMINFO | VIDINFO; This is the multiboot 'flag' field                 ;
MAGIC       equ 0x1BADB002                  ; Magic Number                                       ;
CHECKSUM    equ -(MAGIC + FLAGS)            ; Checksum                                           ;
;                                                                                                ;
section .multiboot                                                                               ;
align 4                                                                                          ;
    dd MAGIC                     ; Store the magic number                                        ;
    dd FLAGS                     ; Store the flags value                                         ;
    dd CHECKSUM                  ; Store the checksum value                                      ;
; ---------------------------------------------------------------------------------------------- ;

; Define the text section where executable code is placed
section .text
extern kernelMain        ; Declare an external reference to the kernel's entry point function (kernelMain)
extern callConstructors  ; Declare an external reference for calling constructors (C++ global/static constructors)
global loader            ; Make the loader function globally accessible

; The loader function is the entry point executed after the bootloader loads the kernel
loader:
    mov esp, kernel_stack     ; Set the stack pointer to the beginning of the kernel stack
    call callConstructors     ; Call the constructor functions

    push eax                  ; Save the value of EAX register (for passing data)
    push ebx                  ; Save the value of EBX register (for passing data)

    call kernelMain           ; Call the kernel's main entry function

; Infinite loop to halt the CPU if something goes wrong
_stop:
    cli                       ; Clear interrupts
    hlt                       ; Halt the CPU
    jmp _stop                 ; Jump to _stop, causing an infinite loop

; Define the BSS section where uninitialized data
section .bss
resb 8*1024;                  ; Allocate 8 KB of space for the kernel's stack

kernel_stack:                 ; Label for the start of the kernel stack

Breaking Down loader.asm:

1. Multiboot Header (section .multiboot):

This section needs to be within the first 8KB of our kernel image so GRUB can recognize it as a valid Multiboot kernel.

  • MAGIC (0x1BADB002): A special value that tells GRUB this is a Multiboot kernel.

  • FLAGS: A bitmask that tells GRUB what features we need.

    • FMBALIGN (1<<0): Asks GRUB to align all boot modules on page boundaries.

    • MEMINFO (1<<1): Requests memory map information from GRUB.

    • VIDINFO (0<<2): This flag is usually for requesting specific video modes. Setting it to 0<<2 means we're not asking for a specific mode right now.

  • CHECKSUM: Calculated as -(MAGIC + FLAGS). GRUB requires that MAGIC + FLAGS + CHECKSUM = 0 for the header to be valid.

  • align 4: Makes sure the header is aligned to 4-byte boundaries, which is required by the standard.

2. Code Section (section .text):

  • extern kernelMain, extern callConstructors: These tell the assembler that these functions are defined somewhere else (in our C++ code) and will be linked together later.

  • global loader: Makes the loader symbol visible to the linker, marking it as our kernel's entry point.

  • loader Function:

    • mov esp, kernel_stack: Sets up the stack pointer to our allocated kernel stack. This is crucial because the CPU doesn't have a valid stack when it first enters our code.

    • call callConstructors: Calls a C++ function that runs global and static object constructors. This is necessary for C++ runtime initialization.

    • push eax, push ebx: GRUB passes important information in these registers - eax contains the Multiboot magic number, and ebx has the address of the Multiboot information structure. We push these onto the stack to pass them as arguments to kernelMain.

    • call kernelMain: Transfers control to our main C++ kernel function.

  • _stop Loop:

    • cli: Disables hardware interrupts.

    • hlt: Puts the CPU in a low-power halt state until an interrupt occurs (which we've disabled).

    • jmp _stop: Creates an infinite loop to keep the CPU halted and prevent it from executing random memory if the kernel fails.

3. Uninitialized Data Section (section .bss):

  • resb 8*1024: Reserves 8KB of uninitialized memory for our kernel stack.

  • kernel_stack: A label pointing to the end (highest address) of this reserved space. This is where we initialize the stack pointer esp, since stacks grow downwards on x86.

The Linker Script (linker.ld)

The linker script tells the linker (ld) exactly how to arrange the different sections of our compiled code into the final executable binary. It defines the memory layout of our kernel.

ENTRY(loader)
OUTPUT_FORMAT(elf32-i386)
OUTPUT_ARCH(i386:i386)

SECTIONS
{
    __kernel_section_start = .;
    . = 0x0100000;
    .text : ALIGN(4)
    {
        __kernel_text_section_start = .;
        *(.multiboot)  /* Must be in first 8KB of file */
        code = .; _code = .; __code = .;
        *(.text*)
        . = ALIGN(4096);
        __kernel_text_section_end = .;
    }

    .data : ALIGN(4)
    {
        __kernel_data_section_start = .;
        data = .; _data = .; __data = .;
        *(.data)
        . = ALIGN(4096);
        __kernel_data_section_end = .;
    }

    .init_array : ALIGN(4)
    {
        start_ctors = .;
        KEEP(*(.init_array))
        KEEP(*(SORT_BY_INIT_PRIORITY(.init_array.*)))
        end_ctors = .;
    }

    .rodata : ALIGN(4)
    {
        __kernel_rodata_section_start = .;
        *(.rodata)
        __kernel_rodata_section_end = .;
    }

    .bss : ALIGN(4)
    {
        __kernel_bss_section_start = .;
        bss = .; _bss = .; __bss = .;
        *(.bss)
        *(COMMON)
        . = ALIGN(4096);
        __kernel_bss_section_end = .;
    }

    /DISCARD/ :
    {
        *(.fini_array*)
        *(.comment)
    }

    end = .; _end = .; __end = .;
    __kernel_section_end = .;
}

Breaking Down linker.ld:

  1. ENTRY(loader): Tells the linker that the loader symbol from loader.asm is our entry point.

  2. OUTPUT_FORMAT(elf32-i386) & OUTPUT_ARCH(i386:i386): Specifies that we want a 32-bit ELF binary for the i386 architecture.

  3. SECTIONS Block: This is where we define how input sections from our object files get arranged in the final binary.

    • . = 0x0100000;: Sets our kernel to start at 1MB (0x100000). This is standard practice since the lower 1MB is typically used by BIOS and firmware.

    • .text: Contains executable code.

      • *(.multiboot): Importantly, this places our .multiboot section from loader.asm at the very beginning, ensuring it's within the first 8KB as required.

      • *(.text*): Includes all other executable code sections.

    • .data: Contains initialized global and static variables.

    • .init_array: This is crucial for C++ global constructors. The linker creates start_ctors and end_ctors symbols that mark the beginning and end of an array of function pointers to these constructors.

    • .rodata: Contains read-only data like string literals.

    • .bss: Contains uninitialized global and static variables. The linker ensures this section is zeroed out at load time.

    • /DISCARD/: Sections listed here are ignored and not included in the final binary, which helps reduce file size.

The Kernel Entry (kernel.h and kernel.cpp)

These files define our initial C++ entry point and handle the execution of global constructors.

kernel.h

#ifndef KERNEL_H
#define KERNEL_H

#include <types.h> // Assumes this provides uint32_t etc.

/**
 * @typedef constructor
 * @brief Defines a pointer to a function with no arguments and no return value.
 *
 * This is used to reference global constructors during initialization.
 */
typedef void (*constructor)();

/**
 * @brief External declaration for the start and end of the constructors section.
 *
 * These symbols are defined by the linker and mark the range of global constructors to call during initialization.
 */
extern "C" constructor start_ctors;
extern "C" constructor end_ctors;

/**
 * @brief Calls all global constructors in the range defined by `start_ctors` and `end_ctors`.
 *
 * This function is called during kernel initialization to ensure all static/global objects are properly constructed.
 */
extern "C" void callConstructors() {
    for (constructor* i = &start_ctors; i != &end_ctors; i++) {
        (*i) (); // Call each constructor in the range.
    }
}

#endif // KERNEL_H
  • typedef void (*constructor)();: Defines a type called constructor as a pointer to a function that takes no arguments and returns nothing.

  • extern "C" constructor start_ctors; and extern "C" constructor end_ctors;: These reference symbols that our linker script (linker.ld) defines. They point to the beginning and end of the .init_array section, which contains pointers to our global C++ constructors. The extern "C" prevents C++ name mangling, ensuring our assembly code can find these symbols.

  • callConstructors(): This function loops through the array of constructor pointers and calls each one. This is essential for proper C++ runtime setup in our bare-metal environment.

kernel.cpp

#include <kernel.h>

extern "C" void kernelMain(void* multiboot_structure, uint32_t magicnumber) {
    // This is the main entry point of our C++ kernel.
    // At this stage, the CPU has transitioned to protected mode,
    // the stack is set up, and C++ global constructors have been called.
    //
    // 'multiboot_structure' (EBX) points to the Multiboot information structure,
    // which contains details about memory, modules, etc., provided by GRUB.
    // 'magicnumber' (EAX) is the Multiboot magic number, confirming a valid boot.

    // For now, we simply enter an infinite loop.
    // Future tutorials will replace this with actual kernel logic.
    while (1);
}
  • extern "C" void kernelMain(void* multiboot_structure, uint32_t magicnumber): This is our actual C++ kernel entry point. The extern "C" is crucial to prevent C++ name mangling so our assembly code in loader.asm can call it directly. The parameters correspond to the values GRUB put in the ebx and eax registers, which we pushed onto the stack in loader.asm.

  • while (1);: For now, our kernel just enters an infinite loop. This prevents the CPU from executing random memory and lets us verify that control successfully transferred to our kernel. We'll replace this with real OS functionality in later tutorials.

The Build System (Makefile)

The Makefile automates the entire process of compiling, linking, and packaging our OS components into a bootable ISO image.

GPP_PARAMS = -m32 -g -ffreestanding -Iinclude -fno-use-cxa-atexit -nostdlib -fno-builtin -fno-rtti -fno-exceptions -fno-common
ASM_PARAMS = --32 -g
ASM_NASM_PARAMS = -f elf32
objects = asm/loader.o \
          kernel.o
LD_PARAMS = -melf_i386

# Compiling C++ files inside the main directory
%.o: %.cpp
    g++ $(GPP_PARAMS) -o $@ -c $<

# Compiling NASM assembly files
asm/%.o: asm/%.asm
    nasm $(ASM_NASM_PARAMS) -o $@ $<

# Linking the kernel binary
kernel.bin: linker.ld $(objects)
    ld $(LD_PARAMS) -T $< -o $@ $(objects)

# Install the kernel binary
install: kernel.bin
    sudo cp kernel.bin /boot/kernel.bin

# Clean rule: removes object files and the final binary
clean:
    rm -f $(objects) kernel.bin

runq:
    qemu-system-i386 -cdrom kernel.iso -boot d  -vga std -serial stdio -m 1G -d int,cpu_reset -D ./log.txt

run:
    make clean
    make
    make iso
    qemu-system-i386 -cdrom kernel.iso -boot d  -vga std -serial stdio -m 1G -d int,cpu_reset -D ./log.txt

runvb: kernel.iso
    (killall VirtualBox && sleep 1) || true
    VirtualBox --startvm 'My Operating System' &

iso: kernel.bin
    mkdir iso
    mkdir iso/boot
    mkdir iso/boot/grub
    mkdir iso/boot/font
    cp kernel.bin iso/boot/kernel.bin
    echo 'set timeout=0'                   > iso/boot/grub/grub.cfg
    echo 'set default=0'                  >> iso/boot/grub/grub.cfg
    echo 'terminal_output gfxterm'        >> iso/boot/grub/grub.cfg
    echo ''                               >> iso/boot/grub/grub.cfg
    echo 'menuentry "My Operating System" {' >> iso/boot/grub/grub.cfg
    echo '  multiboot /boot/kernel.bin'    >> iso/boot/grub/grub.cfg
    echo '  boot'                        >> iso/boot/grub/grub.cfg
    echo '}'                              >> iso/boot/grub/grub.cfg
    grub-mkrescue --output=kernel.iso --modules="video gfxterm video_bochs video_cirrus" iso
    rm -rf iso

Understanding the Makefile:

1. Compiler/Assembler/Linker Parameters:

  • GPP_PARAMS: Flags for g++ (GCC C++ compiler).

    • -m32: Compiles for 32-bit architecture.

    • -g: Includes debugging information.

    • -ffreestanding: Tells the compiler we're in a freestanding environment (no standard library).

    • -Iinclude: Adds the include directory to the search path.

    • -fno-use-cxa-atexit, -nostdlib, -fno-builtin, -fno-rtti, -fno-exceptions, -fno-common: These disable standard library features, runtime type information, exceptions, and other features that aren't available or wanted in our bare-metal OS.

  • ASM_NASM_PARAMS: Flags for nasm.

    • -f elf32: Outputs 32-bit ELF object files.
  • LD_PARAMS: Flags for ld (linker).

    • -melf_i386: Specifies 32-bit ELF output format for i386.

2. Compilation Rules:

  • %.o: %.cpp: Rule to compile C++ source files into object files.

  • asm/%.o: asm/%.asm: Rule to compile NASM assembly files into object files.

3. kernel.bin Rule:

  • Links all object files using our linker script to create the final kernel binary.

4. install Rule:

  • Copies kernel.bin to /boot/kernel.bin (useful if you want to install on a real Linux system).

5. clean Rule:

  • Removes all generated files.

6. runq Rule:

  • Runs the already-built kernel.iso in QEMU with various debugging options.

7. run Rule:

  • A convenience rule that cleans, builds, creates the ISO, and runs QEMU. This is your main command for testing.

8. runvb Rule:

  • Attempts to run the ISO in VirtualBox (less important for this series).

9. iso Rule:

  • This is the most complex rule - it creates the bootable ISO image.

    • Creates the necessary directory structure for the ISO.

    • Copies our kernel binary into the structure.

    • Creates the GRUB configuration file (grub.cfg) that tells GRUB how to boot our kernel.

    • Uses grub-mkrescue to create the final bootable ISO.

    • Cleans up the temporary directory.

Custom Header Files

Since we're working in a freestanding environment without the standard library, we need to define our own basic types and utilities. These header files in our include/ directory provide the fundamental definitions we need.

stdint.h

This header defines integer types with specific bit widths, ensuring consistency across different platforms.

#ifndef _STDINT_H
#define _STDINT_H

// Exact-width integer types
typedef signed char         int8_t;
typedef unsigned char       uint8_t;

typedef short int           int16_t;
typedef unsigned short int  uint16_t;

typedef int                 int32_t;
typedef unsigned int        uint32_t;

typedef long long int       int64_t;
typedef unsigned long long int uint64_t;

// Least-width integer types
typedef int8_t              int_least8_t;
typedef uint8_t             uint_least8_t;

typedef int16_t             int_least16_t;
typedef uint16_t            uint_least16_t;

typedef int32_t             int_least32_t;
typedef uint32_t            uint_least32_t;

typedef int64_t             int_least64_t;
typedef uint64_t            uint_least64_t;

// Fastest minimum-width integer types
typedef int8_t              int_fast8_t;
typedef uint8_t             uint_fast8_t;

typedef int16_t             int_fast16_t;
typedef uint16_t            uint_fast16_t;

typedef int32_t             int_fast32_t;
typedef uint32_t            uint_fast32_t;

typedef int64_t             int_fast64_t;
typedef uint64_t            uint_fast64_t;

// Pointer-sized integer types
typedef int32_t             intptr_t;
typedef uint32_t            uintptr_t;

// Maximum-width integer types
typedef int64_t             intmax_t;
typedef uint64_t            uintmax_t;

// Limits of exact-width integer types
#define INT8_MIN            (-128)
#define INT8_MAX            (127)
#define UINT8_MAX           (255)

#define INT16_MIN           (-32768)
#define INT16_MAX           (32767)
#define UINT16_MAX          (65535)

#define INT32_MIN           (-2147483648)
#define INT32_MAX           (2147483647)
#define UINT32_MAX          (4294967295U)

#define INT64_MIN           (-9223372036854775808LL)
#define INT64_MAX           (9223372036854775807LL)
#define UINT64_MAX          (18446744073709551615ULL)

#endif // _STDINT_H

types.h

#ifndef TYPES_H
#define TYPES_H

#include <stdint.h> // For standard fixed-width integer types
#include <stddef.h> // For size_t, NULL, etc.

typedef uint8_t byte;
typedef uint16_t word;
typedef uint32_t dword;

#endif // TYPES_H

Building and Running Your First OS

Now that we understand all the components, let's build and run our initial kernel.

1.Organize Your Files:

Make sure your files are structured correctly in your hashx86 project directory as shown in the folder structure above.

2. Build and Run:

Open your WSL Ubuntu terminal, navigate to your hashx86 directory, and run:

make run
This command will:

  • Clean any previous build files.

  • Compile loader.asm and kernel.cpp.

  • Link everything together using linker.ld to create kernel.bin.

  • Create the bootable kernel.iso image.

  • Launch QEMU to boot from the ISO.

Expected Outcome:

You should see a QEMU window open. It might show GRUB boot information or just a black screen. If the QEMU window appears and stays open without immediately closing or showing error messages, that means:

  1. GRUB successfully loaded your kernel.bin.

  2. Control was properly transferred from GRUB to loader.asm.

  3. loader.asm correctly set up the stack, called C++ constructors, and jumped to kernelMain.

  4. kernelMain executed and entered its while(1); loop.

The fact that the QEMU window is stable (whether showing GRUB text or a black screen) indicates that your fundamental boot process is working correctly.

img_1


Conclusion

You've successfully compiled, linked, and executed your first basic operating system kernel. This tutorial covered the essential role of the Multiboot standard, the assembly-level bootloader, how the linker controls memory layout, and the initial C++ kernel entry point.

In the next tutorial, we'll break the silence of that black screen and implement VGA Text Output, allowing our OS to display messages directly on the screen.