02 - Bootloader Essentials and Kernel Entry¶
Now that we have our development environment set up from the previous tutorial, it's time to dive into the core of our operating system: the bootloader and kernel entry point. In this tutorial, we'll examine the source files (loader.asm
, kernel.h
, kernel.cpp
), the linker script (linker.ld
), and the build system (Makefile
) to understand how these components work together to bring our OS to life.
Our main goal here is to successfully transition control from the system firmware and GRUB (the bootloader) to our custom kernel, creating a solid foundation for everything we'll build on top of it.
Folder structure¶
hashx86/
├── kernel.cpp
├── linker.ld
├── Makefile
├── include/
│ └── kernel.h
│ └── stdint.h
│ └── types.h
└── asm/
└── loader.asm
The Multiboot Standard and loader.asm
¶
The Multiboot Standard is a specification that defines how a bootloader like GRUB should load an operating system kernel. By following this standard, our kernel can be loaded by any compatible bootloader, and we get important information about the system's current state.
Our loader.asm
file contains the very first code that runs after GRUB hands control over to our kernel.
; Refer to https://www.gnu.org/software/grub/manual/multiboot/multiboot.html#Header-layout for more info.
; ---------------------------------------------------------------------------------------------- ;
; Multiboot header (VGA) ;
; ---------------------------------------------------------------------------------------------- ;
FMBALIGN equ 1<<0 ; Align loaded modules on page boundaries ;
MEMINFO equ 1<<1 ; Provide memory map ;
VIDINFO equ 0<<2 ; Video mode set ;
; ;
FLAGS equ FMBALIGN | MEMINFO | VIDINFO; This is the multiboot 'flag' field ;
MAGIC equ 0x1BADB002 ; Magic Number ;
CHECKSUM equ -(MAGIC + FLAGS) ; Checksum ;
; ;
section .multiboot ;
align 4 ;
dd MAGIC ; Store the magic number ;
dd FLAGS ; Store the flags value ;
dd CHECKSUM ; Store the checksum value ;
; ---------------------------------------------------------------------------------------------- ;
; Define the text section where executable code is placed
section .text
extern kernelMain ; Declare an external reference to the kernel's entry point function (kernelMain)
extern callConstructors ; Declare an external reference for calling constructors (C++ global/static constructors)
global loader ; Make the loader function globally accessible
; The loader function is the entry point executed after the bootloader loads the kernel
loader:
mov esp, kernel_stack ; Set the stack pointer to the beginning of the kernel stack
call callConstructors ; Call the constructor functions
push eax ; Save the value of EAX register (for passing data)
push ebx ; Save the value of EBX register (for passing data)
call kernelMain ; Call the kernel's main entry function
; Infinite loop to halt the CPU if something goes wrong
_stop:
cli ; Clear interrupts
hlt ; Halt the CPU
jmp _stop ; Jump to _stop, causing an infinite loop
; Define the BSS section where uninitialized data
section .bss
resb 8*1024; ; Allocate 8 KB of space for the kernel's stack
kernel_stack: ; Label for the start of the kernel stack
Breaking Down loader.asm
:¶
1. Multiboot Header (section .multiboot
):¶
This section needs to be within the first 8KB of our kernel image so GRUB can recognize it as a valid Multiboot kernel.
-
MAGIC
(0x1BADB002
): A special value that tells GRUB this is a Multiboot kernel. -
FLAGS
: A bitmask that tells GRUB what features we need.-
FMBALIGN (1<<0)
: Asks GRUB to align all boot modules on page boundaries. -
MEMINFO (1<<1)
: Requests memory map information from GRUB. -
VIDINFO (0<<2)
: This flag is usually for requesting specific video modes. Setting it to0<<2
means we're not asking for a specific mode right now.
-
-
CHECKSUM
: Calculated as-(MAGIC + FLAGS)
. GRUB requires thatMAGIC + FLAGS + CHECKSUM = 0
for the header to be valid. -
align 4
: Makes sure the header is aligned to 4-byte boundaries, which is required by the standard.
2. Code Section (section .text
):¶
-
extern kernelMain
,extern callConstructors
: These tell the assembler that these functions are defined somewhere else (in our C++ code) and will be linked together later. -
global loader
: Makes theloader
symbol visible to the linker, marking it as our kernel's entry point. -
loader
Function:-
mov esp, kernel_stack
: Sets up the stack pointer to our allocated kernel stack. This is crucial because the CPU doesn't have a valid stack when it first enters our code. -
call callConstructors
: Calls a C++ function that runs global and static object constructors. This is necessary for C++ runtime initialization. -
push eax
,push ebx
: GRUB passes important information in these registers -eax
contains the Multiboot magic number, andebx
has the address of the Multiboot information structure. We push these onto the stack to pass them as arguments tokernelMain
. -
call kernelMain
: Transfers control to our main C++ kernel function.
-
-
_stop
Loop:-
cli
: Disables hardware interrupts. -
hlt
: Puts the CPU in a low-power halt state until an interrupt occurs (which we've disabled). -
jmp _stop
: Creates an infinite loop to keep the CPU halted and prevent it from executing random memory if the kernel fails.
-
3. Uninitialized Data Section (section .bss
):¶
-
resb 8*1024
: Reserves 8KB of uninitialized memory for our kernel stack. -
kernel_stack
: A label pointing to the end (highest address) of this reserved space. This is where we initialize the stack pointeresp
, since stacks grow downwards on x86.
The Linker Script (linker.ld
)¶
The linker script tells the linker (ld
) exactly how to arrange the different sections of our compiled code into the final executable binary. It defines the memory layout of our kernel.
ENTRY(loader)
OUTPUT_FORMAT(elf32-i386)
OUTPUT_ARCH(i386:i386)
SECTIONS
{
__kernel_section_start = .;
. = 0x0100000;
.text : ALIGN(4)
{
__kernel_text_section_start = .;
*(.multiboot) /* Must be in first 8KB of file */
code = .; _code = .; __code = .;
*(.text*)
. = ALIGN(4096);
__kernel_text_section_end = .;
}
.data : ALIGN(4)
{
__kernel_data_section_start = .;
data = .; _data = .; __data = .;
*(.data)
. = ALIGN(4096);
__kernel_data_section_end = .;
}
.init_array : ALIGN(4)
{
start_ctors = .;
KEEP(*(.init_array))
KEEP(*(SORT_BY_INIT_PRIORITY(.init_array.*)))
end_ctors = .;
}
.rodata : ALIGN(4)
{
__kernel_rodata_section_start = .;
*(.rodata)
__kernel_rodata_section_end = .;
}
.bss : ALIGN(4)
{
__kernel_bss_section_start = .;
bss = .; _bss = .; __bss = .;
*(.bss)
*(COMMON)
. = ALIGN(4096);
__kernel_bss_section_end = .;
}
/DISCARD/ :
{
*(.fini_array*)
*(.comment)
}
end = .; _end = .; __end = .;
__kernel_section_end = .;
}
Breaking Down linker.ld
:¶
-
ENTRY(loader)
: Tells the linker that theloader
symbol fromloader.asm
is our entry point. -
OUTPUT_FORMAT(elf32-i386)
&OUTPUT_ARCH(i386:i386)
: Specifies that we want a 32-bit ELF binary for the i386 architecture. -
SECTIONS
Block: This is where we define how input sections from our object files get arranged in the final binary.-
. = 0x0100000;
: Sets our kernel to start at 1MB (0x100000). This is standard practice since the lower 1MB is typically used by BIOS and firmware. -
.text
: Contains executable code.-
*(.multiboot)
: Importantly, this places our.multiboot
section fromloader.asm
at the very beginning, ensuring it's within the first 8KB as required. -
*(.text*)
: Includes all other executable code sections.
-
-
.data
: Contains initialized global and static variables. -
.init_array
: This is crucial for C++ global constructors. The linker createsstart_ctors
andend_ctors
symbols that mark the beginning and end of an array of function pointers to these constructors. -
.rodata
: Contains read-only data like string literals. -
.bss
: Contains uninitialized global and static variables. The linker ensures this section is zeroed out at load time. -
/DISCARD/
: Sections listed here are ignored and not included in the final binary, which helps reduce file size.
-
The Kernel Entry (kernel.h
and kernel.cpp
)¶
These files define our initial C++ entry point and handle the execution of global constructors.
kernel.h
¶
#ifndef KERNEL_H
#define KERNEL_H
#include <types.h> // Assumes this provides uint32_t etc.
/**
* @typedef constructor
* @brief Defines a pointer to a function with no arguments and no return value.
*
* This is used to reference global constructors during initialization.
*/
typedef void (*constructor)();
/**
* @brief External declaration for the start and end of the constructors section.
*
* These symbols are defined by the linker and mark the range of global constructors to call during initialization.
*/
extern "C" constructor start_ctors;
extern "C" constructor end_ctors;
/**
* @brief Calls all global constructors in the range defined by `start_ctors` and `end_ctors`.
*
* This function is called during kernel initialization to ensure all static/global objects are properly constructed.
*/
extern "C" void callConstructors() {
for (constructor* i = &start_ctors; i != &end_ctors; i++) {
(*i) (); // Call each constructor in the range.
}
}
#endif // KERNEL_H
-
typedef void (*constructor)();
: Defines a type calledconstructor
as a pointer to a function that takes no arguments and returns nothing. -
extern "C" constructor start_ctors;
andextern "C" constructor end_ctors;
: These reference symbols that our linker script (linker.ld
) defines. They point to the beginning and end of the.init_array
section, which contains pointers to our global C++ constructors. Theextern "C"
prevents C++ name mangling, ensuring our assembly code can find these symbols. -
callConstructors()
: This function loops through the array of constructor pointers and calls each one. This is essential for proper C++ runtime setup in our bare-metal environment.
kernel.cpp
¶
#include <kernel.h>
extern "C" void kernelMain(void* multiboot_structure, uint32_t magicnumber) {
// This is the main entry point of our C++ kernel.
// At this stage, the CPU has transitioned to protected mode,
// the stack is set up, and C++ global constructors have been called.
//
// 'multiboot_structure' (EBX) points to the Multiboot information structure,
// which contains details about memory, modules, etc., provided by GRUB.
// 'magicnumber' (EAX) is the Multiboot magic number, confirming a valid boot.
// For now, we simply enter an infinite loop.
// Future tutorials will replace this with actual kernel logic.
while (1);
}
-
extern "C" void kernelMain(void* multiboot_structure, uint32_t magicnumber)
: This is our actual C++ kernel entry point. Theextern "C"
is crucial to prevent C++ name mangling so our assembly code inloader.asm
can call it directly. The parameters correspond to the values GRUB put in theebx
andeax
registers, which we pushed onto the stack inloader.asm
. -
while (1);
: For now, our kernel just enters an infinite loop. This prevents the CPU from executing random memory and lets us verify that control successfully transferred to our kernel. We'll replace this with real OS functionality in later tutorials.
The Build System (Makefile
)¶
The Makefile
automates the entire process of compiling, linking, and packaging our OS components into a bootable ISO image.
GPP_PARAMS = -m32 -g -ffreestanding -Iinclude -fno-use-cxa-atexit -nostdlib -fno-builtin -fno-rtti -fno-exceptions -fno-common
ASM_PARAMS = --32 -g
ASM_NASM_PARAMS = -f elf32
objects = asm/loader.o \
kernel.o
LD_PARAMS = -melf_i386
# Compiling C++ files inside the main directory
%.o: %.cpp
g++ $(GPP_PARAMS) -o $@ -c $<
# Compiling NASM assembly files
asm/%.o: asm/%.asm
nasm $(ASM_NASM_PARAMS) -o $@ $<
# Linking the kernel binary
kernel.bin: linker.ld $(objects)
ld $(LD_PARAMS) -T $< -o $@ $(objects)
# Install the kernel binary
install: kernel.bin
sudo cp kernel.bin /boot/kernel.bin
# Clean rule: removes object files and the final binary
clean:
rm -f $(objects) kernel.bin
runq:
qemu-system-i386 -cdrom kernel.iso -boot d -vga std -serial stdio -m 1G -d int,cpu_reset -D ./log.txt
run:
make clean
make
make iso
qemu-system-i386 -cdrom kernel.iso -boot d -vga std -serial stdio -m 1G -d int,cpu_reset -D ./log.txt
runvb: kernel.iso
(killall VirtualBox && sleep 1) || true
VirtualBox --startvm 'My Operating System' &
iso: kernel.bin
mkdir iso
mkdir iso/boot
mkdir iso/boot/grub
mkdir iso/boot/font
cp kernel.bin iso/boot/kernel.bin
echo 'set timeout=0' > iso/boot/grub/grub.cfg
echo 'set default=0' >> iso/boot/grub/grub.cfg
echo 'terminal_output gfxterm' >> iso/boot/grub/grub.cfg
echo '' >> iso/boot/grub/grub.cfg
echo 'menuentry "My Operating System" {' >> iso/boot/grub/grub.cfg
echo ' multiboot /boot/kernel.bin' >> iso/boot/grub/grub.cfg
echo ' boot' >> iso/boot/grub/grub.cfg
echo '}' >> iso/boot/grub/grub.cfg
grub-mkrescue --output=kernel.iso --modules="video gfxterm video_bochs video_cirrus" iso
rm -rf iso
Understanding the Makefile
:¶
1. Compiler/Assembler/Linker Parameters:¶
-
GPP_PARAMS
: Flags forg++
(GCC C++ compiler).-
-m32
: Compiles for 32-bit architecture. -
-g
: Includes debugging information. -
-ffreestanding
: Tells the compiler we're in a freestanding environment (no standard library). -
-Iinclude
: Adds the include directory to the search path. -
-fno-use-cxa-atexit
,-nostdlib
,-fno-builtin
,-fno-rtti
,-fno-exceptions
,-fno-common
: These disable standard library features, runtime type information, exceptions, and other features that aren't available or wanted in our bare-metal OS.
-
-
ASM_NASM_PARAMS
: Flags fornasm
.-f elf32
: Outputs 32-bit ELF object files.
-
LD_PARAMS
: Flags forld
(linker).-melf_i386
: Specifies 32-bit ELF output format for i386.
2. Compilation Rules:¶
-
%.o: %.cpp
: Rule to compile C++ source files into object files. -
asm/%.o: asm/%.asm
: Rule to compile NASM assembly files into object files.
3. kernel.bin
Rule:¶
- Links all object files using our linker script to create the final kernel binary.
4. install
Rule:¶
- Copies
kernel.bin
to/boot/kernel.bin
(useful if you want to install on a real Linux system).
5. clean
Rule:¶
- Removes all generated files.
6. runq
Rule:¶
- Runs the already-built kernel.iso in QEMU with various debugging options.
7. run
Rule:¶
- A convenience rule that cleans, builds, creates the ISO, and runs QEMU. This is your main command for testing.
8. runvb
Rule:¶
- Attempts to run the ISO in VirtualBox (less important for this series).
9. iso
Rule:¶
-
This is the most complex rule - it creates the bootable ISO image.
-
Creates the necessary directory structure for the ISO.
-
Copies our kernel binary into the structure.
-
Creates the GRUB configuration file (
grub.cfg
) that tells GRUB how to boot our kernel. -
Uses
grub-mkrescue
to create the final bootable ISO. -
Cleans up the temporary directory.
-
Custom Header Files¶
Since we're working in a freestanding environment without the standard library, we need to define our own basic types and utilities. These header files in our include/
directory provide the fundamental definitions we need.
stdint.h
¶
This header defines integer types with specific bit widths, ensuring consistency across different platforms.
#ifndef _STDINT_H
#define _STDINT_H
// Exact-width integer types
typedef signed char int8_t;
typedef unsigned char uint8_t;
typedef short int int16_t;
typedef unsigned short int uint16_t;
typedef int int32_t;
typedef unsigned int uint32_t;
typedef long long int int64_t;
typedef unsigned long long int uint64_t;
// Least-width integer types
typedef int8_t int_least8_t;
typedef uint8_t uint_least8_t;
typedef int16_t int_least16_t;
typedef uint16_t uint_least16_t;
typedef int32_t int_least32_t;
typedef uint32_t uint_least32_t;
typedef int64_t int_least64_t;
typedef uint64_t uint_least64_t;
// Fastest minimum-width integer types
typedef int8_t int_fast8_t;
typedef uint8_t uint_fast8_t;
typedef int16_t int_fast16_t;
typedef uint16_t uint_fast16_t;
typedef int32_t int_fast32_t;
typedef uint32_t uint_fast32_t;
typedef int64_t int_fast64_t;
typedef uint64_t uint_fast64_t;
// Pointer-sized integer types
typedef int32_t intptr_t;
typedef uint32_t uintptr_t;
// Maximum-width integer types
typedef int64_t intmax_t;
typedef uint64_t uintmax_t;
// Limits of exact-width integer types
#define INT8_MIN (-128)
#define INT8_MAX (127)
#define UINT8_MAX (255)
#define INT16_MIN (-32768)
#define INT16_MAX (32767)
#define UINT16_MAX (65535)
#define INT32_MIN (-2147483648)
#define INT32_MAX (2147483647)
#define UINT32_MAX (4294967295U)
#define INT64_MIN (-9223372036854775808LL)
#define INT64_MAX (9223372036854775807LL)
#define UINT64_MAX (18446744073709551615ULL)
#endif // _STDINT_H
types.h
¶
#ifndef TYPES_H
#define TYPES_H
#include <stdint.h> // For standard fixed-width integer types
#include <stddef.h> // For size_t, NULL, etc.
typedef uint8_t byte;
typedef uint16_t word;
typedef uint32_t dword;
#endif // TYPES_H
Building and Running Your First OS¶
Now that we understand all the components, let's build and run our initial kernel.
1.Organize Your Files:¶
Make sure your files are structured correctly in your hashx86
project directory as shown in the folder structure above.
2. Build and Run:¶
Open your WSL Ubuntu terminal, navigate to your hashx86
directory, and run:
make run
-
Clean any previous build files.
-
Compile
loader.asm
andkernel.cpp
. -
Link everything together using
linker.ld
to createkernel.bin
. -
Create the bootable
kernel.iso
image. -
Launch QEMU to boot from the ISO.
Expected Outcome:¶
You should see a QEMU window open. It might show GRUB boot information or just a black screen. If the QEMU window appears and stays open without immediately closing or showing error messages, that means:
-
GRUB successfully loaded your
kernel.bin
. -
Control was properly transferred from GRUB to
loader.asm
. -
loader.asm
correctly set up the stack, called C++ constructors, and jumped tokernelMain
. -
kernelMain
executed and entered itswhile(1);
loop.
The fact that the QEMU window is stable (whether showing GRUB text or a black screen) indicates that your fundamental boot process is working correctly.
Conclusion¶
You've successfully compiled, linked, and executed your first basic operating system kernel. This tutorial covered the essential role of the Multiboot standard, the assembly-level bootloader, how the linker controls memory layout, and the initial C++ kernel entry point.
In the next tutorial, we'll break the silence of that black screen and implement VGA Text Output, allowing our OS to display messages directly on the screen.