Sunday, April 27, 2008

Linux boot process parts.


from http://www.snow.nl/dist/htmlc/ch02.html.
The Linux boot process can be logically divided into six parts. They are as follows:
1, Kernel loader loading, setup, and execution (bootsect.s)
In this step the file bootsect.s is loaded into memory by the BIOS. bootsect.s then sets up a few parameters and loads the rest of the kernel into memory.
2, Parameter setup and switch to 32-bit mode (boot.s)
After the kernel has been loaded, boot.s takes over. It sets up a temporary IDT and GDT (explained later on) and handles the switch to 32-bit mode.
Detailed information on IDT, GDT and LDT can be found on sandpile.org - The world's leading source for pure technical x86 processor information. http://www.sandpile.org/
3, Kernel decompression (compressed/head.s)
The kernel is stored in a compressed format. This head.s (since there is another head.s) decompresses the kernel.
4, Kernel setup (head.s)
After the kernel is decompressed, head.s (the second one) takes over. The real GDT and IDT are created, as is a basic memory-paging table.
5, Kernel and memory initialization (main.c)
This step is the most complex. The kernel now has control and sets up all remaining parameters and initializes everything remaining. Virtual memory is setup completely and the first processes are created.
6, Init process creation (main.c)
In the final step of booting, the Init process is created.

Kernel Loader (linux/arch/i386/boot/bootsect.s)
When the computer is first turned on, BIOS loads the boot sector of the boot disk into memory at location 0x7C00. This first sector corresponds to the bootsect.s file. The BIOS will only copy 512 bytes, so the kernel loader must be small. The code that is loaded by the BIOS must be able to load the remaining portions of the operating system and pass control onto the next file.

The first thing that bootsect.s does when it is loaded is to move itself to the memory location 0x9000. This is to avoid any possible conflicts in memory. The code then jumps to the new copy located at 0x9000. After this, an area in memory is set aside (0x4000-12) for a new disk parameter table. To make it so that more than one sector can be read from the disk at a time, we will try to find the largest number of sectors that can be read at a time. This will help speed reads from the disk when we begin loading the rest of the kernel.

Before this is done, setup.s is loaded into memory in the memory space above bootsect.s, 0x9020. This allows setup.s to be jumped to after the kernel has been loaded. Now the disk parameter table is created. Basically, the code tries to read 36 sectors, if that fails it tries 18, 15, then if all else fails it uses 9 as the default.

If at any point there is an error, little can be done. In most cases, bootsect.s will just keep trying to do what it was doing when the error occurred. Usually this will end in an unbroken loop that can only be resolved by rebooting by hand.

At last we are ready to copy the kernel into memory. bootsect.s goes into a loop that reads the first 508Kb from the disk and places it into memory starting at 0x10000. After the kernel is loaded into RAM, bootsect.s jumps to 0x9020, where setup.s is loaded.

Parameter Setup (linux/arch/i386/boot/setup.s)
setup.s makes sure that all hardware information has been collected and gathers more information if necessary. It first verifies that it is loaded at 0x9020. After this is verified, setup.s does the following:
  1. Gets main memory size
  2. Sets keyboard repeat rate to the maximum
  3. Retrieves video card information
  4. Collects monitor information for the terminal to use
  5. Gets information about the first and possibly second hard drive using BIOS
  6. Looks to see if there is a mouse (a pointing device) attached to the system
All of the information that setup.s collects is stored for later use by device drivers and other areas of the system. Like bootsect.s, if an error occurs little can be done. Most errors are “handled” by an infinite loop that has to be reset manually.

The next step in the booting process needs to use virtual memory. This can only be used on a x86 by switching from real mode to protected mode. After all information has been gathered by setup.s, it does a few more housekeeping chores to get ready for the switch to 32-bit mode.

First, all interrupts are disabled. Once the system is in 32-bit mode, no more BIOS calls can be made. The area of memory at 0x1000 is where the BIOS handlers were loaded when the system came up. We no longer need these, so to get the compressed kernel out of the way, setup.s moves the kernel from 0x10000 to 0x1000. This provides room for a temporary IDT (Interrupt Descriptor Table) and GDT (Global Descriptor Table). The GDT is only setup to have the system in memory. All paging is disabled, so that described memory locations correspond to actual memory addresses. At this point, extended (or high) memory is enabled.

Setup also resets any present coprocessor and reconfigures the 8259 Programmable Interrupt Controller. All that remains now is for the protected bit mask to be set, and the processor is in 32-bit mode. After the switch has been made, setup.s lets processing continue at /compressed/head.s to uncompress the kernel.

Kernel Decompression (linux/arch/i386/boot/compressed/head.s )
This first head.s uncompresses the kernel into memory. The kernel is gzip-compressed to make sure that it can fit into the 508Kb that bootsect.s will load. When the kernel is compiled, bootsect.s, head.s , and /compressed/head.s are not compressed and are appended to the front of the compressed kernel. They are the only three files that must remain uncompressed.

head.s decompresses the kernel to address 0x1000000. This corresponds to the 1Mb boundary in memory. head.s does a bit of error checking before it decompresses the kernel to ensure that there is enough memory available in high memory.

Right before the decompression is done, the flags register is reset and the area in memory where setup.s was is cleared. This is to put the system in a better known state. After the decompression, control is passed to the now decompressed head.s.

Kernel Setup (linux/arch/i386/kernel/head.s)
The second head.s is responsible for setting up the permanent IDT and GDT, as well as a basic paging table. Before anything is done, the flags register is again reset. The first page in the paging system is setup at 0x5000. This page is filled by the information gathered in setup.s by copying it from its location at 0x9000.

Next the processor type is determined. For 586s (Pentium) and higher there is a processor command that returns the type of processor. Unfortunately, the 386 and 486 do not have this feature so some tricks have to be employed. Each processor has only certain flags, so by trying to read and write to them you can determine the type of processor. If a coprocessor is present that is also detected.

After that, the IDT and GDT are set up. The table for the IDT is set up. Each interrupt gets an 8-byte descriptor. Each descriptor is initially set to ignore_int. This means that nothing will happen when the interrupt is called. All that ignore_int does is, is save the registers, print “unknown interrupts”, and then restore the registers.

Each IDT descriptor is divided into four two-byte sections. The top four bytes are called the WW, while the bottom four are the CW. The WW contains a two-byte offset, a P-flag set to 1, and a Descriptor Privilege Level. The CW has a selector and an offset. In total the IDT can contain up to 256 entries.

At this point the code sets up memory paging. In the x86 architecture, virtual memory uses three descriptors to establish an address: a Page Directory, a Page Table, and a Page Frame. The Page Directory is a table of all of the pages and what processes they match to. The Page Directory contains an index into the Page Table. The Page Table maps the virtual address to the beginning of a physical page in memory. The Page Frame and an offset use the beginning address of the physical page and can retrieve an actual location in memory. The three structures are setup by head.s. They make it so that the first 4Mb of memory is in the Page Directory. The kernel's virtual address is set to 0xC0000000, or the top of the last gigabyte of memory.

Each memory address in an x86 has three parts. The first is the index into the Page Directory. The result of this index is the start of a specific Page Table. The second part of the 32-bit address is an offset into the Page Table. The Page Table has a 32-bit entry that corresponds to that offset. The top 20 bits are used to get an actual physical address. The lower 12 bits are used for administrative purposes. The physical address corresponds to the start of a physical page. The third part of the 32-bit address is an offset within this page, equal to a real memory location.

Almost everything is set up at this point. Now control is passed to the main function in the kernel. Main.c gains control.

Init process creation (linux/init/main.c)
After all of the init functions have been called main.c tries to start the init process. main.c tries three different copies of init in order. If the first doesn't work, it tries the second, if that one doesn't work it goes to the third. Here are the file names for the three init versions:

/etc/init
/bin/init
/sbin/init

If none of these three inits work, then the system goes into single user mode. init is needed to log in multiple users and to manage many other tasks. If it fails, then the single user mode creates a shell and the system goes from there.

tools/build builds boot image zImage from {bootsect, setup, compressed/vmlinux.out}, or bzImage from {bbootsect, bsetup, compressed/bvmlinux,out}.