Minix Master Boot Record code reading

Background:
 
Minix is an operating system developed in Vrije Universiteit (in Amsterdam, the Netherlands). It is used for educational purposes, and used in the famous book "Operating Systems: Design and Implementation" by Professor Andrew S. Tanenbaum.
 
Master Boot Record is the very first sector in contemporary hard disks. Although it will be replaced by Guid Partition Table and related architecture defined in Intel Unified Extensible Firmware Interface (UEFI) due to the fact that it cannot handle hard disks larger than 2TB, it is still valuable for educational purposes. It contains machine code that reads the partition table, and loads the boot sector of the active partition into memory, and jumps to it to boot the operating system residing on the active partition.
 
For your reference, I wrote this article Partition Tables and the Master Boot Record when I was in university. After years, I found it was not accurate in some places, such as CHS/LBA information, so I put some explanation here:
 
There are four partition table entries in the MBR. The first one begins at offset 0x1BE in the MBR. There are 16 bytes in the entry. The first byte is the active partition flag. From the second byte, three bytes represent the starting CHS (Cylinder, Head, Sector) for the partition. The fifth byte (offset 4) is the system type of the partition, marking any FAT/FAT32/Linux/NTFS/etc. (also differing between <= 8G and > 8G situation). From the sixth byte (offset 5), three bytes represent the last sector's CHS. From the 9th byte (offset 8), four bytes represent the relative sector number (sector number relative to sector containing the partition table), and this value can also be regarded as the LBA sector number (LBA number starts from 0). From the 13th byte (offset 12), four bytes represent the number of sectors in the partition. Due to Intel x86 uses little-endian representation of integers, the four bytes of sector also follows that, so for example 0x12345678 would be 78 56 34 12.
 
The three CHS bytes are actually in the order H, S, C. In details, the first byte is the head number (0 to 255; while DOS supports only 0 to 254). The second byte is a composite: its high 2 bits are the high 2 bits of the cylinder number (0 to 1023). Its low 6 bits are the sector number (1 to 63). The third byte is the low 8 bits of the cylinder number. For example FE BF 8C means head 254, sector 63, cylinder 652 (remember where its two high bits come from?).
 
Code reading:
 
;       masterboot 2.0 - Master boot block code         Author: Kees J. Bot
;
;   Annotated by Robbie (Decheng) Fan 2006, 2011
;
; This code may be placed in the first sector (the boot sector) of a floppy,
; hard disk or hard disk primary partition (Robbie: most often it is placed in
; the first sector of the hard disk, as the master boot record (MBR)).  There
; it will perform the following actions at boot time:
;
; - If the booted device is a hard disk and one of the partitions is active
;   then the active partition is booted.
;
; - Otherwise the next floppy or hard disk device is booted, trying them one
;   by one.
;
; To make things a little clearer, the boot path might be:
;       /dev/fd0        - Floppy disk containing data, tries fd1 then d0
;       [/dev/fd1]      - Drive empty
;       /dev/c0d0       - Master boot block, selects active partition 2
;       /dev/c0d0p2     - Submaster, selects active subpartition 0
;       /dev/c0d0p2s0   - Minix bootblock, reads Boot Monitor /boot
;       Minix           - Started by /boot from a kernel image in /minix
 
        ; 0x0000:LOADOFF is where this code is loaded
        ; Robbie: 0x7C00 is the address where BIOS loads the boot sector or MBR
        ; upon booting the computer; it is a standard of IBM PC
        LOADOFF    =    0x7C00
        ; BUFFER: First free memory
        ; Robbie: 0x0600 is a typical address where the MBR backs itself up;
        ; the MBR created by MS-DOS FDISK utility also uses this address
        BUFFER     =    0x0600
        ; PART_TABLE: Location of partition table within this code
        ; Robbie: it is 0x1BE, the offset of the partition table within the MBR
        PART_TABLE =       446
        PENTRYSIZE =        16  ; Size of one partition table entry
        ; Location of the AA55 magic number
        ; Robbie: As on Intel platform the number is represented in
        ; little-endian byte order, AA55 is put in two consecutive bytes 55 AA
        MAGIC      =       510
 
        ; <ibm/partition>.h:
        bootind    =         0
        sysind     =         4
        lowsec     =         8
 
 
.text
 
; Find active (sub)partition, load its first sector, run it.
 
; Robbie: starting copying the MBR to 0000:0600 {
master:
        xor     ax, ax
        mov     ds, ax
        mov     es, ax
        ; Robbie: need to disable interrupts when setting the stack segment
        ; (ss) and the stack pointer (sp)
        cli
        mov     ss, ax                  ; ds = es = ss = Vector segment
        mov     sp, #LOADOFF
        sti
 
; Copy this code to safety, then jump to it.
        mov     si, sp                  ; si = start of this code
        push    si                      ; Also where we'll return to eventually
        mov     di, #BUFFER             ; Buffer area
        mov     cx, #512/2              ; One sector
        ; Robbie: "cld" (makes the direction forward) is necessary
        cld
        rep     movs
; Robbie: }
        ; Robbie Mosaic: jmpf means "jump far", so it requires two short
        ; integers
        jmpf    BUFFER+migrate, 0       ; To safety
migrate:
 
; Find the active partition
findactive:
        ; Robbie Mosaic: let (dl and dl) affect the psw; dl is set initially by
        ; BIOS (what's the value? 0x00 for floppy, 0x80 for C: hard disk)
        testb   dl, dl
        ; Robbie Mosaic: jump if not signed (positive) ((dl & 0x80) == 0); this
        ; means it is not a hard disk, such as a floppy disk
        jns     nextdisk                ; No bootable partitions on floppies
        mov     si, #BUFFER+PART_TABLE
find:   cmpb    sysind(si), #0          ; Partition type, nonzero when in use
        jz      nextpart
        testb   bootind(si), #0x80      ; Active partition flag in bit 7
        jz      nextpart                ; It's not active
loadpart:
        ; Robbie Mosaic: note that a following STC instruction sets the carry
        ; flag so that the next jc statement will work
        call    load                    ; Load partition bootstrap
        ; Robbie: if CF is set then it's error
        jc      error1                  ; Not supposed to fail
bootstrap:
        ret                             ; Jump to the master bootstrap
                                        ; Robbie Mosaic: this ret jumps to
                                        ; #LOADOFF, which is the value last
                                        ; pushed into the stack.
nextpart:
        ; Robbie: Increment and jump back to `find'
        add     si, #PENTRYSIZE
        cmp     si, #BUFFER+PART_TABLE+4*PENTRYSIZE
        jb      find
; No active partition, tell 'em
        call    print
        .ascii  "No active partition\0"
        jmp     reboot
 
; There are no active partitions on this drive, try the next drive.
nextdisk:
        incb    dl                      ; Increment dl for the next drive
        testb   dl, dl
        js      nexthd                  ; Hard disk if negative
        ; Robbie: excerpt from RBIL
        ; INT 11 - BIOS - GET EQUIPMENT LIST
        ; Return: (E)AX = BIOS equipment list word (see #00226,#03215 at INT
        ;     4B"Tandy")
        ; Note: since older BIOSes do not know of the existence of EAX, the
        ;     high word of EAX should be cleared before this call if any of the
        ;     high bits will be tested
        ; Bit(s) Description     (Table 00226)
        ; 7-6    number of floppies installed less 1 (if bit 0 set)
        int     0x11                    ; Get equipment configuration
        shl     ax, #1                  ; Highest floppy drive # in bits 6-7
        shl     ax, #1                  ; Now in bits 0-1 of ah
        andb    ah, #0x03               ; Extract bits
        ; Robbie: check DL <= AH is because AH is "the number less 1"
        cmpb    dl, ah                  ; Must be dl <= ah for drive to exist
        ja      nextdisk                ; Otherwise try disk 0 eventually
        call    load0                   ; Read the next floppy bootstrap
        jc      nextdisk                ; It failed, next disk please
        ret                             ; Jump to the next master bootstrap
nexthd: call    load0                   ; Read the hard disk bootstrap
error1: jc      error                   ; No disk?
        ret
 
 
; Load sector 0 from the current device.  It's either a floppy bootstrap or
; a hard disk master bootstrap.
load0:
; Robbie: Here a trick is used: for floppy disk, we just load sector 0, so just
; let lowsec(si) point to a DWORD of 0
        mov     si, #BUFFER+zero-lowsec ; si = where lowsec(si) is zero
        ;jmp    load
 
; Load sector lowsec(si) from the current device.  The obvious head, sector,
; and cylinder numbers are ignored in favour of the more trustworthy absolute
; start of partition.
load:
        ; Robbie: exceprt from RBIL (Ralf Brown's interrupt list)
        ; INT 13 - DISK - GET DRIVE PARAMETERS (PC,XT286,CONV,PS,ESDI,SCSI)
        ;   AH = 08h
        ;   DL = drive (bit 7 set for hard disk)
        ;   ES:DI = 0000h:0000h to guard against BIOS bugs
        ; Return: CF set on error
        ;   AH = status (07h) (see #00234)
        ;   CF clear if successful
        ;   AH = 00h
        ;   AL = 00h on at least some BIOSes
        ;   BL = drive type (AT/PS2 floppies only) (see #00242)
        ;   CH = low eight bits of maximum cylinder number
        ;   CL = maximum sector number (bits 5-0)
        ;        high two bits of maximum cylinder number (bits 7-6)
        ;   DH = maximum head number
        ;   DL = number of drives
        ;   ES:DI -> drive parameter table (floppies only)
        ; When the situation is C: hard disk, then DL would be 0x80; in any
        ; case, this int 0x13 call gets the drive parameters
        mov     di, #3          ; Three retries for floppy spinup
retry:
        ; Robbie: this save of DX will be used before calling INT 13H
        push    dx              ; Save drive code
        push    es
        push    di              ; Next call destroys es and di
        movb    ah, #0x08       ; Code for drive parameters
        int     0x13
        pop     di
        pop     es
        ; Robbie: CL is naturally the sector count, because it is 1-based
        andb    cl, #0x3F       ; cl = max sector number (1-origin)
        ; Robbie: DH is made the head count, that's why the "+ 1"
        incb    dh              ; dh = 1 + max head number (0-origin)
        movb    al, cl          ; al = cl = sectors per track
        ; Robbie Mosaic: this mulb means multiplies al with dh, and then stores
        ; the result in ax
        mulb    dh              ; dh = heads, ax = heads * sectors
        mov     bx, ax          ; bx = sectors per cylinder = heads * sectors
        ; Robbie: here the relative sector number is loaded (0-based, maximum
        ; 2^32, resulting in 2 TB maximum size); this can be used with LBA hard
        ; disk sector addressing
        mov     ax, lowsec+0(si)
        mov     dx, lowsec+2(si); dx:ax = sector within drive
        ; Robbie: this code considers the possibility to use the sector number
        ; rather than the CHS address, so it is 2TB-ready
        cmp     dx, #[1024*255*63-255]>>16  ; Near 8G limit?
        ; Robbie Mosaic: jae (jump if above or equal) is for unsigned
        ; comparison while jge is for signed comparison. I guess that jae
        ; checks the carrier flag while jge checks the sign flag.
        jae     bigdisk
        ; Robbie: DIV does unsigned integer division. Beforehand, DX:AX is the
        ; dividend; afterwards, AX stores the quotient, and DX stores the
        ; remainder (refer to i386 programmer's guide)
        div     bx              ; ax = cylinder, dx = sector within cylinder
        xchg    ax, dx          ; ax = sector within cylinder, dx = cylinder
        movb    ch, dl          ; ch = low 8 bits of cylinder
        ; Robbie: DIVB does unsigned integer division for byte; AH:AL stores
        ; the dividend, and AL will be the quotient, while AH will be the
        ; remainder; previously CL was set to the sector count; Now the target
        ; situation is: CL contains the high two bits of cylinder (bits 8..9,
        ; because maximum cylinder number is 1024) and the sector number (bits
        ; 0..5, maximum 63 (note the maximum for the HDD should be based on the
        ; disk parameter), 1-based), CH contains the low 8 bits of cylinder, DL
        ; contains the disk number (80 for C:) and DH contains the header
        ; number (maximum 255, minimum 0)
        divb    cl              ; al = head, ah = sector (0-origin)
        xorb    dl, dl          ; About to shift bits 8-9 of cylinder into dl
        ; Robbie: Question: why is it two SHR with 1 bit rather than one SHR
        ; with 2 bits? Because although 80386 supports an arbitrary immediate
        ; value, 8086 doesn't
        shr     dx, #1
        shr     dx, #1          ; dl[6..7] = high cylinder
        orb     dl, ah          ; dl[0..5] = sector (0-origin)
        movb    cl, dl          ; cl[0..5] = sector, cl[6..7] = high cyl
        ; Robbie: remember the sector number we got from relative sector number
        ; is 0-based, so now it's the time to increment it; meanwhile, the
        ; minix master boot code disregards the CHS value in the partition
        ; table, and it is mentioned on Wikipedia that Unix (including Linux,
        ; etc.) don't regard CHS value
        incb    cl              ; cl[0..5] = sector (1-origin)
        pop     dx              ; Restore drive code in dl
        movb    dh, al          ; dh = al = head
        ; Robbie: LOADOFF (0x7c00) will be where the boot sector of the
        ; partition will be loaded
        mov     bx, #LOADOFF    ; es:bx = where sector is loaded
        ; Robbie: AH = 02 (read), AL = 01 (one sector)
        mov     ax, #0x0201     ; Code for read, just one sector
        ; Robbie: excerpt from RBIL
        ; INT 13 - DISK - READ SECTOR(S) INTO MEMORY
        ;         AH = 02h
        ;         AL = number of sectors to read (must be nonzero)
        ;         CH = low eight bits of cylinder number
        ;         CL = sector number 1-63 (bits 0-5)
        ;              high two bits of cylinder (bits 6-7, hard disk only)
        ;         DH = head number
        ;         DL = drive number (bit 7 set for hard disk)
        ;         ES:BX -> data buffer
        ; Return: CF set on error
        ;             if AH = 11h (corrected ECC error), AL = burst length
        ;         CF clear if successful
        ;         AH = status (see #00234)
        ;         AL = number of sectors transferred (only valid if CF set for
        ;              some BIOSes)
        int     0x13            ; Call the BIOS for a read
        jmp     rdeval          ; Evaluate read result
bigdisk:
        mov     bx, dx          ; bx:ax = dx:ax = sector to read
        ; Robbie: remember DX was pushed near label "retry"
        pop     dx              ; Restore drive code in dl
        push    si              ; Save si
        mov     si, #BUFFER+ext_rw ; si = extended read/write parameter packet
        mov     8(si), ax       ; Starting block number = bx:ax
        mov     10(si), bx
        ; Robbie: This should be the so called extended int 0x13 functionality;
        ; it is supported by new BIOSes; to detect whether this feature exists,
        ; use INT 13 AH=41 and it will return a set of flags showing which
        ; extended functions are supported
        ;
        ; excerpt from RBIL
        ; INT 13 - IBM/MS INT 13 Extensions - EXTENDED READ
        ;         AH = 42h
        ;         DL = drive number
        ;         DS:SI -> disk address packet (see #00272)
        ; Return: CF clear if successful
        ;             AH = 00h
        ;         CF set on error
        ;             AH = error code (see #00234)
        ;             disk address packet's block count field set to number of
        ;             blocks successfully transferred
        ; Format of disk address packet:
        ; Offset  Size    Description     (Table 00272)
        ;  00h    BYTE    size of packet (10h or 18h)
        ;  01h    BYTE    reserved (0)
        ;  02h    WORD    number of blocks to transfer (max 007Fh for Phoenix
        ;                 EDD)
        ;  04h    DWORD   -> transfer buffer
        ;  08h    QWORD   starting absolute block number
        ;                 (for non-LBA devices, compute as
        ;                 (Cylinder*NumHeads + SelectedHead) * SectorPerTrack +
        ;                 SelectedSector - 1
        ;  10h    QWORD   (EDD-3.0, optional) 64-bit flat address of transfer
        ;                 buffer; used if DWORD at 04h is FFFFh:FFFFh
        movb    ah, #0x42       ; Extended read
        int     0x13
        pop     si              ; Restore si to point to partition entry
        ;jmp    rdeval
rdeval:
        ; Robbie: JNC = jump if not CF
        jnc     rdok            ; Read succeeded
        ; Robbie: refer to RBIL Table 00234 for return code in AH; quit without
        ; retry when disk time out for floppy
        cmpb    ah, #0x80       ; Disk timed out?  (Floppy drive empty)
        je      rdbad
        dec     di
        jl      rdbad           ; Retry count expired
        xorb    ah, ah
        int     0x13            ; Reset
        jnc     retry           ; Try again
        ; Robbie Mosaic: set the carry flag to tell the caller the failure of
        ; the execution; see label "load" - as I examined, only that code will
        ; call here
rdbad:  stc                     ; Set carry flag
        ret
rdok:   cmp     LOADOFF+MAGIC, #0xAA55
        jne     nosig           ; Error if signature wrong
        ; Robbie Mosaic: now return to 0000:#LOADOFF, to execute the boot
        ; sector code
        ret                     ; Return with carry still clear
nosig:  call    print
        .ascii  "Not bootable\0"
        jmp     reboot
 
; A read error occurred, complain and hang
error:
        ; Robbie: here the code needs to print the error number of the read
        ; error returned from INT 13H
        mov     si, #LOADOFF+errno+1
prnum:  movb    al, ah          ; Error number in ah
        andb    al, #0x0F       ; Low 4 bits
        ; Robbie: ASCII 'A' = 0x41, '0' = 0x30, so 'A' - ('9' + 1) = 0x41 -
        ; 0x3A = 7
        cmpb    al, #10         ; A-F?
        ; Robbie Mosaic: jb and ja means jump if below and jump if above. The
        ; difference between ja, jb and jg, jl is that ja and jb are for
        ; unsigned comparison. That is, ja and jb check the carry flag while jg
        ; and jl check the sign flag.
        jb      digit           ; 0-9;
        ; Robbie Mosaic: in ASCII, ':' is next to '9'. Remember that millennium
        ; bug in `winfile'?
        addb    al, #7          ; 'A' - ':'
digit:  addb    (si), al        ; Modify '0' in string
        dec     si
        movb    cl, #4          ; Next 4 bits
        shrb    ah, cl
        jnz     prnum           ; Again if digit > 0
        call    print
        .ascii  "Read error "
errno:  .ascii  "00\0"
        ;jmp    reboot
 
reboot:
        call    print
        .ascii  ".  Hit any key to reboot.\0"
        xorb    ah, ah          ; Wait for keypress
        int     0x16
        call    print
        .ascii  "\r\n\0"
        ; Robbie Mosaic: int 0x19 is the rebooting routine (without power-on
        ; self-test)
        int     0x19
 
; Print a message.
print:  pop     si              ; si = String following 'call print'
        ; Robbie Mosaic: the `call print' statement directs the CPU to put the
        ; code address (IP) into the stack (near call so no CS value pushed).
        ; According to this `print' routine, the string must begin immediately
        ; after the `call print' statement whose address is popped into si, and
        ; the return address is immediately after the string (not using the
        ; `ret' command).
prnext: lodsb                   ; al = *si++ is char to be printed
        testb   al, al          ; Null marks end
        jz      prdone
        ; Robbie: excerpt from RBIL
        ; INT 10 - VIDEO - TELETYPE OUTPUT
        ;   AH = 0Eh
        ;   AL = character to write
        ;   BH = page number
        ;   BL = foreground color (graphics modes only)
        ; Desc: display a character on the screen, advancing the cursor and
        ; scrolling the screen as necessary
        movb    ah, #0x0E       ; Print character in teletype mode
        mov     bx, #0x0001     ; Page 0, foreground color
        int     0x10
        jmp     prnext
        ; Robbie: now SI points to the byte after the string
prdone: jmp     (si)            ; Continue after the string
 
.data
 
; Extended read/write commands require a parameter packet.
ext_rw:
        ; Robbie Mosaic: "data1" means 1 byte; "data2" means 2 bytes
        .data1  0x10            ; Length of extended r/w packet
        .data1  0               ; Reserved
        .data2  1               ; Blocks to transfer (just one)
        .data2  LOADOFF         ; Buffer address offset
        .data2  0               ; Buffer address segment
        .data4  0               ; Starting block number low 32 bits (tbfi)
zero:   .data4  0               ; Starting block number high 32 bits
 
Return to MS-DOS for Beginners