Protected Mode Basics
I remember when I was first learning protected mode. I had barely taught myself assembly language, and I got this crazy idea that I wanted to teach myself protected mode. I went out and purchased an 80286 assembly language book that included some protected mode examples, and I was off to learn. Within a few hours, I realized that the book I had purchased didn't have any usable examples, since the examples in the book were intended to be programmed in EPROM CHIPS. So I hit the bulletin boards in search of something I could use as a guiding example.
The only example I found was so poorly documented and convoluted with task switching that even now, many years later, I haven't figured it out. So with my IBM Technical Reference Manual and my 80286 book, I sat down and tried to figure out protected mode. After spending forty hours in three days of trying, I finally copied some source code out of the IBM Technical Reference Manual, and I was able to enter protected mode and then return to DOS.
Since that time, I have learned much about protected mode and how the CPU handles it internally. I discovered that the CPU has a set of hidden registers that are inaccessible to applications. I also learned how these registers get loaded, their role in memory management, and most importantly, their exact contents. Even though these registers are inaccessible, understanding the role they play in memory management can be applied to application's programming. Applying this knowledge to programming can result in applications that use less data, less code, and execute faster.
ENTERING PROTECTED MODE
Our goal is to enter protected mode, and leave protected mode and return to DOS. The '286 has no internal mechanism to exit protected mode: once you are in protected mode, you are there to stay. IBM recognized this, and implemented a hardware solution that would take the '286 out of protected mode by resetting the CPU. Since the power-on state of the '286 is real mode, simply resetting the CPU will return to real mode. But this introduces a slight problem, as the CPU won't continue executing where it left off. At reset, the CPU starts executing at the top of memory, in the BIOS. Without a protocol to tell the BIOS that we reset the CPU for the purpose of exiting protected mode, the BIOS would have no way to return control back to the user program. IBM implemented a very simple protocol by writing a code to CMOS RAM (CMOS) where the BIOS can check this code and decide what to do. Immediately after the BIOS starts executing from the reset vector, it checks this code in CMOS to determine if the CPU was reset for the purpose of exiting protected mode. Depending on the code in CMOS, the BIOS can return control back to the user program and continue executing.
Resetting the CPU isn't without its ramifications; all the CPU registers are destroyed, and the interrupt mask in the Programmable Interrupt Controller (PIC) is sometimes re-programmed by the BIOS (depending on the shutdown type). Therefore, it is the program's responsibility to save the PIC mask, stack pointer, and return address before entering protected mode. The PIC mask and stack pointer must be stored in the user's data segment, but the return address must be stored at a fixed location defined in the BIOS data segment -- at 40:67h.
Next, we set the code in CMOS that tells BIOS we will exit protected mode and return to the user's program. This is simply done by writing a value to the two CMOS I/O ports. After the CPU gets reset, and BIOS checks the CMOS code, BIOS will clear the CMOS code, so subsequent resets won't cause unexpected results. After setting the code in CMOS, the program must build the GDT. (See the appropriate Intel programmer's reference manual for a description of the GDT.) The limit, and access rights may be filled in by the compiler, as these values are static. But the base addresses of each segment aren't known until run-time; therefore the program must fill them in the GDT. Our program will build a GDT containing the code, data, and stack segments addressed by our program. One last GDT entry will point to 1M for illustrative purposes.
Accessing memory at 1M isn't as simple as creating a GDT entry and using it. The 8086 has the potential to address 64k (minus 16 bytes) beyond the maximum addressability of 1M -- all it lacks is a 21st address line. The 8086 only has 20 address lines (A00..A19), and any attempt to address beyond 1M will wrap around to 0 because of the absence of A20. The '286 has 24 bits of addressability (A00..A23) and doesn't behave like the 8086 in this respect. Any attempt to address beyond 1M (FFFF:0010 - FFFF:FFFF) will happily assert A20, and not wrap back to 0. Any program that relies on the memory wrapping "feature" of the 8086, will fail to run properly. As a solution to this compatibility problem, IBM decided to AND the A20 output of the CPU with a programmable output pin on some chip in the computer. The output of the AND gate is connected to the address bus, thus propogating or not, A20. Based on the input from the CPU A20, ANDed with an externally programmable source, address bus A20 gets asserted. The keyboard controller was chosen as this programmable source because it contained some available pins that can be held high, low, or toggled under program control. When the output of this pin is programmed to be high, the output of the AND gate is high when the CPU asserts A20. When the output is low,A20 is always low on the address bus -- regardless of the state of the CPU A20. Thus by inhibiting A20 from being asserted on the address bus, '286- class machines can emulate the memory wrapping attributes of their 8086 predecessors.
Notice that only A20 is gated to the address bus. Therefore, without enabling the input to the A20 gate, the CPU can address every even megabyte of memory as follows: 0-1M, 2-3M, 4-5M, etc. In fact, duplicates of these memory blocks appear at 1-2M, 3-4M, 5-6M, etc. as a result of holding A20 low on the address bus. To enable the full 24-bits of addressability, a command must be sent to the keyboard controller (KBC). The KBC will enable the output on its pin to high, as input to the A20 gate. Once this is done, memory will no longer wrap, and we can address the full 16M of memory on the '286, or all 4G on 80386-class machines. All that remains in order to enter protected mode is changing the CPU state to protected mode and jumping to clear the prefetch queue (not necessary on the Pentium).
The following table summarizes the steps required to enter (with the intention of leaving) protected mode on the '286:
Steps 1-6 can be done in any order.
The minimum number of steps required to enter protected mode on the '386 and '486 are far fewer, as the '386 can exit protected mode without resetting the CPU. For compatibility purposes, all '386 BIOS's will recognize the CPU shutdown protocol defined on '286-class machines, but following this protocol isn't necessary. To exit protected mode on a '386, the program simply clears a bit in a CPU control register. There is no need to save the PIC mask, SS:SP, a return address, or set a CMOS code. The requisite steps for entering protected mode on a '386 simply become:
Of these requisite steps, building the GDT is the only step that may differ. In the '386 the base address is expanded to 32-bits, the limit is expanded to 20-bits, and two more control attribute bits are present. Listing 1 lists all the auxiliary subroutines to enter protected mode.
EXITING PROTECTED MODE
Like entering protected mode, exiting it differs from the '286 to 80386-class machines. The '386 simply clears a bit in the CPU control register CR0, while the '286 must reset the CPU. Resetting the CPU isn't without its costs, as many hundred -- if not thousands -- of clock cycles pass in the time it takes to reset the CPU and return control back to the use program. The original method employed by IBM used the keyboard controller by connecting another output pin to the CPU RESET line. By issuing the proper command, the KBC would toggle the RESET line on the CPU. This method works, but it is very slow. Many new generation '286 chip sets have a "FAST RESET" feature. These chip sets toggle the RESET line by simply writing to an I/O port. When available, FAST RESET is the preferred method. But there is a third, obscure, but efficient method for resetting the CPU without using the KBC or FAST RESET. This method is elegant, faster than using the KBC, and works on the '386 WITHOUT resetting the CPU! It is truly the most elegant, comprehensive way to exit protected mode, since it works on both the '286, and '386 -- in the most efficient way possible for each CPU. Listing 2 provides the code necessary to use the KBC and this elegant technique.
Using the KBC to reset the CPU is a straightforward technique, but in order to understand the elegant technique, some explanation is required. Recall that in our discussion of interrupts, the CPU checks the interrupt number (x8) against the limit field in the interrupt descriptor cache register (IDTR). If this test passes, then the next phase of interrupt processing begins. But if the test fails, then the CPU generates a DOUBLE FAULT (INT08). For example, let us suppose the limit field in the IDTR=80h: our IDT will service 16 interrupts, 00-15. If interrupt 16 or above was generated, the CPU would DOUBLE FAULT because a fault was generated at the inception of the interrupt calling sequence. Now, suppose the limit field in the IDTR=0, thus inhibiting all interrupts from being serviced. Any interrupt generation would cause the DOUBLE FAULT. But the DOUBLE FAULT itself would cause a fault, due to the limit being less than 40h. This ultimately would cause a TRIPLE FAULT, and the CPU would enter a shutdown cycle. The shutdown cycle doesn't reset the CPU, as a shutdown cycle is considered a BUS cycle. External hardware is attached to the CPU to recognize the shutdown cycle. When a shutdown cycle is observed, the external hardware toggles the RESET input of the CPU. Therefore, all we need to do to cause the RESET is set the IDTR.LIMIT=0, then generate an interrupt. For elegance, we don't just INT the CPU, we generate an invalid opcode. Our opcode is a carefully chosen opcode that doesn't exist on the '286, but does exist on the '386. The elegance in the algorithm is in the opcode chosen for this purpose: MOV CR0,EAX. This will generate the desired invalid opcode exception on the '286, but is the first instruction in a sequence to exit protected mode on the '386. Thus the '286 gets RESET, and the '386 falls through and exits protected mode gracefully.
Exiting protected mode on the '286, and '386 closely resemble reversing the steps for entering protected mode. On the '286, you must:
And on the '386, the steps are simply:
(Listing 3 includes the subroutines needed to restore the machine state after exiting protected mode).
Notice that exiting protected mode on the '386 requires loading the segment registers twice. The segment registers are loaded the first time to assure that real-mode compatible values are stored in the hidden descriptor cache registers -- as the descriptor cache registers "honor" the access attributes, and segment size limit, from protected mode, even when loaded in real mode. The segment registers are loaded the second time to define them with real-mode segment values.
Now that we have all the tools and theory necessary to enter and exit protected mode, we can apply this knowledge to write a program that enters protected mode, moves a block of data from extended memory, and exits protected mode -- returning to DOS. Listing 4 shows a program that consists of these basic steps and can be used to move a 1k block of data from 1M to our program's data segment.
Applications programming for real mode and protected mode aren't that different. Both modes use memory segmentation, interrupts, and device drivers to support the hardware. Whether in real mode or protected mode, a set of user-inaccessible registers -- called descriptor cache registers -- play a major role in memory segmentation and memory management. The descriptor cache registers contain information defining the segment base address, segment size limit, and segment access attributes, and are used for all memory references -- regardless of the values in the segment registers.
Entering and exiting protected mode requires nothing more than following the mechanics necessary for the proper mode transition: entering protected mode requires saving the machine state that needs to be restored upon exiting protected mode. The mechanics of entering real mode depend on the type of the CPU: the '286 requires a reset to enter real mode, and the '386 can enter real mode under program control. By applying our knowledge of how the CPU internally operates, we can write source code that exits protected mode in the manner best suited, and most elegant, for the given CPU.
View source code for PMBASICS:
Download entire source code archive: