Introduction to Protected Mode Programming

A Brief History of the 80x86

Back in 1971 Intel was approached by a (now defunct) Japanese corporation to build a custom circuit for a new calculator. Intel designer Ted Hoff proposed that a programmable, general-purpose computing circuit be built instead, and the 4004 was born. The 4040 and the 8008 chips soon followed, but they lacked many characteristics of microprocessors as we know them today. In 1974 Intel introduced the 8080, which were used in such systems as the Altair and the IMSAI. Soon after that Motorola introduced the 6800 and MOS Technology came out with the 6502. Two of the 8080 designers left Intel for Zilog Corporation, which came out with the Z80 (which was compatable with the 8080, but was twice as fast and had an expanded instruction set).

The 8080 was an 8-bit machine. It had a single accumulator (the A register) and six secondary registers (B, C, D, E, H, and L). These six registers could be used in 8-bit arithmetic operations or combined as pairs (BC, DE, or HL) to hold 16-bit memory addresses. A 16-bit address is able to access only 64KB of memory (216 bytes). In 1978 Intel moved to a 16-bit architecture with the 8086. Unfortunately, programs on the 8080 wouldn't run on the 8086 (we miss the retc (return on carry) instruction). However, every new generation of processor since then has been able to run software written for the previous generation.

The 8086 introduced segmentation to the microprocessor world. A segment is a block of memory beginning at a fixed address that is determined by the value in the appropiate segment register. This was probably the most despised feature of the 8086 because of the restrictions it imposes. Each segment was only 64K in length. However, using segmentation, software could expand the amount of memory the chip could address. The 8086 provides four segment registers that can point anywhere in the 1MB address space. Those four are:

In 1982 Intel introduced the 80286. The 286 supported two modes: real mode(RM) and protected mode (PM). Real mode, which emulates the 8086, is the default mode. The 286 placed a new interpretation on the contents of the segment registers that control how memory is accessed. PM allowed memory from 1MB to 16MB to be physically addressable. Due to the lack of support for protected mode (and that it was a real pain to program for on the 286), many programs didn't take advantage of PM.

With the advent of the 80386 chip, most of the shortcomings of the previous processors were fixed. It was a true 32-bit processor, with 32-bit addressing. However, in order to maintain compatibility, the 386+ processors boot up in real mode, use 16-bit registers and the 16-bit segmentation scheme, and is subject to the 1MB memory limitation. But the 386 can also be switched into protected mode. In PM, each segment is marked by a bit that designates whether the segment is a PM segment containing 16-bit 80286 code or a 32-bit PM segment.

Addressing Differences

16-bit Real Mode

16-bit real mode pointers can either be 16 or 32-bits. When coding in pure assembly memory allocation is done in paragraph chunks (16-bytes at a time). Because segments fall on paragraph boundaries, it is enough to return a segment value. Offsets simply start at 0 into the segment (ie, if ES is the allocated memory segment, ES:[0] is the first byte of the memory). Since there is no memory protection, the only way to generate a protection fault is to use 32-bit addressing. Other languages use a 16-bit segment and 16-bit offset. 1 megabyte is addressable, but all the memory above 640K is taken up by the system (screen memory, BIOS area, etc).

32-bit Real Mode (unreal mode)

32-bit real mode was first introduced to the general public by a demo group back in early 1992 (I'm not sure which group released it first). Origin's Ultima VII utilized this mode. This mode requires that the machine be dropped into protected mode, the segment limits set to 4 gigabytes, and then the machine is popped back to real-mode without a CPU reset. All of the normal real-mode functions work fine until another program goes into protected mode. Since EMM386 does this to access extended memory, EMM386 must be disabled. 32-bit addressing is allows in this mode, so most of the time segment registers are set to 0 and only the 32-bit offset it used. The only way to generate a protection fault is to write to a memory address above the memory installed or above the segment limits (which are almost always set to 4 gigabytes). Normal BIOS calls can still be executed and most software will work in this mode. The most memory addressable is 4 gigabytes in this mode.

16-bit Protected Mode

16-bit Protected Mode is (AFAIK) exclusive to Borland Pascal 7.0. The segment limits are set to 64K and the compiler will only do 16-bit addressing. Most BIOS interrupts work, but some special care has to be taken in order to do some real-mode specific things. Any BIOS interrupt that accepts values in segment registers has to be called with a Real-Mode callback. This is the easiest protected-mode to program for since very few modifications have to be made for 16-bit RM programs (making for easy ports of applications). Pointers are usually 32-bits, 16-bit selector and 16-bit offset. The most memory that can be allocated is 16 megabytes.

32-bit Protected Mode

32-bit protected mode is almost a standard now with C and C++ compilers. Watcom C, GNU C and Borland C are all now 32-bit compilers. For the most part the segment registers are set to the base of memory and only the 32-bit offset is used. Therefore, pointers are 32-bits. Programming in 32-bit PM is very difficult, as most of the BIOS calls don't work directly (a real-mode callback must be used). The most memory addressable is 4 gigabytes.

48-bit Protected Mode

48-bit protected mode is also (AFAIK) exclusive to Borland Pascal 7.0. While the compiler does not support 48-bit addressing, it is possible to use 32-bit offsets with the selectors. A special unit (called NewFrontier) has to be used in order to allocate the 48-bit pointers (16-bit selector, 32-bit offset). The same BIOS problems in 16-bit protected mode apply to 48-bit protected mode as well. This mode allows a maximum alloctaion of 64 terabytes of memory (the maximum amount supported by the Intel chipset).

Protected Mode in Borland Pascal 7.0

Borland Pascal 7.0 is a 16-bit protected mode compiler and it allows programmers to use up to 16 MB of memory. The only drawback is that it uses 286 code and only allows for 16-bit addressing. However, with some work, it is possible to do 32-bit addressing, and even 48-bit addressing.

Most of the differences between coding RM applications and PM applications is the way that one accesses memory. In RM, one could put any value into a segment register and use that as a base address (or as a temporary variable). In PM, this isn't possible, since there is no correlation between what is in the segment register and the memory it accesses. Instead, the value in the segment register is really an offset into a table (known as the LDT and GDT, Local Descriptor Table and Global Descriptor Table) that holds the real memory value. If you do try to load an invalid value into a segment register, the program will produce a General Protection Fault (GPF).

Borland Pascal's DPMI server automatically deals with allocating selectors, which means that you, as a programmer, don't need to worry about the memory you allocate. What you do need to worry about is whether you have any absolute memory addresses hard-wired into your programs. For example instead of loading 0a000h into a segment register, use the BP defined variable SegA000.

Self Modifying Code (SMC)

Self modifying code is generally frowned upon as a bad coding practice. Unless you absolutely (and I mean absolutely) need to use it, I recommend that you do not. In Real Mode, SMC is not difficult, since one can write to the code segment. In Protected Mode, however, each selector has a flag that designates read/write or read-only, and code selectors are always flagged as read-only. A GPF occurs if you attempt to write to a code selector. If necessary you can use an alias to write SMC.

Other Problems

Most of the Interrupt calls are understood and dealt with by the DPMI server. Some, such as VESA information calls, are not. The reason they are not handled is that the Real Mode segment value of a memory address must be passed as a parameter in a segment register. Since the RM value is not necessarily a valid selector, the call will most likely crash. Actually, just loading the segment register will produce a GPF, and the interrupt call will never occur. Once the Real Mode callback is written, another problem occurs. The VESA information call returns RM pointers. One must covert those pointers to PM pointers before dereferencing them. There is a DPMI interrupt call to do this, which will be discussed later.
Up one levelNext page