ICE Mode and the Pentium Processor

By Robert R. Collins


In my previous two columns, I’ve discussed various aspects of in-circuit emulators (ICEs). The first article of the series gave a broad overview of the history of various Intel ICE implementations (see DDJ, July 1997, or http://www.x86.org/ddj/Jul97), starting with my own experiences. I explained the differences between ICEs and software-based debuggers. I concluded with a discussion of how the various Intel x86 ICE implementations have evolved over the years.

In the second article, I discussed the evolution of the microprocessor itself and how it has been designed with in-circuit emulation in mind (see DDJ, September 1997, or http://www.x86.org/ddj/Sep97). Intel CPUs from the 80186 to 80486 contained functions that aided in many ICE functions such as code tracing and halting microprocessor on specific break-point events. I explained how these features worked and how they were implemented in these older CPUs. Finally, I began a discussion of ICE mode, a special CPU operating mode available only to special in-circuit emulator hardware. As I pointed out, early implementations of ICE mode differ greatly from the current implementations in the Pentium and Pentium Pro families. Intel decided to scrap its prior approach in favor of an entirely different approach. Because of these vast differences, I wanted to dedicate an entire article to discussing the ICE features on these contemporary processors.

Inside the Pentium Processor

During the design of the Pentium, Intel decided to take an entirely different approach to in-circuit emulation. Intel was already planning to include some debugging facilities on this processor by virtue of the IEEE-standard test access port (TAP) and Boundary Scan Architectures. The TAP is a small set of pins that aren’t used for any microprocessor input, output, address, or data functions. Instead, they consist of an input and output data pin (TDI and TDO, respectively), clock pin (TCK), mode pin (TMS), and reset pin (TRST#). TAP instructions and data are input in a serial manner on single input and output, pins, which are sampled on the clock (TCK) boundaries.

The TAP has a small instruction set, requiring only a few bits to define all instructions in the instruction set. Even so, the instruction register is much larger than necessary. This allowed Intel to extend the functions of the TAP to define special-purpose TAP instruction codes. Some of these special instructions are used to implement the Boundary Scan Architecture.

The Boundary Scan Architecture (BSA) can be used to program the microprocessor to drive all of its various pins to any user-defined state, or to simply tri-state them. When these pins are driven, BSA lets board manufacturers test the circuit paths and chipset logic of their motherboards. Chipsets may be tested to en- sure proper response to input stimuli from the microprocessor. This allows motherboards to be tested for circuit correctness.

Another benefit of BSA is the ability to read the internal state of the microprocessor. Using BSA, it is possible to direct the microprocessor to put the contents of a standard microprocessor register (such as EAX) into a special output register. The output register is just a temporary holding register. Its contents are then "scanned" out of the microprocessor – one bit at a time – on the TAP TDO pin. The BSA isn’t limited to scanning the various microprocessor registers: It can be used to read the various interface signals between each unit of the microprocessor. For example, BSA could be implemented to read the interface signals at the connection between the prefetch unit and the decode unit, or to read the signals between the decode unit and the execution unit. In fact, this is exactly how most microprocessor manufacturers are using the TAP today – to implement some way to "scan" the various registers and electrical signals from within the microprocessor, and make them readable outside of the microprocessor. Intel isn’t the only company to do this, but it figured out a novel way to capitalize on this idea.

Intel decided that this technique could be used to implement its next version of ICE mode. Intel further extended the TAP instruction set to include some high-level instructions for its new ICE mode, known internally as "probe mode." These instructions are used to enter and leave probe mode, build and submit a "probe mode instruction," and access the probe mode data register.

While in probe mode, the Pentium can examine and modify the internal and external state of a computer system. Memory and I/O space may be examined and modified. All internal CPU registers may also be examined and modified – including control registers (CRn), debug registers (DRn), and model-specific registers (MSRs).

During probe mode, the normal execution of instructions is interrupted, and the Pentium enters a dormant state. While previous x86 processors with embedded ICE support were in ICE mode, unbeknownst to the target system, they were still executing x86 ICE program instructions. In these processors, ICE mode was an alternate operating mode of the CPU, with its own dedicated program and memory space – exactly like System Management Mode (SMM) is to the Pentium. But unlike these previous x86 processors, Pentium probe mode is truly a static state whereby prefetch and decode does not occur at any level, for any purpose. Probe mode instructions exist to examine and modify registers, memory, and I/O space. These probe mode instructions are fed directly into the Pentium’s execution unit(s), thereby bypassing the prefetch and decode stages altogether. These probe mode instructions represent pre-decoded instructions that must be in the same format as the Pentium’s internal micro-word format.

Probe Mode Implementation

Probe mode is implemented via exten- sions to the boundary-scan instruction set, two pins, and three probe mode registers. The boundary-scan extensions that support probe mode include the instructions Begin Probe Mode, End Probe Mode, Build Probe Instruction, Execute Probe Instruction, and Access Probe Data Register. The probe mode boundary-scan instructions are marked as Private Instructions in the Pentium boundary-scan instruction set summary (see the Pentium Processor Family Developer’s Manual, Volume 1, Chapter 11). The two pins defined to support probe mode are R/S# and PRDY. These pins are described (somewhat incompletely) in the Pentium data sheet. The registers used to support probe mode are the Probe Instruction Register (PIR), Probe Data Register (PDR), and the Probe Mode Control Register (PMCR).

When in probe mode, the processor is in a dormant state. Prefetch and decode do not occur. Any exceptions, NMI, or external interrupts that are pending or may become pending are not serviced until termination of probe mode. Snoops, cache line fills, and writebacks may occur during probe mode, since probe mode may perform memory-based operations with the cache enabled.

Entering and Exiting Probe Mode

Probe mode is entered by three possible methods. First, the processor may receive a Begin Probe Mode instruction from the boundary scan instruction set. The processor immediately halts execution at the next instruction boundary, and asserts PRDY. Once PRDY is asserted, the processor is ready to receive probe mode instructions, via the boundary-scan mechanism. To exit probe mode, you must execute the boundary-scan instruction Exit Probe Mode. Otherwise external hardware must force R/S# from high to low, then to high again. It is this low-to-high transition that forces the Pentium to exit probe mode.

Second, probe mode may be entered when external hardware asserts R/S#. The processor will respond by asserting PRDY when it is ready to accept probe mode instructions. To exit probe mode, external hardware must force R/S# back to high. The boundary-scan instruction End Probe Mode will not work to exit probe mode for this entrance method. Since R/S# was asserted to enter probe mode, forcing it high is not only a sufficient means to exit, it is the only means to exit (see Pentium Processor Family Developer’s Manual, Volume 1, Chapter 31).

The Creation of Appendix


About now, you're wondering: What do a discussion of in-circuit emulation and Appendix H have in common? The connection is stronger than you might think.

Long before Intel created the infamous Appendix H, it had released a set of confidential documents, called "gold books," to its most trusted customers. These gold books contained pre-production information about its upcoming Pentium Processor, which was still called the "P5." The name "Pentium" hadn't yet been invented. This gold book was entitled the P5 External Design Specification (rev 2), and contained details of its new P5 processor. The book contained the average run-of-the-mill stuff like new instructions and new features. It even contained all of the information that was later removed and put in Appendix H. But, most importantly, this gold book contained the complete description of Intel's probe mode -- the complete description of how the new in-circuit emulators would work.

Glasnost

As you can imagine, I was elated. I immediately called my Intel representative to congratulate him on Intel's new Glasnost policy. Being a proponent of easy access to information, he was just as elated that Intel had finally decided to come out of the closet. A few months later, Intel released a new version of the P5 External Design Specification (rev 3). In this version of the document, Intel completed the description of probe mode. (It was vaguely outlined in the previous revision of this manual.) This document contained a complete desciption of Intel's internal microword format. Still, the most exciting aspect of this document, was that Intel was planning to publicly release all of this information in its various Pentium manuals.

But Glasnost didn't last very long. In fact, it didn't even last until the market introduction of the Pnetium. Intel abruptly recalled all of those gold books, and issued a new set. The new set was divided into two documents: The P5 External Design Specification (rev 4), and the Supplement to the Pentium Processor User's Manual; hence, Appendix H was born.

Like all revision before it, revision 4 represented a draft version of what the public would actually see in various Pentium manuals. Revision 4 didn't contain any explanation for the new Pentium features, such as 4-MB paging (see DDJ, May 1996, or http://www.x86.org/ddj/May96), Virtual Mode Extensions (see http://www.x86.org/articles/vme1)or Model-Specific Register (see http://www.sandpile.org) All of these explanations were moved into the new Supplement manual. But missing from all of these new manuals was the description of probe mode.

When the Pentium manuals finally became public, it became obvious that the descriptions of these advanced features had been hastily removed. The Pentium manual contained nearly 50 references to VME, MSRs, Performance Monitor Counters, 4-MB paging, and even 2-MB paging (a feature removed before the Pentium reached production; see DDJ, July 1996, or http://www.x86.org/ddj/Jul96) These references gave programmers enough information to reverse engineer these features.

Reverse Engineering

For the next three years, Intel maintained its policy of keeping this information secret -- only available by signing a 15-year nondisclosure agreement. Thanks to a community of Internet hackers and web sites devoted to publishing their efforts, all Appendix H features were ultimately reverse engineered, and made public -- without Intel's help. The futility of keeping this information secret has forced Intel to publish it themselves. Between today's Pentium and Pentium Pro manuals, a near-complete description of the Appendix H features can be obtained. There is just once exception. When the hackers attacked the Performance Monitor Counters, some counters were nevery successfully reverse engineered. Hence, their list of counters had a few holes in it. Intel's published list is a perfect one-to-one match of the list that was reverse engineered. This would indicate that Intel's renewed Glasnost wasn't the result of a benevolent dictator. Instead, it appears that Intel only published what it was forced to publish by the efforts of the hacker community.

A much more humerous version of the "Creation" story, may be found by reading The Story of the CISC King and Appendix H.

Third, the Pentium may enter probe mode whenever a debug exception occurs. For this to occur, the Probe Mode Control Register (PMCR) must be set to allow a debug exception to cause entry into probe mode. When the PMCR is set in such a manner, any debug exception which occurs will cause the Pentium to enter probe mode. (See http://www.x86.org/articles/pmcr/ for details of the Probe Mode Contro1 Register.) Normally, debug exceptions cause the debug exception handler to be invoked. However, when the PMCR is set appropriately, the Pentium enters probe mode instead of invoking the microprocessor debug exception handler. The entire range of debug exceptions are capable of triggering an entrance to probe mode: A debug register breakpoint is detected; a single-step trap occurs; a task switch occurs into a TSS whose T-bit is set; DR7.GD=1 and there was an attempt to access one of the debug registers; or a debug exception instruction was executed – ICEBP. (Refer to http;//www.x86.org/secrets/opcodes/ICEBP.html for details of the undocumented ICEBP instruction.) When PMCR[0]=1, the occurrence of any of these conditions will cause the Pentium to enter probe mode. Once one of these conditions occurs, the Pentium will immediately enter probe mode, assert PRDY, and then is ready to accept probe mode instructions. To exit probe mode, execute the boundary-scan instruction End Probe Mode, or external hardware must force R/S# from high to low, then to high again.

Probe Mode Instructions

Probe instructions are composed in the Probe Instruction Register (PIR). The PIR has full control over both the Pentium execution units (U-pipe and V-pipe) and the FPU core. The PIR format is logically split between the U-pipe and V-pipe, or the FPU pipe and V-pipe. Therefore, two microcoded instructions may be submitted simultaneously using the PIR. Composing probe mode instructions may require writing the PIR multiple times. If such is the case, the PIR is updated after each Build Probe Instruction boundary-scan instruction is executed. Once the microinstruction for each pipe is completely composed; issuing the Execute Probe Instruction will cause execution of the probe mode instruction.

Conclusion

Pentium probe mode is highly dependent on the dual pipe architecture of the Pentium. Probe mode provides a non-intrusive method to read and write any aspect of the microprocessor state, memory space, or I/O space. When in probe mode, the Pentium remains in a dormant state, waiting to accept probe mode instructions until it is instructed to exit probe mode. Resumption from probe mode is equally non-intrusive, as no states of the processor have changed unless ordered to do so by probe mode instructions. Probe mode instructions are composed and fed directly into the U-pipe, V-pipe, and FPU of the Pentium. Instructions exist to examine and modify any internal register, including MSRs. There are no protection checks against probe mode instruction operands, and the results of submitting errant operands is indeterminate at this time. Probe mode provides an ideal means to implement a hardware-based debugger, due to its non-intrusive nature.

As you can see, probe mode represents a radical departure from Intel’s previous ICE mode implementations. When the previous microprocessors were in ICE mode, they were actually in an alternate (undocumented) operating state of the CPU. This alternate state still performed prefetches and executions, but used undocumented pins on the CPU, which made the CPU appear to be in a dormant state. The Pentium ICE mode is truly a dormant state. Prefetches and executions don’t occur on any level. Instead, microinstructions are fed directly into the U-pipe, V-pipe, and FPU, thereby bypassing the decode unit of the microprocessor. Obviously, Intel has continued to evolve ICE mode on its microprocessors. Through trial and error and considerable experience and expertise, Intel has come up with an ICE implementation that is simple and powerful, and therefore profoundly elegant.


Back to Dr. Dobb's Undocumented Corner home page