White Paper: AGP - The new interface

Graphics Adapters, that make good use of the AGP interfaces, can dramatically improve performance

Graphics Adapters, that make good use of the AGP interfaces, can dramatically improve performance

During 3D operations, such as texture mapping, a lot of memory is required to store all possible textures. In order to reduce the amount of memory required in the graphics subsystem, designers have always dreamed of using the system memory instead. The problem with this is that the PCI bus doesn't offer enough bandwidth to allow fast transfers between the system memory and the graphics accelerator. To solve this problem, the AGP has been created. The AGP is basically an accelerated PCI bus with particular bus mastering capabilities. Thus, textures may now be stored in the central memory, which reduces the overall cost of the graphics subsystem. The AGP, being a modified PCI bus, uses the standard PCI signals, but also implements some additional pins. There are several levels of AGP:

AGP 66: This is very similar to the PCI 32-bit, 66MHz, 3.3V level, and is implemented in the new HP Kayak XA PC Workstation

Basic AGP: This is the AGP 66 level plus three new signals (Basic AGP graphics engines were not in the price range of the HP Kayak XA PC Workstation at development time)

Full AGP: This is the Basic AGP plus a separate address bus (seven new signals). This allows faster transfers as addresses and data are no longer multiplexed. Full AGP is implemented in the new HP Kayak XU PC Workstation.

Double clocking: This is the Full AGP plus two new signals, but it runs at 133MHz in AGP master mode. The AGP master mode is not supported in Windows 95 or in Windows NT 4.0.

The AGP master mode

The feature that makes the AGP so powerful is its master mode capability. In standard PCI transfers, the bus master, which can be the host processor, owns the bus until the completion of the transaction. This means that the transaction initiator must wait until its request is served.

As explained earlier, the graphics engine accesses the system memory to store textures. There it may encounter long latencies as many other agents, including the host processor, are constantly accessing the central memory.

To avoid this waiting time, the AGP master mode was created. In this mode, the graphics engine issues a request to the AGP Bridge (the chipset), and then releases the bus and goes back to its current task. The chipset is then in charge of finding a time slot to transfer the data from memory. In this way, the CPU is relieved of heavy Programmed IOs and is also not disturbed too often during its normal communication with the memory.

This can dramatically improve 3D graphics performances, particularly when large textures are used. Indeed, as soon as the graphics engine knows that it will need a texture that is stored in central memory, it issues a request to the chipset and then continues with its current job. It doesn't wait for an immediate answer. At the same time, the CPU is not disturbed by the transfer of the texture to the graphics subsystem because the chipset tries to read data from memory when the CPU is not accessing the memory. This results in a major time optimisation for both the CPU and the graphics subsystem.

The GART and operating system support

The system memory is organised in pages of 4Kb. Some of them are allocated to the graphics subsystem for storing textures. However, to simplify implementation, the graphics engine addresses its allocated pages of memory through a "linear" (or "logical") addressing mode. This means that it issues logical addresses to the chipset that are not the actual addresses of the requested data. The chipset uses a translation table, known as the GART, to convert the logical addresses into real addresses.

The translation information in the GART needs to be maintained by the operating system. However, Windows 95 and Windows NT 4.0 don't manage the GART at all. This means that the host processor must master all relationships with the graphics subsystem through Programmed IOs. The first GART driver will be included in Direct X 5.0 and "Memphis" and Windows NT 5.0 will also support the GART.

Performance impact of the AGP

Under Windows 95 and Windows NT 4.0, the AGP offers a bandwidth twice as large as the standard PCI bus. Transfers to the graphics subsystem are consequently faster. The user will notice the difference between an AGP and a PCI graphics adaptor, particularly when large images are loaded in the frame buffer (that is, displayed on the screen). This is the case, for example, when new windows are opened. The difference is even more apparent when PCI transfers occur while the frame buffer is being accessed. This happens regularly when applications are loaded, when new documents are opened, or when video clips are played back.

AGP and Pentium III enabling software-only MPEG2 decoding

The combination of the Pentium III processor and the AGP bus enables software-only MPEG2 decoding. MPEG2 is a new standard for video compression that achieves both very high compression ratios and great video quality. MPEG2 is the compression type used to store movies on a DVD. (A DVD or Digital Video Disk, is simply a new type of CD-ROM that features a higher storage capacity.) However, decoding an MPEG2 file requires a lot of processing power and also a large bandwidth to transfer the decoded video to the graphics subsystem. Until now, a dedicated and expensive board was needed to correctly play an MPEG2 file from a DVD. As the processor was not able to decode MPEG2 files quickly enough, a dedicated Digital Signal Processor was required to perform the MPEG2 decoding. This is the so-called hardware-assisted DVD solution.

Today, the Pentium III features enough processing power to perform a good MPEG2 decoding. However, the limiting factor in having good video quality now becomes the PCI bus. When trying to play a DVD movie on a Pentium II without the AGP bus implemented, the frame rate is poor. However, when using the AGP capability, up to 25 frames per second can be displayed, which is real TV quality.

Compiled by Ajith Ram

(c) 1997 Hewlett-Packard

Read more on PC hardware