White Paper: A new era in desktop 3D graphics

3DNow! technology holds out the promise of enhancing desktop 3D graphics without requiring changes to the underlying operating...

3DNow! technology holds out the promise of enhancing desktop 3D graphics without requiring changes to the underlying operating system

3DNow! is a significant step to the x86 structure that supports today's personal computers. 3DNow! technology is a set of instructions that opens the traditional processing bottlenecks for floating-point-intensive and multimedia applications. With 3DNow! technology, hardware and software applications can implement more powerful solutions to create a more entertaining and productive PC platform. Examples of the type of improvements that 3DNow! technology enables are faster frame rates on high-resolution scenes, much better physical modeling of real-world environments, sharper and more detailed 3D imaging, smoother video playback, and near theatre-quality audio.

3DNow! was defined and implemented in collaboration with independent software developers including operating system designers, application developers and graphics vendors. It is compatible with today's existing x86 software and requires no operating system support, thereby enabling 3DNow! applications to work with all existing operating systems.

The 3DNow! technology instructions are intended to open a major processing bottleneck in a 3D graphics application ( floating-point operations. Today's 3D applications are facing limitations due to the fact that only one floating-point execution unit exists in the most advanced x86 processors. The front end of a typical 3D graphics software pipeline performs object physics, geometry transformations, clipping and lighting calculations. These computations are very floating-point intensive and often limit the features and functionality of a 3D application.

The source of performance for the 3DNow! instructions originates from the single instruction multiple data (SIMD) implementation. With SIMD, each instruction not only operates on two single-precision, floating-point operands, but the microarchitecture within the AMD-K6-2 and K6-3 processors can execute up to two 3DNow! Instructions per clock through two register execution pipelines, which allows for a total of four floating-point operations per clock. In addition, because the 3DNow! instructions use the same floating-point registers as the MMX technology instructions, task switching between MMX and 3DNow! operations is eliminated.

The 3DNow! technology instruction set contains 21 instructions that support SIMD floating-point operations and includes SIMD integer operations, data prefetching and faster MMX-to-floating-point switching. To improve MPEG decoding, the 3DNow! instructions include a specific SIMD integer instruction created to facilitate pixel-motion compensation.

Because media-based software typically operates on large data sets, the processor often needs to wait for this data to be transferred from main memory. The extra time involved with retrieving this data can be avoided by using the new 3DNow! instruction called PREFETCH. This instruction can ensure that data is in the level 1 cache when it is needed. To improve the time it takes to switch between MMX and x87 code, the 3DNow! instructions include the fast entry/exit multimedia state (FEMMS) instruction, which eliminates much of the overhead involved with the switch. The addition of 3DNow! Technology expands the capabilities of the AMD-K6 family of processors and enables a new generation of enriched user applications.

To properly identify and use the 3DNow! instructions, the application program must determine if the processor supports them. The CPUID instruction gives programmers the ability to determine the presence of 3DNow! technology on a processor. Software applications must first test to see if the CPUID instruction is supported.

The presence of the CPUID instruction is indicated by the ID bit (21) in the EFLAGS register. If this bit is writeable, the CPUID instruction is supported. Once the software has identified the processor's support for CPUID, it must test for extended functions by executing extended function 0 (EAX=8000_0000h). The EAX register returns the largest extended function input value defined for the CPUID instruction on the processor. If the value is greater than 8000_0000h, extended functions are supported.

The next step is for the programmer to determine if the 3DNow! instructions are supported. Extended function 8000_0001h of the CPUID instruction provides this information by returning the extended feature bits in the EDX register. If bit 31 in the EDX register is set to 1, 3DNow! instructions are supported. The AMD-K6-2 processor supports all of the above features. Concatenating the code examples above will produce the basis for a CPU detection software routine.

The complete multimedia units in the AMD-K6-2 processor combine the existing MMX instructions with the new 3DNow! instructions. In addition, by merging 3DNow! with MMX, it becomes possible to write x86 programs containing both integer, MMX and floating-point graphics instructions with no performance penalty for switching between the multimedia (integer) and 3DNow! (floating-point) units.

The AMD-K6-2 processor implements eight 64-bit 3DNow!/MMX registers. These registers are mapped onto the floating-point registers. The 3DNow! And MMX instructions refer to these registers as mm0 to mm7. Mapping the new 3DNow!/MMX registers onto the floating-point register stack enables backwards compatibility for the register saving that must occur as a result of task switching.

Aliasing the 3DNow!/MMX registers onto the floating-point register stack provides a safe method to introduce 3DNow! And MMX technology, because it does not require modifications to existing operating systems. Instead of requiring operating system modifications, new 3DNow! and MMX technology applications are supported through device drivers, 3DNow! And MMX libraries, or Dynamic Link Library (DLL) files.

Current operating systems have support for floating-point operations and the floating-point register state. Using the floating-point registers for 3DNow! and MMX code is a convenient way of implementing non-intrusive support for 3DNow! and MMX instructions. Every time the processor executes a 3DNow! or MMX instruction, all the floating-point register tag bits are set to zero (00b=valid), except for the FEMMS and EMMS instructions, which set all tag bits to one (11b=empty). 3DNow! technology uses a packed data format. The data is packed in a single, 64-bit 3DNow!/MMX register or a quadword memory operand.

The format of 3DNow! instruction encoding is based on the conventional x86 modR/M instruction format and is similar to the format used by MMX instructions.

Compiled by Ajith Ram

(c)AMD UK 1998

Read more on PC hardware