f2 X icon 3 y2 steam2
 
 Search find 4120

Radeon X1800 XT

1192781795_49490
The ATI RADEON X1800 is the world's first commercially produced graphics processor using the 0.09 micron process technology at TSMC.

ATI had little room to maneuver with names, and only in the older segment: only the name RADEON X900 remained vacant, while almost all other options were already taken, because, unlike NVIDIA, ATI Technologies in the case of the RADEON X line , operated with three-digit numbers in the names of their products. A way out was found; it turned out to be simple and, at the same time, quite elegant - the number 1000 was added to the numerical designations of the new products. Thus, the new ATI graphics processors were named RADEON X1800, RADEON X1600 and RADEON X1300. In our opinion, this is a rather successful move, leaving a lot of room for further maneuvers with names, and also indicating that we have a new generation of architecture in front of us.

70083

This time it's really true: ATI and NVIDIA have switched roles. If the NVIDIA G70 is nothing but a significantly improved NV40 (evolution), then the RADEON X1000 is indeed a completely new architecture that has little in common with previous ATI architectures (revolution).
Moreover, the eldest model in the family, the RADEON X1800 (R520) chip, turned out to be more complex than the NVIDIA G70 - 320 against 302 million transistors! At the same time, the RADEON X1600 (RV530), aimed at the middle segments of the market, consists of 157 million transistors, while the RADEON X1300 (RV515) became, according to the developers, the first entry-level chip with about 100 million transistors inside.

The reason for the complexity of the architecture was a whole set of innovations in the chip, including features such as:
Shader Model 3.0 support;
Upgraded shader processors with a special block to execute branch instructions;
New memory controller;
Upgraded cache system;
Upgraded system of internal connections of different blocks of the chip.

Now different RADEON X1000 models differed not only in the number of pixel and vertex processors, which made it possible to achieve an optimal price/performance ratio. As usual, the lower performing versions of the new GPU are named starting with RV.

The RADEON X1000 family will be represented on the market by the following models of video adapters:
RADEON X1800 XT (R520, 625/1500MHz, 16pp, 8vp, 256-bit, 256MB/512MB, );
RADEON X1800 XL (R520, 500/1000MHz, 16pp, 8vp, 256-bit, 256MB,);
RADEON X1600 XT (RV530, 590/1380MHz, 12pp, 5vp, 128-bit, 128/256 MB,);
RADEON X1600 XT (RV530, 500/780MHz, 12pp, 5vp, 128-bit, 128/256MB,);
RADEON X1300 PRO (RV515, 600/800MHz, 4pp, 2vp, 128-bit, 256MB,);
RADEON X1300 (RV515, 450/500MHz, 4pp, 2vp, 128-bit, 128/256MB, );
RADEON X1300 HyperMemory (RV515, 450/1000MHz, 4pp, 2vp, 128-bit, 32MB, up to 128 MB HyperMemory, ).

Pixel processors

Since ATI has paid great attention to the functions of distributing work between different execution devices, the new RADEON X1000 architecture has become truly multi-threaded, even getting a special name - Ultra-Threaded Architecture. The analogy with Intel Hyper-Threading is quite appropriate here, since the goals of these technologies are similar: the most efficient use of available processor power and the maximum possible reduction in idle time of executive devices

70044

The RADEON X1000 (R5xx) architecture has similarities both with the RADEON 9000 (R3xx) and RADEON X800 (R4xx) architectures, as well as with a completely new architecture used in the Xbox 360 GPU, however, the new ATI processors contain a number of unique features that have no analogues in other chips.

In particular, RADEON X1000 chips have a built-in intelligent switch - a special unit called the Ultra-Threading Dispatch Processor, which is responsible for optimal load distribution between quads of pixel processors (each quad consists of four pixel processors, each of which is able to process a shader for a 2x2 pixel block per clock), as well as texture units. In particular, the Ultra-Threading Dispatch Processor splits the work associated with the same pixel shaders (pixel processing workload) into small groups, or threads (threads) of 4x4 pixels.

The Ultra-Threading Dispatch Processor recognizes when any of the pixel processors inside the quads are idle and instantly assigns them new tasks. However, in the case when data that has not yet been received is required to continue the execution of the shader, then such a thread is suspended by the arbitration processor until it is received, thus freeing up arithmetic resources (Arithmetic Logic Unit, ALU) for other threads and masking latency, for example, texture fetching , both in cache and in memory. According to ATI, this organization of work allows reaching 90% efficiency of using pixel processors in any shaders.

Since fast switching between threads requires saving the intermediate results of each, ATI uses special registers for this - the General Purpose Register Array - with a high-speed connection to pixel processors, which we have already seen in previous graphics processors. It is not yet clear how many registers the RADEON X1800, X1600 and X1300 have and how sensitive the new chips are to the complexities of pixel shaders.

According to the Shader Model 3.0 standard, loops, branches and subroutines are fully supported by the new ATI solutions, and the use of flow control allows them to execute shaders of almost unlimited length. All calculations are performed by RADEON X1000 family processors in 128-bit FP format, which virtually eliminates the possibility of error accumulation and, as a result, deterioration of image quality.

The number of simultaneously executing code threads was increased, and the size of each, on the contrary, was reduced to 4x4 pixels, which made it possible to achieve greater efficiency when using dynamic branching, the principle of which is well illustrated by the following diagram:

70031

The advantage of the ATI approach is obvious - with a larger branch size, the efficiency of dynamic branching drops significantly; in the case of 64x64 pixels, its use becomes unjustified. The senior member of the family, RADEON X1800 (R520) is capable of executing up to 512 threads (threads) of shader code simultaneously, while less powerful models are limited to 128 threads.

Vertex Processors

The design of RADEON X1000 vertex processors is very similar to that in NVIDIA GeForce 7 - each processor consists of two units, vector and scalar, with the difference that both ALUs in the G70 vertex processor are 32-bit, and the vector ALU in the similar RADEON X1000 processor has a 128 bits This advantage makes it possible to use a graphics chip to emulate central processors.

70047

The new vertex processors can execute 2 instructions per clock, and the shader length can reach 1024 instructions in the normal case and be almost infinite when using flow control. Of course, the RADEON X1000 vertex processors fully comply with the Shader Model 3.0 specifications.

Memory controller

The memory controller included in the new ATI GPUs has been completely redesigned. Now the RADEON X1800 internal memory bus has a ring topology and consists of two 256-bit opposing ring buses, while the RADEON X1600 ring topology consists of a pair of opposing 128-bit buses.

70042

The fact is that ring busbars that go around the entire crystal make it possible to simplify and optimize the wiring of conductors inside it, connecting the components in the shortest possible way. This solution, coupled with the use of a switch during memory write operations, minimizes delays and signal distortion. Thanks to the Ring Bus technology, RADEON X1800/1600 can easily use even the highest frequency memory, such as GDDR4, which, in the case of a traditional architecture, could lead to unstable operation due to interference caused by non-optimal wiring of the corresponding conductors inside the GPU.

The memory is connected to the buses through so-called "ring stops" (Ring Stop). There are four such stops, each with two 32-bit wide memory access channels. For comparison, in the RADEON X850 the memory was connected to the controller via four 64-bit channels. Each Ring Stop can transmit, according to the instructions of the memory controller, to the client requesting the data.

The principle of operation of the Ring Bus memory subsystem is quite simple. The client sends a request for data to the memory controller, which is located in the middle of the chip. The memory controller prioritizes each of the requests according to a certain algorithm and gives priority to the one that affects performance to a greater extent, sending the appropriate request to the memory chips and transmitting this data over the Ring Bus to the nearest Ring Stop for the client, which then transfers the data to the client. For the most optimal access to memory, the so-called Write Crossbar Switch is located around the direct controller, which allows you to evenly distribute requests.

Improvements have also been made to HyperZ technology - now a more advanced algorithm is used when determining invisible areas to be clipped. It increased the culling efficiency of hidden surfaces by 50% compared to the RADEON X850.

HDR

The new generation of ATI graphics processors has received full support for high dynamic range display modes, collectively known as HDR.

While developing the new architecture, ATI Technologies tried to take into account all the shortcomings, and the RADEON X1000 graphic processors received the widest possibilities for working with HDR, including support for various formats, including non-standard ones (custom). In addition, the RADEON X1000 for the first time has the ability to use HDR simultaneously with full-screen anti-aliasing. Compared to NVIDIA GeForce 6/7, this is a huge step forward, but will the performance of the new GPUs be enough to ensure comfortable gaming in these modes? The answer to this question can only be given by test results. At least now it's clear why the R520 graphics processor, the top model in the new ATI family, turned out to be more complex than NVIDIA G70 - all the above architectural innovations were given to developers for a reason and demanded their share of transistors on a chip.

ATI RADEON X1800 was the world's first commercially produced graphics processor using the 0.09-micron process technology at TSMC. Also, this chip is by far the most complex in the 3D industry - it consists of 320 million transistors, which is slightly more than that of the most dangerous competitor - NVIDIA G70. Although the complexity of the RADEON X1800 is quite high, the thinner technical process made it possible to operate at frequencies up to 625 MHz, which was previously unattainable.

Despite the fact that the number of transistors that make up the RADEON X1800 is twice the number of transistors of its predecessor, the RADEON X800, the number of pixel processors has not been increased, and there are still 16 of them in the new family. Instead of increasing their number, ATI equipped the new GPU with a special block , called Ultra-Threading Dispatch Processor, responsible for efficient load distribution between pixel processors and increasing their efficiency. ATI stated that this approach allows achieving 90% efficiency in the execution of any pixel shader.

 

Specifications ATI Radeon X1800 XT

Name Radeon X1800 XT
Core R520
Process technology (µm) 0.09
Transistors (million) 321
Core frequency 625
Memory frequency (DDR) 750 (1500)
Bus and memory type GDDR3 256 Bit
Bandwidth (Gb/s) 48
Pixel pipelines 16
TMU per conveyor 1
Texture per beat 16
textures per pass 16
Vertex conveyors 8
Pixel Shaders 3.0
Vertex Shaders 3.0
Fill Rate (Mpix/s) 10000
Fill Rate (Mtex/s) 10000
DirectX 9.0c
Anti-Aliasing (Max) MS-6x
Anisotropic Filtering (Max) 16x Quality
Memory 256/512
Interface PCI-E
RAMDAC 2x400

Although the RADEON X1800 family as a whole deserved high praise for its performance and feature set, it was still almost a quarter late compared to the GeForce 7800 GTX, which has similar features and performance.

F.E.A.R.

10_fear