f2 X icon 3 y2 steam2
 
 Search find 4120

GeForce RTX 4070 Ti

4070

The GeForce RTX 4070 Ti is based on the AD104 GPU and features 7680 CUDA cores delivering 40 shader teraflops with FP32 precision for graphics rendering, 240 gen 4 tensor cores offering 641 trillion sparse matrix operations for AI and DLSS processing, 60 RT 3 cores generations of Ada architecture with 93 RT-TFLOPS performance for next-generation ray-traced graphics acceleration and 12GB of GDDR6X memory. Like all GeForce RTX 40 series GPUs, the RTX 4070 Ti features Ada innovations including Shader Execution Reordering (SER), a new optical flow engine, new RT cores, and DLSS 3.

ad 1

NVIDIA Ada Architecture

The NVIDIA Ada architecture is a giant leap in performance. Numerous improvements make it the fastest and most advanced. The RTX 4070 Ti is manufactured using TSMC's custom 4N process and contains 35.8 billion transistors and 7680 CUDA cores. Hardware-accelerated tracing, 4th Gen Tensor Cores to improve AI performance, 8th Gen encoders with support for AV1 encoding and decoding, and DLSS enhancements that deliver high frame rates in competitive games and ultra settings with ray tracing enabled.

NVIDIA Ada Streaming Multiprocessor

RTX video cards have three main processors: programmable universal CUDA cores, which process general-purpose shaders and CUDA applications, RT cores to accelerate the calculation of ray intersections with triangles and bounding volumes, Ada architecture RT cores doubled the rate of calculation of intersections with triangles, the last processor type - artificial intelligence processing pipeline called tensor cores.

Ada improves on all three RTX processors

Programmable Shaders: 40 shader teraflops compared to 21.7 teraflops on the RTX 3070 Ti. The Ada Shader Processor includes an important new technology called Shader Execution Reordering (SER), which reorders work on the fly, providing a 2x speedup for ray tracing shaders. SER is as big an innovation for GPUs as out-of-order execution once was for CPUs.

Gen 4 Tensor Cores: The new Tensor Core in Ada includes the NVIDIA Hopper FP8 Transformer Engine delivering up to 641 FP8 precision tensor teraflops on sparse matrices in the RTX 4070 Ti for training and AI inference, up from 174 tensor teraflops on sparse matrices in the RTX 3070 Ti. Compared to FP16, FP8 reduces memory requirements by half and doubles AI performance.

Gen 3 RT Cores: The new Opacity Micromap Engine averages twice the speed of intersection calculations for surfaces with a texture transparency test when developers use this feature, and the new Micro-Mesh Engine increases geometric detail without the cost of assembling and storing BVH. Ada's throughput on crossover tests is 93 RT-TFLOPS compared to the 3070 Ti's 42.5 RT-TFLOPS.

4th Generation Tensor Cores

Tensor cores are high-performance computing cores specialized and adapted for matrix multiplication and addition operations that are used in artificial intelligence applications and for high performance computing. Tensor cores provide revolutionary performance for matrix calculations, which are critical for training multilayer neural networks and inferencing already trained networks. Example applications with inference include NVIDIA DLSS 3 for gamers, where a separate neural network is responsible for generating high-quality frames, all powered by the NVIDIA Tensor Core. DLSS has become so popular that there are already over 250 games that support this technology, in which gamers can double the performance with one click. Besides, Many creative apps have begun using AI features to help artists create content faster and with better quality. Today, more than 110 popular creative applications use tensor and RT core acceleration on RTX graphics cards. And exclusive NVIDIA applications such asBroadcast and Canvas offer tools to remove noise, create virtual backgrounds, and many other AI-powered effects for video streaming and conferencing.

ada 2

The fourth generation Ada Tensor Core builds on the capabilities of previous Ampere GPUs that supported many new data types and added structured sparsity acceleration to double the throughput of previous Turing cores. Ada generation tensor cores support the new FP8 data format, first introduced in the NVIDIA Hopper GPU architecture. Compared to FP16, FP8 reduces storage requirements by half and doubles AI performance. With the new FP8 format and sparsity feature, the GeForce RTX 4070 Ti delivers 641 TFLOPS of performance for AI workloads.

3rd Generation RT Cores

Ada's third-generation RT cores are dedicated hardware blocks for speeding up BVH traversal and ray-triangle intersection calculations, which are critical to speeding up ray tracing. The RT cores of RTX video cards are completely independent, they perform all BVH traversal and intersection calculations, thereby offloading SM streaming multiprocessors with CUDA cores and freeing them for other tasks such as pixel shading, vertex shading and general purpose calculations.

ada 3

Ada architecture RT cores provide 2x faster triangle ray intersection testing compared to NVIDIA Ampere GPUs, allowing developers to add more detail to their virtual worlds. Ada RT cores also include new Opacity Micromap Engine blocks that speed up alpha-tested geometry tracing by a factor of 2, which will help developers speed up resource-intensive scenes with vegetation and particle effects up to 2x for tracing. The new RT cores also include Displaced MicroMesh Engine blocks, which generate micromesh on the fly to create additional geometry.

4070

All of these ray-traced performance enhancements give the Ada architecture a big head start. As new games emerge that use Ada technologies to boost performance, the RTX 40-series graphics cards will undoubtedly get faster and further ahead of the previous generation of RTX 30-series graphics cards. An example is the recent remaster of Portal with RTX  based on the RTX Remix, in which NVIDIA uses new features of the Ada architecture, such as the OMM and SER engines (which, by the way, can be disabled in the settings), together they allow the RTX 4090 to be up to 3 times faster than the RTX 3080 Ti without DLSS, and with the DLSS 3 frame generator, the advantage can be up to 5x.