PhysX, CUDA and 32X CSAA in games – – PC performance graphics benchmarks of Graphics Cards

 2160x1280

System requirements for optimal implementation of nVidia PhysX technology in games recommended by the gameGPU website:

Operating system: Vista®, Windows 7
Processor:
Core 2 Duo E6750 2,66 GHz or AMD Phenom II X2 540, and in the absence of an Nvidia accelerator Core i 7 930 2,8 GHz or AMD Phenom II X6 1090T
Memory:
4 GB
Video card:
DirectX®9 compatible video adapter with 1024 MB RAMGeForce GTX 465 above
DirectX: DirectX®9-11

System requirements for optimal technology implementation CUDA in games recommended by the gameGPU site:

Operating system: Vista®, Windows 7
Processor: Core 2 Duo E6750 
2,66 GHz or AMD Phenom II X2 540 
Memory:
4 GB
Video card:
DirectX®9 compatible video adapter with 1280 MB RAMGeForce GTX 470
DirectX:
DirectX®10-11

System requirements for optimal technology implementation 32X CSAA in games recommended by the gameGPU site:

Operating system: Vista®, Windows 7
Processor:
Core 2 Duo E6750 2,66 GHz or AMD Phenom II X2 540
Memory:
4 GB
Video card:
DirectX®11 compatible video adapter with 1024 MB RAMGeForce GTX 460, and in case of simultaneous use with CUDA(Just Cause 2) 1280 MB RAMGeForce GTX 570
DirectX: DirectX®11

In our previous review of Fermi vs Cayman in DirectX 11 modes and more – – PC performance graphics benchmarks of Graphics Cards, we compared Nvidia and AMD cards in the most resource-intensive games today. Left behind the scenes were tests with PhysX and CUDA activation, which could have further significantly reduced the performance of the tested video cards. We decided to correct this injustice by adding to this “bouquet” another “gluttonous” innovation that appeared with the release of GF 100, namely  32X CSAA.  Of course, nothing compares to the loss of performance when activating the 3D Vision mode, but we are preparing a separate review for this…

Description of nVidia PhysX technology

In August 2009, the English-language magazine Game Developer, dedicated to the development of computer games, published an article about modern game engines and their use. According to the magazine, the most popular among developers is the nVidia PhysX engine, which occupies 26,8% of the market. In second place is Havok, which occupies 22,7% of the market. Third place belongs to the Bullet Physics Library engine (10,3%), and fourth place belongs to the Open Dynamics Engine (4,1%).

top

PhysX (pronounced [‘fiziks]) is a proprietary middleware, a cross-platform physics engine for simulating a number of physical phenomena, as well as a development kit (SDK) based on it. It was originally developed by Ageia for its PhysX physical processor (before the appearance of this processor, the engine was called NovodeX). After Ageia was acquired by nVidia, the engine became the property of nVidia, which continues its further development. nVidia adapted the engine to accelerate physical calculations on its graphics chips with CUDA architecture. PhysX can also perform calculations using a conventional processor. PhysX is currently available on the following platforms: Windows, Linux, Mac OS X, Wii, PlayStation 3, Xbox 360 (hardware acceleration is only possible on the Windows platform). The engine is used in many games and is actively offered for sale (licensing) to everyone.

The PhysX SDK allows game developers to avoid writing custom code to handle the complex physics interactions in modern PC games. On July 20, 2005, Sony licensed the PhysX SDK for use on its seventh-generation PlayStation 3 video game console. The PhysX SDK can be used not only on Microsoft Windows, but also on Linux, but PhysX processor support is currently only available for Windows.

 cryostasis_2011_01_28_13_28_23_310

Unlike most other physics engines, which are shipped and installed with the game, the PhysX SDK must be installed separately. It is installed as a separate driver. If a PhysX board is installed on your computer, the PhysX SDK driver will use its resources when running. If PhysX is absent, then computing tasks will be transferred to the central processor.

The PhysX SDK physics engine consists of three main components for processing physics:
rigid body processing;
fabric processing (eng. cloth);
fluid processing;

On February 13, 2008, NVIDIA acquired Ageia, making the PhysX SDK the property of NVIDIA. PhysX SDK support has been integrated into the CUDA framework, for which there are already many drivers for Linux. This eliminates the need for a dedicated physical PhysX processor. PhysX SDK support is available for all Nvidia video cards, starting with the 8xxx series. The PhysX SDK physics engine is now known as the NVIDIA PhysX SDK.

mafia2_2011_01_28_13_31_17_400

In March 2008, Nvidia announced that it would make the PhysX SDK an open standard available to everyone. On July 24, 2008, it was announced that Nvidia would release a WHQL-certified ForceWare driver with support for physics acceleration on August 5, 2008. Due to the cancellation of Havok FX, the PhysX SDK physics engine is currently the only technology that supports hardware acceleration. Although the PhysX SDK is designed for use in computer games, it can be used in other applications.

On August 15, 2008, NVIDIA released ForceWare driver 177.83, which enabled PhysX support in the 8, 9, and 200 series graphics cards. This immediately expanded the user base to more than 70 million people worldwide.

On December 5, 2008, NVIDIA released the PhysX pack 2 software package, which expands the list of games with support for advanced physics. This package is free and takes up 3,5 GB. The package includes the logic puzzle Crazy Machines 2, a demo scene of particle processing Dark Basic Fluids Demo, demo scene of working with soft bodies Dark Basic PhysX Soft Body Demo and two new levels for Warmonger multiplayer.

On December 15, 2008, AMD Director of Technical Marketing Godfrey Cheng stated that the NVIDIA PhysX physics engine is doomed to die, like any proprietary technology.

On December 22, 2008, news broke that computer game publisher THQ and NVIDIA had signed agreements to use NVIDIA PhysX technology in computer games to be published by THQ.

ShippingPC-BmGame_2011_01_28_14_41_54_145

On March 17, 2009, NVIDIA published a press release in which it announced that it had entered into an agreement with the Japanese multinational corporation Sony to provide PhysX tools and related software to game developers for the PlayStation 3 game console. Thus, all registered developers who have an official license and right create games for PS3, will be able to get free access to the full set of NVIDIA PhysX technology tools, including libraries, header files, help files, documentation, etc.

On March 20, 2009, NVIDIA confirmed that PhysX tools would be made available free of charge to all registered Nintendo Wii game developers.

On March 26, 2009, news broke that Apple’s online App Store was selling several iPhone games with PhysX support: Big Fun Racing, Space Race and Debris.

apex_arch2

APEX is a high-level add-on that, according to NVIDIA, should simplify the implementation of PhysX in game projects and speed up the development process. APEX allows artists and designers to create physical effects with minimal programming input. Instead of the low-level PhysX API, the developer is provided with a set of tools for creating specific physics effects based on ready-made APEX modules. The use of these modules is ensured by the integration of the APEX framework into game engines.

Description of CUDA technology 

CUDA (Compute Unified Device Architecture) is a software and hardware architecture that allows computing using NVIDIA graphics processors that support GPGPU (arbitrary computing on video cards) technology. The CUDA architecture first appeared on the market with the release of the eighth generation NVIDIA chip – G80 and is present in all subsequent series of graphics chips that are used in the GeForce, Quadro and Tesla accelerator families.

CUDA-Capabilities

CUDA SDK allows programmers to implement algorithms that can be executed on NVIDIA GPUs in a special simplified dialect of the C programming language and to include special functions in the text of a C program. CUDA gives the developer the opportunity, at his own discretion, to organize access to the instruction set of the graphics accelerator and manage its memory, and organize complex parallel calculations on it.

The initial version of the CUDA SDK was introduced on February 15, 2007. The CUDA API is based on the C language with some limitations. To successfully translate code into this language, the CUDA SDK includes Nvidia’s own nvcc command line C compiler. The nvcc compiler is based on the open Open64 compiler and is designed to translate host code (main, control code) and device code (hardware code) (files with the .cu extension) into object files suitable for assembling the final program or library in any programming environment, such as NetBeans.

cuda

Uses grid memory model, cluster thread modeling and SIMD instructions. Mainly used for high-performance graphics computing and development of NVIDIA-compatible graphics API. Includes the ability to connect to applications using OpenGL and Microsoft Direct3D 9. Created in versions for Linux, Mac OS X, Windows.

The first series of hardware to support the CUDA SDK, the G8x, had a 32-bit single-precision vector processor using the CUDA SDK as an API (CUDA supports the C double type, but its precision has now been reduced to 32-bit floating point). Later GT200 processors have support for 64-bit precision (SFU only), but performance is significantly worse than for 32-bit precision (due to the fact that there are only 2 SFUs per stream multiprocessor, and 8 scalar processors). The GPU organizes hardware multithreading, which allows you to use all the resources of the GPU. Thus, the prospect opens up to transfer the functions of the physical accelerator to the graphics accelerator (an example of implementation is nVidia PhysX). It also opens up wide possibilities for using computer graphics hardware to perform complex non-graphical calculations: for example, in computational biology and other branches of science.

 JustCause2_2011_01_28_14_40_11_661

Compared to the traditional approach to organizing general-purpose computing through graphics APIs, the CUDA architecture has the following advantages in this area:
-The CUDA Application Programming Interface (CUDA API) is based on the standard C programming language with some limitations. According to the developers, this should simplify and smooth the process of learning the CUDA architecture[2]
-The 16 KB shared memory between threads can be used for a user-organized cache with wider bandwidth than when fetching from regular textures
-More efficient transactions between CPU memory and video memory
-Full hardware support for integer and bitwise operations

As of December 2010, the CUDA software model is taught in 350 universities around the world. In Russia, training courses on CUDA are given at Moscow, St. Petersburg, Kazan, Novosibirsk and Perm State Universities, the International University of the Nature of Society and Man “Dubna”, the Joint Institute for Nuclear Research, the Moscow Institute of Electronic Technology, Ivanovo State Energy University, BSTU. V. G. Shukhov, MSTU im. Bauman, Mendeleev Russian Chemical Technology University, Russian Scientific Center “Kurchatov Institute”, Interregional Supercomputer Center of the Russian Academy of Sciences, Taganrog Technological Institute (TTI SFU). In addition, in December 2009, it was announced that the first Russian scientific and educational center “Parallel Computing”, located in the city of Dubna, began operating, whose tasks include training and consultations on solving complex computing problems on GPUs.

In Ukraine, courses on CUDA are taught at the Kiev Institute of System Analysis.

Description of 32x csaa technology

Nvidia graphics cards starting with the GF100 series began to support 32x CSAA (coverage sampling anti-aliasing), in which coverage information is subtracted from color/z/stencil data, which reduces the bandwidth and memory footprint of anti-aliasing compared to MSAA technology.

32

Due to the specifics of its work, on the G80 and GT200 CSAA could only smooth out the edges of polygons. The algorithm did its job efficiently, but was helpless in the face of other types of artifacts. The main problems arose with various kinds of translucent surfaces, such as the textures of fences, leaves, grass bushes, which are a kind of compromise and simplification, since if carefully worked out they would require an incredible amount of geometric resources, and the result would not qualitatively correspond to the costs.

 Since such textures have a large area and do not contain any geometry inside, previous implementations of CSAA gave in to anti-aliasing the content. This drawback is not unique to this technology; even for MSAA, a full solution to the issue came only in DX10 with the introduction of alpha-to-coverage technology, thanks to which the GPU, in fact, created several levels of transparency around the “fake” texture geometry so that it blended better with the environment.

BFBC2Game_2011_01_28_14_45_34_358

Although this is a very powerful technique, its implementation on the G80 and GT200 left much to be desired. The performance of the chips was simply not enough to create the required number of transparent layers. Paired with expensive MSAA, even with 8xQ mode, the GPUs could only create up to 9 levels, and this was not even close to enough to create a smooth gradient. As a result, although alpha-to-coverage theoretically completely solved the problem of smoothing translucent surfaces, its actual application was questionable. To maintain a normal framerate, we had to use reduced quality levels of this type of AA, which did not work very well. The only option to fully get rid of the unpleasant aliasing effect on such surfaces was to enable Transparency Super-Sample Anti-Aliasing, however, with it, the speed of operation in most games dropped to inconceivable amounts, especially considering that when textures of this kind are used, they take up a large amount of space. part of the screen.

32_1

For the GF100, NVIDIA has made two CSAA enhancements to address the issue described above. Firstly, the maximum number of samples processed by this algorithm has been increased from 16 to 24. Secondly, CSAA is now trained to work with translucent surfaces. With such capabilities, when working simultaneously with MSAA, in general antialiasing can be performed taking into account up to 32 samples per pixel. In other words, you can achieve up to 33 levels of transparency, which, although not ideal, is much more detailed and higher quality than previous gradient options.

GPU test

All video cards were tested at maximum graphics quality by the Fraps program in the most resource-intensive part of the game. All video cards were tested this time on a platform with a Core™ i7 processor. All tests were carried out at the maximum acceptable quality of graphics settings.

Hardware configuration

Processors

Intel® Core™ [email protected] GHz cache L3 8 MB

motherboards

MSI X58 Pro socket LGA1366 product provided by the company MSI

Memory

GOODRAM PLAY 1600MHz (8-8-8-24) product provided by the company GOODRAM

Video Cards

GeForce 8800 GTS 640 MB
EN 8800GT 512 mb product provided by Asus
GeForce 8800 GTS 512 MB
GeForce 8800 GTX 768 MB
EN 8800ULTRA 768 mb product provided by the company Asus
GeForce 9600 GT 512 MB 
GeForce 9800 GT 512 MB
GeForce 9800 GTX 512 MB
 GeForce GT 240 1024 MB
GeForce GTS 250 1024 MB
GeForce GTX 260 896 MB
GeForce GTX 260 core 216
GeForce GTX 280 1024 MB
GeForce GTX 275 896 MB
GeForce GTX 285 1024 MB
GeForce GT 430 1024 MB
GeForce GTX 460 768 MB
GeForce GTX 460 1024 MB
GeForce GTX 465 1024 MB
Zotac GeForce GTX 470 1280 MB 
product provided by the company Zotac
GeForce GTX 480 1536 MB
GeForce GTX 580 1536 MB product provided by Zotac

Hard disks

3×2 RAID0 Western Digital Caivar WD2500GL 250 GB, 7200 rpm, SATA 3 Gbit/s

Power Supplies

SeaSonic S12D 850 Silver 850W product provided by the company SeaSonic

System software and drivers 

Operating system

Microsoft Windows 7 Ultimate Edition x64

Tested Applications

Battlefield Bad Company 2
Tom Clancy’s HAWX 2 Benchmark
Just Cause 2
Lost Planet 2 Benchmark DirectX 11
Mafia II
Cryostasis Tech Demo
Sacred 2
Batman Arkham Asylum

Platform Driver

Intel INF Chipset Update Utility 9.1.0.1012

Graphics driver

Nvidia GeForce/ION Driver Release 266.35

 All our video cards were tested at maximum quality settings in a resolution of 1920×1080, which today is widely distributed in monitors starting with a diagonal of 22″. Due to the specifics of testing, this time only Nvidia video cards participated in our review. We tested CUDA in only one gaming application that uses this technology, namely Just Cause 2. 32x csaa was tested in those games in the settings of which it is possible to activate it, namely Battlefield Bad Company 2, Tom Clancy’s HAWX 2, Just Cause 2, Lost Planet 2 Benchmark DirectX 11. And finally, nVidia PhysX was tested in the most relevant applications for it – Mafia II, Cryostasis Tech Demo, Sacred 2, Batman Arkham Asylum.

Just Cause 2 CUDA test

JustCause2_2010_12_11_13_13_25_137 

 Just Cause 2 is a sequel to the computer game Just Cause, which tells the story of the new adventures of CIA agent Rico Rodriguez. Shootouts, chases, parachute jumps, theft of cars and military equipment – the player will be able to force the main character to take part in the most dangerous operations. The player has an impressive arsenal of small arms at his disposal. The main character is capable of performing the most spectacular stunts, including capturing a flying helicopter. Compared to the first part, the game has much more weapons and equipment. But the main thing is the chic choice of means of transportation. Military and civilian vehicles are easy to upgrade and tune. For this, everything necessary is provided, including more than two thousand different parts. The game is developed on a completely new Avalanche Engine 2.0 using NVIDIA CUDA technology and only works on PCs with DirectX 10 or later installed; accordingly, Just Cause 2 will not run on Windows XP. Havok Physics is responsible for the physics in the game.

Testing at maximum quality settings 1920×1080 CUDA test

 jc_2_cuda

NVIDIA CUDA is responsible for the simulation of the water surface, as a result of which we get the most natural implementation of all that exists in games today. The anti-aliasing mode in Just Cause 2 was set to 16QX in order to squeeze all the juice out of previous generations of Nvidia video adapters. We can say that in terms of average FPS, the minimum that can be recommended is the GeForce GTX 470, and the recommended cards are the GeForce GTX 480 and GeForce GTX 570.

Testing at maximum quality settings 1920×1080 CUDA 32x csaa test 

 jc_2_32

When 32x anti-aliasing is activated, the picture becomes more realistic, but the load on video cards also increases accordingly. As before, the GeForce GTX 470 remains the recommended minimum, and only the GeForce GTX 580 will provide the most comfortable performance.

Battlefield Bad Company 2 32x csaa test

 BFBC2Game_2010_12_11_13_07_08_873

The game uses the Frostbite engine, which ensures complete destructibility of the surrounding world. An explosion can bring down a house; after several bullet hits, walls crumble, tanks break through fences. This seriously changes the tactical component of the game: you can no longer hide from bullets and explosions in a building, you can enter the house not through the door, but by blowing up the wall with an under-barrel grenade launcher, instead of destroying the transmitter located inside the house, you can simply collapse the building using a tank or explosives. In the Microsoft Windows operating system, the game engine supports graphics display using DirectX 9, DirectX 10, DirectX 10.1, and starting from version 1.5 – DirectX 11. 

Testing at maximum quality settings 1920×1080

BC_2

 В Battlefield Bad Company 2 activation of 32x anti-aliasing does not cause any difficulties for video cards and all users with level accelerators will be able to play comfortably GeForce GTX 460 and higher…

Lost Planet 2 Benchmark DirectX 11 32x csaa test

 LP2DX11_2010_12_11_13_24_13_880

Even before the release of the game on PC, Capcom announced a preview of its new gaming hit Lost Planet 2 and presented the DirectX11 benchmark of the same name. The benchmark is based on the Lost Planet 2 game engine, which runs in DirectX 9 and 11 mode. The results were taken from the most resource-intensive test “B”.

Testing at maximum quality settings 1920×1080

lp2_2_32

 Lost Planet 2 Benchmark DirectX 11 put significant pressure on video cards  Nvidia. Based on average FPS, we can recommend video cards starting from level GeForce GTX 460 and higher, and at the minimum there remains only one representative of the Fermi family – GeForce GTX 580.

 Tom Clancy’s HAWX 2 Benchmark 32x csaa test

 HAWX2_DX11_2011_01_27_14_02_34_250

In HAWX 2, DirectX 11 support not only improves performance thanks to compute shaders, but also visually improves the game. Activating tessellation helps create more realistic terrain.

Testing at maximum quality settings 1920×1080

tc_2_32

The only one who failed in Tom Clancy’s HAWX 2 Benchmark with 32x anti-aliasing was GeForce GT 430 and all other representatives of the new Nvidia line did not experience any significant problems.

Cryostasis Tech Demo nVidia PhysX test

cryostasis_2011_01_27_13_45_24_960 

1968 Arctic Circle. Dead expanse of ice fields. The drifting station “Polyus 21” has just been left by its last occupant – meteorologist Alexander Nesterov. He received an urgent telegram from the mainland and now must leave the endless Arctic on a comfortable ship that will pick him up at the appointed place at the appointed hour. However, instead of a warm welcome, a real nightmare awaits the scientist: by chance, he will soon find himself on board the nuclear icebreaker “North Wind”, which was lost in the ice of oblivion many years ago. Cryostasis TechDemo, like the game itself, takes full advantage of the acceleration of physical effects using video cards.

Testing at maximum quality settings 1920×1080

Cruotasis

Cryostasis is one of the first games that made full use of nVidia PhysX technology and at one time was a “thunderstorm” for Nvidia video cards. Today, the recommended minimum for it should be considered GeForce GTX 260 and GeForce GTS 450. The most optimal should be considered video cards starting from the GeForce GTX 460 level and higher.

Sacred 2 nVidia PhysX test

 sacred2_2011_01_02_20_27_31_118

Sacred 2: Fallen Angel (Sacred 2: Fallen Angel) is a fantasy role-playing computer game released in 2008 by Ascaron. Sequel to the 2004 game Sacred. Starting from version 2.40, the game supports physics acceleration using nVidia PhysX. An addon was released for the game on October 2, 2009 – Ice & Blood, which will bring to the game more than 30 hours of additional game time, a new class (Dracomage), quests, items, as well as two new areas – the Bloody Forest and the Crystal Plains .

Testing at maximum quality settings 1920×1080

 sak_2

In Sacred 2 we see a picture almost identical to Cryostasis. The minimum consists of GeForce GTX 260 and GeForce GTS 450, and for minimum FPS we recommend GeForce GTX 260 core 216 and higher.

Batman Arkham Asylum nVidia PhysX test

 ShippingPC-BmGame_2011_01_27_13_58_16_011

Batman: Arkham Asylum (Russian: Batman: Arkham Asylum) is an action-adventure computer game for Microsoft Windows, PlayStation 3 and Xbox 360, created by Rocksteady Studios and published by Eidos Interactive in collaboration with Warner Bros. Interactive Entertainment and DC Comics in 2009. Based on the Batman comic book series. Batman: Arkham Asylum is a game that relies heavily on the Nvidia PhysX API.

Testing at maximum quality settings 1920×1080

 bat_16

Batman Arkham Asylum at one time was the most prominent representative of Nvidia proprietary technologies: Nvidia PhysX, 3D Vision, 16QX anti-aliasing. Even today it is a “tough nut to crack” for video cards. The minimum for gaming is GeForce GTX 260 or higher, and for comfortable performance we recommend GeForce GTX 465 or higher.

Mafia II nVidia APEX test

mafia2_2010_12_11_13_14_23_349

Nvidia showed great interest in the game, as a result of which features such as PhysX, 3D Vision and Surround gaming were introduced into the new product. And while this in itself is not a bad thing, it can be assumed that the game was probably optimized for Nvidia graphics cards. The most exciting of the three Nvidia-exclusive features in the game is PhysX, which is a means of adding a new layer of realism to the action. Mafia II features Nvidia APEX, a scalable dynamic backbone that provides artists with easy-to-use tools to take advantage of PhysX. APEX supports a range of modules including destruction, growth, particles, clothing and turbulence.

Testing at maximum quality settings 1920×1080

Mafia_II

  In Mafia II, almost all video cards from Nvidia show low performance. GeForce GTX 460 and higher should be considered the minimum, and video cards starting from the level of GeForce GTX 480 and GeForce GTX 570 and higher are recommended.

CUDA – an already completely accessible technology that can be used by almost any programmer with knowledge of C. Today, thanks to the efforts of nVidia, this technology has become “on its feet” and makes it possible to reduce  working time for millions of people around the world. For us, gamers, this technology is also of enormous importance, many times increasing the authenticity and beauty of the game world, as we ourselves were able to see from the example of implementation in Just Cause.

32XCSAA – released along with new generation video cards in the spring of last year, this technology has already received more or less acceptable distribution and is used in many modern graphically advanced games. By activating this option, the picture becomes extremely clear and approaches the quality of a 3D image. Let’s hope that 32X CSAA will become more widespread and will be used in almost all upcoming gaming applications.

nVidia PhysX – one of the most common Nvidia technologies that is widely used in most modern games. The dominance in the physics accelerator market speaks for itself, and we hope that the company will make every effort to improve its product. The only thing that is depressing is the narrow focus of the platform, as a result of which the wide capabilities of nVidia PhysX can only be appreciated by owners of Nvidia cards. But let’s hope that the situation will change soon and at least it will be possible to use an Nvidia card as a physical accelerator paired with AMD officially and without any patches…

Scroll to Top