GPU Rendering Overview
GPU rendering allows V-Ray RT to perform the raytracing calculations on the GPUs installed in the system, rather than the CPU. Since GPUs are specifically designed for massively parallel calculations, they can speed up the rendering process by an order of magnitude.
To enable GPU rendering, select the OpenCL (single kernel) or CUDA (single kernel) value for the Engine type parameter in the V-Ray RT settings.
V-Ray RT for GPU has two back-ends (or engines). One is based on OpenCL (see the references section below for more info on OpenCL) and the other one - on the nVidia CUDA platform.
The OpenCL engine should be able to run on any OpenCL- compatible hardware. However this may or may not be true based on different manufacturers and driver versions. Below is a list of tests we have performed (Until June 2015) and the results:
- nVidia - V-Ray RT Open CL works properly on Fermi-, Kepler- or Maxwell based cards with the latest drivers (353.06). Cards with architecture older than Fermi DO NOT work with OpenCL.
- AMD GPUs - RT OpenCL on AMD works only with the latest drivers (14.502), using V-Ray 3.00.01 and AMD GPU with GCN architecture. It does not work in any other configuration.
- Intel GPUs - RT Open CL does NOT work on Intel GPUs
- CPU (Using Intel or AMD OpenCL runtime) - V-Ray RT Open CL works on all V-Ray versions with the latest Intel/AMD runtimes.
The CUDA engine is supported only in 64-bit builds of V-Ray RT for Fermi-, Kepler- and Maxwell- based nVidia cards. It is recommended to use the CUDA engine on nVidia GPUs.
Rendering on multiple GPUs is supported and by default V-Ray RT for GPU will use all available OpenCL/CUDA devices. See the sections below how to choose devices to run V-Ray RT GPU on.
V-Ray RT for GPU has been tested on a number of graphics cards including:
nVidia GeForce 980
nVidia GeForce 970
nVidia GeForce 960
nVidia GeForce 750ti
nVidia GeForce 750m
nVidia GeForce 680 GTX;
nVidia GeForce 580 GTX;
nVidia GeForce 590 GTX;
nVidia GeForce 570;
nVidia GeForce 480 GTX;
nVidia Titan X
nVidia Titan Z
nVidia Titan Black
nVidia Tesla K40
nVidia Tesla K80
nVidia Tesla C2050;
nVidia Quadro M6000
nVidia Quadro K6000
nVidia Quadro K5200
nVidia Quadro K4200
nVidia Quadro K4000
nVidia Quadro K2200
nVidia Quadro K2000
nVidia Quadro 2000M;
If V-Ray RT for GPU cannot find a supported OpenCL/CUDA device on the system, it will silently fall back to CPU code. To see if the V-Ray render server is really rendering on the GPU, check out its console output.
Note: V-Ray 3.20.02 RT GPU can now render when you run it through Remote Desktop with nVidia drivers after 353.06 on Quadro and newer GeForce cards (GTX 980, GTX TITAN X)
In general, for portability reasons, OpenCL code is compiled at run-time when a program runs (much like OpenGL shaders written in GLSL and DirectX shaders written in HLSL), as opposed to CUDA code, which is precompiled in advance and stored in a binary format inside the program executable. This allows the OpenCL code to be portable and best optimized for the particular hardware on which it runs. The downside is that the compilation may take a while, depending on the OpenCL code complexity, number of OpenCL devices in the system, and the OpenCL compiler and driver versions. Luckily, the binary version of the compiled OpenCL code can be cached and the re-loaded much faster later on.
The first time you install V-Ray RT GPU and perform a GPU rendering, V-Ray will compile the OpenCL code for your hardware. This may take anywhere from 30 seconds to several minutes, depending on the number of graphics cards and driver version. In the V-Ray RT render server console window you will see something like this:
|[2010/Sep/6|21:05:44] Running RTEngine|
[2010/Sep/6|21:05:44] Initializing OpenCL renderer (single kernel version)...
[2010/Sep/6|21:05:44] Number of OpenCL devices found: 1
[2010/Sep/6|21:05:44] OpenCL device list:
[2010/Sep/6|21:05:44] Device 0: GeForce GTX 480
[2010/Sep/6|21:05:44] VRAY_OPENCL_DEVICES environment variable not specified; using all available devices
[2010/Sep/6|21:05:44] cl_nv_compiler_options supported!
[2010/Sep/6|21:05:44] Building OpenCL trace program...
[2010/Sep/6|21:06:34] OpenCL program built in 49.156 s
The resulting compiled binary code is cached to disk in the temporary folder for the current user. On subsequent runs, the compilation phase is skipped and the code is loaded directly from the disk:
|[2010/Sep/6|21:46:54] Building OpenCL trace program...|
[2010/Sep/6|21:46:54] OpenCL program built in 0.016 s
Such run-time compilation is not required by the CUDA engine and it will start rendering right away.
You may not want to use all available OpenCL/CUDA devices for rendering, especially if you have multiple GPUs and you want to leave one of them free for working on the user interface. To do this, you can use the supplied GUI tool, which you can find in Start Menu > Programs > Chaos Group > V-Ray RT Adv for 3ds Max > Select OpenCL devices for V-Ray RT:
After changing this option, you need to restart the V-Ray RT render server (if it is running) for the changes to take effect. If the V-Ray RT render server is running as a Windows service, you may need to stop it from the Services applet in the Control Panel.
Note that the tool determines the devices to use for both CUDA and OpenCL rendering.
Balancing the GPU Load
If you have only one GPU on your system, you may find that the user interface becomes sluggish and unresponsive while V-Ray RT is rendering on the GPU. To alleviate this problem, reduce the Rays per pixel and/or the Ray bundle size parameters in the Performance section of the V-Ray RT renderer settings in the 3ds Max Render Setup dialog. For example, you can try values like 128/8 or 128/4. This will break up the data passed to the GPU into smaller chunks, so that the user interface requests can be processed faster. Note however, that this will reduce the rendering speed. Turn on the statistics display to check the difference in render speed and to find the optimal settings for your system.
On the GPU, V-Ray uses a simplified version of the V-Ray renderer, which supports only a sub-set of all features of the CPU code. The features listed below are supported; anything else will likely not work.
Triangle meshes and VRayProxy objects are supported. Instancing is also supported - see the Instancing and Forest Pro support page.
Note that even for supported lights, only a sub-set of the light parameters are implemented.
- Standard lights: omni, spot, directional;
- Photometric lights with web profiles;
- VRayLight: rectangle lights with textures, sphere and mesh lights without textures; dome light with textures (for image based lighting or IBL);
- VRayIES lights with web profiles.
- VRayMtl material: diffuse color, (glossy) reflections, refractions, opacity, bump mapping, Fresnel reflections;
- Multi/sub-object material;
- VRayLightMtl material (without the direct illumination options);
- VRayBlendMtl material (without the additive mode option);
Bitmap textures, the Falloff map, the VRayHDRI the VRaySky maps are supported. Other procedural textures (Checker, Noise etc.) are supported by baking them, provided that they have Explicit UVW mapping type. If the Resize textures for GPU option is turned on, then all textures uploaded to the GPU are resampled to a resolution specified by the GPU texture size parameter in the V-Ray RT settings in the Render Setup dialog. The Mix and ColorCorrection textures are also supported.
- Standard perspective views; depth of field is not supported for these views;
- VRayPhysicalCamera with support for depth of field and bokeh effects;
- Stereoscopic rendering is fully supported on the GPU.
Only the background texture from the Environment dialog is supported and used for background, GI, reflections and refractions. Only spherical, mirror ball and angular environment mapping types are supported.
Motion blur is supported, provided that the Motion blur option in V-Ray RT is enabled and motion blur is enabled in the production renderer or through a physical camera.
Below is a list of some common OpenCL errors that you may get in the V-Ray RT render server console:
|Error -4 at line XXX, in file ./src/ocl_tracedevice.cpp !!!|
This errors means that there is not enough VRAM on the GPU to complete the rendering. You can try one or more of the following to fix the error:
- Reduce the "GPU texture size" parameter in the V-Ray RT settings;
- Reduce the "Ray bundle size" and/or "Rays per pixel" parameters in the V-Ray RT settings;
- Reduce the amount of geometry in the scene.
- The official OpenCL web site: http://www.khronos.org/opencl/
- The nVidia CUDA developer zone: https://developer.nvidia.com/category/zone/cuda-zone