This page provides information on setting up V-Ray RT.

 

Page Contents

 

Overview


GPU rendering allows V-Ray RT to perform the raytracing calculations on the GPUs installed in the system, rather than the CPU. Since GPUs are specifically designed for massively parallel calculations, they can speed up the rendering process by an order of magnitude. 

To enable GPU rendering, select the CUDA or OpenCL value for the Engine type parameter in the V-Ray RT settings.

(warning) For answers to common questions about V-Ray RT, see the V-Ray RT FAQ page.

 

Supported Hardware and Drivers


V-Ray RT for GPU has two back-ends (or engines). One is based on OpenCL (see the references section below for more info on OpenCL) and the other one - on the NVIDIA CUDA platform.
The OpenCL engine should be able to run on any OpenCL-compatible hardware. However, this may or may not be true based on different manufacturers and driver versions. Below is a list of tests we have performed and the results:

  • NVIDIA GPUs: For NVIDIA GPUs always prefer using CUDA, since it runs faster and has more supported features. V-Ray RT Open CL does not work on NVIDIA hardware.
  • AMD GPUs: RT OpenCL on AMD works only on AMD GCN 1.2 (or newer) GPUs with driver 16 (or newer) using V-Ray 3.40.01 or newer. Polaris architecture or later is recommended.
  • Intel GPUs: RT Open CL does not work on Intel GPUs.
  • CPU (Using Intel or AMD OpenCL runtime): OpenCL allows using the CPU for calculations together with the GPUs. For that, you need to install OpenCL runtime for CPUs (Intel OpenCL RuntimeAMD OpenCL Runtime). V-Ray RT OpenCL works on all V-Ray versions with the latest Intel/AMD runtimes on Windows.
  • Hybrid Rendering (running CUDA on GPU and CPU): Starting with V-Ray 3.6, V-Ray RT GPU CUDA rendering can be performed on CPUs and NVIDIA GPUs at the same time. Using the Select Devices for V-Ray GPU rendering tool you can enable your CPUs as CUDA devices and allow the CUDA code to combine your CPUs and GPUs to utilize all available resources.

 

The CUDA engine is supported only in 64-bit builds of V-Ray RT for Fermi-, Kepler-, Maxwell- and Pascal- based NVIDIA cards. For NVIDIA GPUs, always use the CUDA engine.

Rendering on multiple GPUs is supported and by default V-Ray RT for GPU will use all available OpenCL/CUDA devices. See the sections below how to choose devices to run V-Ray RT GPU on.
V-Ray RT for GPU has been tested on a number of graphics cards including:

Titan X Pascal, GeForce 980, GeForce 970, GeForce 960, GeForce 750ti, GeForce 750m, GTX 1080, GTX 1070, GTX 1060, GeForce 680 GTX, GeForce 580 GTX, GeForce 590 GTX, GeForce 570 ,GeForce 480 GTX, Titan X, Titan Z, Titan Black, Tesla k40, Tesla k80, Tesla C2050, Tesla M60, Quadro GP100, Quadro P6000, Quadro P5000, Quadro P4000, Quadro M6000, Quadro M5000, Quadro M4000, Quadro K5200, Quadro K6000, Quadro K4200, Quadro K4000, Quadro K2000, Quadro 2000M, AMD RX 480, AMD R9 Fury X, AMD R9 Fury Nano, AMD Fury R9.

If V-Ray RT for GPU cannot find a supported OpenCL/CUDA device on the system, it may silently fall back to CPU code. To see if the V-Ray render server is really rendering on the GPU, check out its console output.

 

Driver versions for V-Ray RT GPU CUDA

Although V-Ray GPU 3.6 should work fine with the latest NVIDIA drivers, we recommend if possible using driver ver376. For a comprehensive list of supported drivers for older V-Ray versions, please contact Chaos Group Support.

NVLINK and V-Ray GPU

As of V-Ray 3.6, it will be possible to use NVLINK on supported hardware. All NVLINK devices must be set to TCC mode. Also note that to prevent performance loss, not all data is shared between devices.

OpenCL Run-Time Compilation

In general, for portability reasons, OpenCL code is compiled at run-time when a program runs (much like OpenGL shaders written in GLSL and DirectX shaders written in HLSL), as opposed to CUDA code, which is precompiled in advance and stored in a binary format inside the program executable. This allows the OpenCL code to be portable and best optimized for the particular hardware on which it runs. The downside is that the compilation may take a while, depending on the OpenCL code complexity, the number of OpenCL devices in the system, and the OpenCL compiler and driver versions. Luckily, the binary version of the compiled OpenCL code can be cached and then reloaded much faster later on.

The first time you install V-Ray RT GPU and perform a GPU rendering, V-Ray will compile the OpenCL code for your hardware. This may take anywhere from 30 seconds to several minutes, depending on the number of graphics cards and driver version, and might need a significant amount of CPU memory. In the V-Ray RT render log window you will see something like this:

 

 

 

The resulting compiled binary code is cached to disk in the temporary folder for the current user. On subsequent runs, the compilation phase is skipped and the code is loaded directly from the disk:




Such complex run-time compilation is not required by the CUDA engine and it will start rendering almost right away.


Choosing Which Devices to Use for Rendering


You may not want to use all available OpenCL/CUDA devices for rendering, especially if you have multiple GPUs and you want to leave one of them free for working on the user interface or you may want to combine your CPU and GPU together (see Hybrid Rendering section below). To do this, you can use the control in the V-Ray RT GPU settings in 3ds Max, which allows you to specify which devices you want to use for RT GPU rendering.

Alternatively, you can use the supplied GUI tool, which you can find in Start Menu > Programs > Chaos Group > V-Ray RT Adv for 3ds Max > Select OpenCL devices for V-Ray RT.


 

 

After changing this option, you need to restart the V-Ray RT render server (if it is running) for the changes to take effect. If the V-Ray RT render server is running as a Windows service, you may need to stop it from the Services applet in the Control Panel.
Note that the tool determines the devices to use for both CUDA and OpenCL rendering.

 

Balancing the GPU Load


Balancing the GPU Load
If you have only one GPU on your system, you may find that the user interface becomes sluggish and unresponsive while V-Ray RT is rendering on the GPU. To alleviate this problem, reduce the Rays per pixel and/or the Ray bundle size parameters in the Performance section of the V-Ray RT renderer settings in the 3ds Max Render Setup dialog. For example, you can try values like 64/1 or 32/1. This will break up the data passed to the GPU into smaller chunks so that the user interface requests can be processed faster. Note, however, that this will reduce the rendering speed. Turn on the statistics display to check the difference in render speed and to find the optimal settings for your system.

Hybrid Rendering with CPUs and the CUDA Engine


Starting in 3.60, V-Ray RT GPU can perform hybrid rendering with the CUDA engine utilizing both the CPU and NVIDIA GPUs. V-Ray can now execute the CUDA source on the CPU, as though the CPU was another CUDA device. To enable the hybrid rendering mode, simply enable the C++/CPU device from the list of CUDA devices.

The hybrid rendering mode does not require any special drivers. Furthermore, you can use the CPU as a CUDA device even if you don't have an NVIDIA GPU and/or NVIDIA drivers installed. Meaning, this mode can be used on computers that don't even have GPUs. The hybrid render engine running on a CPU supports the same features as the regular V-Ray RT GPU CUDA engine.


References


The official OpenCL website: http://www.khronos.org/opencl/

The NVIDIA CUDA developer zone:https://developer.nvidia.com/category/zone/cuda-zone