This page provides information on setting up V-Ray RT.


Page Contents



GPU rendering allows V-Ray RT to perform the raytracing calculations on the GPUs installed in the system, rather than the CPU. Since GPUs are specifically designed for massively parallel calculations, they can speed up the rendering process by an order of magnitude. 

To enable GPU rendering, select the CUDA or OpenCL value for the Engine type parameter in the V-Ray RT settings.

(warning) For answers to common questions about V-Ray RT, see the V-Ray RT FAQ page.


Supported Hardware and Drivers

V-Ray RT for GPU has two back-ends (or engines). One is based on OpenCL (see the references section below for more info on OpenCL) and the other one - on the NVIDIA CUDA platform.
The OpenCL engine should be able to run on any OpenCL-compatible hardware. However, this may or may not be true based on different manufacturers and driver versions. Below is a list of tests we have performed and the results:

  • NVIDIA GPUs: For NVIDIA GPUs always prefer using CUDA, since it runs faster and has more supported features. However, V-Ray RT Open CL works properly on Fermi-, Kepler-, Maxwell or Pascal- based cards with the latest drivers. Cards with architecture older than Fermi are not supported.
  • AMD GPUs: RT OpenCL on AMD works only on AMD GCN 1.1 (or newer) GPUs with driver 15 (or newer) using V-Ray 3.40.01 or newer. Polaris architecture or later is recommended.
  • Intel GPUs: RT Open CL does NOT work on Intel GPUs.
  • CPU (Using Intel or AMD OpenCL runtime): OpenCL allows using the CPU for calculations together with the GPUs. For that, you need to install OpenCL runtime for CPUs (Intel OpenCL RuntimeAMD OpenCL Runtime). V-Ray RT OpenCL works on all V-Ray versions with the latest Intel/AMD runtimes on Windows.
  • Combining CPU and GPUs together: This is possible because V-Ray GPU needs an OpenCL-capable device, which could be a GPU or a CPU (when using the appropriate runtime) and using the OpenCL Device Select you can tell V-Ray to use any number of devices in any combination.


The CUDA engine is supported only in 64-bit builds of V-Ray RT for Fermi-, Kepler-,Maxwell- and Pascal- based NVIDIA cards.

It is highly recommended to use the CUDA engine on NVIDIA GPUs.
Rendering on multiple GPUs is supported and by default V-Ray RT for GPU will use all available OpenCL/CUDA devices. See the sections below how to choose devices to run V-Ray RT GPU on.
V-Ray RT for GPU has been tested on a number of graphics cards including:

Titan X Pascal, GeForce 980, GeForce 970, GeForce 960, GeForce 750ti, GeForce 750m, GTX 1080, GTX 1070, GeForce 680 GTX, GeForce 580 GTX, GeForce 590 GTX, GeForce 570 ,GeForce 480 GTX, Titan X, Titan Z, Titan Black, Tesla k40, Tesla k80, Tesla C2050, Tesla M60, Quadro P6000, Quadro P5000, Quadro M6000, Quadro M5000, Quadro M4000, Quadro K5200, Quadro K6000, Quadro K4200, Quadro K4000, Quadro K2000, Quadro 2000M, AMD R9 Fury X, AMD R9 Fury Nano, AMD Fury R9 390X, AMD R9 380X, AMD W9100.

If V-Ray RT for GPU cannot find a supported OpenCL/CUDA device on the system, it may silently fall back to CPU code. To see if the V-Ray render server is really rendering on the GPU, check out its console output.

Driver versions for V-Ray RT GPU CUDA

In V-Ray 3.50 and newer RT GPU works well with the latest NVIDIA drivers. For comprehensive list of supported driver for older V-Ray versions, please contact


OpenCL Run-Time Compilation

In general, for portability reasons, OpenCL code is compiled at run-time when a program runs (much like OpenGL shaders written in GLSL and DirectX shaders written in HLSL), as opposed to CUDA code, which is precompiled in advance and stored in a binary format inside the program executable. This allows the OpenCL code to be portable and best optimized for the particular hardware on which it runs. The downside is that the compilation may take a while, depending on the OpenCL code complexity, the number of OpenCL devices in the system, and the OpenCL compiler and driver versions. Luckily, the binary version of the compiled OpenCL code can be cached and then reloaded much faster later on.

The first time you install V-Ray RT GPU and perform a GPU rendering, V-Ray will compile the OpenCL code for your hardware. This may take anywhere from 30 seconds to several minutes, depending on the number of graphics cards and driver version, and might need a significant amount of CPU memory. In the V-Ray RT render log window you will see something like this:




The resulting compiled binary code is cached to disk in the temporary folder for the current user. On subsequent runs, the compilation phase is skipped and the code is loaded directly from the disk:

Such complex run-time compilation is not required by the CUDA engine and it will start rendering almost right away.

Choosing Which Devices to Use for Rendering



You may not want to use all available OpenCL/CUDA devices for rendering, especially if you have multiple GPUs and you want to leave one of them free for working on the user interface. To do this, you can use the control in the V-Ray RT GPU settings in 3DS Max, which allows turning GPUs off for RT GPU rendering.

Alternatively, you can use the supplied GUI tool, which you can find in Start Menu > Programs > Chaos Group > V-Ray RT Adv for 3ds Max > Select OpenCL devices for V-Ray RT:
After changing this option, you need to restart the V-Ray RT render server (if it is running) for the changes to take effect. If the V-Ray RT render server is running as a Windows service, you may need to stop it from the Services applet in the Control Panel.
Note that the tool determines the devices to use for both CUDA and OpenCL rendering.


Balancing the GPU Load

Balancing the GPU Load
If you have only one GPU on your system, you may find that the user interface becomes sluggish and unresponsive while V-Ray RT is rendering on the GPU. To alleviate this problem, reduce the Rays per pixel and/or the Ray bundle size parameters in the Performance section of the V-Ray RT renderer settings in the 3ds Max Render Setup dialog. For example, you can try values like 64/1 or 32/1. This will break up the data passed to the GPU into smaller chunks so that the user interface requests can be processed faster. Note however, that this will reduce the rendering speed. Turn on the statistics display to check the difference in render speed and to find the optimal settings for your system.


The official OpenCL website:

The NVIDIA CUDA developer zone: