67

CUDA Ray Tracing 2x Faster Than RTX: My CUDA Ray Tracing Journey

I think the title of this should be changed, at the moment it's click bait. It should be something like:

CUDA Ray Tracing 2x Faster Than RTX when Rendering Spheres

As far as I can see, this renderer can't do anything else except spheres (and maybe planes).

It's no bad achievement to beat a general purpose production renderer at one specific thing, but a renderer that can only do spheres is just a hyper-optimized toy, and here it's being presented as far more than that.

a day agoesperent

I’m guessing it’s because they’re using all the computing power the GPU has to offer in CUDA mode, as opposed to sharing the GPU with other functions (when in RTX).

2 days agosheepscreek

More likely it's because the scene they're using is completely unrepresentative of what people are interested in: almost no triangles, primarily procedural nodes (for spheres), and in general a fairly simple scene.

a day agoatq2119

Yup this is an "assume spherical cow" situation where it's not dishonest, but you can't draw any real world conclusions from the experiment unless you happen to be working in a very restricted space.

a day agocolechristensen

Wouldn't you need to in a real world scenario make the CUDA cores aware of the game geometry adding more work on the CPU?

a day agoChocolateGod

Ideally you don't make the cuda cores aware but rather the ray-tracing circuitry. RT cores are designed to perform ray-triangle intersections in a BVH. You get the teraflops and memory bandwidth (or more of it) if you fit the RT-core computing model.

And in most cases it's ok to spend time on one CPU function (creating and loading the BVH) against the hundred thousands of frames you'll be drawing on GPU.

a day agotouisteur

A whole lot of stuff is going on during gaming and graphics rendering with trick upon trick to squeeze out every last bit of performance. Unless you're an expert in a graphics rendering stack or a game engine it's hard to have these conversations in a meaningful way.

14 hours agocolechristensen

wow, bypassing a rendering backend makes things go faster, what a surprise!

This only runs on nvidia, vulkan is designed to be cross-compatible with not only gpus, but operating systems as well. Vulkan is pretty direct compared to something like dx11 thought so I guess it is interesting to see performance improvement non the less.

a day agokachapopopow

> FMA performance here is a non-issue, I'm not just flexing—I'm showing off my CUDA prowess. But hey, got to demonstrate I know my hardware!

This article is pretty embarrassing, and as others have noted, very misleading due to the RTX units hardly being used.

a day agopixelpoet

> __restrict__ Pointers

Ahh, my favorite nitpick from C++ not having sane default aliasing rules spills to the CUDA-land.

a day agokookamamie

Is hard to have them, when one of the original goals was being mostly copy paste compatible with C89.

10 hours agopjmlp

Yes, though C has restrict in the language now, but C++ does not.

10 hours agokookamamie

Because no one has ever bothered to create a WG21 paper proposal to include it.