Skip to content

RTX device: path-tracing correctness improvements#293

Merged
jeffamstutz merged 20 commits into
NVIDIA:next_releasefrom
tarcila:rtx-shading-fixes
May 13, 2026
Merged

RTX device: path-tracing correctness improvements#293
jeffamstutz merged 20 commits into
NVIDIA:next_releasefrom
tarcila:rtx-shading-fixes

Conversation

@tarcila
Copy link
Copy Markdown
Collaborator

@tarcila tarcila commented May 13, 2026

Quality improvements:

  • HDRI color parameter is now respected.
  • Clearcoat reflects HDRIs/sky on indirect bounces.
  • Rough dielectrics no longer go black on indirect bounces.
  • No more dark triangular bands at grazing light on smooth meshes.
  • Shading terminator no longer reveals per-triangle facets.
  • AO and ambient lighting are properly uniform (sampler bias fixed).
  • Direct/indirect diffuse lighting now agree.

Normal mapping fixes:

  • Tangents continuous across triangle edges; UV mirror seams blend correctly.
  • Normal-mapped lighting no longer mirrored along the bitangent.
  • glTF tangent handedness (.w) honored.

Stability improvements:

  • Fewer NaN/black-pixel artifacts.
  • No crash on triangle-soup geometry with face-varying validation.
  • Tangent buffer leak fixed.
  • Cutout transparency restored in the Interactive renderer.

Performance improvements:

  • Denoise prepare halved (kernel fusion).
  • Geometry edits no longer rebuild volume BVHs or light lists.

tarcila added 20 commits May 12, 2026 20:51
Split the global `lastBLASChange` into per-domain stamps so a geometry
edit no longer invalidates volume BVHs or per-group light index lists,
and vice versa.

  lastSurfaceBLASChange -> bumped by Geometry, Surface
  lastVolumeBLASChange  -> bumped by Volume, SpatialField
  lastLightSetChange    -> bumped on Group commits (light array rebind)
  lastTLASChange        -> unchanged (Instance)

Group::markFinalized bumps all three since any of its arrays may have
been rebound. Each Group::rebuild* gates on its own stamp; World fans
the trigger across the three for the BLAS rebuild path.
The faceVarying.normal/tangent count check dereferenced m_index
unconditionally, crashing on triangle-soup geometry (no index buffer).
The computed-tangent Array1D was built without a deleter, so the
cudaMalloc'd memory leaked when the array was released.
Also remove the unused bitangent parameter on computeTangents calls.
The bitangent formula was (t.x*e1 - s.x*e2)/det — sign-reversed vs.
the standard (s.x*e2 - t.x*e1)/det used by MikkTSpace. Normal-mapped
lighting on generated tangents was mirrored along the bitangent axis.
Degenerate triangles or NaN/zero input normals produced fixed
(1,0,0)/(0,1,0) tangents that conflicted with the actual surface
normal. Add safeNormalize, fall back to a Pixar orthonormal basis
built from the geometric normal, and factor the per-corner
Gram-Schmidt + handedness into a shared orthogonalizeTangent helper.
Per-triangle T at a shared corner depended on each face's own UV
gradient, so orthogonalizing against the same vertex normal still
produced different tangents per face — the shading frame jumped at
every triangle edge.

Replace with two passes: pass 1 atomically accumulates angle-weighted
T/B/N into per-vertex slots (MikkTSpace's averaging scheme); pass 2
normalizes and orthogonalizes. Output moves from face-varying
(3*numTri) to per-vertex 'vertex.tangent'. UV mirror seams must be
vertex-split — same constraint as MikkTSpace defaults.
Tangent fetch was discarding .w (vec3 cast on a vec4 array) and reading
handedness from vertex 0 alone. Read the vec4 properly and barycentric-
interpolate the sign across the corners, then quantize to +-1 for the
basis flip.
Numerical fixes:
- Rewrite ggxD denom as alpha2*x + (1-x) instead of x*(alpha2-1) + 1.
  The textbook form cancels catastrophically when alpha2 < eps(1) and
  collapses to zero at x=1, producing NaN throughput.
- Negate NdotL early-outs so a NaN takes the rejection path instead of
  slipping through (NaN compares false to <=, > etc.).
- Fall back from Ns to Ng in Matte and PBR shading state when the
  normal length squared is non-positive (catches both NaN and zero).

Mirror-seam tangents:
- Compute per-vertex bitangents B_i = t_i.w * cross(N_i, T_i) and
  barycentric-blend B and T independently, instead of blending t.w
  signs and applying once at the hit. Matches glTF Sample Renderer /
  PBRT / Filament; avoids carving seam edges into the tangent frame.
The lightDotNg gate and the ambient hemisphere normal both used Ng,
which carved per-triangle facet shapes into the lit/unlit boundary at
grazing angles on smooth-normal meshes. Switch both to Ns so the
terminator follows the shading surface; the material's own NdotL guard
still rejects light from below.
randomDir was normalizing a cube-uniform vector — that clusters samples
toward cube corners, not uniform on the sphere. Replace with the
analytic cosTheta = 1 - 2u mapping. sampleHemisphere had Malley's
method inverted (z = u, r = sqrt(1-sqrt(u))) instead of (r = sqrt(u),
z = sqrt(1-r^2)), biasing AO and diffuse estimates toward grazing.
Fold cos(theta)/pi into the ambient LightSample pdf so it matches the
corrected density.
For opaque non-metals reflectProb=1 always, so the indirect bounce was
purely glossy GGX and multi-bounce Lambertian light was missing entirely
— rough dielectrics rendered black on indirect bounces. Add a third
cosine-weighted diffuse lobe with importance proxy
(1-F)*(1-metallic)*(1-transmission)*luminance(baseColor); existing
reflect/transmit weights now divide by their (now <1) lobe-pick
probability. Sampled around Ns to match shadeSurface's diffuseBRDF axis.
Two sequential kernels did identical resolveSample work for color and
for the albedo/normal guides. Merge into a single kernel; null-pointer
checks gate the guide writes when the guides are absent. Halves the
prepare-denoise dispatch overhead per frame.
The original `compMax(vec4(abs(P), compMax(abs(dir))*t))` was redundant:
for normalized rays `hitpoint = origin + dir*t`, so `|hitpoint|_inf`
already bounds the magnitude that drives ULP-scale self-intersection
offsets at the surface. Drop the dir/t inputs and use `|P|_inf` alone.

The sole caller (`populateSurfaceHit` in populateHit.h) only runs from
__closesthit__/__anyhit__ programs, where `optixGetRayTmax()` is the
finite hit distance — there was no functional bug, this is a cleanup.
Smooth-normal triangles produce dark, triangle-shaped bands at grazing
light: the planar hit point lies below the smooth surface implied by
per-vertex normals, so direct-light shadow rays start inside that
implied curvature and self-occlude on the tessellation.

Add a shadingHitpoint() helper (Hanika RTGII ch. 4: signed distance from
each vertex tangent plane, projected back along the vertex normal,
barycentric-blended) and call it at shadow/AO ray origins only.
Continuation rays keep the unmodified facet hitpoint — transmission
needs it: the smoothed point can sit far enough above the facet that
the -Ng*epsilon offset still leaves the origin outside the volume,
blocking the refracted path from reaching the back wall.
Both shadeSurface and nextRay evaluated (1-F) at the GGX half-vector,
but the diffuse direction is independent of H so the weight didn't
match between NEE and the bounce. Evaluate the diffuse Fresnel at NdotV
(Frostbite/Disney) so the two estimators agree at any roughness.

Also factor the Fresnel+iridescence block into evalFresnelWithIridescence
to remove the duplicated block.
The init() function had two post-hoc dot-product guards to swap a
NaN/zero shading normal for the geometric one. Doing it inside
sampleNormalMap means every caller gets a usable normal back without
the cleanup pass downstream.
The field stores the refraction ratio (n1/n2 from the incident side),
pre-inverted on front-facing hits so it can feed glm::refract directly.
Calling it 'ior' invited reading it as the material's IOR. Move the
'pre-inverted' note onto the struct field so the contract lives with
the data.
Clearcoat was only evaluated in NEE, so smooth clearcoats over matte
bases never picked up HDRI/sky reflections on the bounce path.
@tarcila tarcila requested a review from jeffamstutz May 13, 2026 13:24
Copy link
Copy Markdown
Collaborator

@jeffamstutz jeffamstutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@jeffamstutz jeffamstutz merged commit 73b8f14 into NVIDIA:next_release May 13, 2026
8 checks passed
@tarcila tarcila deleted the rtx-shading-fixes branch May 13, 2026 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants