RefracGS introduces a novel framework for high-fidelity Novel View Synthesis (NVS) through non-planar refractive surfaces by coupling a Neural Height Field for water geometry with 3D Gaussian Splatting for the underwater scene. It achieves state-of-the-art visual quality and surface reconstruction while maintaining real-time rendering at 200+ FPS.
TL;DR
RefracGS is the first framework to achieve real-time (200+ FPS) novel view synthesis and accurate surface reconstruction through wavy water. By combining a Neural Height Field with 3D Gaussian Ray Tracing, it overcomes the extreme geometric distortions that typically break traditional 3DGS and NeRF models.
The Problem: The "Straight-Line" Assumption vs. Snell's Law
In standard computer vision reconstruction, we assume light travels in straight lines. However, when looking into a pool of water, light bends at the interface (refraction) according to Snell's Law.
If you ignore this, a single underwater point appears at different "virtual" locations from different camera angles. For 3D Gaussian Splatting (3DGS), this leads to a "cloud of floaters" and blurred textures as the optimizer tries to reconcile inconsistent viewpoints. Previous fixes like NeRFrac used NeRF-based volumetric rendering, which is slow (minutes per frame) and often fails to capture sharp surface geometry.
Methodology: Decoupling Surface from Scene
The core innovation of RefracGS is the Water Height Map. Instead of treating the water as a view-dependent effect, the authors model it as a physical boundary.
1. Neural Height Field & Recursive Tracing
The water surface is represented as a height function ( \mathcal{H}(x, y) ), parameterized by an MLP. To keep rendering fast, the authors use Recursive Subdivision Tracing:
- They start with a coarse mesh to identify ray-surface intersection candidates.
- They recursively subdivide the mesh into smaller triangles, querying the MLP at each step for precise heights.
- This provides the efficiency of mesh intersections with the smoothness of implicit fields.

2. Refraction-Aware Gaussian Ray Tracing
Unlike standard 3DGS which projects (splats) particles onto a 2D plane, RefracGS uses Ray Tracing.
- Forward Pass: A ray is cast from the camera, hits the height field, bends according to the local normal using Snell's Law, and then samples 3D Gaussians underwater.
- Differentiable Pipeline: Crucially, the gradient of the loss doesn't just update the Gaussian positions; it flows back through the refraction calculation to "nudge" the water surface MLP into the correct shape.
Experimental Breakthroughs
The authors tested RefracGS on both real-world pools and a new, more difficult dataset with steep camera angles.
- Visual Fidelity: While baselines like 3DGS produce artifacts, RefracGS recovers sharp underwater textures.
- Speed: It trains in ~10-15 minutes (vs. 2.5 hours for NeRFrac) and renders at 200+ FPS.
- Geometric Accuracy: On the RefracGS dataset, the surface reconstruction error was reduced by 97% compared to the best prior methods.

Beyond View Synthesis: Water Removal and Editing
Because the water surface is explicitly decoupled, RefracGS allows for "digital de-watering." You can simply turn off the refraction physics after training to see the underwater scene as if the water were gone. Furthermore, you can replace the original wavy water with a different surface (like a triangular wave) without retraining the underlying scene.

Critical Insights & Future Work
The success of RefracGS proves that explicit physical modeling is often superior to "black-box" neural approximations for complex optical phenomena.
Limitations: The current model does not account for caustics (concentrated light patterns on the bottom) or volumetric scattering (murky water). However, the modular nature of the Gaussian ray tracer means these radiometric effects can be added as secondary components in future iterations.
Conclusion
RefracGS bridges the gap between physically accurate ray tracing and high-speed Gaussian Splatting. It is an essential step forward for underwater robotics, environmental monitoring, and AR applications where looking through dynamic interfaces is required.
