CoDa-4DGS: Dynamic Gaussian Splatting with Context and Deformation Awareness for Autonomous Driving

*Equal contribution
1Fraunhofer IVI, 2TU Munich, 3TU Delft, 4TH Ingolstadt

MY ALT TEXT Our CoDa-4DGS incorporates both context awareness and deformation awareness to effectively compensate for deformable Gaussians in 4D. This results in more accurate dynamic scene rendering and enables a range of downstream applications, such as scene segmentation, instance segmentation, 4D reconstruction, novel view synthesis, and scene synthesis. Note that we use Principle Component Analysis (PCA) to visualize the splatted context awareness for each camera next to their RGB rendered results.

Abstract

Dynamic scene rendering opens new avenues in autonomous driving by enabling closed-loop simulations with photorealistic data, which is crucial for validating end-to-end algorithms. However, the complex and highly dynamic nature of traffic environments presents significant challenges in accurately rendering these scenes. In this paper, we introduce a novel 4D Gaussian Splatting (4DGS) approach, which incorporates context and temporal deformation awareness to improve dynamic scene rendering. Specifically, we employ a 2D semantic segmentation foundation model to self-supervise the 4D semantic features of Gaussians, ensuring meaningful contextual embedding. Simultaneously, we track the temporal deformation of each Gaussian across adjacent frames. By aggregating and encoding both semantic and temporal deformation features, each Gaussian is equipped with cues for potential deformation compensation within 3D space, facilitating a more precise representation of dynamic scenes. Experimental results show that our method improves 4DGS's ability to capture fine details in dynamic scene rendering for autonomous driving and outperforms other self-supervised methods in 4D reconstruction and novel view synthesis. Furthermore, CoDa-4DGS deforms semantic features with each Gaussian, enabling broader applications.

Framework

System overview of CoDa-4DGS. Vanilla 4DGS encodes and decodes temporal deformation using HexPlane encoding. Building on this, our CoDa-4DGS embeds temporal information and aggregates it with context and temporal deformation awareness through a Deformation Compensation Network (DCN). This network encodes the deformation adjustments needed to compensate for the original temporal deformation, ultimately producing an enhanced set of 4D Gaussians.

4D RGB and Semantic Reconstruction

Example of Novel View Synthesis

Example of Scene Editing

BibTeX

@inproceedings{song2025coda,
  title={Coda-4dgs: Dynamic gaussian splatting with context and deformation awareness for autonomous driving},
  author={Song, Rui and Liang, Chenwei and Xia, Yan and Zimmer, Walter and Cao, Hu and Caesar, Holger and Festag, Andreas and Knoll, Alois},
  publisher={IEEE/CVF},
  booktitle={IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2025}
}