WisPaper
WisPaper
学术搜索
学术问答
价格
TrueCite
[ICLR 2026] Hyper Diffusion Planner: Scaling Diffusion Models for Real-World Autonomous Driving
总结
问题
方法
结果
要点
摘要

This paper introduces Hyper Diffusion Planner (HDP), a large-scale end-to-end autonomous driving (E2E AD) framework that utilizes a diffusion-based decoder for trajectory planning. Evaluated via 200 km of real-world road testing, HDP achieves a 10x performance improvement over baseline diffusion planners by optimizing loss space, trajectory representation, and data scaling.

TL;DR

The Hyper Diffusion Planner (HDP) is a breakthrough in End-to-End (E2E) Autonomous Driving that transitions diffusion-based planning from "simulation-only" to "real-road-ready." By systematically optimizing the diffusion loss space and introducing a mathematically grounded Hybrid Loss (coupling velocity and waypoints), the researchers achieved a 10x performance boost in closed-loop real-world testing (200 km).

Problem & Motivation: The Gap Between Math and Asphalt

While Diffusion Models are the "SOTA" for image generation and robotic manipulation, applying them to autonomous driving (AD) reveals three critical "pain points":

  1. Jitter vs. Geometry: Models supervised on waypoints capture the path well but produce jerky, un-drivable velocity profiles.
  2. Mode Collapse: On small datasets (like 100k frames), diffusion planners often fail to show their famous multi-modality, behaving like simple regression models.
  3. Safety Gap: Imitation learning blindly copies human behavior, including mistakes, and lacks a mechanism to prioritize "not crashing" in long-tail scenarios.

Methodology: The Core Innovations

1. Re-thinking the Loss Space

Most diffusion models predict the noise (). However, HDP finds that in AD, trajectories live on a low-dimensional manifold. Predicting the clean data () directly leads to faster convergence and eliminates high-frequency artifacts common in -prediction.

2. The Hybrid Loss (Velocity + Waypoints)

To solve the "jitters," the authors predict velocity but supervise on both velocity and waypoints. They mathematically prove that this formulation—termed a P-norm Score Matching loss—is unbiased and maintains the integrity of the data distribution while ensuring both global geometric accuracy and local kinematic smoothness.

Model Architecture Fig 1: The HDP Architecture featuring a Perception Backbone and a Transformer-based Diffusion Decoder.

3. Safety-Aware RL Post-Training

To refine the model without expensive online real-vehicle RL, HDP uses a "pseudo-closed-loop" simulation. It applies Reward-Weighted Regression: This "up-weights" safe trajectories in the training data, aligning the model with safety constraints without requiring complex gradient backpropagation through the denoising chain.

Experiments & Results: The Power of Scaling

The most striking result is the Emergence of Data Scaling. While benchmarks like NAVSIM suggest diffusion models don't show multimodality, HDP proves this is a data volume issue.

  • Scaling Multi-modality: Diversified behaviors only emerge after crossing the ~10M frame threshold.
  • Real-Vehicle Performance: Scaling from 10M to 70M frames improved success rates by over 20%.

Performance Comparison Table 1: Step-by-step performance gains from Base Model to HDP-RL.

In 200 km of urban testing, HDP handled complex "Navigational Lane Changes" and "VRU Yielding" with human-like smoothness, which was previously a major weakness for E2E learning models.

Real World Testing Fig 2: Snapshots of HDP successfully performing unprotected turns and yielding to cross traffic.

Critical Analysis & Conclusion

Takeaway: HDP demonstrates that successful E2E AD doesn't require complex, hand-crafted heuristics (like anchor trajectories). Instead, it requires a theoretically sound loss function and significant data scale.

Limitations:

  • The current RL reward focuses primarily on safety, which can sometimes lead to overly "conservative" driving (e.g., waiting too long at intersections).
  • Future work needs to balance safety with traffic efficiency to make the agent more assertive in dense traffic.

Future Outlook: HDP sets a new baseline for "Generalizable AD." By showing that diffusion models scale as well as LLMs, it opens the door for Large Foundation Models in the physical world.

发现相似论文

试试这些示例

  • Search for recent papers that utilize diffusion models for end-to-end autonomous driving specifically focused on real-world (non-simulation) deployment and closed-loop evaluation.
  • Which study first identified the "low-dimensional manifold" characteristics of trajectories in generative planning, and how does HDP's hybrid loss mathematically differ from early Diffusion Policy implementations?
  • Explore research that applies safety-aware reinforcement learning or reward-weighted regression to improve the robustness of generative world models or planners in robotics.
目录
[ICLR 2026] Hyper Diffusion Planner: Scaling Diffusion Models for Real-World Autonomous Driving
1. TL;DR
2. Problem & Motivation: The Gap Between Math and Asphalt
3. Methodology: The Core Innovations
3.1. 1. Re-thinking the Loss Space
3.2. 2. The Hybrid Loss (Velocity + Waypoints)
3.3. 3. Safety-Aware RL Post-Training
4. Experiments & Results: The Power of Scaling
5. Critical Analysis & Conclusion