EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure

WisPaper

Scholar Search

Scholar QA

Pricing

TrueCite

Workspace

Home

Blog

EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure

[CVPR 2025] EGLOCE: Redefining Concept Erasure via Inference-Time Latent Optimization

Summary

Problem

Method

Results

Takeaways

Abstract

EGLOCE is a training-free framework for concept erasure in text-to-image diffusion models. It introduces an inference-time latent optimization strategy using a dual-energy guidance mechanism (repulsion and retention) to remove undesired content like nudity or copyrighted styles without modifying model weights.

TL;DR

EGLOCE (Energy-Guided Latent Optimization for Concept Erasure) is a training-free, "plug-and-play" framework that secures text-to-image models. By optimizing the latent space during sampling using dual energy functions—Repulsion (to move away from bad concepts) and Retention (to keep the original prompt's soul)—it achieves state-of-the-art safety without ever touching the model's weights.

The "Safety vs. Fidelity" Dilemma

As Diffusion Models like Stable Diffusion become ubiquitous, "Concept Erasure" (removing nudity, violence, or copyrighted artist styles) has moved from a moral preference to a legal necessity.

Current solutions are split into two camps:

Training-based: Effective but rigid. You have to retrain for every new concept, and often the model "forgets" how to draw other things correctly.
Training-free (Inference-time): Dynamic but weak. Techniques like Negative Guidance often leave "shadows" of the concept behind or degrade the image into a chromatic mess.

The authors of EGLOCE identify the core missing piece: explicit energy minimization. Instead of just nudging the model, why not treat safety as a mathematical landscape where the unsafe regions are "mountains" to be avoided?

Methodology: The Repel & Retain Strategy

EGLOCE operates on the "Energy-Based Selection" principle. During each denoising step, it takes the current noisy latent $z_{t}$ , estimates the clean version $x_{0}$ using Tweedie’s Formula, and then performs a mini-optimization.

1. The Dual-Energy Framework

Repulsion Energy ( $E_{r e p}$ ): This calculates the CLIP similarity between the generated image and the target concept (e.g., "nudity"). The gradient of this energy pushes the image away from that concept.
Retention Energy ( $E_{r e t}$ ): Since pushing away might make the model forget the rest of the prompt (e.g., "a woman in a park"), this term ensures the latent stays anchored to the original user intent.

2. Iterative Refinement

Unlike previous methods that take one step and move on, EGLOCE uses Fixed-Point Iteration. It repeats the optimization $K$ times (typically $K = 3$ ) per timestep. This ensures the latent finds a stable, safe path through the complex, non-convex CLIP manifold.

Overall Framework Architecture Figure: The EGLOCE workflow showing how repulsion and retention energies steering the denoising trajectory.

Experiments: Superior Erasure & Better Quality

The most impressive part of EGLOCE is its synergy. It doesn't just replace older methods; it enhances them. When added to existing baselines like SAFREE or ESD, it dramatically lowers the success rate of "jailbreak" prompts (Adversarial Attacks).

Key Results:

Nudity Removal: Consistently lowers Attack Success Rate (ASR) across benchmarks like I2P and Ring-A-Bell.
Image Fidelity: Interestingly, because the "Retention" energy keeps the latent aligned with the prompt, the FID scores actually improve. The images look cleaner and have fewer structural artifacts (like extra limbs) compared to the base models.

Qualitative Results of Nudity Erasure Figure: Qualitative comparison showing EGLOCE successfully adding clothing or changing context where other methods failed.

Critical Insight: The "Adversarial" Catch

While EGLOCE is a massive step forward, the authors honestly note a limitation: CLIP is not a perfect judge. Because the optimization is so powerful, the latent might find a way to "trick" the CLIP encoder into thinking the concept is gone by adding imperceptible noise (an adversarial perturbation), while a human can still see the target concept.

This suggests that the next frontier in AI safety isn't just better optimization, but robust perceptual energy functions that can't be "cheated."

Conclusion

EGLOCE proves that you don't need to rebuild the engine to make the car safer. By treating the sampling process as a steered optimization problem, we can achieve high-fidelity, safe image generation on the fly. It is a vital tool for any developer looking to deploy generative models in compliant, real-world environments.

Find Similar Papers

Try Our Examples

Find recent papers that utilize training-free energy-guided sampling for tasks other than concept erasure, such as image editing or structural control.
Which studies first introduced the use of Tweedie's Formula for estimating clean latents (z0|t) in energy-based diffusion guidance?
Search for research investigating more robust perceptual metrics than CLIP to drive repulsion gradients in concept unlearning to avoid adversarial noise artifacts.

Contents

[CVPR 2025] EGLOCE: Redefining Concept Erasure via Inference-Time Latent Optimization

1. TL;DR

2. The "Safety vs. Fidelity" Dilemma

3. Methodology: The Repel & Retain Strategy

3.1. 1. The Dual-Energy Framework

3.2. 2. Iterative Refinement

4. Experiments: Superior Erasure & Better Quality

4.1. Key Results:

5. Critical Insight: The "Adversarial" Catch

6. Conclusion