WisPaper
WisPaper
Scholar Search
Scholar QA
Pricing
TrueCite
[NeurIPS 2025] A-MAC: Why Selective Forgetting is the Secret to Super-Intelligent LLM Agents
Summary
Problem
Method
Results
Takeaways
Abstract

The paper introduces Adaptive Memory Admission Control (A-MAC), a structured framework for LLM agent long-term memory management. By decomposing memory value into five interpretable factors and utilizing a hybrid rule-based/LLM scoring mechanism, it achieves a new SOTA F1 score of 0.583 on the LoCoMo benchmark.

TL;DR

As LLM agents move toward multi-session, long-term interactions, the "store everything" approach is failing. Adaptive Memory Admission Control (A-MAC) introduces a sophisticated "gatekeeper" for agent memory. By evaluating candidate facts across five dimensions—Utility, Confidence, Novelty, Recency, and Type Prior—A-MAC achieves a 0.583 F1 score on the LoCoMo benchmark and cuts latency by 31% compared to fully LLM-driven systems.

The Problem: The "Digital Hoarding" Crisis in AI

Current LLM agents suffer from a binary memory problem:

  1. Heuristic Over-simplification: Systems like MemGPT use simple recency or "importance" scores. They are fast but often admit hallucinations or miss subtle context.
  2. LLM Over-reliance: SOTA methods like A-mem ask an LLM to judge every single piece of information. This is slow, expensive, and creates a "black box" where you can't explain why the agent remembered a specific (potentially wrong) fact.

Without a structured Admission Control, agents succumb to memory bloat, which increases retrieval latency and causes "knowledge pollution" where old or fake facts interfere with current tasks.

Methodology: The Five Pillars of Memory Value

A-MAC transforms memory admission into a mathematical decision problem. Every candidate memory is scored using a weighted linear combination:

The genius of A-MAC lies in its Hybrid Architecture: it only uses the expensive LLM where it's needed (Utility), while using lightning-fast rules for the rest.

  1. Future Utility (): An LLM rates if the info is actionable or represents a persistent user preference.
  2. Factual Confidence (): Uses ROUGE-L to ensure the memory is actually grounded in the transcript (killing hallucinations at the gate).
  3. Semantic Novelty (): Uses Sentence-BERT to ensure we aren't storing the same fact twice.
  4. Temporal Recency (): Applies an exponential decay to prioritize recent context.
  5. Content Type Prior (): A rule-based classifier that prioritizes "Identity/Preferences" over "Transient States" (e.g., "I am vegan" is more important than "I am hungry right now").

A-MAC Overview Architecture

Performance: Precision Meets Efficiency

In the world of memory, Precision is king. If your agent remembers 1,000 things but 900 are useless, your retrieval will fail. A-MAC achieves the highest precision (0.417) among all LLM-based methods.

Key Experimental Results:

  • Accuracy: F1 score of 0.583 vs. A-mem's 0.541.
  • Speed: 2644ms per candidate (31% faster than A-mem).
  • Ablation Insight: The "Type Prior" is the MVP. Without categorizing information types, performance tanks by nearly 20%.

Precision-Recall Tradeoff Comparison

Critical Insight: The "Why" Matters

A-MAC's linear model isn't just a math trick; it's an auditable policy. Unlike "Agentic Memory" (A-mem) which hides its logic in a prompt, a developer using A-MAC can look at the weights and see exactly why a memory was rejected.

If an agent is being too "forgetful" about user preferences, you simply tune . This level of transparency is a critical requirement for production-grade AI agents in enterprise settings (like Workday AI, where this research originated).

Conclusion & Future Outlook

A-MAC proves that we don't need more LLM calls to make agents smarter; we need smarter structures around the LLM. By treating memory as a controlled resource rather than an infinite bucket, A-MAC paves the way for agents that can interact with us for years without losing their focus—or their minds.

Future Work: The authors note that while personal narratives are handled well, "Professional" domains (technical projects) remain challenging. The next frontier is likely "Domain-Specific Type Priors" that understand the nuance of legal, medical, or engineering data.

Find Similar Papers

Try Our Examples

  • Find recent papers on LLM agent memory management that specifically focus on "hallucination-aware" admission or filtering mechanisms.
  • Which original studies established the "LoCoMo" benchmark for long-term conversational memory, and what are the current leading methodologies on its leaderboard?
  • Explore research applying structured admission control or "Type Priors" to multi-modal agents or autonomous robotics memory systems.
Contents
[NeurIPS 2025] A-MAC: Why Selective Forgetting is the Secret to Super-Intelligent LLM Agents
1. TL;DR
2. The Problem: The "Digital Hoarding" Crisis in AI
3. Methodology: The Five Pillars of Memory Value
4. Performance: Precision Meets Efficiency
4.1. Key Experimental Results:
5. Critical Insight: The "Why" Matters
6. Conclusion & Future Outlook