WisPaper
WisPaper
学术搜索
学术问答
价格
TrueCite
Memory Transfer Learning: The Secret to Cross-Domain Coding Mastery
总结
问题
方法
结果
要点
摘要

This paper introduces Memory Transfer Learning (MTL) for coding agents, a paradigm that enables the sharing of experience across heterogeneous task domains. By utilizing a unified memory pool, the authors achieve a 3.7% average performance gain across 6 benchmarks, reaching new SOTA levels for self-evolving agents in complex coding environments.

TL;DR

Researchers from KAIST and NYU have cracked the code on how AI agents can "remember" lessons from one programming domain and apply them to completely different ones. By shifting focus from raw code traces to high-level Insights, their Memory Transfer Learning (MTL) framework boosts performance by up to 8.3%, proving that for AI agents, process is more important than syntax.

Motivation: Why Coding Agents Stay Stuck in Silos

In the current landscape of AI software engineering, we have "specialists." An agent might be great at competitive programming (LiveCodeBench) but fail miserably when asked to fix a real-world GitHub bug (SWE-bench).

The problem isn't a lack of data; it's a Memory Wall. Most agents only learn from their own successes and failures within a single domain. They fail to realize that whether they are writing a sorting algorithm or fixing a Django backend, the "Meta-Knowledge"—things like "always verify with a dry run" or "check file headers before editing"—remains identical.

Methodology: The Abstraction Spectrum

The authors investigated which type of memory transfers best. They categorized memory into four levels of abstraction:

  1. Trajectory (Low Abstraction): Raw command-level logs.
  2. Workflow: Extracted sequences of successful actions.
  3. Summary: Natural language descriptions of why a task succeeded/failed.
  4. Insight (High Abstraction): Task-agnostic principles for future problem-solving.

Memory Representations Figure 1: From concrete traces to abstract insights—the hierarchy of memory formats used in the study.

The "Insight" Advantage

The study found a direct correlation between Abstraction and Transferability. Low-level Trajectories often caused "Negative Transfer" because the agent would blindly copy-paste commands (like R-language syntax into a C++ environment). Insights, however, act as behavioral guardrails.

Experimental Battleground: Scaling Beyond Benchmarks

The team tested MTL across 6 heterogeneous benchmarks using models like GPT-5-mini and DeepSeek V3.2.

| Benchmark | Improvement (MTL-Insight) | Key Takeaway | | :--- | :--- | :--- | | SWEBench-Verified | +4.0% | Better repository navigation. | | MLGym-Bench | +8.3% | Improved experimental discipline. | | ReplicationBench | +7.8% | Stronger scientific reasoning. |

Performance across models Figure 2: Performance gains across various LLMs. Note the consistent improvement when using high-level Insight transfer.

Deep Dive: How Memory Actually Helps

Interestingly, only 5.5% of the performance gain came from transferring actual algorithms. The vast majority of the "win" came from Meta-Knowledge, such as:

  • Iterative Workflow Discipline: "Edit small, test immediately."
  • Anti-Pattern Avoidance: "Don't blindly overwrite files without checking dependencies."
  • Environment Adaptation: "How to handle bash vs. sh idiosyncrasies."

Visualization of Abstraction Figure 3: t-SNE visualization showing that Task and Trajectory memories are clustered (siloed), while Insight memories are intermingled (generalized) across all domains.

Conclusion: A New Blueprint for Self-Evolving Agents

The findings establish three critical design principles:

  1. Abstraction is King: To transfer knowledge, you must strip away the task-specific "noise."
  2. Cross-Model Synergy: Memory can be transferred from stronger models (like GPT-5) to weaker ones, serving as a form of "on-the-fly" distillation.
  3. Scale is Data-Dependent: The effectiveness of MTL scales linearly with the size of the cross-domain memory pool.

The Future: Instead of training massive, domain-specific coding models, we should focus on building tiny, agile agents that share a massive, global "Insight Pool." This paper shows that in the world of AI coding, wisdom is universal.

发现相似论文

试试这些示例

  • Search for recent papers on coding agents that utilize "Long-term Memory" or "Self-Evolution" to solve repository-level software engineering tasks.
  • Which paper first proposed the "ReasoningBank" or "Agent Workflow Memory" (AWM) frameworks, and how does this paper's MTL approach specifically iterate on their cross-domain limitations?
  • Explore research regarding "Negative Transfer" in Large Language Model agents and methods to mitigate "Implementation Anchoring" during in-context learning.
目录
Memory Transfer Learning: The Secret to Cross-Domain Coding Mastery
1. TL;DR
2. Motivation: Why Coding Agents Stay Stuck in Silos
3. Methodology: The Abstraction Spectrum
3.1. The "Insight" Advantage
4. Experimental Battleground: Scaling Beyond Benchmarks
5. Deep Dive: How Memory Actually Helps
6. Conclusion: A New Blueprint for Self-Evolving Agents