OpenSeeker is the first fully open-source frontier-level search agent developed by an academic team. It utilizes two novel synthesis techniques—Fact-grounded QA Synthesis and Denoised Trajectory Synthesis—to achieve SOTA performance on benchmarks like BrowseComp and WideSearch using only 11.7k training samples and a single SFT run.
TL;DR
Search agents have traditionally been a "closed-door game" played by industrial giants like OpenAI and Google. OpenSeeker, a breakthrough from Shanghai Jiao Tong University, changes this by fully open-sourcing the training data and model weights of a 30B agent that rivals proprietary models. Using just 11.7k high-fidelity synthetic samples, it outperforms models like Tongyi DeepResearch which use far more complex RL-based pipelines.
Background: The Industrial Monopoly on "Deep Research"
In the race for autonomous web intelligence, a massive gap has formed between proprietary "Deep Research" models and the open-source community. While architectural details are often shared, the high-quality trajectory data remains a corporate secret. Prior open-source attempts often hit a performance ceiling because their training data lacked the structural complexity to force "multi-hop" reasoning—agents would simply shortcut to answers using parametric memory or simple keyword search.
Methodology: Engineering the "Hardest" Problems
The core philosophy of OpenSeeker is that a model is only as good as the puzzles it is forced to solve. The team introduced two surgical innovations to the data synthesis pipeline:
1. Fact-Grounded & Controllable QA Synthesis
Instead of asking an LLM to "dream up" a hard question, OpenSeeker reverse-engineers the web graph.
- Topological Expansion: They start with a seed webpage and expand to connected nodes via hyperlinks.
- Entity Obfuscation: To prevent the agent from "cheating" with a single Google search, they replace specific entities with vague descriptions (e.g., "the winner of the 2024 award" instead of the person's name). This mandates a multi-step navigation path through the graph to resolve entities before answering.

2. Denoised Trajectory Synthesis (Asymmetric Training)
Raw web data is noisy. To create "Golden Trajectories":
- The Teacher: During data generation, a teacher model sees a summarized, denoised version of previous steps. This allows the teacher to plan perfectly without getting distracted by HTML boilerplate.
- The Student: During SFT, the OpenSeeker model is trained to predict the Teacher's perfect actions but is given the raw, noisy tool output. This forces the model to internalize the ability to "see through the noise."

Experiments: Quality Over Quantity
The results are a testament to the power of data engineering. Traditionally, SFT requires hundreds of thousands of samples to change model behavior significantly. OpenSeeker achieves SOTA with a mere 11,700 samples.
- Efficiency: On BrowseComp-ZH, OpenSeeker (48.4) beats Tongyi DeepResearch (46.7).
- Complexity: Analysis of the trajectories shows OpenSeeker-v1 data averages 46 tool calls per task, nearly double the complexity of standard benchmarks like BrowseComp.

Critical Analysis & Conclusion
Takeaway
OpenSeeker proves that you don't need a multi-million dollar RL budget to build a frontier agent; you need an intelligent data engine. By reverse-engineering the web graph to generate tasks, the authors have provided a scalable "curriculum" that could theoretically scale to much larger models.
Limitations
Currently, the model has only been trained for a single run. The authors note that no heuristic filtering or hyperparameter tuning was performed, suggesting that the current performance is likely a lower bound of what this methodology can achieve.
Outlook
By open-sourcing the data, OpenSeeker democratizes research into long-horizon planning. Future work will likely look at integrating more diverse tools (beyond web search) and refining the "Student-Teacher" denoising gap to handle even more unstructured environments like PDF analysis or code repositories.
