Mozi: Governed Autonomy for Drug Discovery LLM Agents

WisPaper

学术搜索

学术问答

价格

TrueCite

工作空间

Home

Blog

Mozi: Governed Autonomy for Drug Discovery LLM Agents

[Nature/Bioinformatics] Mozi: Bridging Generative AI and Deterministic Rigor for Governed Drug Discovery

总结

问题

方法

结果

要点

摘要

This paper introduces Mozi, a dual-layer LLM agent architecture for autonomous drug discovery that integrates governed multi-agent orchestration with structured, stateful "Skill Graphs." By bridging the Model Context Protocol (MCP) with domain-specific pipelines, Mozi achieves SOTA performance on the PharmaBench benchmark and successfully executes end-to-end therapeutic design tasks.

TL;DR

Mozi is a new agentic framework that transforms LLMs from "fragile conversationalists" into reliable "co-scientists." It introduces a Dual-Layer Architecture that separates high-level strategic reasoning (Control Plane) from the rigorous, state-dependent execution of drug discovery pipelines (Workflow Plane). By enforcing hard-coded tool governance and stateful skill graphs, Mozi eliminates the "hallucination drift" common in long-horizon scientific tasks.

The Problem: Why LLM Agents Fail in the Lab

Current AI agents suffer from two critical bottlenecks in pharmaceutical research:

Unconstrained Tool Governance: General agents often invoke expensive computational tools with invalid parameters or without proper clearance.
Long-Horizon Reliability: In a pipeline spanning from Target Identification to Lead Optimization, a 5% error in step one compounds multiplicatively, rendering the final candidate scientifically invalid.

Existing SOTA models like Biomni often function as "islands of intelligence"—they are great at single tasks but lack the interoperability and auditability required for regulated drug R&D.

Methodology: Governed Autonomy via Dual-Layer Design

The core innovation of Mozi lies in its separation of Logic and Integrity.

Layer A: The Control Plane (The Brain)

Instead of a simple ReAct loop, Layer A implements a Supervisor-Worker hierarchy. The Supervisor manages a "minimal planning" strategy, while specialized Workers (Research vs. Computation) are isolated via Role-Based Access Control (RBAC). This prevents a "Research Worker" from accidentally triggering a 10-hour GPU-intensive docking simulation.

Layer B: The Workflow Plane (The Skeleton)

Scientific protocols are materialized as Composable Skill Graphs. These are not just sequences of tools; they are stateful Directed Acyclic Graphs (DAGs) that enforce:

Data Contracts: Ensuring a protein structure is "cleaned" (via PDBFixer) before it ever touches a docking engine.
HITL Checkpoints: Strategic pauses where human experts must validate a target or a scaffold before the agent proceeds.

System Architecture Figure 1: The Workflow Plane (Layer B) captures the canonical small-molecule discovery pipeline.

Experiments: SOTA on PharmaBench

The researchers introduced PharmaBench, a benchmark of 88 complex tasks. Mozi outperformed previous baselines by significant margins:

Quantitative Success: On regression tasks (ADMET, DTI), Mozi demonstrated superior tool selection and parameter precision.
Qualitative Mastery: In a 28-task expert-level "Human-Last Exam" subset, Mozi powered by DeepSeek-V3.2 surpassed even proprietary models like Gemini-2.5-Pro.

Performance Comparison Table 1: Mozi vs. Biomni—Accuracy gains across MCQ, Classification, and Regression tasks.

Case Study: Parkinson’s Disease & LRRK2

In a real-world stress test, Mozi was tasked with finding inhibitors for the LRRK2 kinase.

Target ID: It autonomously selected the 8TXZ cryo-EM structure.
Screening: It screened 377,760 compounds using LigUnity.
Corrective Evolution: When early leads showed hERG toxicity (potential heart risk), the Lead Optimization module autonomously navigated the chemical space to find a safer scaffold.
Result: The final candidate achieved a docking score of -8.924 kcal/mol, comparable to the Phase-II clinical drug DNL-201.

Critical Analysis & Conclusion

Mozi represents a shift from "AI agents as assistants" to "AI agents as infrastructure." By using the Model Context Protocol (MCP), it federates a universe of tools (UniProt, PDB, AutoDock) into a unified fabric.

Limitations: The system still relies on in silico surrogate models. While the AI predicts a -8.9 kcal/mol binding, the "physiological reality" still requires wet-lab validation. Future work must integrate Uncertainty Quantification (UQ) to tell the human expert exactly how "confident" the AI is in its toxicity filters.

Takeaway: Mozi proves that for AI to conquer science, it doesn't just need more parameters; it needs better governance.

发现相似论文

试试这些示例

Search for recent papers that utilize the Model Context Protocol (MCP) to federate heterogeneous scientific tools for autonomous laboratory agents.
Which foundational studies first proposed the use of Directed Acyclic Graphs (DAGs) to constrain Large Language Model trajectories in multi-step reasoning tasks?
Explore research that applies LLM-governed multi-agent systems to other high-stakes physical domains such as materials science or automated chemical synthesis.

[Nature/Bioinformatics] Mozi: Bridging Generative AI and Deterministic Rigor for Governed Drug Discovery

1. TL;DR

2. The Problem: Why LLM Agents Fail in the Lab

3. Methodology: Governed Autonomy via Dual-Layer Design

3.1. Layer A: The Control Plane (The Brain)

3.2. Layer B: The Workflow Plane (The Skeleton)

4. Experiments: SOTA on PharmaBench

5. Case Study: Parkinson’s Disease & LRRK2

6. Critical Analysis & Conclusion