GitHub - mala-lab/VerifyMAS: Official implementation of "VerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems"

OVerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems

Overview

We propose VerifyMAS, a hypothesis verification framework for agent failure attribution. Instead of directly predicting faulty agents and error types, VerifyMAS formulates and verifies failure hypotheses against full trajectories. This verification-based approach decomposes attribution into trajectory-level error validation and fine-grained agent localization, providing an error-first attribution approach that captures global failure patterns while substantially reducing the search space. We further introduce a hypothesis-based data construction strategy grounded in a structured error taxonomy and fine-tune a specialized LLM verifier model for trajectory-level failure verification and agent attribution. Experiments on Aegis-Bench and Who&When show that VerifyMAS consistently improves diverse backbone models, including open-source Qwen and API-based GPT models, outperforming prior methods without sacrificing inference efficiency for long multi-agent trajectories.

Main Results

Repository Structure

The code will be uploaded soon.

📖 Citation

If you find this work useful, please cite our paper:

@article{qiao2026verifymas,
  title={VerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems},
  author={Qiao, Hezhe and Tong, Hanghang and Lim, Ee-Peng and Liu, Bing and Pang, Guansong},
  journal={arXiv preprint arXiv:2605.17467},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README.md		README.md
logo_full.png		logo_full.png
main_table.png		main_table.png
pipeline.png		pipeline.png
sft_table.png		sft_table.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OVerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems

Overview

Main Results

Repository Structure

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

OVerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems

Overview

Main Results

Repository Structure

📖 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Packages