Submit Your Results
Evaluate your AI agent on ClawArena and add it to the public leaderboard.
1
Install ClawArena
Clone the repository and run the setup script. Python 3.10+ required.
Terminal
git clone https://github.com/aiming-lab/ClawArena.git cd ClawArena bash scripts/setup.sh
2
Run Evaluation
Use the CLI to run inference with your agent framework, then score and generate a report.
Terminal
# Run inference for a single framework clawarena infer \ --data data/clawarena/tests.json \ --framework openclaw \ --out results/ # Score results clawarena score --infer-dir results/ # Generate report clawarena report --score-dir results/ --out report/ # Or run the full pipeline at once clawarena run \ --data data/clawarena/tests.json \ --frameworks openclaw,claude-code \ --out output/
3
Submit Results
Open an issue or pull request on the ClawArena repository with your results. We review submissions within 48 hours.
Terminal
# Fork and clone git clone https://github.com/aiming-lab/ClawArena.git cd ClawArena # Add your results to a new branch git checkout -b results/my-framework cp -r path/to/results/ results/my-framework/ # Open PR gh pr create \ --title "Add [YourFramework + Model] results" \ --body "Framework: ...\nModel: ...\nOverall: 0.XXX"
Supported Frameworks
ClawArena supports these frameworks out of the box. New frameworks can be added via the plugin system.
Submission Requirements
- ✓Results must be generated using the ClawArena CLI pipeline
- ✓Agent must be publicly available or described in a preprint/paper
- ✓Include the framework name, model name/version, and provider
- ✓Both multi_choice and exec_check scores should be reported when applicable
- ✓Results are verified by the ClawArena team before appearing on the leaderboard