CRUST-bench Leaderboard

CRUST-bench is a benchmark that measures the performance on the C-to-Rust translation task.


Please see our blog post for a more detailed description.

Comparison of test success rates across different repair strategies.
ModelPass@1Compiler repairTest repair
BuildTestBuildTestBuildTest
o3-2025-04-16351968316831
o1-preview-2024-09-12321569285437
claude-3.7-sonnet-20250219261354234932
claude-3.5-sonnet-20240620261149213824
o1-mini-2024-09-1219947162721
gpt-4o18752184222
gemini-1.5-pro11335113014
arcee-ai/Virtuoso-Medium-v222216106
Qwen/QwQ-32B-Preview101010
Adapted SWE-agent (claude-3-7-sonnet-20250219)4132