CRUST-bench is a benchmark that measures the performance on the C-to-Rust translation task.
Please see our blog post for a more detailed description.
Model | Pass@1 | Compiler repair | Test repair | |||
---|---|---|---|---|---|---|
Build | Test | Build | Test | Build | Test | |
o3-2025-04-16 | 35 | 19 | 68 | 31 | 68 | 31 |
o1-preview-2024-09-12 | 32 | 15 | 69 | 28 | 54 | 37 |
claude-3.7-sonnet-20250219 | 26 | 13 | 54 | 23 | 49 | 32 |
claude-3.5-sonnet-20240620 | 26 | 11 | 49 | 21 | 38 | 24 |
o1-mini-2024-09-12 | 19 | 9 | 47 | 16 | 27 | 21 |
gpt-4o | 18 | 7 | 52 | 18 | 42 | 22 |
gemini-1.5-pro | 11 | 3 | 35 | 11 | 30 | 14 |
arcee-ai/Virtuoso-Medium-v2 | 2 | 2 | 21 | 6 | 10 | 6 |
Qwen/QwQ-32B-Preview | 1 | 0 | 1 | 0 | 1 | 0 |
Adapted SWE-agent (claude-3-7-sonnet-20250219) | 41 | 32 | – | – | – | – |