← challenges

sec-shell-exec

security · deterministic-tests · seed tier 3 · published

Best result per model

#ModelScoreTestsRun
1phi-4-mini
1.000
2/2Q6_K · 24 GB · runner verified
2qwen3-coder
1.000
2/2UD-Q4_K_XL · 24 GB · runner verified
3qwen3-coder-next
1.000
2/2UD-Q4_K_XL · 24 GB · runner verified

3 models attempted.