ML Research Wiki / Benchmarks / Red Teaming / SUDO Dataset

SUDO Dataset

Red Teaming Benchmark

Performance Over Time

📊 Showing 1 results | 📏 Metric: Attack Success Rate

Top Performing Models

Rank Model Paper Attack Success Rate Date Code
1 SUDO sudo rm -rf agentic_security 0.00 2025-03-26 📦 AIM-Intelligence/SUDO

All Papers (1)