OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Paper • 2604.10866 • Published 3 days ago • 32
RADAR: Robust AI-Text Detection via Adversarial Learning Paper • 2307.03838 • Published Jul 7, 2023 • 1
Gradient Cuff: Detecting Jailbreak Attacks on Large Language Models by Exploring Refusal Loss Landscapes Paper • 2403.00867 • Published Mar 1, 2024
Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language Models Paper • 2412.18171 • Published Dec 24, 2024
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published Nov 6, 2025 • 242
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Paper • 2604.10866 • Published 3 days ago • 32