Defending MLLMs from Implicit Jailbreak Attacks

A new class of attacks where text and image look safe separately, but their combination carries malicious meaning

22 November 2025

External Data Extraction Attacks against RAG

The paper studies a new class of attacks against RAG-type systems

14 November 2025

Fine-Tuning Jailbreaks

The paper discusses vulnerabilities in fine-tuning systems for large language models under conditions close to real-world operation

10 November 2025