OpenRT - An Open Framework for Red Teaming Multimodal LLMs
OpenRT is a modular and extensible environment for systematic safety evaluation of large language models
OpenRT is a modular and extensible environment for systematic safety evaluation of large language models
Note on the document Small Language Model for AI Agents HandBook
The authors present a new attack called Doublespeak: a simple attack based on hijacking the model’s internal representations in context
A new framework for creating compact models for finding vulnerabilities in C/C++ code
A new attack that makes it possible to determine the topic of an LLM query from encrypted traffic
How LLM selection affects agent security
How future devices with built-in LLMs will become a security problem, because attackers will be able to live off the LLM (LOLLM)
A guide to designing secure enterprise AI agents using MCP from IBM, with verification from Anthropic
A new class of attacks where text and image look safe separately, but their combination carries malicious meaning
Model pruning can be used by an attacker