Architecting secure enterprise AI agents with MCP

Introduction

A guide to designing secure enterprise AI agents using MCP from IBM, with verification from Anthropic.

It defines what AI agents are: programs that perceive context, plan, use tools, and act to achieve goals. Unlike traditional applications, they are adaptive, probabilistic, and trainable.

It discusses paradigms such as:

From deterministic to probabilistic
From static to adaptive
From code-first to evaluation-first

Agentic Enterprise

This section describes how enterprises move from a traditional IT model to a new paradigm: agentic architecture, in which AI agents become active participants in business processes rather than just auxiliary tools.

IBM argues that deploying such agents requires rethinking organizational, technical, and governance processes so that AI acts within corporate norms: safely, predictably, and controllably.

An agentic enterprise is not simply the adoption of new technologies, but an architectural and cultural transformation where AI agents become “digital employees”.

To do this, an enterprise must:

create a unified agent development lifecycle (ADLC);
implement security and observability processes for agents as for any other software;
integrate agents into existing DevSecOps and CI/CD chains;
implement architectural principles such as hybrid design, governability, isolation, and compliance.

Hybrid architectures, sandbox isolation, and contextual access control are used.

The Agent Development Lifecycle (ADLC)

An extended DevSecOps cycle for agents is considered, including two internal loops:

Experimentation between Build and Test. This makes it possible to improve agent quality;
Real-time optimization (Runtime Loop), which improves quality and reduces costs.

ADLC phases:

Plan - task and KPI definition;
Code & Build - designing prompts, memory, and tools;
Test & Release - testing and certification;
Deploy - secure deployment;
Monitor & Optimize - observation and improvements;
Operate - operation and audit.

Enterprise Considerations Building AI Agents

This section explains what factors and conditions enterprises need to consider before creating and deploying AI agents. IBM emphasizes that agentic architecture is not a universal solution, because not every task requires agents, and successful deployment requires balancing value, risk, and operational readiness. In other words, this section discusses different considerations for creating AI agents.

When to use agents: IBM recommends starting not with the technology but with the business problem, because not every problem requires an “agentic” approach, and sometimes classic automation, RAG, or simply a prompt interface is sufficient.

Key criteria:

Clearly defined task domain - the agent must solve a specific, measurable business problem;
Contextual decision-making - an agent is needed if the decision depends on context and data;
Need for autonomous actions - when the agent must perform operations, not just provide answers;
Multi-step tasks - an agent is effective for action chains: collection -> analysis -> execution -> verification;
Benefit from adaptivity - the agent should improve with experience, not operate by rigid rules.

Three areas of the most successful agentic solutions are highlighted:

Customer Support & Service
Document-heavy Processes (document workflows, compliance, analysis)
Knowledge Work & Development Augmentation (specialist assistance)

Strategic factors affecting successful agent deployment are defined:

Security & Risk Management
Compliance & Auditability
Business Value Realization
Observability & Operations
Governance & Lifecycle Management

Agent Observability and Operations

This section describes how organizations should observe, manage, and optimize the operation of agentic AI systems in production. It combines two disciplines.

Agent Observability

Obtaining transparency and controllability in agent operation across the entire lifecycle, where IBM formulates three key observability principles:

Measure Everything - measure not only technical indicators, but also semantic, behavioral, ethical, and business outcomes;
Observe Early - observability must be built in during development;
Close the Loop - observation must not only record events, but also automatically influence agent improvement.

One of IBM’s key innovations is full tracing of the agent’s reasoning process, which makes it possible to:

understand why the agent made a particular decision,
reproduce actions during an audit,
evaluate reasoning logic and safety.

IBM proposes storing reasoning in structured form (JSON), indicating reasoning steps, tool calls, intermediate states, data sources, and environment context (time, user, access policy).

Agent Operations

This subsection extends classic DevOps to managing the behavior, reliability, and quality of live agents. IBM defines AgentOps as a set of processes:

agent version management (Model Registry + Policy Registry);
secure deployment and rollback;
continuous reasoning monitoring;
adaptive optimization and self-correction.

AgentOps includes these principles:

Safe Autonomy - permitted autonomy with control.
Continuous Evaluation - constant behavior evaluation.
Observability by Default - reasoning logging is always enabled.
Human-in-the-loop - ability for manual intervention.
Accountability - every agent has an owner and identity.

In agentic systems, the key question changes from “does the system work?” to “does it work correctly?”, because an agent may function technically correctly while still producing wrong or risky decisions.

Agent Security

IBM highlights security as one of the critically important aspects when designing and operating enterprise agents. Unlike traditional applications, agentic architectures:

operate in nondeterministic environments (behavior is not always repeatable);
interact with external tools through protocols such as MCP;
have autonomy and memory, meaning they can make decisions that sometimes go beyond expectations.

Because of this, standard information-security and DevSecOps approaches are insufficient, and an extended, “agent-aware” approach is required.

Key Threats

Uncontrolled access and privilege escalation The agent can independently raise its access level, bypass approvals, and exceed permissions. Consequently, this creates gaps in accountability and risks compromising critical systems.
Data leaks and prompt exploitation
Because of the stochastic nature of LLMs, an agent can:
- accidentally disclose confidential information in responses;
- be vulnerable to prompt injection.
Autonomous attacks and their amplification
Compromised agents can:
- coordinate attacks with each other;
- act faster than humans can respond;
- use legitimate tools for malicious actions.
Agentic drift and policy non-compliance
Over time, agents can “drift”, meaning change their behavior and goals without formally violating code, but violating policy, standards, or regulations. Such behavior makes continuous compliance monitoring mandatory.

Security Solution Framework

IBM proposes a holistic framework model with four areas, each addressing a specific business problem:

Agent Identity & Access
- Assign unique digital identifiers to each agent.
- Apply context-dependent and temporary access rights (Just-in-Time access).
- Maintain continuous audit trails of all actions. Goal: provide full accountability and traceability of agent actions.
Agent & Data Protection
- Use MCP gateways to filter prompts, prevent injections, and control data flows.
- Track anomalous behavior, such as unusual data requests.
- Isolate agents and environments (sandboxing). Goal: prevent uncontrolled data propagation and malicious operations.
Autonomous Agent Defense
- Implement active threat-hunting mechanisms: monitor agents that detect deviations in the behavior of other agents.
- Apply AI models for automatic attack recognition (for example, injections, goal substitution, memory poisoning).
- Provide rapid containment when threats are detected.
Security Risk & Compliance
- Include agentic systems in corporate risk-management policies.
- Continuously monitor configurations and access patterns.
- Check compliance with regulations and standards (HIPAA, GDPR, ISO, SOC).

Risk Management & Compliance

Extended requirements for enterprise environments:

Add agent components to the software supply chain: include an SBOM (Software Bill of Materials) for agents, tools, and prompts;
Sign and verify artifacts (signatures, versions, hashes) before deployment;
Scan MCP server and plugin dependencies;
Introduce least-privilege permissions by default for tools;
Conduct continuous audits for transparency, fairness, and safety.

Governance: Test, Certify & Catalog

This section describes how to formalize AI agent lifecycle governance: from development and testing to certification, deployment, and subsequent control. In other words, this is a corporate trust system: who can launch, change, and use what in the agentic-solution ecosystem, and how. IBM emphasizes that without formalized governance and certification, it is impossible to scale agentic systems safely in an enterprise environment.

Governed Catalog

The catalog is a centralized registry of all agents, tools, models, prompts, and their relationships. It provides transparency, control, and audit, like a service catalog in DevSecOps, but for agentic systems.

It records:

Registration - agent purpose, owner, environment (dev, stage, prod), data classification boundaries.
Capabilities - list of tools, resources, and prompts the agent works with
Risk Posture - description of the threat model, acceptable risk level, and applied protections.
Policies:
- Authority boundaries - clear autonomy limits: what the agent can do itself and what requires human approval.
- Data handling - rules for handling data: classification, masking, minimization, storage, consent.
- Auditability - requirements for tracing and storing logs: who did what, when, and why.
Evidence: Links to evaluation reports (evals), red-team tests, approvals, and audit artifacts.

Certification Workflow

This process formalizes the agent’s transition from development to production. It includes multi-stage validation and checks for quality, security, and compliance:

Pre-release Checks

Quality, security, and policy compliance checks.
Conducting red-teaming (attack simulation).
Confirming that all required approvals have been coordinated.

Promotion Gates

Feature flags and rollback mechanisms must be present.
Deployment plan and kill switch for problem cases.
Creating a change ticket and release documentation.

Runtime Attestations

Signing and verifying artifacts (prompts, tools, code, models).
SBOM availability: a complete list of dependencies and components.

Experimentation Tracking & Lineage

IBM considers lineage tracing a mandatory part of governance in order to ensure reproducibility of agent behavior and transparency of decisions, similar to ML-Ops, but at the level of agentic systems. Experiment tracking includes:

Run metadata: date, dataset (or its hash/version), prompt version, model, tools, configuration, code commit ID, eval-suite version.
Lineage Graph: Connects experiments, candidates, and releases. Shows how and why one agent variant became the “champion”.
Replayability: Ability to partially reproduce an experiment using saved trace IDs and seeds.
Governance Link: All candidates and results (evals, reports, metrics) are attached to the agent card in the catalog.
Reproducible Manifest: A signed manifest that fixes the versions of all components (agent, prompts, model, datasets, tools).

Versioning & Lifecycle Management

This section describes how to maintain controlled agent evolution.

Core principles:

Semantic Versioning - separate versions for the agent, tools, and prompts. Additive changes are allowed; critical changes require separate review.
Provenance & SBOM - for each version, a Software Bill of Materials is created, including source code (commit), versions of tools and models, prompt hashes, dependencies, and datasets. Everything is signed and stored with the release.
Release Notes and Impact Levels - each release is classified and has its own notifications and checks.
Deprecation Policy - notifications about version deprecation with timelines and dual-run mode.
Champion-Challenger Evaluation - new versions are compared with current ones on real data.
Retirement - the process of deactivating an agent while preserving all data, artifacts, and compliance evidence.

MCP Servers Lifecycle: Enterprise Guide & Best Practices

This section describes how to design, deploy, and manage MCP servers (Model Context Protocol): key components through which AI agents safely interact with enterprise systems and perform actions. The section covers these topics:

MCP Concept

MCP is a protocol that standardizes agent access to tools, resources, and prompts. It provides security, compatibility, and scalability.

Architecture and the MCP Gateway Pattern

It is recommended to use a centralized gateway (MCP Gateway) as a single place for:

authentication and authorization;
routing, quotas, and policies;
audit and logging;
environment separation (dev/stage/prod).

Security and Isolation

Least privilege and strict authentication (OAuth, mTLS);
Validation and sanitization of all inputs/outputs;
Containerization and sandboxing of plugins;
Storing secrets only in managers.

Reliability and Scaling Practices

Rate limiting, health checks, circuit breakers;
Asynchronous and idempotent operations;
Schema versioning and backward compatibility.

Governance, Compliance, and Observability

Centralized policies (policy-as-code);
Structured audits of “who/what/when/why”;
SBOM, container signing, supply-chain control.

Testing and Certification

Security tests, fuzzing, load and chaos tests;
Checking tool contracts and model compatibility.

Containerization and CI/CD Practices

Minimal non-root images, health probes, manifests;
Automatic scanning, signing, and deployment with gates.

Reference Architecture & Enterprise Requirements for an Agentic AI Platform

IBM describes a reference architecture for building an enterprise platform that supports the lifecycle of agentic systems (ADLC), from build and testing to operation, monitoring, and governance. This is the basis for creating secure, governable, and scalable enterprise agents integrated with corporate data, processes, and policies.

Four Key Architecture Phases

Build - continuous integration, testing, synthetic data, red-teaming, built-in security and quality checks.
Deploy - deployment of models and agents with orchestration, policies, guardrails, and secure access to data through AI Gateways and MCP servers.
Monitor & Optimize - observation, telemetry, drift detection, performance and cost optimization; detection of anomalies and shadow agents.
Manage - compliance validation, certification, audit, risk management, policy updates, and deactivation of outdated agents.

Two Fundamental Pillars

Governed Catalog - centralized registry of approved agents, models, prompts, and tools with policies, versions, and compliance artifacts.
Security & Governance Layer - a unified system of identity, access policies, audit, and certification integrated into every ADLC stage.

Non-Functional Requirements Summary

Architecture and integration:

Agent and tool catalogs;
MCP Gateway for routing and policies;
Model Gateway for unified access to LLMs;
Horizontal and federated scaling.

Build-time security:

RBAC control for developers;
Data security;
Access logging;
Build-environment observability;
Supply-chain control.

Runtime security:

Agent identities;
OAuth authentication;
Rights delegation;
BYOK encryption;
Strict isolation;
Protection of prompts and artifacts;
Audit and incident response.

Observability:

Full telemetry (metrics, events, logs, traces);
Integration with the enterprise observability stack;
Token and cost accounting.

Governance & Compliance:

Compliance with standards (ISO, SOC, GDPR, HIPAA);
Drift detection;
Secure catalogs;
Integration with GRC systems.

Resilience & Ethics:

Self-healing
Fault tolerance
Cost control
Metrics

Deployment & Portability:

Support from isolated (air-gapped) to cloud environments
Portability
Versioning of models and tools.

Functional Requirements Summary

Memory & State:

Short- and long-term memory;
Context storage;
Integration with vector/graph databases;
PII handling rules.

Planning & Execution:

Task decomposition;
Secure tool orchestration;
Asynchronicity;
Human-in-the-loop for critical actions.

Interoperability:

MCP protocol support;
OpenAI-compatible APIs, plugins, and tool marketplace;
BYO models and agents.

Knowledge Management:

RAG mechanisms;
Artifact storage (reports, visualizations);
Large-scale data processing.

Human-Agent Collaboration:

Transparent and explainable decisions;
Tracing reasoning chains.

Performance & Evaluation:

Behavior logging;
self-eval;
red-teaming;
champion-challenger comparison;
CI/CD integration.

Future Autonomy:

Multi-agent interactions;
Self-learning;
Event-driven response;
Secure kill switches.

IBM’s reference agentic platform is a multi-layer ecosystem that provides security, observability, governance, and compliance at every stage of the agent lifecycle. It combines DevSecOps practices with AI governance principles so that enterprises can scale agentic systems safely, transparently, and controllably.