CoPhish — Using Microsoft Copilot Studio as a Wrapper: Security Implications for MCPs and AI Agents

CoPhish Security Analysis for Agents and MCPs

Datadog Security Labs' recent publication on CoPhish — exploiting Microsoft Copilot Studio wrappers to bypass safeguards and trigger unintended actions — is a timely reminder that AI-powered automation introduces novel attack surfaces. For teams running Managed Cloud Provider (MCP) environments with agent-based workflows, AI assistants, or brokered automations, the same vulnerability patterns apply with even higher stakes.

This post summarizes Datadog's findings and translates them into practical security design and mitigation strategies for MCP operators and agent architects.

What Datadog Found: CoPhish in Brief

Datadog's research demonstrated that Copilot Studio wrappers—lightweight LLM-driven interfaces around cloud APIs and tools—can be manipulated through carefully crafted prompts and chained operations. Key findings:

Prompt injection attacks: Attacker-controlled inputs can override intended behavior and bypass safety guardrails.
Tool chaining vulnerabilities: Sequences of API calls or commands without strict boundaries allow exfiltration of secrets, data, or privilege escalation.
Assumption failures: Trust in downstream tools (connectors, APIs, cloud services) becomes exploitable if not properly isolated.
Cross-tenant risks: In multi-tenant environments, a compromised agent can leak or modify data for other customers.

📊 Deep Dive: Read the full technical analysis from Datadog Security Labs

Read Full Datadog Research →

Why This Matters for MCPs and Agent-Based Architectures

MCPs and similar managed cloud operations often rely on background automation, brokered assistants, or orchestration agents to:

Execute workflows on behalf of customers (deployments, policy updates, compliance checks).
Hold or access credentials to customer cloud resources (service principals, API keys, tokens).
Make decisions with elevated privileges (approvals, deletions, cross-tenant operations).

This combination creates a significantly larger attack surface than a single Copilot Studio instance. If an attacker can influence the agent's input—via a compromised webhook, malformed API payload, or prompt injection in a free-text field—they could trigger privileged actions affecting many customers.

Why MCP Servers & Copilot Agents Need to Be Secure

These systems have elevated access and capabilities that create serious security risks if compromised:

1. Privileged Access & Execution

MCP servers and agents can access your entire digital environment and execute commands with your permissions:

Data access: Files, code, credentials, emails, documents, company data, environment variables
System control: Execute shell commands, modify files, install packages, run scripts, access terminal
Your permissions: If you have admin/root access, so does the agent—no additional authentication required

Risk: Compromised agents can delete critical files, install malware, create backdoors, mine cryptocurrency, or exfiltrate all sensitive data to external servers.

2. External Services & Supply Chain

Agents connect to critical services and rely on third-party dependencies:

Services: Cloud providers (AWS/Azure/GCP), databases, payment processors, communication tools, version control systems
Dependencies: npm packages, Python libraries, third-party APIs, Docker images—any could contain malware

Risk: Unauthorized cloud usage ($$$), production database breaches, financial fraud, leaked communications, or a single compromised dependency compromising everything.

3. Attack Vectors

Multiple ways attackers can exploit AI agents:

Prompt Injection: Hidden malicious instructions in uploaded files (PDFs, documents) that manipulate the agent into executing attacker commands instead of legitimate requests.

Silent Exfiltration: Agents process large amounts of data automatically and make numerous API calls without user review, disguising malicious actions as legitimate operations—you wont notice until its too late.

Cross-Tenant Leakage: In multi-tenant systems, one users sensitive data or prompts can leak to another user, causing regulatory violations (GDPR, HIPAA) and lawsuits.

Security Best Practices

Access Control

Least Privilege: Grant only minimum necessary permissions (read-only file access, whitelisted APIs, limited environment variables)
Authentication: Verify user identity with token-based auth, validate sessions, check permissions per operation, use short-lived tokens
Network Isolation: Restrict access to only necessary endpoints, use allowlists, block internal networks, implement firewall rules

Input & Output Protection

Validation: Check input length limits, filter system commands, detect SQL injection, reject suspicious patterns
Rate Limiting: Prevent abuse with request throttling (e.g., 100 requests/minute per user/IP)
Audit Logging: Log all sensitive operations with timestamps, actions, files accessed, user IDs, and IP addresses

Secrets & Dependencies

Secrets Management: Never hardcode credentials—use secure vaults, environment variables with access controls, platform secret managers
Dependency Scanning: Regularly update packages, scan for vulnerabilities, review third-party code, use trusted sources

Monitoring

Regular Audits: Review access logs, test security controls, update dependencies, review permissions periodically

Practical Mitigations: Start Here

1. Zero-Trust Input Validation

Treat every input as untrusted, regardless of source:

Normalize and validate against strict schemas before any downstream action.
Reject unexpected input formats, encodings, or structures.
Log all validation failures for audit and threat detection.

2. Least-Privilege Credentials

Minimize what each agent can do:

Use short-lived tokens (minutes to hours, not days or months).
Narrow scopes: prefer role-based access with minimal permissions per operation.
Customer-isolated service principals: each agent instance should have separate credentials per customer, not shared keys.
Rotate credentials regularly and monitor for unusual patterns.

3. Isolation for Tool Chains

Separate concerns and limit agent autonomy:

Keep LLM prompts, connectors, and action executors in separate trust domains.
Use explicit, structured commands (e.g., JSON task definitions) instead of free-text prompts for sensitive operations.
Prefer functionally-limited adapter layers over monolithic automation flows.
Require deterministic approval gates for high-impact operations (deletions, cross-tenant actions, credential access).

4. Observability and Guardrails

Detect and prevent abuse in real time:

Log all agent actions with context: requester identity, input payload, decision rationale, outcome.
Rate-limit chained operations (e.g., max 5 sequential API calls per workflow).
Set deterministic approval queues for sensitive flows, with human review where practical.
Alert on anomalies: unusual action sequences, out-of-band API calls, rapid credential usage.

5. Input Provenance and Audit

Trace every action back to its origin:

Attach metadata to every input: source system, requester identity, request ID, integrity signature.
Maintain immutable audit logs separate from operational logs.
Use signing or HMAC to verify that inputs haven't been tampered with in transit.

Bottom Line

MCP servers and Copilot agents have root-level access to your digital life. Without proper security, your data will be stolen, systems compromised, credentials leaked, and your organization will face legal and financial consequences.

Security isn't optional—it's fundamental to building trustworthy AI systems.

Concluding Recommendations

CoPhish is a timely wake-up call for the AI and automation community. The attack surface introduced by LLM-driven wrappers and agent autonomy is real and expanding. For MCPs, the stakes are higher because trusted agents act across multiple customers and manage sensitive credentials.

The path forward:

Adopt zero-trust principles for agent inputs and credentials.
Design with human oversight in mind—approvals for sensitive actions.
Implement crisp, well-defined contracts between agents and downstream systems.
Make observability and audit logging a first-class requirement.
Test continuously for prompt injection, privilege escalation, and cross-tenant leaks.

If you're operating an MCP or building agent-based automations, now is the time to conduct a security audit of your architecture. A small investment in hardening and observability today will save you from a costly breach tomorrow.

About the Author

Harri Jaakkonen is a Cloud Security Engineer with 30 years of experience designing and implementing Zero Trust security architectures on Microsoft Azure. He's mentored over 2,500 professionals on their certification journey and is passionate about making cybersecurity education accessible to everyone.

Learn more about Harri →