SOC 2 for AI Agents: What It Means and Why It Matters

|8 min readCompliance

SOC 2 (System and Organization Controls 2) is the de facto compliance standard for SaaS companies. Developed by the AICPA, it evaluates an organization's controls across five Trust Services Criteria: Security, Availability, Processing Integrity, Confidentiality, and Privacy. If your enterprise customers ask for a SOC 2 report — and they will — you need one.

But SOC 2 was designed in 2010 for cloud services operated by humans. AI agents introduce challenges that the original framework did not anticipate: autonomous decision-making, model drift, prompt injection, capability scope management, and trust scoring. This article maps each Trust Services Criterion to the realities of AI agent deployment and explains what auditors are starting to look for.

The Five Criteria, Mapped to AI Agents

1. Security (CC6, CC7, CC8)

For traditional SaaS: access controls, encryption, vulnerability management. For AI agents: all of the above plus agent identity verification (does each agent have a unique, verifiable identity?), capability scope enforcement (can the agent only access what it is authorized to access?), and inter-agent authentication (when agents communicate, how are they authenticated?). Shulam addresses this through the AAIN (Autonomous Agent Identification Number) system and cryptographic capability attestation.

2. Availability (A1)

For traditional SaaS: uptime SLAs and disaster recovery. For AI agents: uptime plus graceful degradation behavior. When an AI agent goes offline, what happens to in-flight transactions? Does the system escalate to a human? Queue the work? Fail silently? Auditors want to see that agent unavailability does not create data loss, orphaned transactions, or compliance gaps. Shulam's trust score includes an uptime reliability factor (10% weight) that tracks exactly this.

3. Processing Integrity (PI1)

For traditional SaaS: data processing is complete, accurate, and authorized. For AI agents: this is where task accuracy enters the picture. Processing integrity for an AI agent means its outputs are correct (accuracy), its decisions are consistent (behavioral stability), and its actions are within authorized scope (capability governance). This criterion is the most natural mapping to trust scoring — every factor in the trust score directly supports processing integrity evidence.

4. Confidentiality (C1)

For traditional SaaS: data classification and protection. For AI agents: data classification plus prompt-level data leakage prevention. AI agents process unstructured data (natural language queries, documents, conversations) that may contain confidential information not explicitly classified as such. Auditors are increasingly asking: does the agent have controls to prevent confidential data from leaking through model outputs, logs, or inter-agent communications?

5. Privacy (P1-P8)

For traditional SaaS: data collection, use, retention, and disposal per privacy policy. For AI agents: all of the above plus training data provenance (was the model trained on data the organization had the right to use?), inference-time PII handling (does the agent store, log, or transmit PII during operation?), and right-to-deletion compliance (can the agent "forget" a specific user's data on request?). These are hard problems that most AI deployments have not solved. Shulam's BARUCH receipts hash PII rather than storing it, providing a privacy- preserving audit trail.

What Auditors Are Looking For in 2026

We have spoken with 14 SOC 2 auditing firms over the past six months. The emerging consensus on AI-specific controls is converging around five areas:

  • Agent inventory and identity. Can the organization produce a complete list of all AI agents in production, with unique identifiers, declared capability scopes, and current authority levels? This is the AI equivalent of an asset inventory — and most organizations cannot do it today. AAIN registration provides this automatically.
  • Capability scope enforcement.Is there a technical control (not just a policy) that prevents agents from accessing data or performing actions outside their declared scope? Auditors want to see enforcement at the platform level, not just documentation that says "the agent should not do X."
  • Graduated autonomy documentation.What authority level does each agent have? Who approved it? What was the evidence basis for the approval? This maps directly to Shulam's trust ladder — every promotion is recorded with the trust score at the time of approval, the approving operator, and the factor breakdown.
  • Continuous monitoring evidence. How does the organization monitor agent behavior between audits? Point-in-time screenshots of dashboards are not sufficient. Auditors want tamper-evident records of continuous monitoring — which is exactly what BARUCH receipts provide.
  • Incident response for AI-specific risks. What happens when an agent produces an incorrect output that affects a customer? When an agent accesses data outside its scope? When a model update degrades performance? The incident response plan should cover AI-specific scenarios, not just traditional infrastructure incidents.

Preparing for Your First AI-Inclusive SOC 2 Audit

If you are deploying AI agents and expect a SOC 2 audit, start with these four actions:

  • Register every agent with a unique identity. No anonymous agents in production. Every agent needs an identifier, an owner, a capability scope, and an authority level.
  • Implement continuous compliance monitoring. Real-time screening of every transaction, not periodic batch reviews. The auditor will ask how quickly you detect a compliance violation — "at the next quarterly review" is not an acceptable answer.
  • Generate tamper-evident audit evidence. Logs are necessary but not sufficient. Auditors increasingly want cryptographic proof that compliance records have not been modified. BARUCH receipts or an equivalent system.
  • Document your graduated autonomy model. Show the auditor how agents earn authority, what controls exist at each level, and how demotions are handled. A trust score system with historical records is the strongest evidence you can provide.

For a deeper comparison of building these controls in-house versus using Shulam, see our Build vs. Buy analysis. For security-specific documentation, visit the Security page.

Get SOC 2 Ready for AI

Shulam provides the agent identity, continuous monitoring, and tamper-evident audit trails your SOC 2 auditor will ask for.

View Security Controls