SPICS Lab

Secure AI Agent Harness

High-Level Architecture of Secure AI Agent Harness

  AI agents are rapidly moving beyond simple question answering. They can now retrieve documents, call tools, access databases, write code, communicate with other agents, and make decisions across complex workflows. As these agents become more capable, they are also becoming part of the computing system itself. This raises an important question: How can we safely control AI agents that can observe, decide, act, and communicate?

  Our lab studies Secure AI Agent Harnesses: system-level security layers that manage and constrain AI-agent behavior. A useful analogy is to think of an AI agent as a new kind of CPU. The agent reasons and decides what to do next, while databases, documents, tools, APIs, memory, and other agents act like memory and I/O devices. In this analogy, the harness plays the role of an operating system: it mediates access, enforces permissions, isolates risky behavior, monitors execution, and records security-relevant events.

  Without such a harness, AI agents may become unsafe system components. They may access sensitive data unnecessarily, follow malicious instructions hidden in retrieved documents, call dangerous tools, leak private context, consume excessive resources, or communicate insecurely with other agents. Therefore, securing AI-agent systems requires more than making the underlying model “safer.” We need a practical security architecture that controls the entire agent workflow.

Core Research Themes

  Our lab explores Secure AI Agent Harnesses through three closely connected security tracks:

  Together, these three tracks form an OS-like security architecture for AI agents. The harness should determine what the agent can see, which tools it can call, what data it can modify, how it communicates with other agents, when human approval is required, and how suspicious behavior should be logged, blocked, or escalated.


Key Sub-Topics & Keywords

To give you an idea of potential topics you may be interested in, our research includes, but is not limited to:

  1. OS-like runtime security and policy enforcement for AI agents
  2. Privacy-preserving retrieval, memory, logging, and tool use
  3. TEE, sandboxing, isolation, and capability control for agent workflows
  4. Prompt injection, jailbreaking, RAG poisoning, and malicious-context defense
  5. Monitoring, auditing, and human-in-the-loop control of AI-agent behavior
  6. Secure agent-to-agent communication and MCP-style tool interactions

Student Note: If you are interested in operating systems, systems security, AI agents, privacy, and real-world security problems, this field may be a good fit for you. You will study how AI agents interact with data, tools, memory, and other systems, and how to design a harness that keeps those interactions safe. In short, this research asks how we can build the “operating system” that future AI agents need before they can be trusted with important tasks.

Previous post
Next-Generation Cryptosystems
Next post
Privacy-Preserving AI Systems (PPAI)