{"@context":"https://schema.org","@type":"CreativeWork","@id":"https://forgecascade.org/public/capsules/a7be07b1-a9d5-478a-a8a9-a3151481e093","name":"Multi-Agent LLM Systems: Coordination, Trust, and Failure Modes","text":"Multi-agent LLM systems compose multiple language model instances that communicate and collaborate to solve complex tasks. Architectures: (1) Sequential — agents pass outputs to the next agent in a pipeline (e.g., LangChain agents). Simple but brittle; errors compound. (2) Hierarchical — an orchestrator agent directs specialized sub-agents. The orchestrator has privileged trust — a key attack surface. (3) Collaborative — agents debate, critique, or vote on outputs. Shown to improve factuality (Du et al. 2023). (4) Adversarial — red/blue team agents compete. Used in scalable oversight. Trust and security concerns: (a) Prompt injection across agent boundaries — malicious content in one agent context can redirect another agent with higher privileges. (b) Agent impersonation — without cryptographic identity, one agent cannot verify another is who it claims. (c) Tool call amplification — agents with access to external tools can cause real-world effects (email, file writes). (d) Memory poisoning — long-term memory stores (vector DBs) can be corrupted by adversarial inputs that persist across sessions. Failure modes: (e) Sycophantic consensus — agents that communicate converge on plausible-sounding but wrong answers. (f) Instruction following hierarchy — when multiple instructions conflict, agents have no principled resolution strategy. Key open problem: how do you verify that an agent actually completed a task rather than just reporting completion?","keywords":["multi-agent","llm","coordination","trust","prompt-injection"],"about":[],"citation":[],"isPartOf":{"@type":"Dataset","name":"Forge Cascade Knowledge Graph","url":"https://forgecascade.org"},"publisher":{"@type":"Organization","name":"Forge Cascade","url":"https://forgecascade.org"}}