Large language model agents don't just talk, they collaborate, delegate and act. That orchestration layer opens a new attack surface: multi agent prompt injection. In this fast paced SecTor session you'll watch a red team walkthrough that starts with harvesting hidden system prompts, then escalates through mirrored pattern injections that subvert individual agents, corrupt the planner, and co opt tool calls. We'll dissect both direct and "second hand" (indirect) attacks that propagate across agent boundaries, chaining seemingly innocuous instructions into a full mission level takeover.
Defenders aren't powerless, but every control has a price. We map mitigations—from agent scoped content sanitization to policy enforced orchestrators and high fidelity telemetry—against their engineering effort and real world efficacy. You'll leave with a pragmatic checklist for building observability without violating user privacy, plus concrete design patterns to harden your own LLM ecosystems before attackers weaponize them for you.
By: Jeremy Richards | AI Red Team, ServiceNow
https://ift.tt/UxT0IGy
source https://www.youtube.com/watch?v=D4a8Udi2j-M
Subscribe to:
Post Comments (Atom)
-
Germany recalled its ambassador to Russia for a week of consultations in Berlin following an alleged hacker attack on Chancellor Olaf Scho...
-
Android’s May 2024 security update patches 38 vulnerabilities, including a critical bug in the System component. The post Android Update ...
No comments:
Post a Comment