Scammers hide instructions in websites to hijack AI agents

Attackers are planting hidden instructions inside web pages to hijack AI agents, tricking them into sending money to criminals, according to researchers at Zscaler ThreatLabz. The technique, called indirect prompt injection, works like phishing aimed at software rather than people: instead of fooling a human, the booby trapped page fools the AI assistant that is reading it.

How the attack works

An AI agent that browses the web to finish a task, say installing a software library or checking a crypto balance, ingests whatever a page contains. Zscaler found fraudulent sites that bury attacker instructions in parts of the page a human never sees: text hidden with CSS positioned far off screen, and structured JSON-LD metadata that AI agents tend to treat as trustworthy. The visible page looks like ordinary developer documentation, while the machine readable layer quietly tells the agent what to do.

Two live campaigns

In the first campaign, a site posing as documentation for a Python package called requests-secure-v2 was pushed up search rankings through SEO poisoning so an AI agent would find it. Hidden instructions framed a small payment as a routine step to obtain an API key, complete with a Stripe checkout link and code to send roughly 0.0012 ETH to an attacker controlled Ethereum wallet. An agent trying to resolve a fake licensing error could be steered into paying. Zscaler linked ten related GitHub repositories under the account Open-Agent-Utilities that point at similar sites. The second campaign used a typosquatted domain, debank[.]auction, impersonating the DeBank decentralized finance portfolio tracker to feed AI agents malicious instructions.

Why it matters

When an AI agent misclassifies a malicious site as legitimate, it does not just risk one bad action; it can contaminate the agent's wider context and poison downstream retrieval augmented generation (RAG) workflows. Testing across 26 large language models, Zscaler found 4 failed to act appropriately against the first campaign and 2 misjudged the second, showing the exposure is measurable rather than theoretical. It is the same class of problem seen when fake game webpages trick AI browser agents into leaking credentials, and it remains the top entry in the OWASP Top 10 for LLM applications.

What you should do

Teams deploying autonomous or web browsing AI agents should treat any content an agent retrieves as untrusted input, strip or isolate hidden and structured page data before it reaches the model, require human confirmation for actions that move money, and constrain what an agent can do without approval. Selected indicators (defanged) include the domain debank[.]auction and repositories under hxxps://github[.]com/Open-Agent-Utilities.

Full detail is in the original report from Zscaler ThreatLabz researchers Ashwathi Sasi and Kartik Dixit.

This briefing is provided by IntelFusions for informational and defensive purposes only. It is based on sources assessed to be reliable at the time of writing, and analytic judgments carry the confidence levels indicated. Indicators of compromise are defanged; re-arm them only in controlled environments. IntelFusions is not affiliated with the organizations named and makes no warranty as to completeness or accuracy.