Hackers Talked Meta's AI Helper Into Handing Over Instagram Accounts

Meta has rushed out an emergency fix after attackers spent several months simply talking its new AI support chatbot into giving away Instagram accounts they did not own. According to the original report from Malwarebytes Labs, the operators opened ordinary help chats, claimed to be locked out of someone else's account, and asked the bot to change the recovery email on file. The bot complied, and a one time login code went straight to the attacker. No malware, no exploit code, just a conversation.

Over a single weekend the trick was used to hijack and briefly deface several high profile accounts, including a now dormant Obama White House handle, beauty retailer Sephora, and a senior US Space Force official. Some of the hijacked accounts were defaced with pro Iranian imagery. A well known security researcher and former Meta employee was also among those hit.

Why this matters

This is one of the first real world cases of an autonomous AI agent being socially engineered into committing account takeover at scale. Meta wired its support assistant into the systems that manage accounts, giving it permission to change emails and reset passwords, but it did not teach the bot to confirm it was actually talking to the rightful owner. That gap is a textbook "confused deputy" problem, a flaw class recognized since the 1980s in which a trusted helper with broad permissions is tricked by an outsider into misusing them on the outsider's behalf. As more companies route account recovery through AI agents that hold real permissions, the same pattern is likely to keep recurring. For background on this growing risk, see our coverage of why autonomous AI agents now outnumber human employees by huge margins.

How the attack works

The mechanics were simple. Operators first worked out roughly where a target lived using publicly available information, then connected through a VPN in the matching region so the login did not trip fraud alarms. They started a normal password reset, opened a support chat, and instructed the assistant to change the email address tied to the account. Where Meta's enhanced identity checks did kick in, attackers defeated them by generating video deepfakes of the targets, built from photos harvested off the victims' own Instagram pages. Deepfake fueled fraud has been climbing fast, as our report on deepfake-as-a-service driving $200 million in losses documented.

The motive is usually money. Hijackers extort businesses that depend on their accounts for marketing, and short or otherwise desirable legacy usernames sell for thousands of dollars on underground markets. Meta has not said how many accounts were affected; a spokesperson stated only that the issue was resolved and impacted accounts were being secured.

What you should do

Malwarebytes reports that accounts with multi factor authentication enabled, a second login step beyond the password, defeated the takeover entirely, even those using SMS codes. Turning on two factor authentication in the Meta Accounts Center is the single most effective step for users, and an authenticator app is preferable to SMS. Reviewing the email address and phone number on file, and treating any unexpected "your recovery email was changed" notice as an emergency, also helps.

The fix may not be the end of it. Malwarebytes notes follow on activity already circulating in which an Android emulator runs a modified Instagram client to feed the assistant prompts containing hidden characters designed to manipulate it. That is the same prompt manipulation problem that OWASP ranks as the top risk for LLM applications. We assess with high confidence that strong identity verification, least privilege scoping of agent permissions, and human review of sensitive actions are the durable controls as this class of abuse spreads.

Indicators and TTPs (defanged):

This briefing is provided by IntelFusions for informational and defensive purposes only. It is based on sources assessed to be reliable at the time of writing, and analytic judgments carry the confidence levels indicated. Indicators of compromise are defanged; re-arm them only in controlled environments. IntelFusions is not affiliated with the organizations named and makes no warranty as to completeness or accuracy.

Read the full analysis on IntelFusions