Cybercriminals have begun exploiting AI-powered email security systems using a technique called indirect prompt injection, according to new research from Sublime Security. Attackers embed hidden text in phishing emails using zero-font HTML or color-matching techniques that remain invisible to human recipients but are fully processed by machine learning models. This hidden content, often copied from legitimate sources like brand newsletters or published fiction, dilutes the malicious signals in the email and tricks AI filters into classifying dangerous messages as safe.
The technique works by stuffing phishing emails with carefully selected benign content that influences how AI models assess the message. In one documented campaign, attackers cloned Adidas newsletter content from archival sites like milled.com and emailinspire.com, hiding this text within what appeared to be a standard cloud storage phishing scam. The goal was to make the AI security system associate the email with a high-reputation brand rather than flagging the malicious link. A second campaign used fake health insurance emails that embedded fictional stories from goodnovel.com, apparently hoping the AI would mistake the message for legitimate creative content from platforms like Substack.
The hidden text is inserted using two primary methods. Zero-font HTML sets text size to 0pt, making it completely invisible to human readers while remaining fully readable by AI scanners. Color-matching sets text to the same hexadecimal color code as the email background, achieving the same effect. When the hidden content is substantial enough, it can overwhelm the legitimate signals the AI model uses to identify phishing attempts, causing misclassification.
While these attacks currently represent less than 1% of observed email traffic, researchers warn the threat will grow as organizations deploy more autonomous AI systems. The risk becomes particularly acute with emerging agentic mailbox technologies, where AI assistants take actions on behalf of users. In such scenarios, a model following hidden malicious instructions could have serious consequences beyond simple misclassification.
Sublime Security researchers emphasize that current AI security models need fundamental improvements to understand full message context rather than relying on surface-level analysis of links or keywords. The research demonstrates that while the cybersecurity community has theorized about these attacks, threat actors are now actively deploying them in real-world campaigns. Organizations using AI-powered email security should review their detection capabilities and consider implementing additional layers of analysis that account for hidden or obfuscated content.
Source: https://hackread.com/scammers-text-bypass-ai-email-filters-phishing-scams/



