Researchers have identified PromptSpy, the first Android malware to utilize Google's Gemini AI to maintain persistence on infected devices. This malware leverages generative AI to analyze screen layouts and receive real-time instructions for navigating system interfaces to prevent its own closure or removal.
Cybersecurity experts at ESET recently uncovered a sophisticated Android threat named PromptSpy that integrates Google’s Gemini AI directly into its operational flow. This malware is designed for extensive surveillance, including the ability to record screens, capture login credentials, and gather detailed device information. By embedding an AI model as an automation assistant, the attackers have moved beyond static scripts to a more dynamic form of mobile infection.
The malware functions by sending a natural language prompt and an XML layout of the victim's screen to the Gemini API. This data provides the AI with a comprehensive map of every button, text field, and UI element currently visible. The AI then processes this context and returns specific JSON-formatted instructions, such as precise coordinates for a screen tap, allowing the malware to navigate the device’s settings autonomously.
This AI-driven approach specifically targets the recent apps list to ensure the malicious application remains pinned and active. By using Gemini to generate step-by-step navigation instructions, the malware can adapt to various Android versions and manufacturer-specific layouts that might otherwise break traditional automated scripts. This flexibility allows the threat to remain functional across a significantly wider range of mobile devices.
PromptSpy relies heavily on Android’s accessibility services to execute the actions suggested by the AI without any user interaction. It also employs invisible overlays to block attempts at uninstallation, effectively locking the user out of security settings. These permissions, combined with the AI’s guidance, facilitate the deployment of a VNC module that grants the attackers full remote access to the compromised smartphone.
Data exfiltration and command execution are managed through a hard-coded server that provides the malware with the necessary API keys and specific triggers. Through this connection, the attackers can demand screenshots, intercept PINs, and record pattern unlocks in real-time. This marks a significant evolution in mobile threats, where generative AI is no longer just a tool for writing code, but an active participant in the execution of a cyberattack.
Source: PromptSpy Android Malware Exploits Gemini AI For Recent-Apps Persistence


