Generative AI - An Attacker's View

How Generative AI is Used by Hackers and How to Protect Against Attacks

Tom Taylor-MacLean

Introduction

Recently, S4S held an event they call Club X [1]. They describe themselves as “the only security conference dedicated to collaboration and community” and bring in an impressive audience. I spoke on one of the S4S panel talks which was hosted by Mark Walmsley, the CISO at Freshfields. The time constraints of the panel format meant that there wasn’t enough time to express all my thoughts and research on this topic. As such, it has been organised and adjusted for a blog post, where we delve into the use of GenAI (Generative Artificial Intelligence) by threat actors and what we can do to defend ourselves.

With thanks to each of WithSecure, S4S and my fellow panel members, Jack Chapman of Egress, Martin Davies of Digital Realty, and James Duggan of Synack, for the research time, the discussion and the ideas.

How is Generative AI Used by Hackers?

Before we begin, we need define GenAI, so we are on the same page. After looking at how attacks can be assisted through the use of GenAI, we consider whether this is a prevalent issue now, or a problem for the future.

What is Generative AI?

GenAI is a type of Artificial Intelligence which is distinguished from other types of AI by its ability to create content. This may be text, images, video, audio or other mediums. The GenAI model would be given an input or a prompt, then produce the requested content in a form that is immediately consumable by humans. While many of us may be familiar with GenAI due to the established Large Language Models (LLMs) such as ChatGPT, Gemini , etc, these are just one variant of GenAI. Other types exist where, for example, the input or output could be a document, or media such as an audio clip.

What Sort of Attacks are We Dealing With?

Recon

Do you remember all of the information you’ve put onto social media? Do you know what data about your personal life is on the Internet for all to find? Perhaps an award you won at school is listed in a public document somewhere, or your membership to a sports club. Consider the dark web and whether PII (Personally Identifiable Information) such as your address or phone number might be there. Equally, leaked credentials or financial information, such as usernames and passwords or credit card numbers may be there too. While previously it could take hours for a sufficiently motivated attacker to gather the relevant information, a GenAI model could do this in minutes, providing a concise summary of your known attributes. This could form the basis for targeted spear phishing attacks. Similar issues apply to organisations, where a wide range of disparate information is often available online. This gives a malicious actor intelligence on which areas of the business may be more worthwhile to investigate.

But recon is not necessarily only conducted before an attack is started. Once an attacker gains access to a network, they may try to find out more information about their target from their new, internal perspective. While many companies train their employees not to put firm proprietary data or confidential informational into GenAI systems, an attacker would have no such qualms. Imagine a malicious actor getting access to your internal knowledge base, bulk exporting pages and feeding them to a GenAI model as additional training data. Using Retrieval-Augmented Generation (RAG), they could use this data to add context around your organisation to a pre-trained model and then ask the model to summarise the known gaps in security at the organisation. They would improve their understanding of the organisation, but may make themselves easier to detect, due to the large quantity of data exfiltration required for this to succeed.

Social Engineering

One of the more prevalent offensive uses of GenAI is to trick an organisation’s employee into performing some action which benefits the attacker. This takes the form of phishing, vishing or the use of deepfakes. Generic phishing emails can often be relatively straightforward to spot if one takes the time to analyse each email individually. Usually, spelling or grammatical errors can be found, or maybe the email purports to be from a bank which you do not have an account with. Where GenAI comes in is tailoring these messages in targeted campaigns to make them relevant to the intended recipient. Passing a prompt into a model in which an attacker asks for email content which is professionally written in English, and purports to come from a website in which the user has a hobbyist interest, will result in text which could pass as genuine. With the additional advantage of GenAI’s speed, attackers could target people more efficiently. The technology continues to improve, so even if models currently create sections of text which do not flow properly, these flaws may soon be ironed out by technology companies. All of this leads to an email, perhaps containing a malicious file or a link to an attacker-controlled website, which is likely to fool a wider audience.

While phishing is the most well-known vector, deepfakes are likely to become more common as initial methods of contact. With deepfakes, most people might imagine a video showing a politician or a famous actor saying something that the real person did not. Creation of a deepfake requires significant computing power especially if an attacker is planning to get the deepfake to perform in real-time. Previously a barrier to entry, cloud providers now allow users to use their computing reserves and even provide base AI models to use. Other barriers might include the requirement for a large training corpus of video or photo footage plus audio from the target person. However, Microsoft have recently announced VASA-1 [2] which can create a deepfake from a single portrait photo and some speech audio. This is possible after only approximately 10 years of significant work on video-focused GenAI.

Another method for tricking users is to use voice cloning software. This is similar to generating deepfake video, but computationally less expensive and with a wider range of applications. A fake phone call to a target can be made to try to coerce them into performing some action. Combining this with the ability to spoof a phone number can be a powerful tool. In episode 144 of Darknet Diaries [3], Rachel Tobac talks about her experience in using this sort of voice cloning software to target people on her assignments. At the moment, the main issue with this approach is trying to avoid delays between the end of the target person’s sentence and the start of the cloned voice’s reply. However, given the speed at which the technology is evolving, we can expect this to be overcome soon. If a chatbot-style GenAI was combined with voice cloning technology, ultimately allowing the system to autonomously maintain a conversation with its cloned voice, this would become particularly powerful. A new release by OpenAI [4] brings this closer to reality.

Malicious Code Generation

On episode 359 of the Smashing Security podcast [5], Allan Liska, the “Ransomware Sommelier” talked about his experience of using GPT-4, one of OpenAI’s models, to build functional ransomware. Although protections were in place to try to prevent a malicious actor from creating ransomware, Allan described how he used some bypass techniques to get GPT-4 to create the ransomware in pieces. All Allan then had to do was put these bits together, before testing it to find that it worked as expected, encrypting all of his files.

This is not to say that anyone on the Internet could do the same as Allan did. Numerous examples are scattered across forums and social media where GenAI gives a user code that doesn’t do what they asked it to. The code it gives may be close to what was requested but needs some final adjustment. For anything more than simple scripts, a low-skilled attacker in this position is unlikely to understand what changes are required, or how to make them. Without knowing why the code isn’t working, they would also struggle to known what additional prompts to give so that the GenAI model fixes the problem.

It seems that to make best use of GenAI to create malicious code, one needs to be proficient themselves. Rather than using GenAI as a mechanism to create “point-and-click” style code, a high-skilled attacker would use it as a time saver. Rather than writing an exploit from scratch, they could use GenAI to provide an outline, but then finalise and test the code themselves. The UK’s NCSC (National Cyber Security Centre) agrees on this point, stating that for effective current use of GenAI in offensive operations, both excellent training data and expert knowledge are required. [6]

Is This Happening Now?

Current use of GenAI by hackers seems to be limited to the earlier stages of attacks. The NCSC expect that advanced use cases such as facilitating lateral movement or privilege escalation will not be realised before 2025. However, examples are already appearing where attacks are embracing the use of GenAI.

In January of 2024, an organisation in Hong Kong lost over £20m ($25m) when an employee was asked by their CFO to make transactions on behalf of the firm. According to news reports, [7] this directive was shortly followed by a call where the CFO and other employees of the company were present. Unknown to the employee, all these other attendees were deepfakes. The transactions were carried out until the employee contacted the company headquarters about the transactions.

Dark Reading [8] reported on an arXiv pre-print article [9] in which GenAI models were given security advisories, access to some basic tools such as a web browser and a terminal and asked to exploit the issues. 15 security advisories were given, of which over 10 were released in 2024. The researchers then gave 10 GenAI models the task of exploiting the issues. 9 of the 10 models could not successfully exploit even a single issue, however GPT-4 managed to exploit 13 of the 15 under specific circumstances. The article further points out that this costs less than a human security expert, and this effort is more scalable.

Some of the biggest players in the industry continue to track the use of GenAI in offensive security operations. Their findings can be read in Threat Intelligence reports such as Mandiant’s M-Trends [10], Microsoft’s Digital Defence Reports [11] and, of course, WithSecure’s Threat Highlight Reports [12].

Protecting Ourselves

As with most problems in cybersecurity, there is no one single solution which will take care of the issue. We will talk about a multifaceted approach, to help provide defence-in-depth against GenAI attacks, including issues of governance and trust.

General Protections

At present, the best way to tell if you are talking to a deepfake, would be to get them to do something a GenAI model would be unlikely to deal with. However, you may not feel comfortable asking your CFO to “cha-cha real smooth” once they’ve asked you to perform an important transaction on behalf of the organisation. This could lead to a shift in cultural norms in societies where employees are conditioned not to question more senior people. People may be encouraged to raise questions where they have doubts, or debate points more thoroughly, otherwise their organisation is more likely to be affected by this style of attack.

A second check through a separate means of communication may suffice. For example, if you received the instructions through a video call, double checking the transaction data in-person, or on a phone call. This may involve additional training for users and should be started as soon as possible, to allow time for this training content to take root in people’s minds.

Social media and information about yourself on other online locations should be monitored and, where possible, personal information removed. All of this information builds up a picture for GenAI-powered reconnaissance and allows it to tailor-make content which would be of personal interest to you. Of course, if someone is a target of a nation state, reducing social media presence is unlikely to prove an insurmountable challenge. If someone really wants video or audio footage of them, there are ample opportunities at the local coffee shop, or walking through the street. But for a typical person trying to protect against a remote attacker trying their luck, this may prove a useful defence.

Technical Measures

GenAI’s use should not distract us from our conventional controls and defences. While GenAI is supplementing attacks and increasing the speed at which these can occur, currently we have not seen any AI-driven attacks which have performed better than human operators. As such, we should continue to implement:

Basic cyber hygiene, such as MFA (Multi Factor Authentication) and segmentation of assets;
Policies and procedures to contain and mitigate attacks;
Detection and response processes and systems;
Patch management programs ;
Regular security testing to verify that existing controls operate correctly.

While training users is a useful addition to these elements, it should not be fully relied upon.

We should mention that many providers of GenAI systems are attempting to build safety into their models. According to their website [13], “GPT-4 is 82% less likely to respond to requests for disallowed content … than GPT-3.5”. Google Gemini [14] falls under Google’s AI responsibility principles [15] which state that “We will design our AI systems to be appropriately cautious” without delving into much more detail about what this entails. However, open-source models or less well-known variants of GenAI may not have these protections or safeguards built in, and so attackers could start training these models themselves.

In the previously mentioned Darknet Diaries episode, Daniel Miessler talks about how to prove authenticity in a world where voice cloning and deepfakes are abundant. A short discussion occurs on how to distinguish between real and fake content. Donato Capitella of WithSecure has been thinking about this problem too and imagines that some sort of widespread adoption of cryptographic signing may be necessary to mark content. For example, consider now if you saw that your bank’s CEO had put out a video explaining that their company was in financial difficulty and close to collapse. It would be useful to be able to tell (especially for a news organisation or trading floor) whether the message was real, or had been generated for some ulterior motive.

Incident Response

Incident response in relation to GenAI-driven attacks is not much different to current tactics. If we first consider the use of AI-generated malicious code, it is likely that the model will have taken code snippets from around the Internet or its other training data and combined those into something functional. This may sound familiar to many of you! Discerning whether a person has stitched together code or a GenAI model would be difficult.

Sora is an AI model from OpenAI which can create video from text prompts. The team behind Sora are releasing tools such as a “detection classifier that can tell when a video was generated by Sora” [16]. If a video was used as part of the initial attack and this was saved onto an organisation’s systems, it may be possible to use tools such as these to judge whether a video was genuine or generated. This would, of course, be a lower priority task and more relevant to building up the lifecycle of the attack rather than forming part of a response. That said, if an organisation used this type of detection classifier on all video calls, they may be in a stronger position to detect imminent threats. If these are not saved, IR teams are in a similar position to those where an attack started over the phone on a non-recorded line.

Organisations should consider their position if an employee has carried out an action which they claim was conducted at the behest of the “CEO” or “CFO”, with the employee now thinking that they must have been dealing with a deepfake. If no evidence is available to prove this either way, it would be hard to known whether this was a genuine outside attack, or an insider threat with a good cover story. Procedures should be put in place so that the organisation is never put in this position, perhaps following in the footsteps of financial institutions. This could include the requirement to maintain a recording of all phone/video calls or chat logs for any monetary transactions exceeding a specified amount. Similarly, high-risk or high-value assets being manipulated should require multiple layers of approval. While this may not necessarily stop the attack from going ahead, it at least provides an audit trail and governance structures to follow.

Summary

GenAI remains in its infancy. It continues to mature and as it does so, we can expect a greater level of sophistication of attacks which integrate GenAI. The current threat from its use is the speed at which elements of an attack can be generated. We need to ensure that we can respond to this new pace with effective defensive mechanisms, which may include the use of GenAI where appropriate. Attacks on organisations can be seen today where GenAI is part of the technology used to perform analysis or execution. These will only become more prevalent, and organisations should prepare themselves accordingly.

References

[1] https://www.s4sclub.co.uk

[2] https://www.microsoft.com/en-us/research/project/vasa-1/

[3] https://darknetdiaries.com/episode/144/

[4] https://openai.com/index/hello-gpt-4o/

[5] https://www.smashingsecurity.com/359-declaring-war-on-ransomware-gangs-mobile-muddles-and-ai-religion/

[6] https://www.ncsc.gov.uk/report/impact-of-ai-on-cyber-threat

[7] https://www.scmp.com/news/hong-kong/law-and-crime/article/3250851/everyone-looked-real-multinational-firms-hong-kong-office-loses-hk200-million-after-scammers-stage

[8] https://www.darkreading.com/threat-intelligence/gpt-4-can-exploit-most-vulns-just-by-reading-threat-advisories

[9] https://arxiv.org/pdf/2404.08144

[10] https://www.mandiant.com/m-trends

[11] https://www.microsoft.com/en-us/security/security-insider/microsoft-digital-defense-report-2023

[12] https://www.withsecure.com/en/expertise/research-and-innovation/research/monthly-threat-highlights-report/february-2024

[13] https://openai.com/index/gpt-4

[14] https://gemini.google.com/

[15] https://ai.google/responsibility/principles/

[16] https://openai.com/index/sora

Should you let ChatGPT control your browser?

Large Language Models (LLMs), such as those powering OpenAI’s ChatGPT and Google’s Gemini, have made significant strides towards the creation of autonomous agents capable of performing a wide range of tasks, pushing the boundaries of artificial intelligence.

Domain-specific prompt injection detection

This article specifically focusses on the latter point and delves into developing a machine learning classifier to detect prompt injection attempts. We detail our approach to constructing a domain-specific dataset and fine-tuning DistilBERT for this purpose.

Generative AI Security

Are you planning or developing GenAI-powered solutions, or already deploying these integrations or custom solutions?

We can help you identify and address potential cyber risks every step of the way.

Synthetic Recollections

This blog post presents plausible scenarios where prompt injection techniques might be used to transform a ReACT-style LLM agent into a “Confused Deputy”. This involves two sub-categories of attacks.