Caught somewhere in time: Hunting for timer-queue timers
By William Burgess on 10 October, 2022
Hiding injected code is an enduring problem for attackers. Staying resident in memory increases the risk of detection through techniques like memory scanning and so it is desirable for an attacker to hide their implant when not actively performing a task.
More recent techniques have leveraged the Windows thread pool API to line up a sequence of callbacks (via timer-queue timers) to obfuscate injected code, sleep for a period, and then restore its execution. A good open-source example of this technique is Ekko Sleep Obfuscation which uses the CreateTimerQueueTimer API. This technique in turn appears to be inspired by Nighthawk / FOLIAGE.
Furthermore, Windows does not appear to provide any native telemetry sources for observing malicious timer-queue timers, making it extremely difficult for blue teamers to catch adversaries using these techniques. Therefore, it provides a promising TTP for threat actors to utilize in order to stay hidden. This blog attempts to address this gap and will demonstrate a PoC tool to enumerate timer-queue timers in memory.
The aim of this research was to identify a potential signature which could be used to enumerate timer-queue timers in memory. The approach taken was to understand how the Ekko PoC works, understand how the functions it uses work under the hood (e.g., CreateTimerQueueTimer), and attempt to identify some “signaturable” behaviour.
At a high level, Ekko sets up a sequence of callbacks to fire one after the other and which perform a basic obfuscation – sleep – deobfuscation pattern. Hence, it marks itself as read writable, uses SystemFunction032 to encrypt its contents, and then sleeps for a set duration. Once the sleep timer expires, it will decrypt itself, mark itself as executable and continue execution.
From the Ekko PoC, it is clear you need to first call CreateTimerQueue to obtain a “handle” to a timer queue. If we look at this function, it is apparent that the handle is not actually a fully-fledged handle (and hence does not correspond to a kernel object) but rather a pointer to a block of memory allocated on the heap (i.e., the value returned by RtlAllocateHeap below):
Therefore, timer-queue timers are actually fairly “lightweight objects” and appear to be predominantly implemented in user mode. In fact, they are managed by a worked thread in ntdll as part of the thread pool implementation. This has an important implication in that the sleep/wait initiated by Ekko is performed by a worker thread. Furthermore, the callbacks (e.g. VirtualProtect) will also be executed by a worker thread, as shown in the call stack below:
0:003> knf # Memory Child-SP RetAddr Call Site 00 00000013`920ff5c8 00007ffd`996f4dd6 ntdll!NtProtectVirtualMemory 01 8 00000013`920ff5d0 00007ffd`9bee6119 KERNELBASE!VirtualProtect+0x36 02 50 00000013`920ff620 00007ffd`9bee1609 ntdll!RtlpTpTimerCallback+0x79 03 50 00000013`920ff670 00007ffd`9bec315a ntdll!TppTimerpExecuteCallback+0xa9 04 50 00000013`920ff6c0 00007ffd`9ba67034 ntdll!TppWorkerThread+0x68a 05 300 00000013`920ff9c0 00007ffd`9bec2651 KERNEL32!BaseThreadInitThunk+0x14 06 30 00000013`920ff9f0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
This gives the technique a minimal footprint from a detection perspective as it avoids typical indicators of injected code (e.g., there are no threads created which point to unbacked regions).
Additionally, it is important to stress that timer-queue timers are also distinct entities to fully fledged kernel timer objects that are accessed via NtCreateTimer, etc. For a version of Ekko which uses waitable timers (e.g., kernel timers) see: https://github.com/Idov31/Cronos.
Once a handle has been acquired, an attacker can now call CreateTimerQueueTimer multiple times to queue up a series of callbacks which will perform the familiar obfuscate-sleep-deobfuscate pattern. The function definition for CreateTimerQueueTimer is shown below:
BOOL CreateTimerQueueTimer( [out] PHANDLE phNewTimer, [in, optional] HANDLE TimerQueue, [in] WAITORTIMERCALLBACK Callback, [in, optional] PVOID Parameter, [in] DWORD DueTime, [in] DWORD Period, [in] ULONG Flags );
The key parameters to note are the Callback and Parameter arguments, which are the function and argument to be executed when the timer expires. Note the “handle” acquired via CreateTimerQueue previously is also passed (TimerQueue).
If we look at the implementation of CreateTimerQueueTimer, we can see that it is essentially a wrapper around RtlCreateTimer. This is also demonstrated by setting a breakpoint in windbg on ntdll!RtlCreateTimer as shown in the call stack below:
0:000> knf # Memory Child-SP RetAddr Call Site 00 000000f4`838fcbe0 00007ffd`996fec3e ntdll!RtlCreateTimer+0x190 01 c0 000000f4`838fcca0 00007ff7`00886430 KERNELBASE!CreateTimerQueueTimer+0x5e 02 50 000000f4`838fccf0 00007ff7`008820a2 ekko!EkkoObf+0x2e0 [C:\Users\wb\source\repos\ekko\ekko\ekko.c @ 48] 03 2a80 000000f4`838ff770 00007ff7`00882a69 ekko!main+0x32 [C:\Users\wb\source\repos\ekko\ekko\main.c @ 11]
Therefore, the next step is to dig into the implementation of RtlCreateTimer, of which a key section is shown below:
This small code stub enables us to draw two initial conclusions:
1) RtlCreateTimer calls RtlAllocateHeap itself on line 45 with an allocation size of 0x60 bytes.
2) As part of the timer initialisation, RtlCreateTimer writes the arguments passed in the call to CreateTimerQueueTimer to this block of memory (e.g. the callback is written on line 57, the parameter on line 58 etc.).
Therefore, we know that the timer callback and parameter can be found in heap memory; however, so far there is not much to go on in terms of developing a signature that we can use to actively hunt for timer-queue timers in memory.
If we continue to examine RtlCreateTimer, we can observe that later on in its execution it calls the function TpAllocTimer, to which it passes a pointer to the previous heap allocation of 0x60 bytes. TpAllocTimer in turn allocates more heap memory (0x168 bytes) as shown in the image below:
This new block of heap memory is initialised in a similar fashion via calling a chain of successive functions (TpAllocTimer -> TppInitializeTimer -> TppCleanupGroupAddMember). At this point, following the exact implementation of each of these functions is complicated, but the TL;DR is that during the call to TpAllocTimer, a repetitive and predictable sequence of pointers are written to this second block of heap memory.
Crucially, this second block also contains a pointer to the initial allocation performed by RtlCreateTimer that contains the callback and parameter. Therefore, from a single call of CreateTimerQueueTimer we can expect to find two ”linked” heap allocations.
This can be demonstrated via running the Ekko PoC under windbg. We can start by enumerating the process heaps and examining the objects within them:
0:000> !heap -a Index Address Name Debugging options enabled 1: 177a8c60000 Segment at 00000177a8c60000 to 00000177a8d5f000 (0001a000 bytes committed) 2: 177a8b20000 Segment at 00000177a8b20000 to 00000177a8b30000 (00001000 bytes committed) 3: 177a8e90000 Segment at 00000177a8e90000 to 00000177a8e9f000 (00007000 bytes committed) 0:000> !heap -a 177a8c60000 Index Address Name Debugging options enabled 1: 177a8c60000 Segment at 00000177a8c60000 to 00000177a8d5f000 (0001a000 bytes committed) Flags: 00000002 ForceFlags: 00000000 Granularity: 16 bytes Segment Reserve: 00100000 Segment Commit: 00002000 DeCommit Block Thres: 00000400 DeCommit Total Thres: 00001000 Total Free Size: 00000210 Max. Allocation Size: 00007ffffffdefff Lock Variable at: 00000177a8c602c0 Next TagIndex: 0000 Maximum TagIndex: 0000 Tag Entries: 00000000 PsuedoTag Entries: 00000000 Virtual Alloc List: 177a8c60110 Uncommitted ranges: 177a8c600f0 177a8c7a000: 000e5000 (937984 bytes) FreeList[ 00 ] at 00000177a8c60150: 00000177a8c78aa0 . 00000177a8c6f1a0 00000177a8c6f190: 00020 . 00020  - free [...] Segment00 at a8c60000: Flags: 00000000 Base: 177a8c60000 First Entry: a8c60740 Last Entry: 177a8d5f000 Total Pages: 000000ff Total UnCommit: 000000e5 Largest UnCommit:00000000 UnCommitted Ranges: (1) Heap entries for Segment00 in Heap 00000177a8c60000 address: psize . size flags state (requested size) […] 00000177a8c73330: 00170 . 00070  - busy (60) […] 00000177a8c75f40: 003a0 . 00170  - busy (168)
We are interested in objects allocated on the heap with a size of either 0x60 (first allocation) or 0x168 bytes (second allocation). For brevity, two such objects are shown in the output above, which correspond to two linked heap allocations performed by a single CreateTimerQueueTimer call.
If we examine the memory at the latter heap entry (0x00000177a8c75f40), we will discover the second heap allocation performed by TpAllocTimer:
0:000> dps 00000177a8c75f40 L20 00000177`a8c75f40 00000000`00000000 00000177`a8c75f48 0800bac1`24bf2411 00000177`a8c75f50 00000000`00000001 00000177`a8c75f58 00007ffd`9bf8c1e8 ntdll!TppTimerpCleanupGroupMemberVFuncs 00000177`a8c75f60 00000000`00000000 00000177`a8c75f68 00000000`00000000 00000177`a8c75f70 00007ffd`9be78820 ntdll!RtlpTpTimerFinalizationCallback 00000177`a8c75f78 00000177`a8c75f78 00000177`a8c75f80 00000177`a8c75f78 00000177`a8c75f88 00000000`00000000 00000177`a8c75f90 00000000`00000000 00000177`a8c75f98 00000000`00000000 00000177`a8c75fa0 00007ffd`9bee60a0 ntdll!RtlpTpTimerCallback 00000177`a8c75fa8 00000177`a8c73340 Pointer to first RtlCreateTimer heap allocation 00000177`a8c75fb0 00000000`00000000 00000177`a8c75fb8 00000000`00000000 00000177`a8c75fc0 00000000`00000000 00000177`a8c75fc8 00000000`00000000 00000177`a8c75fd0 00000000`00000000 00000177`a8c75fd8 00000000`00000000 00000177`a8c75fe0 00000177`a8c74da0 00000177`a8c75fe8 00000177`a8c76158 00000177`a8c75ff0 00000177`a8c73268 00000177`a8c75ff8 00000000`00000002 00000177`a8c76000 00007ffd`9be79ee0 ntdll!RtlCreateTimer+0x190 00000177`a8c76008 00000000`00000000 00000177`a8c76010 00000000`00000001 00000177`a8c76018 00007ffd`9bf8c138 ntdll!TppTimerpTaskVFuncs 00000177`a8c76020 00000001`00000000
The crucial point to stress here is the pattern of repeatable/predictable pointers, which starts with ntdll!TppTimerpCleanupGroupMemberVFuncs and ends with ntdll!TppTimerpTaskVFuncs.
Furthermore, I mentioned previously that a pointer to the first (“linked”) heap allocation was also written to this block of memory. This can be found at 0x00000177a8c75fa8 (so immediately after ntdll!RtlpTpTimerCallback in memory). To confirm this, we can compare it to the output of “!heap -a” above and see that 0x00000177a8c73340 is located within the first heap entry of 0x60 bytes (which starts at 0x00000177a8c73330).
We can therefore proceed to dump the memory at this location:
0:000> dps 00000177`a8c73340 00000177`a8c73340 00000177`a8c6ee10 00000177`a8c73348 00000177`a8c733b0 00000177`a8c73350 00000000`00000000 00000177`a8c73358 dddddddd`00000020 00000177`a8c73360 00007ffd`9bf0d590 ntdll!NtContinue (Callback) 00000177`a8c73368 00000062`23cfd6e0 (Parameter) 00000177`a8c73370 dddddddd`00000000 00000177`a8c73378 00000177`a8c63600 00000177`a8c73380 00000177`a8c75f50 00000177`a8c73388 00000000`00000000 00000177`a8c73390 00000000`00000000 00000177`a8c73398 00000001`dddddd00 00000177`a8c733a0 dddddddd`dddddddd 00000177`a8c733a8 1000bafc`34bf2401 00000177`a8c733b0 00000177`a8c73340 00000177`a8c733b8 00000177`a8c76230
Voila! We have found the callback (which points to ntdll!NtContinue). As NtContinue expects a CONTEXT structure, we can assume the next pointer (0x0000006223cfd6e0) is a valid context structure:
0:000> dt CONTEXT 00000062`23cfd6e0 ekko!CONTEXT +0x000 P1Home : 0 +0x008 P2Home : 0 +0x010 P3Home : 0 +0x018 P4Home : 0 +0x020 P5Home : 0 +0x028 P6Home : 0 +0x030 ContextFlags : 0x10000f +0x034 MxCsr : 0x1f80 +0x038 SegCs : 0x33 +0x03a SegDs : 0x2b +0x03c SegEs : 0x2b +0x03e SegFs : 0x53 +0x040 SegGs : 0x2b +0x042 SegSs : 0x2b +0x044 EFlags : 0x202 +0x048 Dr0 : 0 +0x050 Dr1 : 0 +0x058 Dr2 : 0 +0x060 Dr3 : 0 +0x068 Dr6 : 0 +0x070 Dr7 : 0 +0x078 Rax : 0x00007ffd`9ba746c0 +0x080 Rcx : 0x00007ff7`0e2c0000 +0x088 Rdx : 0x28000 +0x090 Rbx : 0x00000177`a8c6ee10 +0x098 Rsp : 0x00000062`23fff8c8 +0x0a0 Rbp : 0x00000062`23fffae8 +0x0a8 Rsi : 0x00000062`23b5c000 +0x0b0 Rdi : 0x7ffe0386 +0x0b8 R8 : 4 +0x0c0 R9 : 0x00000062`23cff514 +0x0c8 R10 : 0x00000177`a8c63060 +0x0d0 R11 : 0x00000177`a8c63080 +0x0d8 R12 : 0 +0x0e0 R13 : 0 +0x0e8 R14 : 0 +0x0f0 R15 : 0x22c +0x0f8 Rip : 0x00007ffd`9ba6bc70 0:000> uf 0x00007ffd`9ba6bc70 KERNEL32!VirtualProtect: 00007ffd`9ba6bc70 48ff25d15b0600 jmp qword ptr [KERNEL32!QuirkIsEnabledForPackage2Worker+0x11578 (00007ffd`9bad1848)] Branch
In this case, we can see that the parameter contains a CONTEXT structure in which Rip is pointing at VirtualProtect and is setting the memory region located at 0x00007ff7`0e2c0000 (First arg is Rcx) to PAGE_READWRITE (e.g., it is this timer in memory: https://github.com/Cracked5pider/Ekko/blob/main/Src/Ekko.c#L97).
- Obtaining a handle to every process and locating all heap memory regions via the PEB.
- Scanning heap memory for our specific pattern of thread pool pointers (ntdll!TppTimerpCleanupGroupMemberVFuncs, ntdll!RtlpTpTimerFinalizationCallback, ntdll!RtlpTpTimerCallback and ntdll!TppTimerpTaskVFuncs).
- If we find the pattern, retrieve the initial allocation performed by RtlCreateTimer and obtain the target callback and parameter.
Firstly, I did initially explore enumerating heaps via CreateToolhelp32Snapshot, however, I found it quite flakey as it would often fail to retrieve a snapshot (especially when the target process was slow to break-in to under a debugger as Ekko often is). Hence, I decided to manually locate heap memory in the target process.
Secondly, an obvious race condition exists in that the heap is constantly in flux while we are reading and scanning memory. This could result in failing to spot a malicious timer which is partially scrubbed. A potential fix to this would be to suspend all threads in the process before scanning, but this is obviously more intrusive. In any case, one thing to bear in mind is that a few timers must always be present while the implant is sleeping, as they are needed to fire when the sleep duration ends. Hence, while the beacon may only be visible in memory for a brief period, some timers must always be present for the whole sleep duration and hence our chances of spotting them are considerably higher.
Thirdly, there may be a better and more reliable/performant way to do this. Other sensible approaches would be to approach the problem from the other side; that is to trace the TppWorkerThread which will execute the timer callbacks. I did start investigating this and could see for instance that the PEB contains a linked list which is referenced by TppWorkerThread (TppWorkerpListLock // TppWorkerpList; https://www.geoffchappell.com/studies/windows/km/ntoskrnl/inc/api/pebteb/peb/index.htm). However, this felt more complicated from a quick analysis and, having found something that worked, it was not pursued any further. However, it may yet provide an easier way of enumerating timer-queue timers.
Lastly, as this technique relies on undocumented functionality that is liable to change in future versions of Windows, it may break due to future Windows releases. At the time of publishing, it has been tested on Windows 10 1607 and Windows 10 21h2.
PoC || GTFO
The PoC memory scanner can be found here: https://github.com/WithSecureLabs/TickTock
Below is a demonstration of the results of scanning for timer-queue timers while Ekko is running:
Additionally, in terms of false positives, on a basic build of Windows 21h2 there appeared to be only a few common timer-queues lurking in memory. These belonged to winlogon, dllhost (WORK_QUEUE::OnThreadSentinelTimer), taskhost (COSTimerQueueEntry::Completion_). Hence, finding multiple (unique) timer-queue timers in memory already seems highly anomalous.
As a final note, the PoC released with this blog makes no effort to remove false positives. As with all detection logic, the real test is how a specific detection approach is tailored to an environment and how common FPs are removed. As such, I have made no attempt to address this issue within the scope of this research.