Caught somewhere in time: Hunting for timer-queue timers

By William Burgess on 10 October, 2022

Hiding injected code is an enduring problem for attackers. Staying resident in memory increases the risk of detection through techniques like memory scanning and so it is desirable for an attacker to hide their implant when not actively performing a task.  
More recent techniques have leveraged the Windows thread pool API to line up a sequence of callbacks (via timer-queue timers) to obfuscate injected code, sleep for a period, and then restore its execution. A good open-source example of this technique is Ekko Sleep Obfuscation which uses the CreateTimerQueueTimer API. This technique in turn appears to be inspired by Nighthawk / FOLIAGE.  
Furthermore, Windows does not appear to provide any native telemetry sources for observing malicious timer-queue timers, making it extremely difficult for blue teamers to catch adversaries using these techniques. Therefore, it provides a promising TTP for threat actors to utilize in order to stay hidden. This blog attempts to address this gap and will demonstrate a PoC tool to enumerate timer-queue timers in memory.

Technical Walkthrough

The aim of this research was to identify a potential signature which could be used to enumerate timer-queue timers in memory. The approach taken was to understand how the Ekko PoC works, understand how the functions it uses work under the hood (e.g., CreateTimerQueueTimer), and attempt to identify some “signaturable” behaviour. 
At a high level, Ekko sets up a sequence of callbacks to fire one after the other and which perform a basic obfuscation – sleep – deobfuscation pattern. Hence, it marks itself as read writable, uses SystemFunction032 to encrypt its contents, and then sleeps for a set duration. Once the sleep timer expires, it will decrypt itself, mark itself as executable and continue execution. 
From the Ekko PoC, it is clear you need to first call CreateTimerQueue to obtain a “handle” to a timer queue. If we look at this function, it is apparent that the handle is not actually a fully-fledged handle (and hence does not correspond to a kernel object) but rather a pointer to a block of memory allocated on the heap (i.e., the value returned by RtlAllocateHeap below):

Therefore, timer-queue timers are actually fairly “lightweight objects” and appear to be predominantly implemented in user mode. In fact, they are managed by a worked thread in ntdll as part of the thread pool implementation. This has an important implication in that the sleep/wait initiated by Ekko is performed by a worker thread. Furthermore, the callbacks (e.g. VirtualProtect) will also be executed by a worker thread, as shown in the call stack below:

0:003> knf 
#   Memory  Child-SP          RetAddr               Call Site 
00           00000013`920ff5c8 00007ffd`996f4dd6     ntdll!NtProtectVirtualMemory 
01         8 00000013`920ff5d0 00007ffd`9bee6119     KERNELBASE!VirtualProtect+0x36 
02        50 00000013`920ff620 00007ffd`9bee1609     ntdll!RtlpTpTimerCallback+0x79 
03        50 00000013`920ff670 00007ffd`9bec315a     ntdll!TppTimerpExecuteCallback+0xa9 
04        50 00000013`920ff6c0 00007ffd`9ba67034     ntdll!TppWorkerThread+0x68a 
05       300 00000013`920ff9c0 00007ffd`9bec2651     KERNEL32!BaseThreadInitThunk+0x14 
06        30 00000013`920ff9f0 00000000`00000000     ntdll!RtlUserThreadStart+0x21

This gives the technique a minimal footprint from a detection perspective as it avoids typical indicators of injected code (e.g., there are no threads created which point to unbacked regions). 
Additionally, it is important to stress that timer-queue timers are also distinct entities to fully fledged kernel timer objects that are accessed via NtCreateTimer, etc. For a version of Ekko which uses waitable timers (e.g., kernel timers) see: 
Once a handle has been acquired, an attacker can now call CreateTimerQueueTimer multiple times to queue up a series of callbacks which will perform the familiar obfuscate-sleep-deobfuscate pattern. The function definition for CreateTimerQueueTimer is shown below: 

BOOL CreateTimerQueueTimer( 
  [out]          PHANDLE             phNewTimer, 
  [in, optional] HANDLE              TimerQueue, 
  [in]           WAITORTIMERCALLBACK Callback, 
  [in, optional] PVOID               Parameter, 
  [in]           DWORD               DueTime, 
  [in]           DWORD               Period, 
  [in]           ULONG               Flags 

The key parameters to note are the Callback and Parameter arguments, which are the function and argument to be executed when the timer expires. Note the “handle” acquired via CreateTimerQueue previously is also passed (TimerQueue). 
If we look at the implementation of CreateTimerQueueTimer, we can see that it is essentially a wrapper around RtlCreateTimer. This is also demonstrated by setting a breakpoint in windbg on ntdll!RtlCreateTimer as shown in the call stack below:

0:000> knf 
 #   Memory  Child-SP          RetAddr               Call Site 
00           000000f4`838fcbe0 00007ffd`996fec3e     ntdll!RtlCreateTimer+0x190 
01        c0 000000f4`838fcca0 00007ff7`00886430     KERNELBASE!CreateTimerQueueTimer+0x5e 
02        50 000000f4`838fccf0 00007ff7`008820a2     ekko!EkkoObf+0x2e0 [C:\Users\wb\source\repos\ekko\ekko\ekko.c @ 48]  
03      2a80 000000f4`838ff770 00007ff7`00882a69     ekko!main+0x32 [C:\Users\wb\source\repos\ekko\ekko\main.c @ 11]

Therefore, the next step is to dig into the implementation of RtlCreateTimer, of which a key section is shown below: 

This small code stub enables us to draw two initial conclusions: 
1) RtlCreateTimer calls RtlAllocateHeap itself on line 45 with an allocation size of 0x60 bytes.

2) As part of the timer initialisation, RtlCreateTimer writes the arguments passed in the call to CreateTimerQueueTimer to this block of memory (e.g. the callback is written on line 57, the parameter on line 58 etc.). 
Therefore, we know that the timer callback and parameter can be found in heap memory; however, so far there is not much to go on in terms of developing a signature that we can use to actively hunt for timer-queue timers in memory. 
If we continue to examine RtlCreateTimer, we can observe that later on in its execution it calls the function TpAllocTimer, to which it passes a pointer to the previous heap allocation of 0x60 bytes. TpAllocTimer in turn allocates more heap memory (0x168 bytes) as shown in the image below:

This new block of heap memory is initialised in a similar fashion via calling a chain of successive functions (TpAllocTimer -> TppInitializeTimer -> TppCleanupGroupAddMember). At this point, following the exact implementation of each of these functions is complicated, but the TL;DR is that during the call to TpAllocTimer, a repetitive and predictable sequence of pointers are written to this second block of heap memory.  
Crucially, this second block also contains a pointer to the initial allocation performed by RtlCreateTimer that contains the callback and parameter. Therefore, from a single call of CreateTimerQueueTimer we can expect to find two ”linked” heap allocations. 
This can be demonstrated via running the Ekko PoC under windbg. We can start by enumerating the process heaps and examining the objects within them:

0:000> !heap -a 
Index   Address  Name      Debugging options enabled 
  1:   177a8c60000  
    Segment at 00000177a8c60000 to 00000177a8d5f000 (0001a000 bytes committed) 
  2:   177a8b20000  
    Segment at 00000177a8b20000 to 00000177a8b30000 (00001000 bytes committed) 
  3:   177a8e90000  
    Segment at 00000177a8e90000 to 00000177a8e9f000 (00007000 bytes committed) 

0:000> !heap -a 177a8c60000 
Index   Address  Name      Debugging options enabled 
  1:   177a8c60000  
    Segment at 00000177a8c60000 to 00000177a8d5f000 (0001a000 bytes committed) 
    Flags:                00000002 
    ForceFlags:           00000000 
    Granularity:          16 bytes 
    Segment Reserve:      00100000 
    Segment Commit:       00002000 
    DeCommit Block Thres: 00000400 
    DeCommit Total Thres: 00001000 
    Total Free Size:      00000210 
    Max. Allocation Size: 00007ffffffdefff 
    Lock Variable at:     00000177a8c602c0 
    Next TagIndex:        0000 
    Maximum TagIndex:     0000 
    Tag Entries:          00000000 
    PsuedoTag Entries:    00000000 
    Virtual Alloc List:   177a8c60110 
    Uncommitted ranges:   177a8c600f0 
            177a8c7a000: 000e5000  (937984 bytes) 
    FreeList[ 00 ] at 00000177a8c60150: 00000177a8c78aa0 . 00000177a8c6f1a0   
        00000177a8c6f190: 00020 . 00020 [100] - free 

    Segment00 at a8c60000: 
        Flags:           00000000 
        Base:            177a8c60000 
        First Entry:     a8c60740 
        Last Entry:      177a8d5f000 
        Total Pages:     000000ff 
        Total UnCommit:  000000e5 
        Largest UnCommit:00000000 
        UnCommitted Ranges: (1) 

    Heap entries for Segment00 in Heap 00000177a8c60000 
                 address: psize . size  flags   state (requested size) 
                 00000177a8c73330: 00170 . 00070 [101] - busy (60)    
                 00000177a8c75f40: 003a0 . 00170 [101] - busy (168)

We are interested in objects allocated on the heap with a size of either 0x60 (first allocation) or 0x168 bytes (second allocation). For brevity, two such objects are shown in the output above, which correspond to two linked heap allocations performed by a single CreateTimerQueueTimer call. 
If we examine the memory at the latter heap entry (0x00000177a8c75f40), we will discover the second heap allocation performed by TpAllocTimer:

0:000> dps 00000177a8c75f40 L20 
00000177`a8c75f40  00000000`00000000 
00000177`a8c75f48  0800bac1`24bf2411 
00000177`a8c75f50  00000000`00000001 
00000177`a8c75f58  00007ffd`9bf8c1e8 ntdll!TppTimerpCleanupGroupMemberVFuncs 
00000177`a8c75f60  00000000`00000000 
00000177`a8c75f68  00000000`00000000 
00000177`a8c75f70  00007ffd`9be78820 ntdll!RtlpTpTimerFinalizationCallback 
00000177`a8c75f78  00000177`a8c75f78 
00000177`a8c75f80  00000177`a8c75f78 
00000177`a8c75f88  00000000`00000000 
00000177`a8c75f90  00000000`00000000 
00000177`a8c75f98  00000000`00000000 
00000177`a8c75fa0  00007ffd`9bee60a0 ntdll!RtlpTpTimerCallback 
00000177`a8c75fa8  00000177`a8c73340 Pointer to first RtlCreateTimer heap allocation 
00000177`a8c75fb0  00000000`00000000 
00000177`a8c75fb8  00000000`00000000 
00000177`a8c75fc0  00000000`00000000 
00000177`a8c75fc8  00000000`00000000 
00000177`a8c75fd0  00000000`00000000 
00000177`a8c75fd8  00000000`00000000 
00000177`a8c75fe0  00000177`a8c74da0 
00000177`a8c75fe8  00000177`a8c76158 
00000177`a8c75ff0  00000177`a8c73268 
00000177`a8c75ff8  00000000`00000002 
00000177`a8c76000  00007ffd`9be79ee0 ntdll!RtlCreateTimer+0x190 
00000177`a8c76008  00000000`00000000 
00000177`a8c76010  00000000`00000001 
00000177`a8c76018  00007ffd`9bf8c138 ntdll!TppTimerpTaskVFuncs 
00000177`a8c76020  00000001`00000000

The crucial point to stress here is the pattern of repeatable/predictable pointers, which starts with ntdll!TppTimerpCleanupGroupMemberVFuncs and ends with ntdll!TppTimerpTaskVFuncs.  
Furthermore, I mentioned previously that a pointer to the first (“linked”) heap allocation was also written to this block of memory. This can be found at 0x00000177a8c75fa8 (so immediately after ntdll!RtlpTpTimerCallback in memory). To confirm this, we can compare it to the output of “!heap -a” above and see that 0x00000177a8c73340 is located within the first heap entry of 0x60 bytes (which starts at 0x00000177a8c73330). 
We can therefore proceed to dump the memory at this location:

0:000> dps 00000177`a8c73340 
00000177`a8c73340  00000177`a8c6ee10 
00000177`a8c73348  00000177`a8c733b0 
00000177`a8c73350  00000000`00000000 
00000177`a8c73358  dddddddd`00000020 
00000177`a8c73360  00007ffd`9bf0d590  ntdll!NtContinue (Callback) 
00000177`a8c73368  00000062`23cfd6e0  (Parameter)  
00000177`a8c73370  dddddddd`00000000 
00000177`a8c73378  00000177`a8c63600 
00000177`a8c73380  00000177`a8c75f50 
00000177`a8c73388  00000000`00000000 
00000177`a8c73390  00000000`00000000 
00000177`a8c73398  00000001`dddddd00 
00000177`a8c733a0  dddddddd`dddddddd 
00000177`a8c733a8  1000bafc`34bf2401 
00000177`a8c733b0  00000177`a8c73340 
00000177`a8c733b8  00000177`a8c76230 

Voila! We have found the callback (which points to ntdll!NtContinue). As NtContinue expects a CONTEXT structure, we can assume the next pointer (0x0000006223cfd6e0) is a valid context structure:

0:000> dt CONTEXT 00000062`23cfd6e0 
   +0x000 P1Home           : 0 
   +0x008 P2Home           : 0 
   +0x010 P3Home           : 0 
   +0x018 P4Home           : 0 
   +0x020 P5Home           : 0 
   +0x028 P6Home           : 0 
   +0x030 ContextFlags     : 0x10000f 
   +0x034 MxCsr            : 0x1f80 
   +0x038 SegCs            : 0x33 
   +0x03a SegDs            : 0x2b 
   +0x03c SegEs            : 0x2b 
   +0x03e SegFs            : 0x53 
   +0x040 SegGs            : 0x2b 
   +0x042 SegSs            : 0x2b 
   +0x044 EFlags           : 0x202 
   +0x048 Dr0              : 0 
   +0x050 Dr1              : 0 
   +0x058 Dr2              : 0 
   +0x060 Dr3              : 0 
   +0x068 Dr6              : 0 
   +0x070 Dr7              : 0 
   +0x078 Rax              : 0x00007ffd`9ba746c0 
   +0x080 Rcx              : 0x00007ff7`0e2c0000 
   +0x088 Rdx              : 0x28000 
   +0x090 Rbx              : 0x00000177`a8c6ee10 
   +0x098 Rsp              : 0x00000062`23fff8c8 
   +0x0a0 Rbp              : 0x00000062`23fffae8 
   +0x0a8 Rsi              : 0x00000062`23b5c000 
   +0x0b0 Rdi              : 0x7ffe0386 
   +0x0b8 R8               : 4 
   +0x0c0 R9               : 0x00000062`23cff514 
   +0x0c8 R10              : 0x00000177`a8c63060 
   +0x0d0 R11              : 0x00000177`a8c63080 
   +0x0d8 R12              : 0 
   +0x0e0 R13              : 0 
   +0x0e8 R14              : 0 
   +0x0f0 R15              : 0x22c 
   +0x0f8 Rip              : 0x00007ffd`9ba6bc70 
0:000> uf 0x00007ffd`9ba6bc70 
00007ffd`9ba6bc70 48ff25d15b0600  jmp     qword ptr [KERNEL32!QuirkIsEnabledForPackage2Worker+0x11578 (00007ffd`9bad1848)]  Branch

In this case, we can see that the parameter contains a CONTEXT structure in which Rip is pointing at VirtualProtect and is setting the memory region located at 0x00007ff7`0e2c0000 (First arg is Rcx) to PAGE_READWRITE (e.g., it is this timer in memory: 


We can now automate the approach taken above by:
  1. Obtaining a handle to every process and locating all heap memory regions via the PEB.
  2. Scanning heap memory for our specific pattern of thread pool pointers (ntdll!TppTimerpCleanupGroupMemberVFuncs, ntdll!RtlpTpTimerFinalizationCallback, ntdll!RtlpTpTimerCallback and ntdll!TppTimerpTaskVFuncs).
  3. If we find the pattern, retrieve the initial allocation performed by RtlCreateTimer and obtain the target callback and parameter.


Firstly, I did initially explore enumerating heaps via CreateToolhelp32Snapshot, however, I found it quite flakey as it would often fail to retrieve a snapshot (especially when the target process was slow to break-in to under a debugger as Ekko often is). Hence, I decided to manually locate heap memory in the target process. 
Secondly, an obvious race condition exists in that the heap is constantly in flux while we are reading and scanning memory. This could result in failing to spot a malicious timer which is partially scrubbed. A potential fix to this would be to suspend all threads in the process before scanning, but this is obviously more intrusive. In any case, one thing to bear in mind is that a few timers must always be present while the implant is sleeping, as they are needed to fire when the sleep duration ends. Hence, while the beacon may only be visible in memory for a brief period, some timers must always be present for the whole sleep duration and hence our chances of spotting them are considerably higher. 
Thirdly, there may be a better and more reliable/performant way to do this. Other sensible approaches would be to approach the problem from the other side; that is to trace the TppWorkerThread which will execute the timer callbacks. I did start investigating this and could see for instance that the PEB contains a linked list which is referenced by TppWorkerThread (TppWorkerpListLock // TppWorkerpList; However, this felt more complicated from a quick analysis and, having found something that worked, it was not pursued any further. However, it may yet provide an easier way of enumerating timer-queue timers. 
Lastly, as this technique relies on undocumented functionality that is liable to change in future versions of Windows, it may break due to future Windows releases. At the time of publishing, it has been tested on Windows 10 1607 and Windows 10 21h2.


The PoC memory scanner can be found here: 
Below is a demonstration of the results of scanning for timer-queue timers while Ekko is running:

Additionally, in terms of false positives, on a basic build of Windows 21h2 there appeared to be only a few common timer-queues lurking in memory. These belonged to winlogon, dllhost (WORK_QUEUE::OnThreadSentinelTimer), taskhost (COSTimerQueueEntry::Completion_). Hence, finding multiple (unique) timer-queue timers in memory already seems highly anomalous. 
As a final note, the PoC released with this blog makes no effort to remove false positives. As with all detection logic, the real test is how a specific detection approach is tailored to an environment and how common FPs are removed. As such, I have made no attempt to address this issue within the scope of this research.