EthicalHackers
Pentester et hacker indépendant pour les sociétés
EDR evasion through transpilation and virtualization
There was a time when antivirus evasion was easy. There was even a time, around 2015/2016, when it was trivial, and several open-source “silver bullets” existed that could evade defenses almost at will. From reflectively embedding payloads in memory, to shellcode packers, to PE encryption wrappers, the means of achieving stealth were as numerous as they were accessible.
In my experience however, not only has this not been the case for some years, it also is worsening : evasion techniques are scarcer, and are technologie dependent. Tooling used to be a cat-and-mouse race where the mouse often had the upper hand, but now the tables have turned. In the rare cases where a somewhat universal evasion technique is found, it usually becomes obsolete within months. This create a vicious circle where attackers are less likely to share their tradecraft as to not lose months of work, which means defensers have fewer techniques to optimize against, which means techniques are made obsolete even faster, which means attackers are more likely to keep their tradecraft, etc.
It has come to the point that, I feel, as an attacker, you either develop your own tooling (and keep it to yourself), or you use existing tools and commit to the painful, time-consuming task of customizing them until their signatures and event traces are sufficiently distinct (a process you must repeat for every single one of your tools). Either way, the task is complex. The time you must sink into developing your own tools is significant, and I believe this problem is shared by most pentesters and red teams worldwide.
It’s with all that in mind that I stumbled across a blogpost from foxit a year ago, called “Red Teaming in the age of EDR: Evasion of Endpoint Detection Through Malware Virtualisation“, wrote by Boudewijn Meijer and Rick Veldhoven.
In this article, I’ll give a quick breakdown of the current state of detection mechanisms as I understand it, how this approach can help us in bypassing them, and some glimpse into my own implemenation.
Evolution of antivirus
Historically, antivirus were mostly glorified pattern-searching engines. Given enough bytes in common with a previously discovered virus, a file was deemed malicious. There were two characteristics that attackers abused to evade those kinds of engines.
First, this byte-sequence comparison meant that, if I managed to produce a PE with the same functionalities but different byte sequences, the antivirus wouldn’t catch the payload, even though the end result is the exact same. This was done in several ways. Manually editing the source code was, of course, one way of doing it. But the easy way out was to modify the whole payload at once through encoding (e.g., shikata_ga_nai), encryption (e.g., Veil-Evasion), or polymorphism (e.g., well, also shikata_ga_nai, but for its decoding stub).
Secondly, this file-centered paradigm meant that, if we somehow managed to execute our payload without it being an actual file, the antivirus would be completely blind. This was mostly done by creating small launchers that fetched the actual payload through an HTTP request (or any other channel) and launched it reflectively ; meaning the bulk of what got executed never touched the disk to begin with. This discovery, tremendously popularized by PowerSploit and Empire, induced that a complete antivirus bypass was as easy as writing a single PowerShell line. This was the golden age of antivirus evasion.
However, as time went by, techniques to catch both approaches were either invented or improved with new optics to guide their judgment. We can classify those detections methods in two broad categories: Pre-execution heuristics and Runtime monitoring. Here’s a basic run down of the main (not all) detections that those categories encompass:
Pre-execution heuristics
Entropy analysis
Entropy is one of the most effective way of catching most encryption based wrappers. Indeed, truely random data should be somewhat rare inside executable files. Instructions are not random, strings are not random, resources files of most kinds should not be random (compressed data like images or archives being the notable exception here). Thus, higher than average entropy is considered a decent indicator that a given file is malicious.
Import Address Table Analysis
The IAT contains the external functions (resolved by the loader) the executable might use at run-time. As such, an executable referencing well known functions often seen in malicous code (such as VirtualAlloc, VirtualProtect, etc.) is another mark that the PE might be malicious.
Pattern matching
Of course, the historical way of catching payloads still exists, and section data such as strings or sequences of instructions found in previous identified malwares are matched against analyzed files, in order to detect wether or not they are malicious.
Runtime monitoring
Userland API hooking
Userland API hooking intercepts calls to sensitive Windows APIs within user-mode processes. EDRs commonly monitor functions related to memory allocation, code injection, and process creation (e.g., VirtualAlloc, WriteProcessMemory, CreateRemoteThread). By capturing these calls and their arguments, the agent is able to flag suspicious behavior post obfuscation.
Event Tracing for Windows
ETW works on a provider-controller-consumer basis. A part of the operating system, ranging from user-mode applications to the kernel, provides events. Providers can be enabled or disabled through controllers. Security products can start a session with the adequat providers through their controller and use their consumer agent to access events, and take actions based on those.
Kernel level callback routines
EDRs often include a kernel level agent (so, a driver) that registers callbacks on process and object notifications. Those notifications occur when key objects are created or modified. This allows kernel-level monitoring of process spawns and suspicious handle access, among other things. The monitoring of process spawning is what, I believe, triggered the switch in popular C2s from fork&run to inside agent execution.
Memory scanning
Agents can inspect process memory for indicators such as executable pages without a file backing, regions marked RWX, byte patterns resembling known shellcode, etc. Memory scanning is often triggered when a suspicous event is identified through another sensor.
It’s also important to say that the runtime sensors collect events, but unless one specific event is absolutly known to be malicious, events are often correlated with each others to classify a process as benign or not. This correlation can be either done through human fed rules, or through ML heuristic. Also, a lot of the sensors we just described only raise the “suspicious” score of the payload, and only when this suspicious score exceeds a particular threshold, then the payload is actually deemed malicious.
While as I said the given list of detection mechanisms is not exhaustive, it provides a basic checklist on what we want our evasion tools to bypass.
Rundown of the approach
Before beginning to describe the approach itself, let’s make some parallels with known systems that work in analog ways.
In Java or .net for example, source code is compiled into an intermediate representation (Java bytecode for Java, and CIL for .net), which is then executed in a runtime environment (the Java Virtual Machine for Java, and Common Language Runtime for .net). This managed runtime has multiple responsabilities, the main one being turning the intermediate language into native code, but also, for example, managing memory through garbage collection, ensuring threads synchronization, etc.
In the approach Boudewijn Meijer and Rick Veldhoven described, instead of turning source code into intermediate representation, an executable is transpiled into an intermediate representation, which is then executed in a runtime environment.
The transpiler responsabilities are:
- Transform assembly instructions into encrypted instructions
- Do it in a manner that allows the managed runtime to decrypt instructions one at a time
- Do it in a manner that do not rises entropy
While Boudewijn Meijer and Rick Veldhoven do not do this, in my project, I also transpiled other sections of the executables, mainly .rdata (containing read only data, so strings) and .data (mainly containing global variables) ; into encrypted blob.
The intermediate file is then executed in the runtime environment, which has a few responsabilities:
- Executing the intermediate file
- Ensuring that cleartext incriminating data remains in memory for the shortest possible time
- Confusing event driven analysis

Pre-execution evasion
This way of executing intrinsicaly displays several interesting properties.
First, the intermediate files generated (hereafter called pexe files) are, to my knowledge, completely immune to static analysis since they are encrypted. Entropy analysis is not effective here, as the algorithm use for encryption does not rise entropy (the goal is obfuscation, not confidentiality).
Another way of catching encrypted instructions is to wait for it toeither decrypted in memory. This method is also ineffective: instructions are decrypted, executed, and re-encrypted one at a time. Unlike standard shellcode execution, there is never a window where a large decrypted payload exists in memory. Detection would require flagging a single instruction at the precise moment it is decrypted.
or to reach a sensor post-decryption (e.g. going through a hooked VirtualAlloc function)
Just like that, we have pretty good defenses against static pattern matching, memory analysis and entropy analysis, at least from the pexe perspective. The runtime environment could also be detected, which we will address later.
Currently the runtime environment has no linker nor loader. Instead, functions are imported reflectively within the payload. As a result, the runtime environment does not import suspicious functions directly, rendering Import Address Table analysis ineffective.
What about event analysis based on what sensors detect?
While this approach does not directly make your suspicious events disappear, it does provide a means to complicate heuristic and event-tree-based detection. The architecture of this technique allows us to run multiple runtime engines from the same thread, one executing our malicious payload, the other pouring legitimate events at the same time, or in-between suspicious calls.
From the outside, all events appear tied to a single thread. An EDR attempting to reconstruct a timeline will see legitimate API usage surrounding or overlapping with malicious activity, making it harder to separate intent from noise. In practice, this disrupts correlation: the same thread may appear to allocate memory, free it, perform harmless file I/O, then suddenly inject code ; but without a clear causal chain, thus obsuring the malicious pattern.
In short, this design does not remove visibility, but it corrupts context. Security tools still see events, yet the interleaving of benign and malicious actions makes it far harder to assemble a conclusive picture.
It will not, however, make unitary incrimating events appear legitimate. Drowning creation of an LSASS handle in legitimate events won’t help, since it is usually enough proof on its own of malicious activities.

Transpilation process
The transpilation is actually quite straightfoward. A PE is made of several sections, the one we’re the most interested with being the .text section, which contains the encoded assembly instructions executed by the CPU. Going from an assembly instruction to a Phantomerie instruction (pinstruction for short) is the main objective of transpilation. This transformation operates as follow:
First, the instruction is decoded through the iced rust library. From that decoded assembly instruction, we re-encode in a specific format.
pub struct Instruction {
pub opcode: u8,
pub left_operand_type: u8,
pub right_operand_type: u8,
// can be a RegisterOperand, MemoryOperand, ImmediateOperand, or NoneOperand
pub left_operand: u64,
pub right_operand: u64,
}
The opcode property represents the operation specified by an instruction. In our runtime environment, which reimplements basic assembly operations, the opcode’s byte value is mapped to the corresponding instruction that should be executed.
Most assembly operation comes with one operand, two operands, or none. Those are encoded in the left_operand and right_operand fields. As in assembly, they can be either immediate values, or indirect values, i.e values pulled from a register or the memory. This is encoded in the left_operand_type and right_operand_type fields/ In our case, the registers and memory are virtuals and maintened by the runtime environment. We’ll dwelve into how virtual memory and virtual registers are implemented by the runtime. For now, all there is to know is that operand are encoded in a 8 bytes structures that closely follows they way they are encoded in assembly.
pub struct RegisterOperand {
/// The index of the register.
pub name: Registers,
/// Specifies the chunk of the register to start at (e.g., low byte, high byte, word).
pub chunk: u8,
/// The size of the operand in bits (e.g., 8, 16, 32, or 64).
pub size: u16,
/// Reserved space to align the struct to 64 bits.
pub padding: u32,
}
pub struct MemoryOperand {
/// The effective size pointed by the operand in bits (e.g., 8, 16, 32, or 64).
pub size: u8,
/// The index of the base register. This is the starting address for the calculation.
pub base: u8,
/// The index of the register used for scaled indexing.
pub index: u8,
/// A multiplier for the index register (valid values are 1, 2, 4, or 8).
pub scale: u8,
/// A constant value added to the calculated address.
pub displacement: i32,
}
pub struct ImmediateOperand {
pub value: Value,
}
This data format, which is the exact format of the original article, allows us to represent most assembly instructions ; but not all. For example, one form of the IMUL operation form uses three operands, and some operations work differently wether or not a prefix is present, such as the REP instruction ; which indicates that the current instruction has to be repeated until the counter register reach 0.
For those instructions, we added a “reserved” field, so additional data can be encoded when standard fields are not sufficient.
pub struct Instruction {
pub opcode: u8,
pub left_operand_type: u8,
pub right_operand_type: u8,
pub left_operand: u64,
pub right_operand: u64,
// for now, only used for IMUL and REP prefix
pub reserved: u32
}
Once a pinstruction is encoded, it is then encrypted using a xor encryption variant. The aim is not actually encryption, but to have the ability to change the signature of a pexe by simply modifying the bytes used as key.
use crate::{ encryption_key::{ ENCRYPTION_SEED, LCG_CONSTANT_1, LCG_CONSTANT_2 }, Instruction };
pub struct SimpleStreamCipher {
state: Wrapping<u32>,
}
impl SimpleStreamCipher {
pub fn new(seed: u32) -> Self {
SimpleStreamCipher {
state: Wrapping(seed),
}
}
pub fn next(&mut self) -> u8 {
// Simple Linear Congruential Generator (LCG) - not cryptographically secure but we don't care
self.state = self.state * Wrapping(LCG_CONSTANT_1) + Wrapping(LCG_CONSTANT_2);
(self.state.0 & 0xff) as u8
}
// Encrypt/Decrypt data by XORing with generated keystream
pub fn apply_keystream(&mut self, data: &mut [u8]) {
for byte in data.iter_mut() {
*byte ^= self.next();
}
}
}
pub fn encrypt_decrypt_instruction(instr: &mut Instruction) {
let instr_bytes = unsafe {
std::slice::from_raw_parts_mut(
instr as *mut Instruction as *mut u8,
std::mem::size_of::<Instruction>()
)
};
let mut cipher = SimpleStreamCipher::new(ENCRYPTION_SEED);
cipher.apply_keystream(instr_bytes);
}

Instructions from PEs is not the only thing that gets encoded. Some headers are also stored in the resulting pexe, in order for the runtime environment to know how to deserialize and / or execute the file.
pub struct PhantomerieHeaders {
pub entry_point: u64,
pub phantomerie_headers_size: u8,
pub instructions_section_size: u64,
pub instructions_number: u32,
pub arguments_section_size: u32,
pub arguments_number: u16,
pub rdata_size: u32,
}
Finally, we decided to also put other sections (for now, .rdata and .data) inside the transpiled pexe. This is the first real departure from the original implemation described by. The goal was to support more than PIC or stringless PE, even though I don’t know how useful this really is in practice.
The sections are imported as is from the PE, and the only difference is that they are encrypted using the same algorithm as the one used for instructions.

pub struct Phantomexe {
pub headers: PhantomerieHeaders,
pub arguments: Vec<Argument>,
pub instructions: Vec<Instruction>,
pub rdata: Section,
pub data: Section,
}