Alongside the disassembled instructions in the dashboard, Pwndbg also has the ability to display annotations - text that contains relevent information regarding the execution of the instruction. For example, on the x86 MOV instruction, we can display the concrete value that gets placed into the destination register. Likewise, we can indicate the results of mathematical operations and memory accesses. The annotation in question is always dependent on the exact instruction being annotated - we handle it in a case-by-case basis.
The main hurdle in providing annotations is determining what each instruction does, getting the relevent CPU registers and memory that are accessed, and then resolving concrete values of the operands. We call the process of determining this information "enhancement", as we enhance the information provided natively by GDB.
The Capstone Engine disassembly framework is used to statically determine information about instructions and their operands. Take the x86 instruction sub rax, rdx. Given the raw bytes of the machine instructions, Capstone creates an object that provides an API that, among many things, exposes the names of the operands and the fact that they are both 8-byte wide registers. It provides all the information necessary to describe each operand. It also tells the general 'group' that a instruction belongs to, like if its a JUMP-like instruction, a RET, or a CALL. These groups are architecture agnostic.
However, the Capstone Engine doesn't fill in concrete values that those registers take on. It has no way of knowing the value in rdx, nor can it actually read from memory.
To determine the actual values that the operands take on, and to determine the results of executing an instruction, we use the Unicorn Engine, a CPU emulator framework. The emulator has its own internal CPU register set and memory pages that mirror that of the host process, and it can execute instructions to mutate its internal state. Note that the Unicorn Engine cannot execute syscalls - it doesn't have knowledge of a kernel.
We have the ability to single-step the emulator - tell it to execute the instruction at the program counter inside the emulator. After doing so, we can inspect the state of the emulator - read from its registers and memory. The Unicorn Engine itself doesn't expose information regarding what each instruction is doing - what is the instruction (is it an add, mov, push?) and what registers/memory locations is it reading to and writing from? - which is why we use the Capstone engine to statically determine this information.
Using what we know about the instruction based on the Capstone engine - such as that it was a sub instruction and rax was written to - we query the emulator after stepping in to determine the results of the instruction.
We also read the program counter from the emulator to determine jumps and so we can display the instructions that will actually be executed, as opposed to displaying the instructions that follow consecutively in memory.
Everytime the inferior process stops (and when the disasm context section is displayed), we display the next handful of assembly instructions in the dashboard so the user can understand where the process is headed. The exact amount is determined by the context-disasm-lines setting.
We will be enhancing the instruction at the current program counter, as well as all the future instructions that are displayed. The end result of enhancement is that we get a list of PwndbgInstruction objects, each encapsulating relevent information regarding the instructions execution.
When the process stops, we instantiate the emulator from scratch. We copy all the registers from the host process into the emulator. For performance purposes, we register a handler to the Unicorn Engine to lazily map memory pages from the host to the emulator when they are accessed (a page fault from within the emulator), instead of immediately copying all the memory from the host to the emulator.
The enhancement is broken into a couple stops:
First, we resolve the values of all the operands of the instruction before stepping the emulator. This means we read values from registers and dereference memory depending on the operand type. This gives us the values of operands before the instruction executes.
Then, we step the emulator, executing a single instruction.
We resolve the values of all operands again, giving us the after_value of each operand.
Then, we enhance the "condition" field of PwndbgInstructions, where we determine if the instruction is conditional (conditional branch or conditional mov are common) and if the action is taken.
We then determine the next and target fields of PwndbgInstructions. next is the address that the program counter will take on after using the GDB command nexti, and target indicates the target address of branch/jump/PC-changing instructions.
With all this information determined, we now effectively have a big switch statement, matching on the instruction type, where we set the annotation string value, which is the text that will be printed alongside the instruction in question.
We go through the enhancement process for the instruction at the program counter and then ensuing handful of instructions that are shown in the dashboard.
When to use emulation / reasoning about process state¤
When possible, we code aims to use emulation as little as possible. If there is information that can be determined statically or without the emulator, then we try to avoid emulation. This is so we can display annotations even when the Unicorn Engine is disabled. For example, say we come to a stop, and are faced with enhancing the following three instructions in the dashboard:
Alongside the disassembled instructions in the dashboard, Pwndbg also has the ability to display annotations - text that contains relevent information regarding the execution of the instruction. For example, on the x86 MOV instruction, we can display the concrete value that gets placed into the destination register. Likewise, we can indicate the results of mathematical operations and memory accesses. The annotation in question is always dependent on the exact instruction being annotated - we handle it in a case-by-case basis.
The main hurdle in providing annotations is determining what each instruction does, getting the relevent CPU registers and memory that are accessed, and then resolving concrete values of the operands. We call the process of determining this information "enhancement", as we enhance the information provided natively by GDB.
The Capstone Engine disassembly framework is used to statically determine information about instructions and their operands. Take the x86 instruction sub rax, rdx. Given the raw bytes of the machine instructions, Capstone creates an object that provides an API that, among many things, exposes the names of the operands and the fact that they are both 8-byte wide registers. It provides all the information necessary to describe each operand. It also tells the general 'group' that a instruction belongs to, like if its a JUMP-like instruction, a RET, or a CALL. These groups are architecture agnostic.
However, the Capstone Engine doesn't fill in concrete values that those registers take on. It has no way of knowing the value in rdx, nor can it actually read from memory.
To determine the actual values that the operands take on, and to determine the results of executing an instruction, we use the Unicorn Engine, a CPU emulator framework. The emulator has its own internal CPU register set and memory pages that mirror that of the host process, and it can execute instructions to mutate its internal state. Note that the Unicorn Engine cannot execute syscalls - it doesn't have knowledge of a kernel.
We have the ability to single-step the emulator - tell it to execute the instruction at the program counter inside the emulator. After doing so, we can inspect the state of the emulator - read from its registers and memory. The Unicorn Engine itself doesn't expose information regarding what each instruction is doing - what is the instruction (is it an add, mov, push?) and what registers/memory locations is it reading to and writing from? - which is why we use the Capstone engine to statically determine this information.
Using what we know about the instruction based on the Capstone engine - such as that it was a sub instruction and rax was written to - we query the emulator after stepping in to determine the results of the instruction.
We also read the program counter from the emulator to determine jumps and so we can display the instructions that will actually be executed, as opposed to displaying the instructions that follow consecutively in memory.
Everytime the inferior process stops (and when the disasm context section is displayed), we display the next handful of assembly instructions in the dashboard so the user can understand where the process is headed. The exact amount is determined by the context-disasm-lines setting.
We will be enhancing the instruction at the current program counter, as well as all the future instructions that are displayed. The end result of enhancement is that we get a list of PwndbgInstruction objects, each encapsulating relevent information regarding the instructions execution.
When the process stops, we instantiate the emulator from scratch. We copy all the registers from the host process into the emulator. For performance purposes, we register a handler to the Unicorn Engine to lazily map memory pages from the host to the emulator when they are accessed (a page fault from within the emulator), instead of immediately copying all the memory from the host to the emulator.
The enhancement is broken into a couple stops:
First, we resolve the values of all the operands of the instruction before stepping the emulator. This means we read values from registers and dereference memory depending on the operand type. This gives us the values of operands before the instruction executes.
Then, we step the emulator, executing a single instruction.
We resolve the values of all operands again, giving us the after_value of each operand.
Then, we enhance the "condition" field of PwndbgInstructions, where we determine if the instruction is conditional (conditional branch or conditional mov are common) and if the action is taken.
We then determine the next and target fields of PwndbgInstructions. next is the address that the program counter will take on after using the GDB command nexti, and target indicates the target address of branch/jump/PC-changing instructions.
With all this information determined, we now effectively have a big switch statement, matching on the instruction type, where we set the annotation string value, which is the text that will be printed alongside the instruction in question.
We go through the enhancement process for the instruction at the program counter and then ensuing handful of instructions that are shown in the dashboard.
When to use emulation / reasoning about process state¤
In general, the code aims to be organized in a way as to allow as many features as possible even in the absence of emulation. If there is information that can be determined statically, then we try to expose it as an alternative to emulation. This is so we can display annotations even when the Unicorn Engine is disabled. For example, say we come to a stop, and are faced with enhancing the following three instructions in the dashboard:
1.learax,[rip+0xd55]2.>movrsi,rax# The host process program counter is here3.movrax,rsi
Instruction 1, the lea instruction, is already in the past - we pull our enhanced PwndbgInstruction for it from a cache.
Instruction 2, the first mov instruction, is where the host process program counter is at. If we did stepi in GDB, this instruction would be executed. In this case, there is two ways we can determine the value that gets written to rsi.
After stepping the emulator, read from the emulators rsi register.
Given the context of the instruction, we know the value in rsi will come from rax. We can just read the rax register from the host. This avoids emulation.
The decision on which option to take is implemented in the annotation handler for the specific instruction. When possible, we have a preference for the second option, because it makes the annotations work even when emulation is off.
The reason we could do the second option, in this case, is because we could reason about the process state at the time this instruction would execute. This instruction is about to be executed (Program PC == instruction.address). We can safely read from rax from the host, knowing that the value we get is the true value it takes on when the instruction will execute. It must - there are no instructions in-between that could have mutated rax.
However, this will not be the case while enhancing instruction 3 while we are paused at instruction 2. This instruction is in the future, and without emulation, we cannot safely reason about the operands in question. It is reading from rsi, which might be mutated from the current value that rsi has in the stopped process (and in this case, we happen to know that it will be mutated). We must use emulation to determine the before_value of rsi in this case, and can't just read from the host processes register set. This principle applies in general - future instructions must be emulated to be fully annotated. When emulation is disable, the annotations are not as detailed since we can't fully reason about process state for future instructions.
It is possible for the emulator to fail to execute an instruction - either due to a restrictions in the engine itself, or the instruction inside segfaults and cannot continue. If the Unicorn Engine fails, there is no real way we can recover. When this happens, we simply stop emulating for the current step, and we try again the next time the process stops when we instantiate the emulator from scratch again.
When we are stepping through the emulator, we want to remember the annotations of the past couple instructions. We don't want to nexti, and suddenly have the annotation of the previously executed instruction deleted. At the same time, we also never want stale annotations that might result from coming back to point in the program to which we have stepped before, such as the middle of a loop via a breakpoint.
New annotations are only created when the process stops, and we create annotations for next handful of instructions to be executed. If we continue in GDB and stop at a breakpoint, we don't want annotations to appear behind the PC that are from a previous time we were near the location in question. To avoid stale annotations while still remembering them when stepping, we have a simple caching method:
While we are doing our enhancement, we create a list containing the addresses of the future instructions that are displayed.
For example, say we have the following instructions with the first number being the memory address:
0x555555556259 <main+553> lea rax, [rsp + 0x90]
diff --git a/dev/feed_json_updated.json b/dev/feed_json_updated.json
index c6a0d1638..6aafae31d 100644
--- a/dev/feed_json_updated.json
+++ b/dev/feed_json_updated.json
@@ -1 +1 @@
-{"version": "https://jsonfeed.org/version/1", "title": "pwndbg Blog", "home_page_url": "https://pwndbg.re/pwndbg/latest/", "feed_url": "https://pwndbg.re/pwndbg/latest/feed_json_updated.json", "description": "pwndbg (/pa\u028an\u02c8di\u02ccb\u028c\u0261/) is a GDB plug-in that makes debugging with GDB suck less, with a focus on features needed by low-level software developers, hardware hackers, reverse-engineers and exploit developers.", "icon": "https://pwndbg.re/pwndbg/assets/favicon.ico", "authors": [], "language": "en", "items": [{"id": "https://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/", "url": "https://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/", "title": "Pwndbg coding sprints report", "content_html": "Report of the two coding sprints with Pwndbg\n", "image": null, "date_modified": "2025-10-09T22:12:47+00:00", "authors": [{"name": "Disconnect3d"}], "tags": []}]}
\ No newline at end of file
+{"version": "https://jsonfeed.org/version/1", "title": "pwndbg Blog", "home_page_url": "https://pwndbg.re/pwndbg/latest/", "feed_url": "https://pwndbg.re/pwndbg/latest/feed_json_updated.json", "description": "pwndbg (/pa\u028an\u02c8di\u02ccb\u028c\u0261/) is a GDB plug-in that makes debugging with GDB suck less, with a focus on features needed by low-level software developers, hardware hackers, reverse-engineers and exploit developers.", "icon": "https://pwndbg.re/pwndbg/assets/favicon.ico", "authors": [], "language": "en", "items": [{"id": "https://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/", "url": "https://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/", "title": "Pwndbg coding sprints report", "content_html": "Report of the two coding sprints with Pwndbg\n", "image": null, "date_modified": "2025-10-11T18:40:07+00:00", "authors": [{"name": "Disconnect3d"}], "tags": []}]}
\ No newline at end of file
diff --git a/dev/feed_rss_created.xml b/dev/feed_rss_created.xml
index c83c6f5aa..e55cdaea5 100644
--- a/dev/feed_rss_created.xml
+++ b/dev/feed_rss_created.xml
@@ -1 +1 @@
-pwndbg Blogpwndbg (/paʊnˈdiˌbʌɡ/) is a GDB plug-in that makes debugging with GDB suck less, with a focus on features needed by low-level software developers, hardware hackers, reverse-engineers and exploit developers.https://pwndbg.re/pwndbg/latest/https://github.com/pwndbg/pwndbg/enThu, 09 Oct 2025 22:14:34 -0000Thu, 09 Oct 2025 22:14:34 -00001440MkDocs RSS plugin - v1.17.3https://pwndbg.re/pwndbg/assets/favicon.icopwndbg Bloghttps://pwndbg.re/pwndbg/latest/ Pwndbg coding sprints reportDisconnect3dReport of the two coding sprints with Pwndbghttps://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/ Sun, 21 Aug 2022 00:00:00 +0000pwndbg Bloghttps://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/
\ No newline at end of file
+pwndbg Blogpwndbg (/paʊnˈdiˌbʌɡ/) is a GDB plug-in that makes debugging with GDB suck less, with a focus on features needed by low-level software developers, hardware hackers, reverse-engineers and exploit developers.https://pwndbg.re/pwndbg/latest/https://github.com/pwndbg/pwndbg/enSat, 11 Oct 2025 18:47:15 -0000Sat, 11 Oct 2025 18:47:15 -00001440MkDocs RSS plugin - v1.17.3https://pwndbg.re/pwndbg/assets/favicon.icopwndbg Bloghttps://pwndbg.re/pwndbg/latest/ Pwndbg coding sprints reportDisconnect3dReport of the two coding sprints with Pwndbghttps://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/ Sun, 21 Aug 2022 00:00:00 +0000pwndbg Bloghttps://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/
\ No newline at end of file
diff --git a/dev/feed_rss_updated.xml b/dev/feed_rss_updated.xml
index ddbd09caa..7976f0ede 100644
--- a/dev/feed_rss_updated.xml
+++ b/dev/feed_rss_updated.xml
@@ -1 +1 @@
-pwndbg Blogpwndbg (/paʊnˈdiˌbʌɡ/) is a GDB plug-in that makes debugging with GDB suck less, with a focus on features needed by low-level software developers, hardware hackers, reverse-engineers and exploit developers.https://pwndbg.re/pwndbg/latest/https://github.com/pwndbg/pwndbg/enThu, 09 Oct 2025 22:14:34 -0000Thu, 09 Oct 2025 22:14:34 -00001440MkDocs RSS plugin - v1.17.3https://pwndbg.re/pwndbg/assets/favicon.icopwndbg Bloghttps://pwndbg.re/pwndbg/latest/ Pwndbg coding sprints reportDisconnect3dReport of the two coding sprints with Pwndbghttps://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/ Thu, 09 Oct 2025 22:12:47 +0000pwndbg Bloghttps://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/
\ No newline at end of file
+pwndbg Blogpwndbg (/paʊnˈdiˌbʌɡ/) is a GDB plug-in that makes debugging with GDB suck less, with a focus on features needed by low-level software developers, hardware hackers, reverse-engineers and exploit developers.https://pwndbg.re/pwndbg/latest/https://github.com/pwndbg/pwndbg/enSat, 11 Oct 2025 18:47:15 -0000Sat, 11 Oct 2025 18:47:15 -00001440MkDocs RSS plugin - v1.17.3https://pwndbg.re/pwndbg/assets/favicon.icopwndbg Bloghttps://pwndbg.re/pwndbg/latest/ Pwndbg coding sprints reportDisconnect3dReport of the two coding sprints with Pwndbghttps://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/ Sat, 11 Oct 2025 18:40:07 +0000pwndbg Bloghttps://pwndbg.re/pwndbg/latest/blog/2022/08/21/pwndbg-coding-sprints-report/
\ No newline at end of file
diff --git a/dev/objects.inv b/dev/objects.inv
index d4b443ab7..b77efa9de 100644
Binary files a/dev/objects.inv and b/dev/objects.inv differ
diff --git a/dev/reference/pwndbg/aglib/disasm/aarch64/index.html b/dev/reference/pwndbg/aglib/disasm/aarch64/index.html
index 9c84dea64..764ecd377 100644
--- a/dev/reference/pwndbg/aglib/disasm/aarch64/index.html
+++ b/dev/reference/pwndbg/aglib/disasm/aarch64/index.html
@@ -1,4 +1,4 @@
- aarch64 - Documentation
Mapping of Capstone register id to integer value. During enhancement, we might manually determine that an instruction writes some value to a register, and this is stored here.
True under the following conditions: - If it's an unconditional jump, we know the target of the jump - If it's a conditional jump, we know the target of the branch and know whether or not we take it Otherwise, false