* Add gdb_version to mock gdblib
* Re-enable unit tests
* Only collect unit test coverage if --cov is passed
* Source venv before running tests in github action
* Add venv path PATH in to Dockerfile
* Only check for "/ls" in `which` test
* Fix i386-32 syscall name printing
pwndbg-git from AUR shows hexadecimal constants in masm syntax
(e.g. 80h) for some reason (as if the option CS_OPT_SYNTAX_MASM was set).
This commit makes syscall name printing work regardless of hex syntax.
* riscv: Fix AssertionError on "jalr ra, ra, 0x252"
When the PC was on this instruction, the pwndbg context would not be
printed due to this AssertionError.
* riscv: Fix AssertionError on "c.jalr a5"
According to the specification, "C.JALR expands to jalr x1, 0(rs1)".
* Modify python test scripts to work from nix
* Update utils.py
* address review feedback
---------
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Only look for readable address in retaddr command
* Rename stack.py to retaddr.py
* Add pwndbg.gdblib.stack.callstack and use it in retaddr
* Add callstack gdb test
* Add QEMU callstack test
* V1 - annotations for values of registers and memory to display result of instructions.
* Emulator telescope(), more x86 instructions
* Emulation change - keep track of before & after instruction execution. Telescope format correctly, read size taken into account
* Add config options to configure emulation and annotations, vmovaps alignment warning, string length in disasm telescope, cache previously annotated instructions
* Create PwndbgInstruction type for better typing and easier future development
* More consistent spacing, options to disable annotations, ADD instruction shows operands
* Rebase from dev
* Correctly go to .next address in disasm view (was incorrectly going to call targets)
* Precompute register str to reduce code duplication
* Correct telescope memory read width, bring target printing back to previous behavior when symbol can be resolved
* More consistent looking annotation spacing/padding, fixed edge case bugs with cached instructions
* Even cleaner padding
* Additional comments and debugging, ironed out last bugs
* debugging tight loops
* Cache fixed - nearpc only annotates what can be statically resolved
* lint and show instructions that cannot dereference
* Reapply btrace crash fix after rebase
* Less code duplication, implement XCHG and AND instructions, moved more methods from x86 subclass to superclass
* aarch64 set flags register in Unicorn correctly
* fix
* Don't recreate emulator regname->const map every time the emulator is instantiated
* Use emulation to set .next within enhancement
* Improve ret instruction target address setting
* Green checkmarks for jumps on all architectures
* Fixed .next and .target setting
* All architectures now have correct logic for determining .next and .target. Green checkmarks for taking conditional branches now appear for all architectures, added things to determine that type of branch being taken, and simplified printing by replacing symbol_addr with new field 'target_string'
* Instruction debug print fix
* Correct jump instruction checking
* Fixed target_string resolution
* Fix conditional jump check, also make default target resolution better
* target_const determined more accurately
* reverse iteration order of last change
* Pwndbg.condition is more retyped and more correct, make manual determinations of condition override the emulators (it can be incorrect in cases). Uncover why MIPS sometimes takes incorrect jumps in the emulator (delay slot)
* MIPS annotations work really good now. Jumps are correctly predicted (with green checkmarks). Implemented manual condition() function for MIPS. Only highlight the correct instance of instruction at PC when there are multiple in view (tight loops). Allow manual .condition to override emulator in determiningg .next.
* Additional debug info on instructions
* Print arch in instruction
* aarch64 branch fix
* aarch64 branch fix (real)
* lint
* Final changes - fixing .size error
* lint
* Add dev_dump_instruction command, add default memory read in resolve_used_value, update comments and remove .size from enhancedoperand as it only exists on x86
* More uniform spacing on annotations
* Various comments converted to docstrings, aarch64 enhancer created, post-rebase
* import aarch64
* Aarch64 mov, ldr, add, sub
* adrp
* ADR
* lint
* Fun with git rebase
* lint
* lint again after re-installing dependencies
* New caching strategy implemented to ensure no state caching when jumping large distances. Handled edge cases of user manually setting a register or memory, 'set regname=2'
* lint
* Fixed two regressions (nearpc shouldn't take jumps, even ones we know statically, and replace all constants in the assembly with symbols). Tweak tests to reflect new annotations
* lint
* one last test
* Fix chain format dereferencing for non-singleton lists, now correctly deferences and displays chains for future instructions when not emulating (dereference until pointer goes to writable memory)
* Add jumps-only setting to emulation (on, off, jumps-only), fixes to chain deferencing and enhancing
* Properly dereference memory before and after execution of instructions, adding a new before_value_resolved field (same for after). This also reduces code duplication.
* Debogusify the format()/telescoping dereferencing logic
* lint
* post-rebase fixes
* Fix case the breaks a test - don't attempt to read larger than ptrsize such as in SIMD instruction memory reads
* Typo in emulate setting
* Developer docs for annotations
* Fix case where emulator attempts to read and unpack very large, 16 byte+ wide values while telescoping
* Add a helper command to find valid one_gadget for current context
* Refactor the function for getting section address
* Rename the command to onegadget for more convenient typing
* Make the output format cleaner
* Add a simple cache mechanism for the one_gadget output
* Update the warning message
* Use MD5 instead of BLAKE2 for computing the file hash
I thought that BLAKE2 was faster than MD5, but it doesn't seem correct here somehow (probably because of the implementation of Python!?)
Here's the script I used for benchmarking:
```python
import hashlib
import timeit
def compute_file_hash_1() -> str:
h = hashlib.blake2b()
with open("/lib/x86_64-linux-gnu/libc.so.6", "rb") as f:
h.update(f.read())
return h.hexdigest()
def compute_file_hash_2() -> str:
h = hashlib.md5()
with open("/lib/x86_64-linux-gnu/libc.so.6", "rb") as f:
h.update(f.read())
return h.hexdigest()
print(timeit.timeit(compute_file_hash_1, number=1000))
print(timeit.timeit(compute_file_hash_2, number=1000))
```
I executed the above script on various machines, and the results seem to show that MD5 outperforms BLAKE2 in this scenario. (On my x86 VM running through QEMU on my M1 MacBook, BLAKE2 even takes almost twice as long as MD5.)
* Add the tests for `onegadget` command
* Fix lint issue
* Try to cover more code
* Fix lint issue
* Fix illogical tests
* Rename one_gadget to onegadget
* Use `pwndbg.lib.tempfile.cachedir` for `onegadget`
* Call `pwndbg.lib.tempfile.cachedir` only once
* Add support for breaking on UAF
* Small fixes and documentation
* Add a command to enable and disable tracking, better diagnostics
* Add initial support for calloc and realloc
* Better safeguard against matching ld.so malloc
* Small fixes
* Better interface for managing the heap tracker. More terse and information dense diagnostics
* Add warning and fix lints
* Update poetry lock
Hopefully fixes#1947 by fetching stacks only when they are used instead
of doing it on each stop event. It will also first try to compute stacks
dictionary based on vmmap and if it fallbacks to exploring stacks if
vmmap is not present.
* [WIP] Port gdb-tests from bash to python
* Use threads instead of processes
* Port gdb tests to python
* Linting
* Fix coverage "again"
* Remove bash tests
---------
Co-authored-by: intrigus <abc123zeus@live.de>
Previously test scripts would just indiscriminately kill all qemu
processes on the system. This would kill other debug sessions I had
running. These changes make the test scripts record the qemu pids they
run and only kill those.
The old scripts would also not allow you to specify a gdb port, so
if you were already running a debug session with port 1234, the tests
would fail. This update allows you to pass --gdb-port=NNNN to use a
non-default port. You can pass -Q to preserve failing qemu instances.
The scripts now also will show qemu errors to console, and will warn
the user if there is a qemu port conflict.
Also update gdb-pt-dump submodule as it has been updated recently to not
throw an exception when multiple qemu processes are running. The
exception thrown in the event of a failure also changed, so
this has also been updated on the pwndbg side.
* get_one_instruction: clear "cont" cache on mem/reg changed
Fixes#1818.
Note that this makes a substantial change: it changes all caches that
are refreshed on `gdb.ContinueEvent` to also be cleared on memory/regs
changed.
This change is needed so that the `get_one_instruction` function which
uses this cache will get its cache cleared when user invokes a command
that changes memory or registers.
While this may sound as too big change: we are changing the whole "cont"
cache to be cleared on two additional events, this should not be an
issue. This is because:
1. We should notice it if we start clearing an important cache too often
2. The "cont" cache is currently only used by the `get_one_instruction`
at this moment.
The 2) also creates a question: when should one use "cont" vs "start"
caches? It is not so clear to me right now.
* Add test for issue #1818
* Clear caches on MemoryChanged events from gdblib.write
Regarding the last part:
Interestingly implementing tests here uncovered another bug: the gdblib.memory.write(..) or rather the gdb.selected_inferior().write_memory(...) API used there does not trigger a gdb.MemoryChanged event. As a result, we never cleared certain caches that should have been cleared when the user used that API.
I have added two tests here, one changes the instructions at $RIP to nops via gdblib.memory.write(..) and another via executing the patch $rip nop;nop;nop;nop;nop command. As a result, we test both scenarios: 1) when we depend on memory changed event being fired via GDB to clear caches; and 2) when we depend on gdblib.memory.write(..) to clear the caches.
This PR also makes a fix to the gdblib.memory.write(..) to actually clear caches that depend on (or rather: are hooked to in order to be cleared) memory changed events.
* Fix glibc-fastbin-bug option of find_fake_fast
Using the find_fake_fast option --glibc-fastbin-bug always resulted in an error, at least on 64-bit platforms.
This was because the option caused only 4 bytes to be read for the size, but then that gets passed to unpack() which expects 8 bytes.
Closes#1773
* Address review comment
* Update arch.py
* Update pwndbg/commands/heap.py
* Fix lint
* Update arch.py
* Update arch.py
---------
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
This commit adds a command that traverses the linked list beginning at a given
element, dumping its contents and the contents of all the elements that come
after it in the list. Traversal is configurable and can handle multiple types
of chains.
This commit adds the `break-if-taken` and `break-if-not-taken` commands,
which attach breakpoints to branch instructions that will stop the
inferior if said branch is taken or is not taken, respectively. It adds
an extra class, `pwndbg.gdblib.bpoint.Breakpoint`, which clears caches
before calling `stop()`, allowing for the use of register values inside
that function in breakpoint classes that derive from it. Additionally,
checking of whether the conditions for a branch to be taken have been
fulfilled is done through `DisassemblyAssistant.condition()`.
* Add `stepuntilasm` command
This commit adds a `stepuntilasm` command that, given a mnemonic and,
optionally, a set of operands, will step until a instruction that
matches both is found. Matching is string-based, as the user will likely
want to spell out the asm directive they want as text, and interpreting
assembly language conventions for all of the platforms pwndbg supports
is probably outside the scope of this change.
* next.py: small code cleanup
* next.py: fix bug introduced in previous commit
op.str -> op_str
* Update next.py
* Update next.py
* Update next.py
---------
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Only run arch for testing
* Remove outdated arch repo
* Actually build the docker image
* Do not include site packages in sys.path
* Ignore `.relr.dyn` section; skip lines w/o spaces
Newer binaries can contain a `.relr.dyn` section to compress `R_X86_64_RELATIVE` relocation entries.
These binaries can be found for example on archlinux but also on Debian 12 for example.
`readelf` prints the content of the section similarly to this:
```
Relocation section '.relr.dyn' at offset 0x25220 contains 35 entries:
1198 offsets
00000000001ce8d0
00000000001ce8e0
```
Compared to `00000000001d2000 0000000000000025 R_X86_64_IRELATIVE 9f330` for
`.rela.plt`.
Pwndbg now chokes on the new format because it expects a space seperator where there is none.
It might be, that this is actually an upstream problem with binutils, because llvm-readelf prints this:
```
Relocation section '.relr.dyn' at offset 0x25220 contains 1198 entries:
Offset Info Type Symbol's Value Symbol's Name
00000000001ce8d0 0000000000000008 R_X86_64_RELATIVE
00000000001ce8e0 0000000000000008 R_X86_64_RELATIVE
```
Nevertheless, we aren't actually interested in `R_X86_64_RELATIVE` relocations so I guess it's fine to
just skip all lines that contain no spaces at all.
`.relr.dyn` can only containt `R_X86_64_RELATIVE` relocations as far as I understand
https://maskray.me/blog/2021-10-30-relative-relocations-and-relr
* Accept Full RELRO in test
Archlinux has libc and ld with Full RELRO.
We now just accept Partial and Full RELRO.
* Do not copy binaries from host to docker
The `Dockerfile` copies the whole pwndbg folder to the image.
If we have built binaries on the host before, these binaries will contain references to
the host system and *copied* to the image.
If we now run `context code` (inside docker) to have a look at the source code this will
fail, because we will try to refer to a path on the host system.
* Do not use loop index after loop
Do not use loop index after the loop. The tests assumed that the loop in line 186
would run at least once, thereby *resetting* `i` to zero. If we never enter the
loop, `i` will *continue* to have the value it had at the end of line 172.
This will cause the test to fail in mysterious ways because `i` is now not reset
to zero but still has the value `31` for example.
The solution is to never use `i` outside of a loop.
* Re-enable archlinux and temporarily disabled ones
* Fix coverage combine toml issue
This commit should fix this issue:
```
Run coverage combine
coverage combine
coverage xml
shell: /usr/bin/bash -e {0}
Can't read 'pyproject.toml' without TOML support. Install with [toml] extra
Error: Process completed with exit code 1.
```
* setup.sh: cleanup the --user flag since we use venv now
Cleans up the --user flag from setup.sh since it is unused after we changed setup.sh to install Python dependencies in a virtual environment
* Remove --user flag from CI workflows
* Fix codecov problem
We need to run the python `coverage` library to collect coverage.
However, gdb was failing to find it.
Recently, pwndbg moved to using venvs. When pwndbg is initialized
it setups the venv "manually", that is, no "source .venv/bin/activate"
is needed. When we run gdb tests, we pass the `gdbinit.py` of pwndbg as a
command to gdb to be executed like this:
`gdb --silent --nx --nh -ex 'py import coverage;coverage.process_startup()' --command PATH_TO_gdbinit.py`
The problem is that *order* matters. This means that *first* coverage
is imported (by `-ex py ...`) and only *then* the init script is executed.
When `coverage` is first imported, it's library search path only looks
in system libraries of python, and not the venv that gdbinit.py would load.
So we would try to import an old version of coverage and fail.
One solution would be to move around the commands, but this would be an
ugly hack IMHO. **Instead**, we should just tell gdb that this is an **init**
command that has to be executed before other commands.
Previously, the order did not matter. All of pwndbg's dependencies were
installed directly as system libraries to python. So the library search path
was the same before and after loading `gdbinit.py`.
---------
Co-authored-by: disconnect3d <dominik.b.czarnota@gmail.com>
Co-authored-by: intrigus <abc123zeus@live.de>
* Refactor the `got` command to support more use cases
- Create some function to parse the information of loaded shared object libraries from `info sharedlibrary`
- Make got command can show the entries of other libraries loaded in memory
- Make got command can show more various relocations to support not only the `JUMP_SLOT` type relocation but also supports `IRELATIVE` and `GLOB_DAT` type relocation.
* Update tests for the `got` command
* Update pwndbg/commands/got.py
* Update pwndbg/commands/got.py
* Update pwndbg/commands/got.py
* Update pwndbg/commands/got.py
* Update pwndbg/commands/got.py
* Update pwndbg/commands/got.py
* Update pwndbg/commands/got.py
* Update pwndbg/commands/got.py
* Update the comment
https://github.com/pwndbg/pwndbg/pull/1771#discussion_r1251054080
* Update the tests
* Add some hints for the qemu users
---------
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Improve RISCV support
This is a resurrection of #829
Co-authored-by: Tobias Faller <faller@endiio.com>
* Silence bogus vermin warning
* Fix relative backwards jump calculations
The target address wouldn't be truncated to the pointer size.
* Add basic qemu-user test
* Run qemu-user tests in CI
* Make shfmt happy
* Fix pwntools < 4.11.0 support
* Support RISCV32 for pwntools < 4.11.0 as well
---------
Co-authored-by: Tobias Faller <faller@endiio.com>
* fix: remove minor bugs and complete address translation
* feat: add 5lvl paging
* feat: add address translation tests
* fix: remove unnecessary comments
* fix: add references for magic values
* fix: add X86_FEATURE_LA57 reference
* fix: move x86 specific functions to x86_64Ops
* fix: extend tests and remove faulty code
* fix: only test address translation for lowmem
* fix: adjust arch_ops test to pytest
* fix: add reference for memory models in linux
* fix: do not memoize staticmethods
* Fix and test ctx disasm when disassembly-flavor changes
* New lib/cache.py: make caching great again
This commit fixes bugs with old caching (memoize.py) and makes it more
readable.
See also https://github.com/pwndbg/pwndbg/issues/1453
* Update pwndbg/lib/cache.py
Co-authored-by: Gulshan Singh <gsingh2011@gmail.com>
* lib.cache: address PR comments and add debug mode
* Fix lint
* Remove leftover memoize usages
* Add cache benchmark
* fix lint
---------
Co-authored-by: Gulshan Singh <gsingh2011@gmail.com>
* feature: Add `killthreads` command (closes#1580)
This command allows the user to quickly kill multiple threads by
specyfying their ids as arguments to this command. It also supports
the `--all` flag, which will kill overy thread except the currently
selected one. This is useful for use with the `checkpoint` command.
The killing is done by calling `pthread_exit(0)`.
* fix: try fixing building test binaries by enabling -lpthread
* fix: remove error message check when calling pthread_exit
Removed the message check, because the error messages difffer between
versions of GDB.
* fix: Improve UX of the killthreads command
Add an extended description of the command, some validation on the thread IDs
and suppress GDB output.
* fix: lint
* fix: put the multiline help text in the correct place
* tests: fix test failing due to a race condition when running in parrallel to other tests
Replaced asserts with loops which wait for a cundition to be met, so that the tests doesn't fail due to scheduling issues.
* tests: add more fixes for race conditions in test_killthreads
* fix: lint
* Update pwndbg/commands/killthreads.py
* tests: Wait for exactly three threads
---------
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Changing the arguments to vis_heap_chunks to be clearer
1. --native to --beyond_top
2. --display_all to --no_truncate
* Add print all chunks to vis_heap_chunks
* Preventing the use of the all_chunks argument together with the count argument in vis_heap_chunks
* Use linting for heap.py
* Fix test_vis_heap_chunks.py
According to cdd71a1d82 --display_all/-d moved to --no_truncate/-n
---------
Co-authored-by: Nerya Zadkani <nerya@tokagroup.com>
This commit adds a fix and tests for #1600 and #752.
* https://github.com/pwndbg/pwndbg/issues/1600
* https://github.com/pwndbg/pwndbg/issues/752
Generally, for an example like this:
```cpp
struct A {
void foo(int, int) { };
};
int main() {
A a;
a.foo(1, 1);
}
```
The output for `info symbol <address of A::foo>` returns:
```
'A::foo(int, int) [clone.isra.0] + 3 in section .text of /root/pwndbg/tests/gdb-tests/tests/binaries/a.out\n'
```
We then used this code to parse this:
```py
# Expected format looks like this:
# main in section .text of /bin/bash
# main + 3 in section .text of /bin/bash
# system + 1 in section .text of /lib/x86_64-linux-gnu/libc.so.6
# No symbol matches system-1.
a, b, c, _ = result.split(maxsplit=3)
if b == "+":
return "%s+%s" % (a, c)
if b == "in":
return a
return ""
```
The `result.split(maxsplit=3)` here splitted the string to:
```py
['A::foo(int,',
'int)',
'[clone.isra.0] + 3 in section .text of /root/pwndbg/tests/gdb-tests/tests/binaries/a.out\n']
```
And since `b` was not `"+"` or `"in"` we eventually returned an empty
string instead of the `A::foo(int, int)` which would be expected here.
* Fix the bug when using LD_PRELOAD to load libc
The heap heuristics will try to find `libc.so.6` in the output of `info sharedlibrary`, but if we load libc with `LD_PRELOAD`, the filename of the libc might not be `libc.so.6`.
* Add test for `glibc.get_libc_filename_from_info_sharedlibrary`
* Refactor `pwndbg.glibc`
- Add type hints
- Use `info sharedlibrary` to find libc
- Update the regex of libc filename
- Rename `get_data_address()` to `get_data_section_address()`
* Add a function to dump libc ELF file's .data section
* Use the new methods to find `main_arena` and `mp_`
With ELF of libc, we can use the default value of `main_arena` and `mp_` to find their address
* Drop some unreliable methods for the heap heuristics
* Update the tests for the heap heuristics
* Show `main_arena` address in the `arenas` command output
* Make the heap hueristics support statically linked targets
* Drop some deprecated TLS functions and refactor the command
- Drop some deprecated TLS functions for the deprecated heap heuristics
- Don't call `pthread_self()` in the `tls` command without `-p` option
- Show the page of TLS in the `tls` command output
* Update the hint for the heap heuristics for multi-threaded
* Fix the wrong usage of the exception
* Fix the outdated description
* Return the default global_max_fast when we cannot find the address
* Enhance the output of `arena` and `mp`
- Show the address of the arena we print in the output of `arena` command if we didn't specify the address by ourselves.
- Avoid the bug that `arena` command might get an error if thread_arena doesn't allocate yet.
- Show the address of `mp_` in the output of the `mp` command
* Remove wrong hint
* Support using brute-force to find the address of main_arena
If the user allows, brute-force the left and right sides of the TLS address to find the closest possible value to the TLS address.
* Refactor the code about thread_arena and add the new brute-force strategy
In the .got section, brute-force search for possible TLS-reference values to find possible thread_arena locations
* Add tests for thread_arena and global_max_fast
- Check if we can get default global_max_fast
- Check if we can use brute-force to find thread_arena
* Update the output of `arenas`
* Add the test for the `tls` command
Add two tests for the `tls` command:
```
test_tls_address_and_command[x86-64] PASSED
test_tls_address_and_command[i386] PASSED
```
* Update and refactor the heuristics for `thread_arena` and `tcache`
- We provide an option for users to brute force `tcache` like what we did for `thread_arena`
- Cache `thread_arena` even when we are single-threaded
- Refactor the code for `thread_arena`, to make it work for `tcache` as well
- Update the tests for `tcache`
- Remove some redundant hint
* Fix the wrong cache mechanism
Cache the address of the arena instead of the instance of `Arena`, because `Arena` will cache the value of the field, resulting in getting the old value the next time the same property is used
* Update the description of some configs about heap heuristics
* Handling the case when tcache is NULL
* Handling the case when thread_arena is NULL
* Fix a bug that occurred when the TLS address could not be found
* Fix#1550
* Show tid only if no address is specified
* Update pwndbg/commands/__init__.py
* Update pwndbg/commands/heap.py
* Update pwndbg/commands/heap.py
* Update pwndbg/commands/heap.py
* Update pwndbg/commands/heap.py
* Update pwndbg/commands/heap.py
* Update pwndbg/commands/heap.py
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Fix lint
* Move some code into `pwndbg.gdblib.elf`
---------
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Fix plt and gotplt commands
* Add plt gotplt commands tests
* Fix got and plt commands and test them
* Revert accidental change
* Extend system path
* Hopefully fix PATH problems once and for all?
* fix import
* remove redundant part
* Enhance the checks before accessing the memory
- Use `pwndbg.gdblib.memory.peek()` instead of `pwndbg.gdblib.vmmap.find()` to check if the address is valid
- Directly access the memory when searching the `main_arena` in memory and catch the exception
* Make finding `main_arena` in memory more efficient and reliable
We only try the address that is aligned to `pwndbg.gdblib.arch.ptrsize`
* Avoid unnecessary memory accessing if possible
- Before we used `pwndbg.gdblib.memory.peek()` to check if an address is readable for GDB, we used `pwndbg.gdblib.vmmap.find()` to make sure that this address is in one of the pages, since accessing memory for embedded targets might be slow and expensive
- Create a new function: `is_readable_address` for `pwndbg.gdblib.memory`
* Fix wrong test for `main_arena`
The heap object should be reset before testing the multi-threaded condition
* Add the test to make sure the heap heuristics won't be affected by the vmmap result
Previously, we used `pwndbg.gdblib.vmmap.find()` to check whether the address is valid or not, but this might be a false positive for the address in the `[vsyscall]` page or in the page with a range from 0~0xffffffffffffffff (e.g. qemu-user).
This commit aims to include this scenario during the tests, to make sure the heap heuristics won't be affected by this.
* Use `gdb.MemoryError` instead of `Exception`
* Refactor TLS module
- Replace unreliable `__errno_location()` trick with `pthread_self()` to acquire TLS address
- Consolidate heap heuristics checks about TLS within the `pwndbg.gdblib.tls` module for better organization
* Bug fix for the `errno` command
Calling `__errno_location()` without locking the scheduler can cause another thread to inadvertently continue execution
* Refactor code about heap heuristics of thread-local variables
- Replace some checks with some functions in `pwndbg.gdblib.tls`
- Try to find tcache with `mp_.sbrk_base + 0x10` if the target is single-threaded
* Add tests for heap heuristics with multi-threaded
* Refacotr scheduler-locking related functions
- Move these functions into `pwndbg.gdblib.scheduler`
- Fetch the parameter value once (https://github.com/pwndbg/pwndbg/pull/1536#discussion_r1082549746)
* Avoid bug caused by GLIBC_TUNABLES
See https://github.com/pwndbg/pwndbg/pull/1536#discussion_r1083202815
* Add note about `set scheduler-locking on`
* Add comment for `lock_scheduler`
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Update DEVELOPING.md
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
- Update the docs of the config: `kernel-vmmap`, `hexdump-group-use-big-endian`, `kernel_vmmap_via_pt`, and `resolve-heap-via-heuristic`
- Update the output of `get_show_string()` to display: ``See `help set <config>` for more information`` in the end of the output of `show <config>`
- Modify `get_set_string()` to match GDB's builtin behaviour
- Make `gcc-compiler-path`'s and `cymbol-editor`'s `set_show_doc` first strings to lowercase
- Change `gcc-compiler-path` and `cymbol-editor` to `gdb.PARAM_OPTIONAL_FILENAME`
- Add resolve_heap_via_heuristic as a gdb.PARAM_ENUM config with options:
- auto: pwndbg will try to use heuristics if debug symbols are missing
- force: pwndbg will always try to use heuristics, even if debug symbols are available
- never: pwndbg will never use heuristics to resolve the heap
- Move some hints to `resolve_heap_via_heuristic`'s `help_docstring`
* Fix Arch CI: install missing netcat (#1400)
The arch linux test_command_procinfo was failing bcoz the netcat was not
installed on arch build. This commit fixes it by:
1) installing gnu-netcat for arch linux setup-dev.sh
2) asserting that nc is available in the test itself, to prevent similar
regressions from happening on future/newer images
* Fix Arch CI: the load binary tests (#1400)
Before this commit we asserted whether the loaded binary in tests report
to find or not find debug symbols but this is irrelevant for the thing
we want to test there which is: pwndbg loading. What eventually cares is
whether Pwndbg got loaded and didn't raise an exception.
This commit fixes those tests so they should now work also on ArchLinux
CI and on all CI builds.
Additionally, it removes the `compile_binary` test utility function
which was redundant as we compile all test binaries via a makefile.
* fix lint
* cleanup tests/binaries/div_zero_binary
The cymbol command did not work on old GDB versions like 8.2 because
they require the ADDR argument to be passed into the `add-symbol-file`
command unlike newer GDB versions in which the argument is optional.
This can be seen below.
```
pwndbg> help add-symbol-file
Load symbols from FILE, assuming FILE has been dynamically loaded.
Usage: add-symbol-file FILE ADDR [-readnow | -readnever | -s SECT-NAME SECT-ADDR]...
ADDR is the starting address of the file's text.
Each '-s' argument provides a section name and address, and
should be specified if the data and bss segments are not contiguous
with the text. SECT-NAME is a section name to be loaded at SECT-ADDR.
The '-readnow' option will cause GDB to read the entire symbol file
immediately. This makes the command slower, but may make future operations
faster.
The '-readnever' option will prevent GDB from reading the symbol file's
symbolic debug information.
pwndbg> version
Gdb: 8.1.1
Python: 3.6.9 (default, Jun 29 2022, 11:45:57) [GCC 8.4.0]
Pwndbg: 1.1.1 build: c5d8800
Capstone: 4.0.1024
Unicorn: 2.0.7
```
vs
```
pwndbg> help add-symbol-file
Load symbols from FILE, assuming FILE has been dynamically loaded.
Usage: add-symbol-file FILE [-readnow | -readnever] [-o OFF] [ADDR] [-s SECT-NAME SECT-ADDR]...
ADDR is the starting address of the file's text.
Each '-s' argument provides a section name and address, and
should be specified if the data and bss segments are not contiguous
with the text. SECT-NAME is a section name to be loaded at SECT-ADDR.
OFF is an optional offset which is added to the default load addresses
of all sections for which no other address was specified.
The '-readnow' option will cause GDB to read the entire symbol file
immediately. This makes the command slower, but may make future operations
faster.
The '-readnever' option will prevent GDB from reading the symbol file's
symbolic debug information.
pwndbg> version
Gdb: 12.1
Python: 3.10.6 (main, Nov 2 2022, 18:53:38) [GCC 11.3.0]
Pwndbg: 1.1.1 build: c5d8800
Capstone: 4.0.1024
Unicorn: 2.0.0
pwndbg>
```
When we optimized tests runs with gnu parallel execution, we broke the
--pdb flag. This commit fixes it and sets the SERIAL flag so that tests
are run one by one when --pdb is passed.
* fix shlint
* Fix crash when unable to get ehdr and fix vmmap coredump test
This commit fixes two issues and test them.
1. It changes the reads in `get_ehdr` to partial reads so that inability
to read the `vmmap.start` address there will not crash Pwndbg with
`gdb.error` but instead we will simply return `None` as expected from
this function. This crash could happen on Debian 10 (GDB 8.2.1) and
Ubuntu 18.04 (not sure which GDB) when you did:
- gdb ./binary-that-crashes
- `run`
- `generate-core-file /tmp/core`
- `file` - to unload the binary
- `core-file /tmp/core` - to load the generated core
At this point I think we may have preserved the old vmmap info and use
it in `get_ehdr` maybe, which then crashed? I am not sure, but this fix
here works.
To test this behavior properly I also added the `unload_file`
parametrization to the
`test_command_vmmap_on_coredump_on_crash_simple_binary` test.
2. We fix the vmmap coredump test case when the `info proc mappings` returns nothing on core
dumps on old GDBs. In such case we are missing the vmmap info about
the binary mapping, so now we properly remove it in the test.
This fixes the weird error that appeared on debian10 CI:
```
root@98cc3841eab9:/pwndbg/tests/gdb-tests/tests/binaries# ld -Ttext 0x400000 -o memory.out memory.o
ld: section .note.gnu.property LMA [00000000004000e8,0000000000400107] overlaps section .text LMA [0000000000400000,00000000004001a4]
```
It turned out that the .note.gnu.property address was choosen to be the
same as our hardcoded .text address and so we got into this issue.
This PR hardcodes the gnu section address.
* Fix tests reporting in parallel execution
Fix issue where parallel test execution was unable to track failed tests and inform about their number.
* Fix logic in tests.sh
* Add get_sbrk_heap_region() method
* Use SIZE_BITS in Chunk.real_size()
* Add non_contiguous property to Arena class
* Improve Heap class
* More accurate arena detection
* Integrate Heap class into Chunk class
* Don't parse bins when no arena in find_fake_fast
* Add active_heap property to Arena class
* Add more functionality to heap classes
* next_chunk method for Chunk class
* prev property & __str__ method for Heap class
* heaps property for Arena class
* arenas command updated to reflect changes to Arena class
* Use deepcopy() in get_region() to avoid changing vmmap command output
* Import fiddling to deal with unrelated bug
* Attempt at integration with heap commands
With debug syms looks good, still issues to iron out with heuristics
* Remove redundant heap functions
* Remove redundant functions from tests
* Add system_mem property to Arena class
* thread_arena returns main_arena if single thread
* Fix some issues for GDB < 9.x
* GDB < 9.x doesn't have `gdb.lookup_static_symbol`
* GDB < 9.x doesn't have `gdb.PARAM_ZUINTEGER_UNLIMITED`
* Better error handling for the heap commands
* Inform users to `set exception-* on` when they encounter some error during using some heap commands
* Bug fix for heap region finding of `HeuristicHeap`
* Before this commit, `get_heap_boundaries()` of `HeuristicHeap` will always return the page whose name is `[heap]`, this won't work for multithreaded cases and won't work if the heap region of the main thread is not `[heap]` (e.g., when using QEMU, sometimes the name of heap region is something like: `[anon_deadbeaf]`)
* Fallback to `gdb.lookup_symbol` if we do not have `gdb.lookup_static_symbol`
* Add more features for `pwndbg.gdblib.config`
* Support all parameter-class
* Use `get_show_string` to render better output when using `show <param>`
* Show more information when using `help set <param>` and `help show <param>` if we create a config with `help_docstring` parameter.
Some examples of the updates included in this commit:
1. `gdb.PARAM_AUTO_BOOLEAN` with `help_docstring`
In Python script:
```
pwndbg.gdblib.config.add_param(
"test",
None,
"test",
"on == AAAA\noff == BBBB\nauto == CCCC",
gdb.PARAM_AUTO_BOOLEAN,
scope="test",
)
```
In GDB:
```
pwndbg> show test
The current value of 'test' is 'auto'
pwndbg> set test on
Set test to 'on'
pwndbg> set test off
Set test to 'off'
pwndbg> set test auto_with_typo
"on", "off" or "auto" expected.
pwndbg> show test
The current value of 'test' is 'off'
pwndbg> set test auto
Set test to 'auto'
pwndbg> show test
The current value of 'test' is 'auto'
pwndbg> help show test
Show test
on == AAAA
off == BBBB
auto == CCCC
pwndbg> help set test
Set test
on == AAAA
off == BBBB
auto == CCCC
```
2. `gdb.PARAM_AUTO_BOOLEAN` with `help_docstring`
In Python script:
```
pwndbg.gdblib.config.add_param(
"test",
"A",
"test",
"A == AAAA\nB == BBBB\nC == CCCC",
gdb.PARAM_ENUM,
["A", "B", "C"],
scope="test",
)
```
In GDB:
```
pwndbg> show test
The current value of 'test' is 'A'
pwndbg> set test B
Set test to 'B'
pwndbg> set test C
Set test to 'C'
pwndbg> set test D
Undefined item: "D".
pwndbg> show test
The current value of 'test' is 'C'
pwndbg> help show test
Show test
A == AAAA
B == BBBB
C == CCCC
pwndbg> help set test
Set test
A == AAAA
B == BBBB
C == CCCC
```
* Update the tests for gdblib parameter
* Use auto boolean for `safe-linking`
* Fix some comments
* Pass `help_docstring` directly
* Force callers of `add_param` to use keyword arguments
* Create `add_heap_param()` to avoid setting the scope of param everytime
* Add a header to the vmmap table
A simple header has been added to the output of vmmap which helps new users identify the columns.
* fix: lint
* fix: failing test
Adjust the length of expected vmmaps
* fix: tests again
* Fix parameter default values
Before this commit the created gdb.Parameter default values were not set
properly. Now, we set the object's .value field properly with the
provided default value.
* fix issue with set/show docstring
* fix lint
* fix lint
* fix lint
* fix parameter further...
* fix flake8 lint
* Increase CI timeout to 20 minutes
* Fixes: set context-sections '' and add more opts to set empty sections
The `validate_context_sections` function started to receive a string of
`"''"` after the changes in eabab31. Before those changes, it always
received an empty string (`""`).
I am not sure why this behavior changed in that commit, but the current
behavior resembles the native GDB behavior more. We can see this here on
a GDB native parameter:
```
(gdb) set exec-wrapper ''
(gdb) show exec-wrapper
The wrapper for running programs is "''".
```
And so we will keep this native behavior for our config variables for
now. But since this changed, I want to keep the old behavior of: `set
context-sections ''` working, and so this commit brings it.
Additionally, we also now allow setting empty context via multiple
values: empty string, empty quotations or double quotations and with
strings like `-` or `none`.
...and this commit comes with tests for this behavior so it will be
harder to introduce such issues anymore :)
* added Bin classes from old PR #1063 back
* added Bin classes from pr #1063
* added more properties to Arena class
* integrated Bin classes with the malloc_chunk command
* integrated Bin classes with vis and try_free. passed all heap tests
* very small change
* fixed lint
* fixed lint
* fixed lint..
* finally fixed lint
* Delete .err.txt
Co-authored-by: Gulshan Singh <gsingh2011@gmail.com>
Co-authored-by: Tingfeng Yu <tingfeng.yu@anu.edu.au>
* fix: make mprotect command truly multi-arch
Added register saving based on reg_sets defined for each processor architecture, additionally shellcraft is used to generate the arch-specific shellcode.
Unfortunately this command is not currently tested on platforms other than x86_64.
* Update pwndbg/commands/mprotect.py
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* mprotect: Add parsing, alignment to the addr argument
This change makes sure that the addr argument is parsed as an gdb expression (so you can use registers for example) and aligns it to the nearest page boundary.
* mprotect: Clean up register saving, print the result
Cleaned up saving of registers and added printing of the results, as per disconnect's sugesstions.
* Simplify the test for mprotect
Simplify the code and remove the useless binary
* Update tests/test_mprotect.py
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Add reset_on_thread decorator
* Apply reset_on_thread to Heap.multithreaded
* Add multithreaded malloc_chunk tests
* Clarify comment in C source
* Clarify expected thread number with assert in test
* Fix#1256: fixes next cmds hangs on segfaults
Before this commit the next/step commands like `nextret`, `stepret`,
`nextsyscall`, `nextproginstr` etc. would hang if they approach a
segfault. This commit fixes it by checking for ANY signals by executing
the GDB's `info prog` command and parsing its output.
* fix lint
* Move symbol.py to gdblib
* Renamed private methods
* Renamed pwndbg.symbol to pwndbg.gdblib.symbol
* Cleanup symbol.py
* Fix lint issues
* Handle tls error on symbol lookup
* Fix merge conflicts
* Remove old way of looking up symbols
This commit adds a test for context disasm showing of file descriptors
file paths in syscalls like read() or close().
It also fixes a small issue when Pwndbg is run with PWNDBG_DISABLE_COLORS=1
This issue was that executing:
```
pi '{a:2}'.format(a=pwndbg.color.context.prefix(pwndbg.config.code_prefix))
```
Failed when Pwndbg was run with disabled colors. It failed because our
generate color functions in pwndbg/color/* ended up not processing the
input argument -- which here is a Pwndbg config Paramater object -- so
that we got a very non obvious exception:
```
Exception occurred: context: unsupported format string passed to Parameter.__format__ (<class 'TypeError'>)
```
This issue could hypothetically also exist if our config value would be
empty I think. So with the fix in this commit, where we do str(x) over
the color funciton argument should fix this issue in all cases.
Turns out the mprotect command didn't ever work, as it was amd64 only, but used x86 syscall numbers to call mprotect. I have refactored the command to use shellcraft to generate the shellcode that calls mprotect. I have also unit-tested this command.
* Improve vmmap on coredump files
With this commit we now recognize coredumps better and also finally have
a simple test for vmmap commands on:
- a running binary
- on a loaded coredump file with loaded binary
- on a loaded coredump file without a loaded binary
We also stop saving vmmaps for `maintenance info sections` sections
which have a start address of 0x0. While there could potentially be a
coredump file from a binary with start=0x0, this should work in most
cases.
We could in theory do a slighty better: we could take the vmmap at 0 and
try to read memory from it. However, I am not sure if it is a good idea
to try such memory read?
* remove unused import
* add missing crash_simple.asm
* fix vmmap coredump test on different ubuntu mem layouts
* use /proc/$pid/maps for vmmap tests
* fix formatting
* fix import
* fix test
* fix test
* fix test
* fix lint
* fix test
* fix test
* fix test
* fix test
* fix lint
* another fixup for ubuntu 22.04
* another fixup for ubuntu 22.04
* lint
* Add a regression test for find_fake_fast
The test program creates a fake chunk size field in its .data section
with a set NON_MAIN_ARENA flag. The Python test runs the find_fake_fast
command on an address succeeding the fake chunk. A gdb.MemoryError
indicates regression - issue #1142
* Make linter happy
* fix#1111 errno command edge case
This commit fixes the case when errno command causes a binary to
segfault when the `__errno_location` symbol was defined but its .plt.got
entry was not filled yet by the dynamic loader (ld.so), so e.g. when the
glibc library was not loaded yet.
In such a case, us triggering a call to `__errno_location` function
triggered a jump to an unmapped address. Now, we dereference that
.plt.got symbol and see if it lives in mapped memory.
* add tip about errno command
* errno: fix case when __errno_location@got.plt is missing
* fix lint
* fix sh lint
* fix errno test
This should fix things like:
> tests/test_heap.py::test_try_free_invalid_next_size_fast Dwarf Error: DW_FORM_strx1 found in non-DWO CU [in module /pwndbg/tests/binaries/heap_bugs.out]
* Make ZIGPATH configurable and provide defaults
Mostly fixes docker/docker-compose environment where building zig into
$pwd/.zig doesn't work well because it is later overwritten by mounting
the volume in /pwndbg.
With current approach during the docker build zig is put in /opt/zig
instead, and when you run it without docker it's possible to configure a
different path (with sane defaults)
* remove Makefile
* add ZIGPATH to tests.sh for CI
* move ZIGPATH setting before make in tests
* tools: change zig to install from a tarball
Migrate from using snap, we install from a cheksumed tarball
* fix: add sudo
* fix: install zig to .zig in PWD
Co-authored-by: Albert Koczy <albert.koczy@asseco.pl>
* Add Bins classes and refactor allocator methods to return them
* Refactor bins() and related commands
* Refactor malloc_chunk
* Use chunk_size_nomask in top_chunk()
* Refactor vis_heap_chunks
* Rename read_chunk to read_chunk_from_gdb and move to ptmalloc.py
* Add get_first_chunk_in_heap and use it in heap and vis_heap_chunks commands
* Move some methods from DebugSymsHeap to Heap base class
* Strip type hints from heap.py and ptmalloc.py
* Set heap_region before using it
* Fix test_heap_bins test
* Fix try_free
When we moved to argparse command parsing we introduced `gdb_sloppy_parse` which wasn't perfect: e.g. for `gdb.parse_and_eval("__libc_start_main")` would return a `gdb.Value()` whose `.type.name` was `long long`.
As a result when code that used `gdb_sloppy_parse` then casted the result to `int(gdb_value)` it crashed because for some reason GDB errored.
This commit fixes the issues related to it by adding `AddressExpr` and `HexOrAddressExpr` functions.
It also adds tests for some of the windbg compatibility commands and fixes some nifty details here and there.
Those lines are redundant in our case: pwndbg is not imported or launched directly.
Also, the coding lines were relevant in Py2 but are not really needed in Py3.
* Tests launcher: show passed and failed count
* Build nearpc, emulate, u, pdisass test binaries
* Add tests for emulate, nearpc, pdisass, u
* Refactored disasm and emulator
* Fix nearpc following jumps w/o emulation
* Prevent tests from calling start_binary twice
* Add test for emulate_disasm_loop
* Fix isort
* Add nasm to travis install
* Add --eval-command quit to tests invocation
This should prevent travis from staying in gdb/stalled build when something fails in weird way (like a file is missing)
```
[+] Building 'emulate_disasm.o'
make: nasm: Command not found
make: *** [emulate_disasm.o] Error 127
gdbinit.py: No such file or directory.
pytests_collect.py: No such file or directory.
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
```
* Add test binaries
* it would be cool to have tests that run within GDB so that we don't have to parse GDB output and deal with weird problems
* we can't run all tests in one GDB session as `file x; entry; <some pwndbg command>; file y; entry; <some wndbg command>;` may have different results - it seems either us or GDB fails to cleanup everything properly
* Add prototype of unit tests for pwndbg
* Add test for pwndbg [filter]
* Fix isort, e2e tests, add pytest requirement
* Add comment about not handling exceptions for unittests
* Fixes after rebase
* Fix test_loads_without_crashing
* e2e tests: no colors & loading pwndbg tests
* Fix isort
* Add example of no file loaded test
* Move tests to unit_tests, add binary, add memory tests
* Isort fixes
* Move from e2e/unit tests to tests
* Add info about tests to DEVELOPING.md
* Fix tests
* review fixes
* commands filtering test: check for contents, not for equality
* Add tests launcher bash script
* Change tests launcher name from unittests to pytests
* Cleanup; better test file paths
* Add theme param to disable colors
* Better test_loads
* Skip some tests locally that can run on travis
* Fix test_loads according to travis
* Fix travis tests
* Make chain.get() to check vmmap first in bare metal mode
Make chain.get() limit to de-reference within the known page in
bare metal mode.
Since the address are all valid when mmu is not enable and all
the value are valid physical address. It will be de-referenced
even these addresses are not used and actually, it is data in
the most of case. Ex. 0x1 often means the value 1, not the
address 0x1.
Also, for issue #371, some addresses may be the MMIO registers.
The read operation on these address will break the state.
It is better to limit the de-reference address range. This patch
will also fix it, hopefully.
* Add custom vmmap add/del API in vmmap.py
In some cases, ex. bare metal, the pages information can not be
detected automatically. Also, the most of pwndbg feature rely on
page information such as highlighting.
User may want to create page information manually and maintain it
by himself.
This commit add python APIs to manually add/del page information
and they are isolated.
* Fix stack page detection in bare metal mode
We can not detect the stack page size in bare metal mode by
1. finding the ELF location after the stack page
2. page fault
A simple workaround is returning the current $sp page
and assume it is the stack page.
* Add vmmap control command to add/del customized vmmap
In some cases, ex. bare metal, the pages information can not be
detected automatically. Also, the most of pwndbg feature rely on
page information such as highlighting.
User may want to create page information manually and maintain it
by himself.
I add few commands to make user can add/del pages and load page
information from ELF sections.
* Fix the command amount for auto test to pass CI
* Add warning message
* Fix descriptions
* Fix cache issue and use bisect in insert API
* Keep LinuxOnly in find_elf_magic
* remove XXX
Adds `$rebase(offset)` gdbfunction that can be used to set up a breakpoint
over an offset from program image base.
Also changed a bit the pwndbg banner displayed at startup.