* Don't catch the error of the heap commands for developers
* Use `pwndbg.config` and re-raise the error
See https://github.com/pwndbg/pwndbg/pull/1270#discussion_r992209956
* Update pwndbg/commands/__init__.py
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
Before this commit we passed `pwndbg.gdblib.arch.current` as `arch=...`
keyword argument to pwnlib functions like `asm` and `disasm`.
Since pwnlib has a concept of "context" that holds variables like
currently set architecture or number of bits, this commit starts using
those for the `patch` command implementation as we started to set pwnlib
context recently in 9e84c18c44
* Fix#1256: fixes next cmds hangs on segfaults
Before this commit the next/step commands like `nextret`, `stepret`,
`nextsyscall`, `nextproginstr` etc. would hang if they approach a
segfault. This commit fixes it by checking for ANY signals by executing
the GDB's `info prog` command and parsing its output.
* fix lint
This commit allows for setting the selected thread's registers by using
the pwndbg.gdblib.regs.<register-name> = <new-value> expressions. Before
this commit invoking such Python code would set the internal Pwndbg
register value, but not really the inferior value. This could lead to
weird issues when the displayed context shows the new register value but
e.g. `info reg rax` displays the old value.
* Move symbol.py to gdblib
* Renamed private methods
* Renamed pwndbg.symbol to pwndbg.gdblib.symbol
* Cleanup symbol.py
* Fix lint issues
* Handle tls error on symbol lookup
* Fix merge conflicts
* Remove old way of looking up symbols
* Enhance the error handling of the heap command
* Add a new method: `can_be_resolved()` to heap classes to check whether we can resolve the heap after the heap is initialized.
* Add a new function: `get_got_plt_address()` to `pwndbg.glibc`, by doing this, we can determine the location of the symbols without `_IO_list_all` by parsing the output of `info files`.
* Add a new subclass of Exception: `SymbolUnresolvableError` to handle the error when we can't resolve some symbols.
* If we didn't set the GLIBC version manually, we won't get the unnecessary Python error from this now, instead, we tell the user how to set it.
* If we didn't have enough information to resolve some symbols, we show which symbol we lack and tell the user how to set it manually instead of just executing it and showing a Python error.
* Avoid getting the wrong heap config
* List the symbols manually instead of using `locals()`
* Avoid the extra function call by `can_be_resolved`
* Enhance the error handling when finding TLS (#1237)
* Enhance the error handling for more cases
* Add support to use `gdb.lookup_static_symbol` in `pwndbg.symbol`
* Enhance the strategy when handling the heap-related symbols
* Use `pwndbg.symbol.static_linkage_symbol_address()` to get the address of the symbol first
e.g. Let's say we have a file called `FILENAME.c`:
```
void *main_arena = 0xdeadbeaf;
int main(){
free(malloc(0x20));
return 0;
}
```
If we compiled it with `gcc FILENAME.c -g`, the old heap command will fail because it thinks `main_arena` is this 0xdeadbeaf `main_arena`, not the "real" `main_arena` in GLIBC.
With this commit, it should work without this issue.
* Revert "Enhance the error handling when finding TLS (#1237)"
This reverts commit 7d2d1ae6b6.
* Enhance the error handling when finding TLS (#1237)
* Catch the error when reading the address of the static linkage symbol
* Bug fix for `thread_cache` under heuristic mode
* Bug fix for `static_linkage_symbol_address()`
* If `gdb.lookup_static_symbol(symbol)` is None, it will cause the
error.
* Use new code after refactoring
* Fix#1197: dont display ctx on reg/mem changes
This commit fixes a bug where we displayed context on registers or memory changes made by the user, so e.g. when user executed one of:
```
set *rax=1
set *(int*)0x<some address> = 0x1234
set *(unsigned long long*)$rsp+4=0x44444444
```
It fixes it by just... setting a flag after the context is displayed for
the first time and resetting it on a continue GDB event.
There was a previous attempt to fix this bug in #1226 but it was rather
a hack than a proper fix. This current commit should be a proper fix :P.
Below is some more explanation of this bug.
The fact that we displayed ctx on regs/mem changes was a result us clearing the cache of the `prompt_hook_on_stop` function:
```python
@pwndbg.lib.memoize.reset_on_stop
def prompt_hook_on_stop(*a):
pwndbg.commands.context.context()
```
Where this function is called in `prompt_hook`, on each prompt display:
```python
def prompt_hook(*a):
global cur
new = (gdb.selected_inferior(), gdb.selected_thread())
if cur != new:
pwndbg.gdblib.events.after_reload(start=cur is None)
cur = new
if pwndbg.proc.alive and pwndbg.proc.thread_is_stopped:
prompt_hook_on_stop(*a)
```
So, since we cleared this function cache on each register/memory changes, it resulted in us displaying context on each prompt hook.
So how did we clear this function cache? Through the `memoize_on_stop` function:
```
@pwndbg.gdblib.events.stop
@pwndbg.gdblib.events.mem_changed
@pwndbg.gdblib.events.reg_changed
def memoize_on_stop():
reset_on_stop._reset()
```
But why? We need this to make sure that all of the executed commands, when they read memory or registry, get proper new (not cached) values!
So it makes sense to keep reseting the stop caches on mem/reg changed events. Otherwise, we would use incorrect (old) values if user set a register/memory and then used some commands like `context` or other that depend on register/memory state.
* lint
Hopefully his will improve the discoverability/UX for users who are not
aware of those options.
This is how the new registers banner looks like:
```
[ REGISTERS / show-flags off / show-compact-regs off ]
```
Fwiw it is 54 chars long (without "---" before and after) so its length
should be fine.
This commit adds a test for context disasm showing of file descriptors
file paths in syscalls like read() or close().
It also fixes a small issue when Pwndbg is run with PWNDBG_DISABLE_COLORS=1
This issue was that executing:
```
pi '{a:2}'.format(a=pwndbg.color.context.prefix(pwndbg.config.code_prefix))
```
Failed when Pwndbg was run with disabled colors. It failed because our
generate color functions in pwndbg/color/* ended up not processing the
input argument -- which here is a Pwndbg config Paramater object -- so
that we got a very non obvious exception:
```
Exception occurred: context: unsupported format string passed to Parameter.__format__ (<class 'TypeError'>)
```
This issue could hypothetically also exist if our config value would be
empty I think. So with the fix in this commit, where we do str(x) over
the color funciton argument should fix this issue in all cases.
This commit enhances the heap commands UX for statically linked binaries
and removes typeinfo module bloat.
The typeinfo module had this typeinfo.load function that was looking up a given type.
If it didn't find the type, it fallbacked to compiling many many system
headers in order to add a symbol for a given type into the program. This was
supposed to be used for missing glibc malloc symbols like malloc_chunk.
However, the exact reason it was used: the struct malloc_chunk was never
defined in a header file and was always defined in a malloc.c or another
.c file in glibc sources.
Another place the typeinfo.load logic of compiling headers was/is used
is the `dt` command, which is a windbg alias for getting struct layout
type information, e.g.:
```
pwndbg> dt 'struct malloc_chunk'
struct malloc_chunk
+0x0000 mchunk_prev_size : size_t
+0x0008 mchunk_size : size_t
+0x0010 fd : struct malloc_chunk *
+0x0018 bk : struct malloc_chunk *
+0x0020 fd_nextsize : struct malloc_chunk *
+0x0028 bk_nextsize : struct malloc_chunk *
pwndbg>
```
However, the whole big issue with typeinfo.load compilation of headers
was that most of the time it didn't work because e.g. some headers
defined in other paths were missing or that two different headers used
the same struct/function name and the compilation failed.
Since this logic almost never gave good results, I am removing it.
Regarding UX for statically linked binaries: we use `info dll` command
to see if a binary is statically linked. While this method is not
robust, as it may give us wrong results if the statically linked binary
used `dlopen(...)` it is probably good enough.
Now, if a heap related command is executed on statically linked binaries, it
will inform the user and set the resolving of libc heap symbols via
heuristics. Then, it also says to the user they have to set the glibc
version and re-run the command.
This commit tries to fix the issue of our `set context-clear-screen on`
option resetting the scrollback buffer on some terminals like
gnome-terminal (fwiw it did not happen on terminator or on tmux).
It also adds info to tips about that option.
Turns out the mprotect command didn't ever work, as it was amd64 only, but used x86 syscall numbers to call mprotect. I have refactored the command to use shellcraft to generate the shellcode that calls mprotect. I have also unit-tested this command.
Fixes the `nextproginst` command and adds two simple tests for it.
The command had two following issues:
1) It assumed that the program vmmap was always the first vmmap with
proc.exe objfile name -- this assumption has two flaws. First, newer
linkers will create the first memory page for the binary file as
read-only. This is because you do not need the ELF header content to be
executable, and that was the case in old linkers or linux distributions.
As an example, see those vmmap from a simple hello world binary compiled
on Ubuntu 18.04 vs Ubuntu 22.04:
Ubuntu 18.04:
```
pwndbg> vmmap
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x555555554000 0x555555555000 r-xp 1000 0 /home/dc/a.out
0x555555754000 0x555555755000 r--p 1000 0 /home/dc/a.out
0x555555755000 0x555555756000 rw-p 1000 1000 /home/dc/a.out
[...]
```
Ubuntu 22.04:
```
pwndbg> vmmap
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x555555554000 0x555555555000 r--p 1000 0 /home/user/a.out
0x555555555000 0x555555556000 r-xp 1000 1000 /home/user/a.out
0x555555556000 0x555555557000 r--p 1000 2000 /home/user/a.out
0x555555557000 0x555555558000 r--p 1000 2000 /home/user/a.out
0x555555558000 0x555555559000 rw-p 1000 3000 /home/user/a.out
```
So, before this commit on Ubuntu 22.04 we ended up taking the first
vmmap which was non-executable and we compared the program counter
register against it after each instruction step executed by the
nextproginstr command. As a result, we ended up never getting back to
the user and just finishing the debugged program this way!
Now, after this commit, we will grab only and all the executable pages for
the binary that we debug and compare and compare against them.
2) The second problem was that we printed out the current Pwndbg context
after executing nextproginstr succesfully. This does not seem to make
much sense because the context should be printed by the prompt hook.
(Without removing this, we ended up printing the context twice)
* add patch command
This commit adds the `patch`, `patch_list` and `patch_revert` commands
and adds the `pwntools==4.8.0` as Pwndbg dependency.
The current implementation could be further improved by:
- adding tests :)
- maybe moving `patch_list` and `patch_revert` to `patch --list` and
`patch --revert` flags?
- better handling of incorrect args/pwnlib exceptions
* lint
* Improve vmmap on coredump files
With this commit we now recognize coredumps better and also finally have
a simple test for vmmap commands on:
- a running binary
- on a loaded coredump file with loaded binary
- on a loaded coredump file without a loaded binary
We also stop saving vmmaps for `maintenance info sections` sections
which have a start address of 0x0. While there could potentially be a
coredump file from a binary with start=0x0, this should work in most
cases.
We could in theory do a slighty better: we could take the vmmap at 0 and
try to read memory from it. However, I am not sure if it is a good idea
to try such memory read?
* remove unused import
* add missing crash_simple.asm
* fix vmmap coredump test on different ubuntu mem layouts
* use /proc/$pid/maps for vmmap tests
* fix formatting
* fix import
* fix test
* fix test
* fix test
* fix lint
* fix test
* fix test
* fix test
* fix test
* fix lint
* another fixup for ubuntu 22.04
* another fixup for ubuntu 22.04
* lint
Call fetch_lazy() on the gdb.Value acquired in get_heap() and wrap it in
a try/except block. Return None if gdb.MemoryError is raised.
Let get_arena_for_chunk() handle None returned by get_heap().
Fixes#1142
* fix#1111 errno command edge case
This commit fixes the case when errno command causes a binary to
segfault when the `__errno_location` symbol was defined but its .plt.got
entry was not filled yet by the dynamic loader (ld.so), so e.g. when the
glibc library was not loaded yet.
In such a case, us triggering a call to `__errno_location` function
triggered a jump to an unmapped address. Now, we dereference that
.plt.got symbol and see if it lives in mapped memory.
* add tip about errno command
* errno: fix case when __errno_location@got.plt is missing
* fix lint
* fix sh lint
* fix errno test
* Don't exclude pwndbg/lib in .gitignore
* Move which.py to lib/which.py
* move funcparser.py and functions.py to lib/
* moved version.py to lib/
* Move tips.py to lib/
* Update pwndbg/lib/version.py
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Fix some bugs of the aarch64 heuristic and a bug about tcache
* Some orders of the aarch64 assembly instructions might have a little bit different, so I make it more general. Some bugs of the older version can reproduce by the libc here (https://github.com/perfectblue/ctf-writeups/tree/master/2019/insomnihack-teaser-2019/nyanc/challenge)
* If we didn't find the correct tcache symbol address via heuristic, we will now use our fallback strategies for it.
* Refactor the code in a cleaner way
See https://github.com/pwndbg/pwndbg/pull/1029#discussion_r945934337
* Update the fallback solution of finding `main_arena`
* Since the arenas are a circular linked list, we can iterate it to check the address we guess is `main_arena` or not (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r945335543)
* Update the boundaries of the address we might guess to avoid some unneeded tests
* Remove guard code for `mp_` before we test the fallback solution
Fix https://github.com/pwndbg/pwndbg/pull/1029#discussion_r945338469
* Refactor TLS features and fix a bug about fsbase/gsbase
* Move TLS features into an external module, and now the user can use the `tls` command to get its address (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r945336737)
* Avoid `ValueError: Bad register` when trying to access fsbase/gsbase if the current arch is i386
* Fix a bug about tls finding for i386: `__errno_location` not always in `libc.so.6`, sometimes it will also in `libpthread-*.so`
* Update the comments about finding tcache
* Use `exit` event to avoid unnecessary reset
* Add a paramter for GLIBC version
* Update some strategies of heuristic
* Try to resolve heap via debug symbols even when using the heuristic
(By doing this, the binary compiled with `--static` flag can work with the heuristics by setting the GLIBC version manually)
* Try to avoid false positives when finding the symbol address and TLS base via heuristic
* Refactor some useless code
* Update the descriptions of the heap config
* Update the tips for the heap heuristics features
* Raise error when user set the GLIBC version in the wrong format
* Use `reset_on_start` with `glibc._get_version`
See https://github.com/pwndbg/pwndbg/pull/1075#discussion_r957899458
* Remove some unnecessary information in the hint message
See https://github.com/pwndbg/pwndbg/pull/1075#discussion_r957900468
* Use black to fix the format
* Fix indent
* Use black to fix the format
* improve start and entry commands description
Now, those commands will display proper description, describing when
they actually stop and what else can you do (e.g. run `starti` command
if u need to stop on first stop!).
* Update start.py
* ArgparsedCommand: fix `help cmd` and `cmd --help` behavior
Before this commit there was always a mismatch of what was displayed
when the user did `<command> --help` or `help <command>`.
With those changes, we fetch the help string from the argument parser
and render it as the command object's `self.__doc__`, so that it will be
displayed during `help <command>`.
Previously, we only displayed the command description during help.
* fix the pwndbg [filter] command that was broken in previous commit
* add riscv:rv64 registers
base on https://github.com/pwndbg/pwndbg/pull/829 by Tobias Faller <faller@endiio.com>
* disassemble without capstone to support other architectures
* ignore gdb.error on context_backtrace
This commit cleans up the commands/__init__.py a bit by removing the
`QuietSloppyParsedCommand` that we do not use anymore.
The last command that used it was `brva` which was just an alias for
`breakrva`, so now we just set it as an alias using the
`ArgparsedCommand` as it should be done.
* Add Bins classes and refactor allocator methods to return them
* Refactor bins() and related commands
* Refactor malloc_chunk
* Use chunk_size_nomask in top_chunk()
* Refactor vis_heap_chunks
* Rename read_chunk to read_chunk_from_gdb and move to ptmalloc.py
* Add get_first_chunk_in_heap and use it in heap and vis_heap_chunks commands
* Move some methods from DebugSymsHeap to Heap base class
* Strip type hints from heap.py and ptmalloc.py
* Set heap_region before using it
* Fix test_heap_bins test
* Fix try_free
This commit improves the `search --next ...` speed by making it so that
only the saved addresses are checked. Previously, the command performed
a full search even in the presence of `--next` flag and only afterwards
filtered the results. That resulted in extremely slow execution e.g.
when debugging processes with gigabytes of allocated memory.
The commit also adds a `--trunc-out` argument which makes it so that
only 20 results are displayed. This is helpful when performing a
CheatEngine-style search when we try to locate a given field/value
address in memory by first finding its known value, then changing its
value in the program and then re-searching the space.
The `--trunc-out` argument could further be improved by enabling it
default and making users aware that the results were truncated.
This PR removes ~40 commands that could be used to run shell programs.
I am removing this since GDB has the support for running shell programs
with either `shell <command...>` or `<!command...>` and so we do not
need this feature in Pwndbg anymore.
This feature also bloated Pwndbg a little bit and made more interesting
commands harder to find e.g. through the `pwndbg` command.
* Add support to use heap commands without debug symbols
* Fix a possible bugs when getting heap boundaries via heuristic
See https://github.com/pwndbg/pwndbg/pull/1029#issuecomment-1189841299
* Fix typo causing issues in `c_malloc_par_2_25`
See https://github.com/pwndbg/pwndbg/pull/1029#issuecomment-1189841299
* Fix a bug for `tcache_perthread_struct` and refactor some codes in `structs.py`
* The bug: `tcache_perthread_struct` for GLIBC < 2.30 is using `char` instead of `uint16_t` for `counts` field
* Fix some bugs about handling `thread_arena` and `tcache` with multithreaded
* Re-initialize the heap when the process stop or the file changed
By doing this, we can attach to another architecture in GDB without any bugs.
* Add guard code for unsupported architectures
* Support heuristic for arm and aarch64
Note: thread_arena and thread_cache for arm still can not work
* Update .pylintrc
* Ignore `import-error` error for `import gdb`
* Ignore `no-member` error for `pwndbg.typeinfo.*`, because most of its members are dynamically generated.
* Ignore `protected-access` warning for `_fields_`, `_type_`, `_length_`, because ctypes don't have other ways to access them.
* Refactor some code and comment to fit pep8 and lint check
* Add a feature to enable users set symbol addresses manually
For example, by using `set main_arena 0xdeadbeaf`, pwndbg will try to find main_arena at 0xdeadbeaf when using heuristic
* Use `__errno_location` to find TLS base for arm
By doing this, we can get `thread_arena` and `tcache` address
* Block other thread before `__errno_location()`
* Fix a bug for arm32 and a typo-caused bug
* Some wrong field names inside `c_heap_info` may cause some bugs in the future if we want to access it
* `pad` size of `heap_info` for arm32 is 0 byte, only i386 is 8 bytes, so I fixed it in a hard-coded way temporary
* Fix#1044 related issues
* Refactor the code about heap related config
* Use `int(address_str, 0)` to auto determine the format (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939569382)
* Use `OnlyWithResolvedHeapSyms` instead of `OnlyWithLibcDebugSyms` (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939568687)
* Use `resolve-heap-via-heuristic` instead of `resolve-via-heuristic` (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939569076)
* Update the description of `resolve-heap-via-heuristic` config (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939569069)
* Move heap related config into `heap` scope, and add a new command, `heap_config`, to show the config in that scope (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939569260)
* Refactor code about the config of heap related symbols
* Fix the logic when thread_arena is not found
* Use errno trick as a fallback for i386 and x86-64
* Update pwndbg/heap/ptmalloc.py
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* remove py2 coding notations from files
* remove six package use and replace with proper py3 code
* remove py2 futures use
* replace unicode string literals with string literals
* remove python2 urlparse import in favor of python3 urllib.parse
* keep ida_script in py2 version
* remove hashbang python lines as the files are never ran directly
When we moved to argparse command parsing we introduced `gdb_sloppy_parse` which wasn't perfect: e.g. for `gdb.parse_and_eval("__libc_start_main")` would return a `gdb.Value()` whose `.type.name` was `long long`.
As a result when code that used `gdb_sloppy_parse` then casted the result to `int(gdb_value)` it crashed because for some reason GDB errored.
This commit fixes the issues related to it by adding `AddressExpr` and `HexOrAddressExpr` functions.
It also adds tests for some of the windbg compatibility commands and fixes some nifty details here and there.
Those lines are redundant in our case: pwndbg is not imported or launched directly.
Also, the coding lines were relevant in Py2 but are not really needed in Py3.
Those regs does not seem to work on i386: I can't do `i r dil` in i386 but I can do so in amd64 binaries.
Via https://www.tortall.net/projects/yasm/manual/html/arch-x86-registers.html :
```
The 64-bit x86 register set consists of 16 general purpose registers, only 8 of which are available in 16-bit and 32-bit mode. The core eight 16-bit registers are AX, BX, CX, DX, SI, DI, BP, and SP. The least significant 8 bits of the first four of these registers are accessible via the AL, BL, CL, and DL in all execution modes. In 64-bit mode, the least significant 8 bits of the other four of these registers are also accessible; these are named SIL, DIL, SPL, and BPL. The most significant 8 bits of the first four 16-bit registers are also available, although there are some restrictions on when they can be used in 64-bit mode; these are named AH, BH, CH, and DH.
```
and the table present there, it seems SIL, DIL, SPL and BPL are only available in 64-bit mode.
The typeinfo module used a static global tempdir location of /tmp/pwndbg
that an attacker may control and prepare symlinks of the predictable
files that are then written to.
TL;DR: With .splitlines() we splitted over universal splitlines which did not correspond to GDB's target code line splitting...
As a result we got `context code` to produce bogus out of sync lines that didn't correspond to GDB's `line` command.
See also https://docs.python.org/3/library/stdtypes.html#str.splitlines
Addresses #957 by enclosing anonymous map names printed by vmmap in square brackets.
Search still works & xinfo plays nice, but please let me know if you find anything this breaks.