This commit enhances the heap commands UX for statically linked binaries
and removes typeinfo module bloat.
The typeinfo module had this typeinfo.load function that was looking up a given type.
If it didn't find the type, it fallbacked to compiling many many system
headers in order to add a symbol for a given type into the program. This was
supposed to be used for missing glibc malloc symbols like malloc_chunk.
However, the exact reason it was used: the struct malloc_chunk was never
defined in a header file and was always defined in a malloc.c or another
.c file in glibc sources.
Another place the typeinfo.load logic of compiling headers was/is used
is the `dt` command, which is a windbg alias for getting struct layout
type information, e.g.:
```
pwndbg> dt 'struct malloc_chunk'
struct malloc_chunk
+0x0000 mchunk_prev_size : size_t
+0x0008 mchunk_size : size_t
+0x0010 fd : struct malloc_chunk *
+0x0018 bk : struct malloc_chunk *
+0x0020 fd_nextsize : struct malloc_chunk *
+0x0028 bk_nextsize : struct malloc_chunk *
pwndbg>
```
However, the whole big issue with typeinfo.load compilation of headers
was that most of the time it didn't work because e.g. some headers
defined in other paths were missing or that two different headers used
the same struct/function name and the compilation failed.
Since this logic almost never gave good results, I am removing it.
Regarding UX for statically linked binaries: we use `info dll` command
to see if a binary is statically linked. While this method is not
robust, as it may give us wrong results if the statically linked binary
used `dlopen(...)` it is probably good enough.
Now, if a heap related command is executed on statically linked binaries, it
will inform the user and set the resolving of libc heap symbols via
heuristics. Then, it also says to the user they have to set the glibc
version and re-run the command.
This commit tries to fix the issue of our `set context-clear-screen on`
option resetting the scrollback buffer on some terminals like
gnome-terminal (fwiw it did not happen on terminator or on tmux).
It also adds info to tips about that option.
Turns out the mprotect command didn't ever work, as it was amd64 only, but used x86 syscall numbers to call mprotect. I have refactored the command to use shellcraft to generate the shellcode that calls mprotect. I have also unit-tested this command.
Fixes the `nextproginst` command and adds two simple tests for it.
The command had two following issues:
1) It assumed that the program vmmap was always the first vmmap with
proc.exe objfile name -- this assumption has two flaws. First, newer
linkers will create the first memory page for the binary file as
read-only. This is because you do not need the ELF header content to be
executable, and that was the case in old linkers or linux distributions.
As an example, see those vmmap from a simple hello world binary compiled
on Ubuntu 18.04 vs Ubuntu 22.04:
Ubuntu 18.04:
```
pwndbg> vmmap
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x555555554000 0x555555555000 r-xp 1000 0 /home/dc/a.out
0x555555754000 0x555555755000 r--p 1000 0 /home/dc/a.out
0x555555755000 0x555555756000 rw-p 1000 1000 /home/dc/a.out
[...]
```
Ubuntu 22.04:
```
pwndbg> vmmap
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x555555554000 0x555555555000 r--p 1000 0 /home/user/a.out
0x555555555000 0x555555556000 r-xp 1000 1000 /home/user/a.out
0x555555556000 0x555555557000 r--p 1000 2000 /home/user/a.out
0x555555557000 0x555555558000 r--p 1000 2000 /home/user/a.out
0x555555558000 0x555555559000 rw-p 1000 3000 /home/user/a.out
```
So, before this commit on Ubuntu 22.04 we ended up taking the first
vmmap which was non-executable and we compared the program counter
register against it after each instruction step executed by the
nextproginstr command. As a result, we ended up never getting back to
the user and just finishing the debugged program this way!
Now, after this commit, we will grab only and all the executable pages for
the binary that we debug and compare and compare against them.
2) The second problem was that we printed out the current Pwndbg context
after executing nextproginstr succesfully. This does not seem to make
much sense because the context should be printed by the prompt hook.
(Without removing this, we ended up printing the context twice)
* add patch command
This commit adds the `patch`, `patch_list` and `patch_revert` commands
and adds the `pwntools==4.8.0` as Pwndbg dependency.
The current implementation could be further improved by:
- adding tests :)
- maybe moving `patch_list` and `patch_revert` to `patch --list` and
`patch --revert` flags?
- better handling of incorrect args/pwnlib exceptions
* lint
* Improve vmmap on coredump files
With this commit we now recognize coredumps better and also finally have
a simple test for vmmap commands on:
- a running binary
- on a loaded coredump file with loaded binary
- on a loaded coredump file without a loaded binary
We also stop saving vmmaps for `maintenance info sections` sections
which have a start address of 0x0. While there could potentially be a
coredump file from a binary with start=0x0, this should work in most
cases.
We could in theory do a slighty better: we could take the vmmap at 0 and
try to read memory from it. However, I am not sure if it is a good idea
to try such memory read?
* remove unused import
* add missing crash_simple.asm
* fix vmmap coredump test on different ubuntu mem layouts
* use /proc/$pid/maps for vmmap tests
* fix formatting
* fix import
* fix test
* fix test
* fix test
* fix lint
* fix test
* fix test
* fix test
* fix test
* fix lint
* another fixup for ubuntu 22.04
* another fixup for ubuntu 22.04
* lint
Call fetch_lazy() on the gdb.Value acquired in get_heap() and wrap it in
a try/except block. Return None if gdb.MemoryError is raised.
Let get_arena_for_chunk() handle None returned by get_heap().
Fixes#1142
* fix#1111 errno command edge case
This commit fixes the case when errno command causes a binary to
segfault when the `__errno_location` symbol was defined but its .plt.got
entry was not filled yet by the dynamic loader (ld.so), so e.g. when the
glibc library was not loaded yet.
In such a case, us triggering a call to `__errno_location` function
triggered a jump to an unmapped address. Now, we dereference that
.plt.got symbol and see if it lives in mapped memory.
* add tip about errno command
* errno: fix case when __errno_location@got.plt is missing
* fix lint
* fix sh lint
* fix errno test
* Don't exclude pwndbg/lib in .gitignore
* Move which.py to lib/which.py
* move funcparser.py and functions.py to lib/
* moved version.py to lib/
* Move tips.py to lib/
* Update pwndbg/lib/version.py
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* Fix some bugs of the aarch64 heuristic and a bug about tcache
* Some orders of the aarch64 assembly instructions might have a little bit different, so I make it more general. Some bugs of the older version can reproduce by the libc here (https://github.com/perfectblue/ctf-writeups/tree/master/2019/insomnihack-teaser-2019/nyanc/challenge)
* If we didn't find the correct tcache symbol address via heuristic, we will now use our fallback strategies for it.
* Refactor the code in a cleaner way
See https://github.com/pwndbg/pwndbg/pull/1029#discussion_r945934337
* Update the fallback solution of finding `main_arena`
* Since the arenas are a circular linked list, we can iterate it to check the address we guess is `main_arena` or not (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r945335543)
* Update the boundaries of the address we might guess to avoid some unneeded tests
* Remove guard code for `mp_` before we test the fallback solution
Fix https://github.com/pwndbg/pwndbg/pull/1029#discussion_r945338469
* Refactor TLS features and fix a bug about fsbase/gsbase
* Move TLS features into an external module, and now the user can use the `tls` command to get its address (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r945336737)
* Avoid `ValueError: Bad register` when trying to access fsbase/gsbase if the current arch is i386
* Fix a bug about tls finding for i386: `__errno_location` not always in `libc.so.6`, sometimes it will also in `libpthread-*.so`
* Update the comments about finding tcache
* Use `exit` event to avoid unnecessary reset
* Add a paramter for GLIBC version
* Update some strategies of heuristic
* Try to resolve heap via debug symbols even when using the heuristic
(By doing this, the binary compiled with `--static` flag can work with the heuristics by setting the GLIBC version manually)
* Try to avoid false positives when finding the symbol address and TLS base via heuristic
* Refactor some useless code
* Update the descriptions of the heap config
* Update the tips for the heap heuristics features
* Raise error when user set the GLIBC version in the wrong format
* Use `reset_on_start` with `glibc._get_version`
See https://github.com/pwndbg/pwndbg/pull/1075#discussion_r957899458
* Remove some unnecessary information in the hint message
See https://github.com/pwndbg/pwndbg/pull/1075#discussion_r957900468
* Use black to fix the format
* Fix indent
* Use black to fix the format
* improve start and entry commands description
Now, those commands will display proper description, describing when
they actually stop and what else can you do (e.g. run `starti` command
if u need to stop on first stop!).
* Update start.py
* ArgparsedCommand: fix `help cmd` and `cmd --help` behavior
Before this commit there was always a mismatch of what was displayed
when the user did `<command> --help` or `help <command>`.
With those changes, we fetch the help string from the argument parser
and render it as the command object's `self.__doc__`, so that it will be
displayed during `help <command>`.
Previously, we only displayed the command description during help.
* fix the pwndbg [filter] command that was broken in previous commit
* add riscv:rv64 registers
base on https://github.com/pwndbg/pwndbg/pull/829 by Tobias Faller <faller@endiio.com>
* disassemble without capstone to support other architectures
* ignore gdb.error on context_backtrace
This commit cleans up the commands/__init__.py a bit by removing the
`QuietSloppyParsedCommand` that we do not use anymore.
The last command that used it was `brva` which was just an alias for
`breakrva`, so now we just set it as an alias using the
`ArgparsedCommand` as it should be done.
* Add Bins classes and refactor allocator methods to return them
* Refactor bins() and related commands
* Refactor malloc_chunk
* Use chunk_size_nomask in top_chunk()
* Refactor vis_heap_chunks
* Rename read_chunk to read_chunk_from_gdb and move to ptmalloc.py
* Add get_first_chunk_in_heap and use it in heap and vis_heap_chunks commands
* Move some methods from DebugSymsHeap to Heap base class
* Strip type hints from heap.py and ptmalloc.py
* Set heap_region before using it
* Fix test_heap_bins test
* Fix try_free
This commit improves the `search --next ...` speed by making it so that
only the saved addresses are checked. Previously, the command performed
a full search even in the presence of `--next` flag and only afterwards
filtered the results. That resulted in extremely slow execution e.g.
when debugging processes with gigabytes of allocated memory.
The commit also adds a `--trunc-out` argument which makes it so that
only 20 results are displayed. This is helpful when performing a
CheatEngine-style search when we try to locate a given field/value
address in memory by first finding its known value, then changing its
value in the program and then re-searching the space.
The `--trunc-out` argument could further be improved by enabling it
default and making users aware that the results were truncated.
This PR removes ~40 commands that could be used to run shell programs.
I am removing this since GDB has the support for running shell programs
with either `shell <command...>` or `<!command...>` and so we do not
need this feature in Pwndbg anymore.
This feature also bloated Pwndbg a little bit and made more interesting
commands harder to find e.g. through the `pwndbg` command.
* Add support to use heap commands without debug symbols
* Fix a possible bugs when getting heap boundaries via heuristic
See https://github.com/pwndbg/pwndbg/pull/1029#issuecomment-1189841299
* Fix typo causing issues in `c_malloc_par_2_25`
See https://github.com/pwndbg/pwndbg/pull/1029#issuecomment-1189841299
* Fix a bug for `tcache_perthread_struct` and refactor some codes in `structs.py`
* The bug: `tcache_perthread_struct` for GLIBC < 2.30 is using `char` instead of `uint16_t` for `counts` field
* Fix some bugs about handling `thread_arena` and `tcache` with multithreaded
* Re-initialize the heap when the process stop or the file changed
By doing this, we can attach to another architecture in GDB without any bugs.
* Add guard code for unsupported architectures
* Support heuristic for arm and aarch64
Note: thread_arena and thread_cache for arm still can not work
* Update .pylintrc
* Ignore `import-error` error for `import gdb`
* Ignore `no-member` error for `pwndbg.typeinfo.*`, because most of its members are dynamically generated.
* Ignore `protected-access` warning for `_fields_`, `_type_`, `_length_`, because ctypes don't have other ways to access them.
* Refactor some code and comment to fit pep8 and lint check
* Add a feature to enable users set symbol addresses manually
For example, by using `set main_arena 0xdeadbeaf`, pwndbg will try to find main_arena at 0xdeadbeaf when using heuristic
* Use `__errno_location` to find TLS base for arm
By doing this, we can get `thread_arena` and `tcache` address
* Block other thread before `__errno_location()`
* Fix a bug for arm32 and a typo-caused bug
* Some wrong field names inside `c_heap_info` may cause some bugs in the future if we want to access it
* `pad` size of `heap_info` for arm32 is 0 byte, only i386 is 8 bytes, so I fixed it in a hard-coded way temporary
* Fix#1044 related issues
* Refactor the code about heap related config
* Use `int(address_str, 0)` to auto determine the format (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939569382)
* Use `OnlyWithResolvedHeapSyms` instead of `OnlyWithLibcDebugSyms` (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939568687)
* Use `resolve-heap-via-heuristic` instead of `resolve-via-heuristic` (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939569076)
* Update the description of `resolve-heap-via-heuristic` config (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939569069)
* Move heap related config into `heap` scope, and add a new command, `heap_config`, to show the config in that scope (https://github.com/pwndbg/pwndbg/pull/1029#discussion_r939569260)
* Refactor code about the config of heap related symbols
* Fix the logic when thread_arena is not found
* Use errno trick as a fallback for i386 and x86-64
* Update pwndbg/heap/ptmalloc.py
Co-authored-by: Disconnect3d <dominik.b.czarnota@gmail.com>
* remove py2 coding notations from files
* remove six package use and replace with proper py3 code
* remove py2 futures use
* replace unicode string literals with string literals
* remove python2 urlparse import in favor of python3 urllib.parse
* keep ida_script in py2 version
* remove hashbang python lines as the files are never ran directly
When we moved to argparse command parsing we introduced `gdb_sloppy_parse` which wasn't perfect: e.g. for `gdb.parse_and_eval("__libc_start_main")` would return a `gdb.Value()` whose `.type.name` was `long long`.
As a result when code that used `gdb_sloppy_parse` then casted the result to `int(gdb_value)` it crashed because for some reason GDB errored.
This commit fixes the issues related to it by adding `AddressExpr` and `HexOrAddressExpr` functions.
It also adds tests for some of the windbg compatibility commands and fixes some nifty details here and there.
Those lines are redundant in our case: pwndbg is not imported or launched directly.
Also, the coding lines were relevant in Py2 but are not really needed in Py3.
Those regs does not seem to work on i386: I can't do `i r dil` in i386 but I can do so in amd64 binaries.
Via https://www.tortall.net/projects/yasm/manual/html/arch-x86-registers.html :
```
The 64-bit x86 register set consists of 16 general purpose registers, only 8 of which are available in 16-bit and 32-bit mode. The core eight 16-bit registers are AX, BX, CX, DX, SI, DI, BP, and SP. The least significant 8 bits of the first four of these registers are accessible via the AL, BL, CL, and DL in all execution modes. In 64-bit mode, the least significant 8 bits of the other four of these registers are also accessible; these are named SIL, DIL, SPL, and BPL. The most significant 8 bits of the first four 16-bit registers are also available, although there are some restrictions on when they can be used in 64-bit mode; these are named AH, BH, CH, and DH.
```
and the table present there, it seems SIL, DIL, SPL and BPL are only available in 64-bit mode.
The typeinfo module used a static global tempdir location of /tmp/pwndbg
that an attacker may control and prepare symlinks of the predictable
files that are then written to.
TL;DR: With .splitlines() we splitted over universal splitlines which did not correspond to GDB's target code line splitting...
As a result we got `context code` to produce bogus out of sync lines that didn't correspond to GDB's `line` command.
See also https://docs.python.org/3/library/stdtypes.html#str.splitlines
Addresses #957 by enclosing anonymous map names printed by vmmap in square brackets.
Search still works & xinfo plays nice, but please let me know if you find anything this breaks.
Revert the change from 3e4ad60 and make the `pwndbg.proc.get_file` to strip the "target:" prefix instead of the `pwndbg.proc.exe`.
This way, we will prevent bugs when pwndbg code would use `pwndbg.proc.exe` on remote targets but not pass the returned path to `pwndbg.proc.get_file` to get the real remote file and instead use the local one (if it exists in the same path).
Additionally, we assert the `path` passed to `pwndbg.proc.get_file` so we prevent incorrect use of the function when an absolute path or not a remote path is passed to it.
Before this commit the `pwndbg.proc.exe` could return a "target:" prefix when `pwndbg.proc.exe` was executed on remote targets. This could be seen by:
1. Executing gdbserver in one terminal: gdbserver 127.0.0.1:1234 `which ps`
2. Executing `gdb -ex 'target remote :1234'` in another terminal and then invoking `pi pwndbg.proc.exe`.
This resulted in `checksec` (and some other) commands crashes which were using the `pwndbg.file.get_fille` functionality as it downloaded the remote file by using the `gdb.execute("remote get %s %s")` command passing it a path prefixed with `"target:"` which this GDB command does not support.
The `pwndbg.memoize.reset_on_new_base_address` decorator is super problematic: its start event was called before `pwndbg.arch.update` event because the pwndbg/memoize.py file is executed faster than the pwndbg/arch.py file. This happens even if we import pwndbg/arch.py as first import because it imports regs.py and events.py and those import memoize.py and so on.
TL;DR: The decorator was quite redundant and made too many calls in the end which caused some problems when executing:
1. In one console: qemu-x86_64 -g 1234 `which ps`
2. In another, attaching to this via `gdb` -> `target remote :1234`
The `explore_registers` and `clear_explored_pages` functions are currently redundant as we create an empty memory page on QEMU targets.
Also, this functions are not so useful to be called automagically on real remote and embedded targets where we may not have the memory pages information (as it may be too slow to explore stuff via remote gdbstub on a real embedded target).
E.g. for calls like this:
```
► 0x555555554a57 <main+296> call ioctl@plt <ioctl@plt>
fd: 0x3
request: 0xae01
vararg: 0x0
```
We want to display the `fd` as: `fd: 0x3 (/some/path)` fetched from
`readlink -f /proc/$PID/fd/$FD`.
The ASLR command did not work properly on QEMU kernel targets: it read /proc/sys/kernel/randomize_va_space and then /proc/<pid>/personality on local filesystem which was wrong, and returned that it couldn't read personality.
Now, this commit made so that:
- the `pwndbg.file.get_file` will print a warning if it returns a local path on remote targets
- the `check_aslr` was refactored: we don't run this on `new_objfile` or cache its result; the `pwndbg.vmmap.aslr` was also removed as it was never used
- the `pwndbg.vmmap.check_aslr` and `aslr` command will now return info if we couldn't detect ASLR on QEMU targets
There is a bug when the `pwndbg.auxv.get()` and `pwndbg.vmmap.get()` caches are not resetted when the binary is restarted. This causes an error when `disable-randomization` is set to off and the binary is restarted.
TL;DR to reproduce:
1. Run `gdb /bin/ls`
2. Invoke `entry`
3. Invoke `set disable-randomization off`
4. Invoke `starti` or `entry`
Or it can be seen here:
```
dc@dc:~$ gdb /bin/ls -q
pwndbg: loaded 195 commands. Type pwndbg [filter] for a list.
pwndbg: created $rebase, $ida gdb functions (can be used with print/break)
Reading symbols from /bin/ls...
(No debugging symbols found in /bin/ls)
pwndbg> set context-sections ''
Sections set to be empty. FYI valid values are: regs, disasm, args, code, stack, backtrace, expressions, ghidra
Set which context sections are displayed (controls order) to ''
pwndbg> entry
Temporary breakpoint 1 at 0x55555555a810
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Temporary breakpoint 1, 0x000055555555a810 in ?? ()
pwndbg> set exception-verbose on
Set whether to print a full stacktrace for exceptions raised in Pwndbg commands to True
pwndbg> set disable-randomization off
pwndbg> starti
Starting program: /usr/bin/ls
Traceback (most recent call last):
File "/home/dc/src/pwndbg/pwndbg/events.py", line 165, in caller
func()
File "/home/dc/src/pwndbg/pwndbg/memoize.py", line 194, in __reset_on_base
base = pwndbg.elf.exe().address if pwndbg.elf.exe() else None
File "/home/dc/src/pwndbg/pwndbg/proc.py", line 71, in wrapper
return func(*a, **kw)
File "/home/dc/src/pwndbg/pwndbg/memoize.py", line 46, in __call__
value = self.func(*args, **kwargs)
File "/home/dc/src/pwndbg/pwndbg/elf.py", line 182, in exe
return load(e)
File "/home/dc/src/pwndbg/pwndbg/elf.py", line 220, in load
return get_ehdr(pointer)[1]
File "/home/dc/src/pwndbg/pwndbg/elf.py", line 241, in get_ehdr
if pwndbg.memory.read(vmmap.start, 4) == b'\x7fELF':
File "/home/dc/src/pwndbg/pwndbg/memory.py", line 40, in read
result = gdb.selected_inferior().read_memory(addr, count)
gdb.MemoryError: Cannot access memory at address 0x555555558000
```
This commit fixes the above problem by making sure those function caches are cleared on binary start.
Before this commit the `pwndbg.elf.get_ehdr(pointer)` function searched for the ELF header by iterating through memory pages (with a step of -4kB) until the ELF magic is found (b'\x7fELF' value).
This was most likely redundant and this commit optimized this logic to look at the begining of the page and if it doesn't have the ELF magic, to look at the first page of the given objfile. Actually, maybe there was one compelling argument to do it this way which is bare metal debugging where we don't have all vmmap info. However! This situation is likely broken anyway, so let's skip doing more work than needed.
Additionally, we refactor the `pwndbg.stack.find_upper_stack_boundary` function to use the `pwndbg.memory.find_upper_boundary` function instead of the `pwndbg.elf.find_elf_magic` function, as the last one was removed in this commit. (NOTE: this actually fixes a potential bug that we could have incorrectly set the upper stack boundary if it contained an ELF magic on the beginning of one of its 4kB pages...)
Before this commit, when we did `gdb /bin/ls` and then `entry`, we could see a "Cannot find ELF base!" warning. This occured, because the `pwndbg.arch.ptrmask` was incorrectly set and then the `find_elf_magic` was using the `pwndbg.arch.ptrmask` which was 0xffFFffFF and made an early return. As a result of this early return, the "Cannot find ELF base!" warning was emitted.
The reason for incorrect early arch detection is that we used the `gdb.newest_frame().architecture().name()` API while the binary was not using in here:
```python
@pwndbg.events.start
@pwndbg.events.stop
@pwndbg.events.new_objfile
def update():
m = sys.modules[__name__]
try:
m.current = fix_arch(gdb.newest_frame().architecture().name())
except Exception:
return
m.ptrsize = pwndbg.typeinfo.ptrsize
m.ptrmask = (1 << 8*pwndbg.typeinfo.ptrsize)-1
```
And if the binary was not running, we just returned early and did not set `pwndbg.arch.ptrsize` and `pwndbg.arch.ptrmask` at all, leaving them at their default values.
Now, those values were often eventually fixed, but this occured by chance as the `new_objfile` was executed!
Anyway, starting from this commit, we will detect the arch from `gdb.newest_frame().architecture().name()` only if the process is running and if it is not, we will fallback to the `show architecture` GDB command and parse it, hoping we detect the arch properly. In case we don't, or, we don't support the given arch, we raise a `RuntimeError` currently.
Also, as a side note: the `find_elf_magic` from `elf.py` can be optimized to instead of doing 4kB steps over pages, to just look at the begining of a page. Otherwise, if this doesn't work, we are most likely on a target that may not have an ELF magic/header at all, so we shouldn't bother about it. This fix is done in the next commit.
This commit fixes an exception when pwndbg is sources after gdb is
attached to a process.
Below log is with `set exception-verbose on` and `set exception-debugger
on` hardcoded into the sources. We can see that:
1. We get an immediate exception
2. The `pwndbg.arch.current` is set incorrectly to `i386`
3. The real arch is `i386:x86-64`
```
$ gdb -q -p $(pidof a.out) ./a.out
Reading symbols from ./a.out...
(No debugging symbols found in ./a.out)
Attaching to program: /home/dc/example/a.out, process 415704
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...
Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/libc-2.31.so...
Reading symbols from /lib64/ld-linux-x86-64.so.2...
(No debugging symbols found in /lib64/ld-linux-x86-64.so.2)
0x00007f4bb94e7142 in __GI___libc_read (fd=0, buf=0x55ee576fb6b0, nbytes=1024)
at ../sysdeps/unix/sysv/linux/read.c:26
26 ../sysdeps/unix/sysv/linux/read.c: No such file or directory.
(gdb) source /home/dc/tools/pwndbg/gdbinit.py
pwndbg: loaded 195 commands. Type pwndbg [filter] for a list.
pwndbg: created $rebase, $ida gdb functions (can be used with print/break)
Traceback (most recent call last):
File "/home/dc/tools/pwndbg/pwndbg/commands/__init__.py", line 130, in __call__
return self.function(*args, **kwargs)
File "/home/dc/tools/pwndbg/pwndbg/commands/__init__.py", line 221, in _OnlyWhenRunning
return function(*a, **kw)
File "/home/dc/tools/pwndbg/pwndbg/commands/context.py", line 269, in context
result[target].extend(func(target=out,
File "/home/dc/tools/pwndbg/pwndbg/commands/context.py", line 350, in context_regs
regs = get_regs()
File "/home/dc/tools/pwndbg/pwndbg/commands/context.py", line 399, in get_regs
m = ' ' * len(change_marker) if reg not in changed else C.register_changed(change_marker)
TypeError: argument of type 'NoneType' is not iterable
If that is an issue, you can report it on https://github.com/pwndbg/pwndbg/issues
(Please don't forget to search if it hasn't been reported before)
To generate the report and open a browser, you may run `bugreport --run-browser`
PS: Pull requests are welcome
> /home/dc/tools/pwndbg/pwndbg/commands/context.py(399)get_regs()
-> m = ' ' * len(change_marker) if reg not in changed else C.register_changed(change_marker)
(Pdb) print(changed)
None
(Pdb) print(reg)
eax
(Pdb) print(pwndbg.arch.current)
i386
(Pdb) print(gdb.execute('show arch', to_string=True))
The target architecture is set automatically (currently i386:x86-64)
(Pdb)
```
Because of the previous commit to this file that removed `from queue import *`, the `address_queue` on line 96 would fail by throwing an exception when running `leakfind`.
This commit adds back the required `import queue` and fixes the reference to `Queue` on line 96.
This is the case:
```
pwndbg> show print elements
Limit on string chars or array elements to print is 200.
warning: (Internal error: pc 0x7ffff49ef495 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x7ffff49ef495 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x7ffff49ef495 in read in CU, but not in symtab.)
warning: (Internal error: pc 0x7ffff49ef495 in read in CU, but not in symtab.)
Exception occurred: Error: invalid literal for int() with base 10: 'symtab.)' (<class 'ValueError'>)
For more info invoke `set exception-verbose on` and rerun the command
or debug it by yourself with `set exception-debugger on`
Python Exception <class 'ValueError'> invalid literal for int() with base 10: 'symtab.)':
```
If we call `message = message.split()[-1]`, we get `symtab.)`.
Then `length = int(message)` raise an Exception.
In case the max steps are reached and the loop finished the current skip
buffer remains filled and not unrolled when the last lines are all
skipped values. Fix this by calling the collapse function and
potentially unroll the buffer in case it contains any values.
Fixes#907
The length is enough as the register column is joined with whitespaces
around it. Hence we can simply drop the increment and just use the raw
length to get rid of the double whitespace.
Skipped lines create cognitive load as it takes a bit to figure out how
many lines are actually collapsed. Instead we just create a label
showing the count of omitted lines.
Buffer all repeating lines and check the minimum value when to start
marking them with skip lines. In case the minimum value is not hit, just
unroll the buffer.
To stop skipping any lines, there is the existing bool config
telescope-skip-repeat-val so we avoid adding special values to minimum
like -1 and keep the setting separated.
Fixes#803
Currently this function is only used for the backtrace context and does
not prefix the frame pointers in hex form, which can be annoying if the
value is copied to be inspected or otherwise processed.
This can be a useful command to quickly execute some radare2 operations
in various positions in mid of a debugging session without the need to
shell out and temporarily transfer process control to radare2.
Exception driven code flow for expected code paths is not great for
readability and suffers some performance degeneration that can be
circumvented with conditional checks.
Use exceptions exclusively for fatal failure handling and either return
a simple string from the decompile function or throw an exception.
If we are trying to decompile a running binary which is a PIE, we need
to make sure to pass the appropriate base address to radare2 to be used
when loading a new binary.
Furthermore set io.cache to fix relocations in disassembly and avoid a
warning from the r2pipe.
As the source code and the decompiled sources are essentially the same
thing, lets just reuse the existing code prefix marker to indicate the
current line instead of using a hardcoded plain string.
A comment compatible marker is used before the syntax highlighter to
avoid any highlight and parsing confusion which is later replaced by the
colorized variant of the prefix marker before returning the results.
Furthermore we only replace the amount of indented spaces that is
required to fill the space for the code prefix marker.
The logic was reversed leading to not showing ghidra context if the
source could be found. Instead, we continue with ghidra decompilation
if we can't find the file.
Splitting the logic into ghidra related functionality, context
processing and plain command invocation makes the code better structured
and the individual files smaller.
* feature(radare2): add alias radare2 to r2 command
* feature(radare2): add argument to set base when loading for PIE
Depending on the use case, one may want to have either the same
addresses for PIE as in gdb or just use the non rebased plain addresses
without taking the current memory mapping into account.
* fix(radare2): fix relocations in disassembly warning by enabling io.cache
The 8f33ec4 made `pwndbg.symbol.address` to discard addresses
of symbols not mapped.
Unfortunately this broke pwndbg's `start`.
GDB's `start` puts a temporal break in `main` and pwndbg's `start` does
the same but when GDB returns the address of `main`, it returns an
offset the first time because the symbol was not mapped yet.
The offset is then discarded and pwndbg doesn't put the breakpoint when
it should.
This PR fixes pwndbg's `start` allowing `pwndbg.symbol.address` to
return offsets instead of addresses: GDB will resolve the correct
address when it builds the breakpoint and pwndbg's `start` will behave
like GDB's `start`.
* Added option to compact the register list when displayed in context view.
* Disabled compact register list as default setting.
* Moved calculate_padding method outside the compact_regs function and renamed it to calculate_padding_to_align
* Replaced custom get_text_length function with existing pwndb.color.strip function
* Moved width calculation in context_regs to top so that other function calls can reuse the calculated width value
* Clarified show-compact-regs-space configuration option description
This commit fixes the issue described in https://github.com/pwndbg/pwndbg/issues/772#issuecomment-652260420
tl;dr: when we displayed a known constant call instruction like
`call qword ptr [rip + 0x1c7d2]` when it called a known symbol we only
displayed the target address instead of the symbol.
Now, we will display only the target address.
Note that there are still cases when we can display both. This can
happen for example for a `ret` instruction (even without emulation).
This commit extends a comment around that code to give such example.
* vmmap command: show offset for single addresses
Changes vmmap display so when it is invoked with a single address, it
will also display its offset.
Before this commit:
```
pwndbg> vmmap $rsp
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x7ffffffde000 0x7ffffffff000 rw-p 21000 0 [stack]
```
After this commit:
```
pwndbg> vmmap $rsp
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x7ffffffde000 0x7ffffffff000 rw-p 21000 0 [stack] +0x1fb60
```
* Remove ugly hack :)
I believe the `arch_to_regs[pwndbg.arch.current][item]` is a dead code.
I stumbled upon this during debugging one of other issues and:
* The `arch_to_regs` is a dict mapping str -> RegisterSet objects
* So `arch_to_regs[pwndbg.arch.current]` gets a `RegisterSet`
* Now, the `RegisterSet` doesn't have a subscription (the `__getitem__` magic method)
This can also be seen below:
```
>>> pwndbg.regs.arch_to_regs
{'i386': <pwndbg.regs.RegisterSet object at 0x7f931020b048>, 'x86-64': <pwndbg.regs.RegisterSet object at 0x7f931020b080>, 'mips': <pwndbg.regs.RegisterSet object at 0x7f9310212f98>, 'sparc': <pwndbg.regs.RegisterSet object at 0x7f9310212b38>, 'arm': <pwndbg.regs.RegisterSet object at 0x7f930ee7a6a0>, 'armcm': <pwndbg.regs.RegisterSet object at 0x7f930ee7aba8>, 'aarch64': <pwndbg.regs.RegisterSet object at 0x7f931020b0b8>, 'powerpc': <pwndbg.regs.RegisterSet object at 0x7f9310212518>}
>>> pwndbg.arch.current
'x86-64'
>>> pwndbg.regs.arch_to_regs[pwndbg.arch.current]
<pwndbg.regs.RegisterSet object at 0x7f931020b080>
>>> pwndbg.regs.arch_to_regs[pwndbg.arch.current]['$rax']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'RegisterSet' object is not subscriptable
>>> pwndbg.regs.arch_to_regs[pwndbg.arch.current][0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'RegisterSet' object does not support indexing
>>>
```
When pyelftools missed a PT_* name, invoking a `xinfo` command on a library with the missing program header crashed the `xinfo` command due to us assuming the ptype will be a string.
Instead, since pyelftoools didn't have the name, this ph is then an int.
So this commit takes this into acccount and omits those program headers that we weren't able to name due to lack of info in pyelftools.
This solution is not ideal as we might miss something, but given that we care about given program header name, I don't think it will be a big deal for us.
This commit fixes the issue described in #749.
During disasm output, we enhance the display to show additional information of the instructions.
When a future instruction executes a branch instruction (jmp/call), we fetch the next instruction based on the jmp/call target, as long as we can calculate it statically.
If we can calculate it statically, we will then display the target of the jmp/call as the next instruction, as e.g. in here:
```
> 0x5555555545fe <main+4> jmp main+4 <0x5555555545fe>
v
> 0x5555555545fe <main+4> jmp main+4 <0x5555555545fe>
```
The issue is, that we mark both instructions as "current", highlighting both of them, making it a bit unambigous "where we are".
While this view is _kinda valid_ as the PC is really the same, we want to mark/hightlight only the first instruction we are on, as it is the one that is being executed right now and the program might go some other path in the future.
This commit fixes this display by simply making it so that the `nearpc` function/command used to display disasm shows the marker only once, for the first time it shows the current PC instruction.