diff options
author | Martin Liska <mliska@suse.cz> | 2022-08-03 11:28:10 +0200 |
---|---|---|
committer | Martin Liska <mliska@suse.cz> | 2022-08-08 13:51:14 +0200 |
commit | e441b55e94c905c6ee4417be3e5d88021d9c5aa6 (patch) | |
tree | 4b397c69f3aabe80666a08208d05b6485c0c2977 /.editorconfig | |
parent | 06ce017c7d73f90db1c99f9e6a56c486b0313864 (diff) |
add splay tree for info_ptr -> CU mappinglinaro-local/ci/tcwg_kernel/gnu-master-arm-stable-defconfiglinaro-local/ci/tcwg_gnu_native_build/master-aarch64linaro-local/ci/tcwg_gcc_check_bootstrap/master-aarch64-check_bootstrap_ltolinaro-local/ci/tcwg_gcc_bootstrap/master-arm-bootstrap_ubsanlinaro-local/ci/tcwg_gcc_bootstrap/master-arm-bootstrap_ltolinaro-local/ci/tcwg_bmk_gnu_apm/gnu-master-arm-spec2k6-Os_LTO
While using perf top for MozillaThunderbird I noticed quite some slow
dissably call with source code involved. E.g.
time ./objdump --start-address=0x0000000004e0dcd0 --stop-address=0x0000000004e0df8b -l -d --no-show-raw-insn -S -C /usr/lib64/thunderbird/libxul.so
took 2.071s and I noticed quite some time is spent in
find_abstract_instance:
33.46% objdump objdump [.] find_abstract_instance
18.22% objdump objdump [.] arange_add
13.77% objdump objdump [.] read_attribute_value
4.82% objdump objdump [.] comp_unit_maybe_decode_line_info
3.10% objdump libc.so.6 [.] __memset_avx2_unaligned_erms
where linked list of CU is iterated when searing for where info_ptr
belongs to:
: 3452 for (u = unit->prev_unit; u != NULL; u = u->prev_unit)
0.00 : 4c61f7: mov 0x10(%rbx),%rax
0.00 : 4c61fb: test %rax,%rax
0.00 : 4c61fe: je 4c6215 <find_abstract_instance+0x365>
: 3453 if (info_ptr >= u->info_ptr_unit && info_ptr < u->end_ptr)
0.00 : 4c6200: cmp 0x60(%rax),%rdx
83.20 : 4c6204: jb 4c620c <find_abstract_instance+0x35c>
0.00 : 4c6206: cmp 0x78(%rax),%rdx
6.89 : 4c620a: jb 4c6270 <find_abstract_instance+0x3c0>
: 3452 for (u = unit->prev_unit; u != NULL; u = u->prev_unit)
0.00 : 4c620c: mov 0x10(%rax),%rax
7.90 : 4c6210: test %rax,%rax
0.00 : 4c6213: jne 4c6200 <find_abstract_instance+0x350>
The following scan can be replaced with search in a splay tree and with
that I can get to 1.5s and there are other symbols where the difference
is even bigger.
bfd/ChangeLog:
PR 29081
* dwarf2.c (struct addr_range): New.
(addr_range_intersects): Likewise.
(splay_tree_compare_addr_range): Likewise.
(splay_tree_free_addr_range): Likewise.
(struct dwarf2_debug_file): Add comp_unit_tree.
(find_abstract_instance): Use the splay tree when searching
for a info_ptr.
(stash_comp_unit): Insert to the splay tree.
(_bfd_dwarf2_cleanup_debug_info): Clean up the splay tree.
Diffstat (limited to '.editorconfig')
0 files changed, 0 insertions, 0 deletions