Looking at yourself


Linux introspection tales


Arnaldo Carvalho de Melo
acme@redhat.com

What is this about?



  • Type information
  • Introspection
  • Adaptation

shrinking sockets



  • Linux 2.4
  • struct sock
  • Big union
  • All protocols

shrunk socket



  • Reduce CPU cache utilization
  • Reorder struct fields
  • To better pack
  • Demote some
  • Remove alignment holes

socket hierarchy



  • struct sock
  • struct tcp_sock
  • struct tcp6_sock
  • struct udp_sock
  • And all the others

Tedious



  • Manual
  • Can we automate this?
  • gdb knows about types, how?

DWARF



  • Debugging With Arbitrary Record Formats
  • Executable and Linkable Format's friend
  • More recently we got ORC too

pahole



  • Read DWARF
  • Rebuild source from type info
  • Augmented with alignment info
  • Showing the holes
  • Cacheline boundaries

Example


$ pahole -C list_head ~/git/build/v5.18-rc6+/vmlinux
struct list_head {
	struct list_head *         next;                 /*     0     8 */
	struct list_head *         prev;                 /*     8     8 */

	/* size: 16, cachelines: 1, members: 2 */
	/* last cacheline: 16 bytes */
};
$
					

CERN ATLAS Migration



  • 32-bit to 64-bit
  • C++
  • long/pointer: 32 to 64-bit

pahole --reorganize



  • To elliminate holes
  • Demotes bitfields
  • show-reorg-steps
  • __alignment__ attribute
  • false sharing

pahole --reorganize 2



  • DWARF now has attribute align
  • Use that in the reorg algo

CTF



  • Dtrace
  • Solaris
  • In the kernel image
  • Introspection
  • SparcLinux

Multi-format



  • Type info agnostic
  • DWARF and CTF
  • CTF reader
  • CTF encoder

Conversion



  • From DWARF to CTF
  • For testing

DWARF problems



  • kernel community problems with unwinding
  • object files: Compile Units
  • All types represented per CU
  • debuginfo files are big
  • hundreds of megabytes

BPF needs type info



  • To pretty print maps
  • Later: CO-RE

BTF



  • BPF Type Format
  • Reused the CTF infra in pahole
  • First BTF producer: clang bpf target
  • First BTF consumer: Linux kernel

BPF dedup



  • Looks at all objects
  • Removes duplicates


					

BPF reader



  • For testing
  • /sys/kernel/btf/vmlinux
  • Wow, that is fast!

pahole + btf examples



  • fill

Where is it?



  • libbpf loads it
  • kernel verifies
  • Gets associated to the prog/map fd
  • Tools can retrieve it

bpftool bpf


  • Generates vmlinux.h
  • With all kernel types
bpftool btf dump file /sys/kernel/btf/vmlinux format c
					

pahole --compile



  • Reconstruct compilable source code
  • Like bpftool

program lines



  • .BTF_ext ELF section
  • perf annotate
  • BPF source code

BPF annotate example



  • get screnshots from lisbon lpc prez

BTF tags



  • clang generates new DWARF tags
  • pahole converts to BTF_KIND_TAG
  • kernel BPF verifier uses
  • __rcu, percpu, etc

New BPF/BTF features



  • pahole as an enabler
  • Eventually compiler will produce BTF
  • Covenient when developing new features

Rust



  • reordering of struct fields
  • pahole --reorganize done by rust
  • rejected by kernel BTF verifier
  • kernel build: pahole --lang_exclude rust
  • pahole should put fields in order instead
  • when generating BTF for rust kernel objects
  • DWARF compile_unit tag has DW_AT_producer

DWARF langs


static const char *languages[] = {
  [DW_LANG_Ada83]          = "ada83",
SNIP
  [DW_LANG_C11]            = "c11",
  [DW_LANG_C89]            = "c89",
  [DW_LANG_C99]            = "c99",
  [DW_LANG_C]              = "c",
  [DW_LANG_Cobol74]        = "cobol74",
SNIP
  [DW_LANG_C_plus_plus_14] = "c++14",
  [DW_LANG_C_plus_plus]    = "c++",
  [DW_LANG_D]              = "d",
  [DW_LANG_Dylan]          = "dylan",
  [DW_LANG_Fortran03]      = "fortran03",
SNIP
  [DW_LANG_PLI]            = "pli",
  [DW_LANG_Python]         = "python",
  [DW_LANG_RenderScript]   = "renderscript",
  [DW_LANG_Rust]           = "rust",
};
					

BTFgen



  • For older kernels
  • Generates kernel eBPF needed by a prog
  • Modified libbpf looks at it as the kernel BPF
  • Does its CO-RE work

Tetragon



  • BTF required
  • N-level filtering