Uninitialized AVTextFormatDataDump causes UB in avtext_print_data()

Olivier Laflamme

When ffprobe is invoked with -show_data but without -data_dump_format, the local variable data_dump_format_id in main() is never assigned. The uninitialized value propagates into AVTextFormatOptions.data_dump_format and eventually reaches a switch statement in avtext_print_data(). The indeterminate enum value falls into the default: arm, which calls av_unreachable("Invalid data dump type"). In release builds, av_unreachable() compiles to _ builtin_unreachable(), so reaching it is undefined behavior. In assert-enabled builds, it aborts. Either way, ffprobe crashes.

Any invocation of ffprobe -show_data -show_streams or ffprobe -show_data -show_packets without an explicit -data_dump_format xxd triggers the bug. Automated media pipelines (transcoding backends, CDN ingest, forensics tooling) that shell out to ffprobe -show_data to inspect uploads will crash on every file processed.

Root cause

The commit 75f5d79f6a introduces -data_dump_format as a new CLI option with a corresponding data_dump_format_id local variable, but never initializes it. The code has two paths:

  1. data_dump_format provided (e.g. "xxd"): the string pointer is non-NULL, the if (data_dump_format) block executes, and data_dump_format_id is assigned AV_TEXTFORMAT_DATADUMP_XXD (value 0). The switch in avtext_print_data() hits case 0, calls print_data_xxd(), clean exit.
  2. data_dump_format omitted (default usage): the string pointer is NULL, the if block is skipped, data_dump_format_id is never written. Stack garbage is copied into AVTextFormatOptions.data_dump_format.

The call chain from the uninitialized value to the crash:

javascript
main (ffprobe.c:3377)           – uninitialized data_dump_format_id propagated into opts
  probe_file (ffprobe.c:2607)
    read_packets (ffprobe.c:1729)
      read_interval_packets (ffprobe.c:1291)
        avtext_print_data (avtextformat.c:554)switch on garbage value hits default:
          av_unreachable()UB (__builtin_unreachable) or abort

MSan confirms the use-of-uninitialized-value at avtext_print_data, and the symbolized stack matches exactly.

javascript
==155062==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x561c2127c270  (/home/FFmpeg/ffprobe+0x323270) (BuildId: b756367ea3007dc6b9f624f950562076fdf5fda5)
    #1 0x561c212a082e  (/home/FFmpeg/ffprobe+0x34782e) (BuildId: b756367ea3007dc6b9f624f950562076fdf5fda5)
    #2 0x561c2129c873  (/home/FFmpeg/ffprobe+0x343873) (BuildId: b756367ea3007dc6b9f624f950562076fdf5fda5)
    #3 0x561c2129a638  (/home/FFmpeg/ffprobe+0x341638) (BuildId: b756367ea3007dc6b9f624f950562076fdf5fda5)
    #4 0x561c21297917  (/home/FFmpeg/ffprobe+0x33e917) (BuildId: b756367ea3007dc6b9f624f950562076fdf5fda5)
    #5 0x7f38e28461c9 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #6 0x7f38e284628a in __libc_start_main csu/../csu/libc-start.c:360:3
    #7 0x561c211e1494  (/home/FFmpeg/ffprobe+0x288494) (BuildId: b756367ea3007dc6b9f624f950562076fdf5fda5)
MemorySanitizer: use-of-uninitialized-value (/home/FFmpeg/ffprobe+0x323270) (BuildId: b756367ea3007dc6b9f624f950562076fdf5fda5)
Exiting

for addr in 0x323270 0x34782e 0x343873 0x341638 0x33e917; do addr2line -e ./ffprobe_g $addr -f -p; done
avtext_print_data at /home/FFmpeg/fftools/textformat/avtextformat.c:554
read_interval_packets at /home/FFmpeg/fftools/ffprobe.c:1291
read_packets at /home/FFmpeg/fftools/ffprobe.c:1729
probe_file at /home/FFmpeg/fftools/ffprobe.c:2607
main at /home/FFmpeg/fftools/ffprobe.c:3377

Reproduction with with the following test media file ./ffmpeg -y -f lavfi -i "anullsrc=r=8000:cl=mono" -t 0.1 -f wav ~/test.wav would trigger the bug via uninitialized path.

image.png

With -data_dump_format xxd explicitly set, the same invocation exits cleanly with code 0.

The fix is to initialize the default explicitly: AVTextFormatDataDump data_dump_format_id = AV_TEXTFORMAT_DATADUMP_XXD; which was pushed https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/22484/commits