Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zdb: SIGSEGV in dsl_deadlist_close while verifying pool checksum #16959

Open
zzy9001 opened this issue Jan 17, 2025 · 0 comments
Open

zdb: SIGSEGV in dsl_deadlist_close while verifying pool checksum #16959

zzy9001 opened this issue Jan 17, 2025 · 0 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@zzy9001
Copy link

zzy9001 commented Jan 17, 2025

System information

Type Version/Name
Distribution Name CachyOS
Distribution Version rolling
Kernel Version 6.6.69-1-cachyos-lts
Architecture x86_64
OpenZFS Version zfs-2.3.0-1;zfs-kmod-2.2.7-1

Describe the problem you're observing

After experiencing several unexpected system shutdowns, the ZFS pool appeared to be functioning normally upon system restart. As a precautionary measure, I decided to run zdb -cc to verify data integrity. But the command crashes with a SIGSEGV signal while traversing blocks for checksum verification. The crash occurs during the deadlist processing (dsl_deadlist_close). Before the crash, multiple "error 52" reading errors were encountered on various blocks.

Describe how to reproduce the problem

  1. Run the following command on the affected pool:
sudo zdb -cc zpcachyos
  1. The command initially proceeds with block traversal and verification
  2. Multiple error 52 readings appear
  3. The process eventually crashes with SIGSEGV while processing deadlist

Include any warning/errors/backtraces from the system logs

Traversing all blocks to verify checksums and verify nothing leaked ...
loading concrete vdev 0, metaslab 231 of 232 ...
6.44G completed ( 103MB/s) estimated time remaining: 3hr 10min 51sec
zdb_blkptr_cb: Got error 52 reading <529, 18472, 0, 15>  -- skipping
zdb_blkptr_cb: Got error 52 reading <529, 979002, 0, 2>  -- skipping
zdb_blkptr_cb: Got error 52 reading <529, 980200, 0, 0>  -- skipping
zdb_blkptr_cb: Got error 52 reading <529, 2973302, 0, 6>  -- skipping
zdb_blkptr_cb: Got error 52 reading <529, 6645040, 0, 0>  -- skipping
zdb_blkptr_cb: Got error 52 reading <529, 7932296, 0, 7>  -- skipping
zdb_blkptr_cb: Got error 52 reading <529, 7932296, 0, 4>  -- skipping
zdb_blkptr_cb: Got error 52 reading <529, 7932296, 0, 0>  -- skipping
zdb_blkptr_cb: Got error 52 reading <529, 7946710, 0, 0>  -- skipping
zdb_blkptr_cb: Got error 52 reading <529, 7946684, 0, 0>  -- skipping
769G completed ( 125MB/s) estimated time remaining: 0hr 53min 09sec
zdb_blkptr_cb: Got error 52 reading <771, 0, -1, 0>  -- skipping
774G completed ( 125MB/s) estimated time remaining: 0hr 52min 35sec

Call trace:
/usr/lib/libzpool.so.6(libspl_backtrace+0x38) [0x77be8c409db8]
zdb(+0xc50c) [0x555e20d8450c]
/usr/lib/libc.so.6(+0x42150) [0x77be8b8f4150]
/usr/lib/libc.so.6(pthread_mutex_lock+0x4) [0x77be8b954c74]
/usr/lib/libzpool.so.6(dsl_deadlist_close+0x229) [0x77be8c1ac6f9]
/usr/lib/libzpool.so.6(dsl_dir_hold_obj+0x86c) [0x77be8c1bda4c]
/usr/lib/libzpool.so.6(dsl_dataset_hold_obj+0x3bb) [0x77be8c19971b]
/usr/lib/libzpool.so.6(traverse_pool+0x200) [0x77be8c1731e0]
zdb(+0x240c3) [0x555e20d9c0c3]
zdb(+0xa812) [0x555e20d82812]
/usr/lib/libc.so.6(+0x2618e) [0x77be8b8d818e]
/usr/lib/libc.so.6(__libc_start_main+0x8a) [0x77be8b8d824a]
zdb(+0xb635) [0x555e20d83635]

fish: Job 1, 'sudo zdb -cc zpcachyos' terminated by signal SIGSEGV (Address boundary error)

$zpool status -v
  pool: zpcachyos
 state: ONLINE
  scan: scrub repaired 0B in 00:10:56 with 0 errors on Wed Jan 15 09:31:10 2025
config:

        NAME                                    STATE     READ WRITE CKSUM
        zpcachyos                               ONLINE       0     0     0
          a5d06edf-0256-4f19-9065-00613a102ba5  ONLINE       0     0     0

errors: No known data errors
@zzy9001 zzy9001 added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

1 participant