Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: VERIFY0(spa_open(dsname, &spa, FTAG)) failed (0 == 2) #16988

Open
asomers opened this issue Jan 24, 2025 · 4 comments
Open

panic: VERIFY0(spa_open(dsname, &spa, FTAG)) failed (0 == 2) #16988

asomers opened this issue Jan 24, 2025 · 4 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@asomers
Copy link
Contributor

asomers commented Jan 24, 2025

System information

Type Version/Name
Distribution Name FreeBSD
Distribution Version FreeBSD
Kernel Version 15.0-CURRENT
Architecture amd64
OpenZFS Version zfs-2.3.99-158-FreeBSD_gfae4c664a zfs-kmod-2.3.99-158-FreeBSD_gfae4c664a

Describe the problem you're observing

I can produce this panic semi-reliably by using the ZFS test suite (on 3 out of 4 attempts). I am running FreeBSD 15.0-CURRENT with KMSAN enabled, plus a change that allows me to use the original raidz math routines rather than the SIMD ones (though this panic doesn't look raidz related).

Describe how to reproduce the problem

Run cli_root/zpool_destroy/zpool_destroy_test:zpool_destroy_004_pos in FreeBSD's ZFS test suite.

Include any warning/errors/backtraces from the system logs

panic: VERIFY0(spa_open(dsname, &spa, FTAG)) failed (0 == 2)

cpuid = 6
time = 1737674404
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x99/frame 0xfffffe0077abe260
vpanic() at vpanic+0x6f8/frame 0xfffffe0077abe3f0
spl_panic() at spl_panic+0x1c9/frame 0xfffffe0077abe500
dsl_prop_set_hasrecvd_impl() at dsl_prop_set_hasrecvd_impl+0x39e/frame 0xfffffe0077abe590
dsl_prop_set_hasrecvd() at dsl_prop_set_hasrecvd+0x1e1/frame 0xfffffe0077abe5f0
zfs_ioc_set_prop() at zfs_ioc_set_prop+0x2ba/frame 0xfffffe0077abe690
zfsdev_ioctl_common() at zfsdev_ioctl_common+0x1de4/frame 0xfffffe0077abe7f0
zfsdev_ioctl() at zfsdev_ioctl+0x2e6/frame 0xfffffe0077abe880
devfs_ioctl() at devfs_ioctl+0x3ed/frame 0xfffffe0077abe940
VOP_IOCTL_APV() at VOP_IOCTL_APV+0xfe/frame 0xfffffe0077abe9b0
vn_ioctl() at vn_ioctl+0x79a/frame 0xfffffe0077abeaa0
devfs_ioctl_f() at devfs_ioctl_f+0x165/frame 0xfffffe0077abeb50
kern_ioctl() at kern_ioctl+0xc84/frame 0xfffffe0077abec50
sys_ioctl() at sys_ioctl+0x57f/frame 0xfffffe0077abed60
amd64_syscall() at amd64_syscall+0x722/frame 0xfffffe0077abef30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0077abef30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x241067d34cca, rsp = 0x241059d9b3d8, rbp = 0x241059d9b440 ---
KDB: enter: panic
``` #0 __curthread () at /usr/home/somers/src/freebsd.org/src/sys/amd64/include/pcpu_aux.h:57 td = #1 doadump (textdump=textdump@entry=0) at /usr/home/somers/src/freebsd.org/src/sys/kern/kern_shutdown.c:404 error = 0 coredump = #2 0xffffffff80a51fed in db_dump (dummy=, dummy2=, dummy3=, dummy4=) at /usr/home/somers/src/freebsd.org/src/sys/ddb/db_command.c:596 error = #3 0xffffffff80a515c1 in db_command (last_cmdp=, cmd_table=, dopager=true) at /usr/home/somers/src/freebsd.org/src/sys/ddb/db_command.c:508 modif = "\000\000ޅ\377\377\377\377\000\000\000\200\000\376\377\377\000\240\000\206\377\377\377\377\000\000\000\000\000\000\000\000\000\017\000\000\000\000\000\000\270\377\311\303\000\376\377\377\b", '\000' , "\376\377\377\270\377\311\303\000\376\377\377\254\213<\203\377\377\377\377", '\000' , "`ܫw\000\376\377\377" addr = -2093184084 count = -1 cmd = 0xffffffff85c1a978 have_addr = t = result = #4 0xffffffff80a506b0 in db_command_loop () at /usr/home/somers/src/freebsd.org/src/sys/ddb/db_command.c:555 No locals. #5 0xffffffff80a6257f in db_trap (type=type@entry=3, code=code@entry=0) at /usr/home/somers/src/freebsd.org/src/sys/ddb/db_main.c:267 jb = {{_jb = {0, -2197015503704, -2197015503504, 0, -2195738460232, -2197015503696, 0, -2136595556, 0, -2050389008, -2056009088, -2047184032}}} bkpt = false watchpt = false prev_jb = 0x0 why = 0xffffffff855a7985 "panic" #6 0xffffffff833cc1be in kdb_trap (type=3, code=code@entry=0, tf=tf@entry=0xfffffe0077abe140) at /usr/home/somers/src/freebsd.org/src/sys/kern/subr_kdb.c:790 __pc = 0x0 __pc = 0x0 other_cpus = {__bits = {0, 18446741877971087368, 0, 0, 12, 0, 0, 32, 305034528032, 18446741876694049144, 0, 0, 0, 0, 4294967295, 18446741876694049136}} be = 0xffffffff85c1b510 intr = 2 did_stop_cpus = handled = #7 0xffffffff84ff4444 in trap (frame=0xfffffe0077abe140) at /usr/home/somers/src/freebsd.org/src/sys/amd64/amd64/trap.c:608 __pc = 0x0 __pc = 0x0 __pc = 0x0 ksi = {ksi_link = {tqe_next = 0xffffffff80a62c66 , tqe_prev = 0x0}, ksi_info = {si_signo = 93560188, si_errno = 93560188, si_code = 93560188, si_pid = 93560188, si_uid = 93560188, si_status = 93560188, si_addr = 0xf39c0ad0b4d7af6b, si_value = {sival_int = -1010175992, sival_ptr = 0xfffffe00c3c9f008, sigval_int = -1010175992, sigval_ptr = 0xfffffe00c3c9f008}, _reason = {_fault = { _trapno = 0}, _timer = {_timerid = 0, _overrun = 0}, _mesgq = { _mqd = 0}, _poll = {_band = 0}, _capsicum = {_syscall = 0}, __spare__ = {__spare1__ = 0, __spare2__ = {0, 0, 2007752616, -512, 2007752736, -512, 2007752912}}}}, ksi_flags = -2092782669, ksi_sigq = 0x3000000018} signo = -1 ucode = -1010176000 td = p = dr6 = 0 type = 3 addr = pf = i = #8 No locals. #9 kdb_enter (why=0xffffffff855a7985 "panic", msg=0xffffffff855a7985 "panic") at /usr/home/somers/src/freebsd.org/src/sys/kern/subr_kdb.c:556 No locals. #10 0xffffffff8320e658 in vpanic ( fmt=fmt@entry=0xffffffff87c0c2f3 "VERIFY0(spa_open(dsname, &spa, FTAG)) failed (0 == %lld)\n", ap=ap@entry=0xfffffe0077abe490) at /usr/home/somers/src/freebsd.org/src/sys/kern/kern_shutdown.c:967 buf = "VERIFY0(spa_open(dsname, &spa, FTAG)) failed (0 == 2)\n", '\000' __pc = 0x0 __pc = 0x0 __pc = 0x0 other_cpus = td = 0xfffffe00bfd2d000 bootopt = newpanic = #11 0xffffffff86f30009 in spl_panic (file=, func=, line=2007754000, fmt=0xffffffff87c0c2f3 "VERIFY0(spa_open(dsname, &spa, FTAG)) failed (0 == %lld)\n") at /usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/module/os/freebsd/spl/spl_misc.c:99 ap = {{gp_offset = 40, fp_offset = 48, overflow_arg_area = 0xfffffe0077abe510, reg_save_area = 0xfffffe0077abe460}} #12 0xffffffff873ac52e in dsl_prop_set_hasrecvd_impl ( dsname=dsname@entry=0xfffffe00c376f000 "testpool1.2123/d1@snap1", source=source@entry=ZPROP_SRC_LOCAL) at /usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/module/zfs/dsl_prop.c:1271 _verify0_right = 0 spa = 0x0 error = 0 version = #13 0xffffffff873ac131 in dsl_prop_set_hasrecvd ( dsname=dsname@entry=0xfffffe00c376f000 "testpool1.2123/d1@snap1") at /usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/module/zfs/dsl_prop.c:1289 error = 0 #14 0xffffffff878206fa in zfs_ioc_set_prop (zc=zc@entry=0xfffffe00c376f000) at /usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/module/zfs/zfs_ioctl.c:2938 origprops = 0xfffffe0078860c20 nvl = 0xfffffe00d3247c20 received = 1 source = ZPROP_SRC_RECEIVED error = errors = #15 0xffffffff877f2444 in zfsdev_ioctl_common (vecnum=vecnum@entry=22, zc=zc@entry=0xfffffe00c376f000, flag=flag@entry=0) at /usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/module/zfs/zfs_ioctl.c:8098 innvl = 0xfffffe00c7040a40 saved_poolname = 0xfffffe00d3841e60 "testpool1.2123" saved_poolname_len = 24 cmd = 22 error = start_time = -2020851486796 vec = 0xffffffff87cbf140 max_nvlist_src_size = cookie = 0 #16 0xffffffff86f55e06 in zfsdev_ioctl (dev=, zcmd=3222821398, arg=, flag=, td=) at /usr/home/somers/src/freebsd.org/src/sys/contrib/openzfs/module/os/freebsd/zfs/kmod_core.c:167 len = 24 vecnum = 22 error = 0 zcl = 0x0 uaddr = 0x241059d9b580 zc = 0xfffffe00c376f000 rc = zp = #17 0xffffffff8297579d in devfs_ioctl (ap=ap@entry=0xfffffe0077abea00) at /usr/home/somers/src/freebsd.org/src/sys/fs/devfs/devfs_vnops.c:950 dev = 0xfffffe00c1c32400 ref = 1 vp = 0xfffffe00c39496e0 com = 3222821398 td = 0xfffffe00bfd2d000 dsw = error = fgn = p = i = sess = vpold = #18 0xffffffff85487e1e in VOP_IOCTL_APV ( vop=vop@entry=0xffffffff85cb29b8 , a=a@entry=0xfffffe0077abea00) at vnode_if.c:1229 rc = #19 0xffffffff838282fa in VOP_IOCTL (vp=, command=, data=, fflag=, cred=, td=) at ./vnode_if.h:637 a = #20 0xffffffff838282fa in vn_ioctl (fp=, com=3222821398, data=0xfffffe0077abecb0, active_cred=0xfffffe00bee31c00, td=0xfffffe00bfd2d000) size = 0 vp = 0xfffffe00c39496e0 error = bmarg = #21 0xffffffff82977f15 in devfs_ioctl_f (fp=fp@entry=0xfffffe00bfc3feb0, com=com@entry=3222821398, data=data@entry=0xfffffe0077abecb0, cred=cred@entry=0xfffffe00bee31c00, td=td@entry=0xfffffe00bfd2d000) at /usr/home/somers/src/freebsd.org/src/sys/fs/devfs/devfs_vnops.c:881 fpop = 0x0 error = #22 0xffffffff834f9da4 in fo_ioctl (fp=0xfffffe00bfc3feb0, com=3222821398, data=0xfffffe0077abecb0, active_cred=0xfffffe00bee31c00, td=0xfffffe00bfd2d000) at /usr/home/somers/src/freebsd.org/src/sys/sys/file.h:371 No locals. #23 kern_ioctl (td=td@entry=0xfffffe00bfd2d000, fd=fd@entry=3, com=com@entry=3222821398, data=data@entry=0xfffffe0077abecb0 "\017") at /usr/home/somers/src/freebsd.org/src/sys/kern/sys_generic.c:805 tmp = 0 fdp = 0xfffffe0079b33430 locked = 0 fp = 0xfffffe00bfc3feb0 error = #24 0xffffffff834f8f0f in sys_ioctl (td=td@entry=0xfffffe00bfd2d000, uap=uap@entry=0xfffffe00bfd2d400) at /usr/home/somers/src/freebsd.org/src/sys/kern/sys_generic.c:713 smalldata = "\017\000\000\000\020$\000\000\200\265\331Y\020$\000\000\260\021", '\000' , "`\355\253w\000\376\377\377\212\226\377\204\377\377\377\377\000\000\000\000\000\000\000\000@\357\253w\000\376\377\377\000\000\000\000\000\000\000\000]!\000\205\000\000\000\000\000\320ҿ\000\376\377\377\b\360\311\303\000\376\377\377\000\000\000\000\000\376\377\377\360\323ҿ\000\376\377\377\000\320ҿ", '\000' arg = -512 com = size = 24 data = 0xfffffe0077abecb0 "\017" error = #25 0xffffffff84ffa7e2 in syscallenter (td=0xfffffe00bfd2d000) at /usr/home/somers/src/freebsd.org/src/sys/amd64/amd64/../../kern/subr_syscall.c:191 se = 0xffffffff85cc64f0 p = 0xfffffe00c841b580 sa = 0xfffffe00bfd2d3f0 error = sy_thr_static = true traced = _audit_entered = #26 amd64_syscall (td=0xfffffe00bfd2d000, traced=0) at /usr/home/somers/src/freebsd.org/src/sys/amd64/amd64/trap.c:1201 ksi = {ksi_link = {tqe_next = 0xfffffe0077abee60, tqe_prev = 0xfffffe00c841b6c0}, ksi_info = { si_signo = -1076703224, si_errno = -512, si_code = -2038926336, si_pid = 0, si_uid = 0, si_status = 0, si_addr = 0x0, si_value = { sival_int = 0, sival_ptr = 0x0, sigval_int = 0, sigval_ptr = 0x0}, _reason = {_fault = {_trapno = -1}, _timer = { _timerid = -1, _overrun = 0}, _mesgq = {_mqd = -1}, _poll = { _band = 4294967295}, _capsicum = {_syscall = -1}, __spare__ = { __spare1__ = 4294967295, __spare2__ = {0, 0, 0, 0, 8, 0, -2038926312}}}}, ksi_flags = 0, ksi_sigq = 0xffffffff8678740c} #27 No locals. #28 0x0000241067d34cca in ?? () ```
@asomers asomers added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jan 24, 2025
@amotin
Copy link
Member

amotin commented Jan 24, 2025

It seems FreeBSD for some reason includes two copies of ZTS, and zpool_destroy_004_pos is not present in upstream OpenZFS and its copy. Would be nice to remove duplication and upstream what is not upstreamed.

As I can see, the test tries to destroy the pool during receive. And setting received properties can't handle the error if the pool was destroyed meanwhile. It seems the assertion where it has tripped should be replaces with proper error handling, since obviously it is not an impossible error.

@asomers
Copy link
Contributor Author

asomers commented Jan 24, 2025

It seems FreeBSD for some reason includes two copies of ZTS, and zpool_destroy_004_pos is not present in upstream OpenZFS and its copy. Would be nice to remove duplication and upstream what is not upstreamed.

Do you mean two copies of the test suite? That's because we originally ported it from Solaris to FreeBSD when I worked for Spectra Logic. See FreeBSD commit 2fae26bd8b752cfae083962a152f4b1ee54ada17. But later the ZoL people independently ported it, apparently ignorant of our earlier work. So now there are two test suites, and they are difficult to reconcile. This particular test case is one that I wrote.

As I can see, the test tries to destroy the pool during receive. And setting received properties can't handle the error if the pool was destroyed meanwhile. It seems the assertion where it has tripped should be replaces with proper error handling, since obviously it is not an impossible error.

+1

@amotin
Copy link
Member

amotin commented Jan 24, 2025

Do you mean two copies of the test suite?

Yes. tests/sys/cddl/zfs/tests/cli_root/zpool_destroy and sys/contrib/openzfs/tests/zfs-tests/tests/functional/cli_root/zpool_destroy. I guess the second might be not hooked up to FreeBSD test suite, while the first to OpenZFS CI. They should be unified.

This particular test case is one that I wrote.

It would be good to port it and whatever else might be in FreeBSD only to OpenZFS upstream. That way it would regularly be executed by OpenZFS CI on both FreeBSD and Linux, and the issue would be spotted long ago.

@asomers
Copy link
Contributor Author

asomers commented Jan 24, 2025

It would be good to port it and whatever else might be in FreeBSD only to OpenZFS upstream. That way it would regularly be executed by OpenZFS CI on both FreeBSD and Linux, and the issue would be spotted long ago.

Agreed. But it's hard, especially because ZoL made two important design decisions differently than I did:

  • ZoL uses per-directory setup/teardown, rather than per-testcase.
  • ZoL runs the tests unprivileged by default and raises privileges when needed. FreeBSD's test suite does it the other way.

I put a lot of hours into that test suite at Spectra Logic. And frankly, the fact that ZoL completely overlooked all of my effort without a second thought is very dispiriting to me. That's why I'm not investing any more of my volunteer hours into the test suite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

2 participants