Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zpool status -e doesn't show drive being rebuilt #16979

Open
cmharr opened this issue Jan 22, 2025 · 0 comments
Open

zpool status -e doesn't show drive being rebuilt #16979

cmharr opened this issue Jan 22, 2025 · 0 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@cmharr
Copy link
Contributor

cmharr commented Jan 22, 2025

System information

Type Version/Name
Distribution Name TOSS / RHEL
Distribution Version 4.8-6 / 8.10
Kernel Version 4.18.0-553.27.1.1toss.t4.x86_64
Architecture x86_64
OpenZFS Version 2.2.4

Describe the problem you're observing

zpool status -e does not display drives being resilvered
A drive being resilvered will show ONLINE even though it is not fully rebuilt. Using the -e option to zpool status should show such drives.

Describe how to reproduce the problem

In a pool with a drive that is resilvering, run zpool status -e and you will see the drive in question is not shown. In the following example, drive U0 is being resilvered in a 53-drive draid. Both the in-use distributed spare and U0 are hidden from view.

 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Jan 21 06:44:45 2025
	1.38P / 1.38P scanned, 144T / 177T issued at 1.29G/s
	43.4T resilvered, 81.21% done, 07:21:15 to go
config:

	NAME                   STATE     READ WRITE CKSUM
	merced216              DEGRADED     0     0     0
	  draid2:11d:53c:1s-0  DEGRADED     0     0     0
	    spare-0            DEGRADED     0 1.68K     0
	      replacing-0      DEGRADED     0 1.68K     0
	        U0/old         FAULTED      0     0     0  too many errors
	special	
	spares
	  draid2-0-0           INUSE     currently in use

errors: No known data errors

By allowing vdevs with vs->vs_scan_processed > 0 to be shown, we can then see all drives participating in the resilver. (Note we should also show drives that have non-zero vs->vs_aux values.)

diff --git a/cmd/zpool/zpool_main.c b/cmd/zpool/zpool_main.c
index 506427a10..55a51beee 100644
--- a/cmd/zpool/zpool_main.c
+++ b/cmd/zpool/zpool_main.c
@@ -2921,6 +2921,8 @@ print_status_config(zpool_handle_t *zhp, status_cbdata_t *cb, const char *name,
         * can be pruned if all of their leaves are healthy.
         */
        if (cb->cb_print_unhealthy && depth > 0 &&
+               vs->vs_aux == 0 &&
+               vs->vs_scan_processed == 0 &&
            for_each_vdev_in_nvlist(nv, vdev_health_check_cb, cb) == 0) {
                return;
        }

With the above patch, we now see the following with zpool status -e:

  pool: merced216
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Jan 21 06:44:45 2025
	1.38P / 1.38P scanned, 144T / 177T issued at 1.29G/s
	43.5T resilvered, 81.36% done, 07:17:56 to go
config:

	NAME                   STATE     READ WRITE CKSUM
	merced216              DEGRADED     0     0     0
	  draid2:11d:53c:1s-0  DEGRADED     0     0     0
	    spare-0            DEGRADED     0 1.68K     0
	      replacing-0      DEGRADED     0 1.68K     0
	        U0/old         FAULTED      0     0     0  too many errors
	        U0             ONLINE       0     0     0  (resilvering)
	      draid2-0-0       ONLINE       0     0     0  (resilvering)
	    U2                 ONLINE       0     0     0  (resilvering)
	    U4                 ONLINE       0     0     0  (resilvering)
	    U6                 ONLINE       0     0     0  (resilvering)
	    U8                 ONLINE       0     0     0  (resilvering)
	    U10                ONLINE       0     0     0  (resilvering)
	    U12                ONLINE       0     0     0  (resilvering)
	    U14                ONLINE       0     0     0  (resilvering)
	    U16                ONLINE       0     0     0  (resilvering)
	    U18                ONLINE       0     0     0  (resilvering)
	    U20                ONLINE       0     0     0  (resilvering)
	    U22                ONLINE       0     0     0  (resilvering)
	    U24                ONLINE       0     0     0  (resilvering)
	    U26                ONLINE       0     0     0  (resilvering)
	    U28                ONLINE       0     0     0  (resilvering)
	    U30                ONLINE       0     0     0  (resilvering)
	    U32                ONLINE       0     0     0  (resilvering)
	    U34                ONLINE       0     0     0  (resilvering)
	    U36                ONLINE       0     0     0  (resilvering)
	    U38                ONLINE       0     0     0  (resilvering)
	    U40                ONLINE       0     0     0  (resilvering)
	    U42                ONLINE       0     0     0  (resilvering)
	    U44                ONLINE       0     0     0  (resilvering)
	    U46                ONLINE       0     0     0  (resilvering)
	    U48                ONLINE       0     0     0  (resilvering)
	    U50                ONLINE       0     0     0  (resilvering)
	    U52                ONLINE       0     0     0  (resilvering)
	    U54                ONLINE       0     0     0  (resilvering)
	    U56                ONLINE       0     0     0  (resilvering)
	    U58                ONLINE       0     0     0  (resilvering)
	    U60                ONLINE       0     0     0  (resilvering)
	    U62                ONLINE       0     0     0  (resilvering)
	    U64                ONLINE       0     0     0  (resilvering)
	    U66                ONLINE       0     0     0  (resilvering)
	    U68                ONLINE       0     0     0  (resilvering)
	    U70                ONLINE       0     0     0  (resilvering)
	    U72                ONLINE       0     0     0  (resilvering)
	    U74                ONLINE       0     0     0  (resilvering)
	    U76                ONLINE       0     0     0  (resilvering)
	    U78                ONLINE       0     0     0  (resilvering)
	    U80                ONLINE       0     0     0  (resilvering)
	    U82                ONLINE       0     0     0  (resilvering)
	    U84                ONLINE       0     0     0  (resilvering)
	    U86                ONLINE       0     0     0  (resilvering)
	    U88                ONLINE       0     0     0  (resilvering)
	    U90                ONLINE       0     0     0  (resilvering)
	    U92                ONLINE       0     0     0  (resilvering)
	    U94                ONLINE       0     0     0  (resilvering)
	    U96                ONLINE       0     0     0  (resilvering)
	    U98                ONLINE       0     0     0  (resilvering)
	    U100               ONLINE       0     0     0  (resilvering)
	    U102               ONLINE       0     0     0  (resilvering)
	    U104               ONLINE       0     0     0  (resilvering)
	special	
	spares
	  draid2-0-0           INUSE     currently in use

errors: No known data errors

This is an improvement over a plain zpool status because the full pool has 106 vdevs (split into two dRAIDs) and only the resilvering vdevs are shown. However, I believe this is not ready for merging because I would like to not see the other drives participating in the resilver (U2-U104); rather, just the drive that is primarily being rebuilt and the in-use spare that now accompanies it.

Include any warning/errors/backtraces from the system logs

N/A

@cmharr cmharr added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

1 participant