This document is a WORK IN PROGRESS.
This is just a quick personal cheat sheet: treat its contents with caution!
ZFS¶
ZFS is a combined file system and logical volume manager (designed by Sun Microsystems). ZFS is
scalable, and includes extensive protection against data corruption, support for high storage
capacities, efficient data compression, integration of the concepts of file system and volume
management, snapshots and CoW clones, continuous integrity checking and automatic repair, RAID Z,
native NFSv4 ACLs
, and can be very precisely configured.
Reference(s)
- ⭐️ https://openzfs.github.io/openzfs-docs/Getting%20Started/index.html
- ⭐️ https://web.archive.org/web/20231110075538/https://blog.mikesulsenti.com/zfs-cheat-sheet-and-guide/
- https://web.archive.org/web/20241009080703/https://www.dlford.io/linux-zfs-raid-disk-replacement-procedure/
- https://wiki.gentoo.org/wiki/ZFS
- https://wiki.archlinux.org/index.php/ZFS
- https://en.wikipedia.org/wiki/ZFS
Table of contents¶
Install¶
TODO
Config¶
TODO
Use¶
TODO
-
Show disk space utilization info:
-
Show ZFS storage pool debugging and consistency information:
-
Show all properties for
$POOLNAME
or$DATASET_NAME
: -
Check zpool status of all pools with extra verbose information:
-
Check zpool status of specific pool
$POOLNAME
with extra verbose information: -
Show verbose information about pools filesystem statistics:
-
Show verbose IO statistics for all pools:
-
Show verbose IO statistics for a specific pool
$POOLNAME
: -
Show useful and advanced information on how ZFS's ARCG Cache is being used:
-
Show the serial number of a disk (e.g.
/dev/sdx
):
How to replace a disk¶
- Get the name of the disk you want to replace:
$ zpool status -v $POOLNAME
pool: <POOLNAME>
state: ONLINE
status: One or more devices are configured to use a non-native block size.
Expect reduced performance.
action: Replace affected devices with devices that support the
configured block size, or migrate data to a properly configured
pool.
scan: resilvered 186G in 1 days 17:17:59 with 0 errors on Wed Sep 11 04:10:14 2024
config:
NAME STATE READ WRITE CKSUM
<POOLNAME> ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:3:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:4:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:5:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:6:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:7:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:8:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:9:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:10:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:11:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:12:0 ONLINE 0 0 0 block size: 512B configured, 4096B native
pci-0000:03:00.0-scsi-0:0:13:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:14:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:15:0 ONLINE 0 0 0
errors: No known data errors
E.g. let's say pci-0000:03:00.0-scsi-0:0:12:0
has to be replaced.
- Find the path to that disk:
$ zdb | grep "pci-0000:03:00.0-scsi-0:0:12:0"
path: '/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:12:0-part1'
phys_path: 'pci-0000:03:00.0-scsi-0:0:12:0'
- Put the disk offline:
Note that if you put the wrong disk offline, you can take it bake online with:
- Make sure that the last command has been correctly issued before moving on:
zpool status -v $POOLNAME
pool: <POOLNAME>
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scan: resilvered 186G in 1 days 17:17:59 with 0 errors on Wed Sep 11 04:10:14 2024
config:
NAME STATE READ WRITE CKSUM
<POOLNAME> DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
pci-0000:03:00.0-scsi-0:0:3:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:4:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:5:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:6:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:7:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:8:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:9:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:10:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:11:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:12:0 OFFLINE 0 0 0 block size: 512B configured, 4096B native
pci-0000:03:00.0-scsi-0:0:13:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:14:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:15:0 ONLINE 0 0 0
errors: No known data errors
- Locate the disk to remove, e.g. by blinking its LED with
ledctl
:
-
Physically remove the disk, and replace it with a new one.
-
You can stop the LED blinking, e.g. with
ledctl
:
- Tell ZFS to replace/resilver the disk:
- Now we wait! You can keep an eye on the status with the below command:
$ watch zpool status -v $POOLNAME
pool: <POOLNAME>
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Wed Oct 9 09:47:36 2024
590G scanned at 18.4G/s, 7.84M issued at 251K/s, 2.46T total
0B resilvered, 0.00% done, no estimated completion time
config:
NAME STATE READ WRITE CKSUM
<POOLNAME> DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
pci-0000:03:00.0-scsi-0:0:3:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:4:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:5:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:6:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:7:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:8:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:9:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:10:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:11:0 ONLINE 0 0 0
replacing-9 DEGRADED 0 0 0
old OFFLINE 0 0 0 block size: 512B configured, 4096B native
pci-0000:03:00.0-scsi-0:0:12:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:13:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:14:0 ONLINE 0 0 0
pci-0000:03:00.0-scsi-0:0:15:0 ONLINE 0 0 0
errors: No known data errors
sanoid¶
Sanoid is a policy-driven snapshot management tool for ZFS filesystems.
-
How to install (see https://repology.org/project/sanoid/versions):
TODO
TODO
Install with AUR:
$ mkdir -p ~/apps/aur-apps $ cd ~/apps/aur-apps $ git https://aur.archlinux.org/sanoid.git $ cd sanoid $ makepkg -is # --syncdeps to auto-install deps, --install to install after building
For Artix users
If you are not using systemd, you might have to translate the
sanoid
systemd services and timer yourselfTODO
TODO
-
How to use / configure:
-
After installing sanoid, you will find a simple example configuration file in
/etc/sanoid/sanoid.conf
or in/usr/share/doc/sanoid/examples/sanoid.conf
.
You will also find a more detailed example in
/etc/sanoid/sanoid.defaults.conf
or/usr/share/sanoid/sanoid.defaults.conf
, in order to consult all possible options for thetemplate
part of the configuration file.-
Now let's list the available pool(s) for ZFS snapshots:
-
Let's say you want to snapshot the
production_data
pool. In order to do so, you can create your own configuration file in/etc/sanoid/sanoid.conf
, e.g. :
This configuration will keep a weekly, a monthly and a yearly snapshot.
You will have to wait for a 15mn (maximum) before the
sanoid.timer
applies your configuration and start to take snapshots.-
You can list your snapshots like so:
-
ℹ️ Note about systemd: by default,
sanoid
will work with systemd, i.e. after installation, a systemd timer and two systemd services will be up and running:
The sanoid.timer
timer unit runs sanoid-prune.service
followed by sanoid.service
every 15 minutes.
To edit any of the command-line options,
you can edit these service files.
- ℹ️ Note about snapshots location: by default,
all snapshots are saved to the
sanoid
cache directory:/var/cache/sanoid
RAIDZ expansion¶
TODO
If this cheat sheet has been useful to you, then please consider leaving a star here.