This document is a WORK IN PROGRESS.
This is just a quick personal cheat sheet: treat its contents with caution!

ZFS¶

ZFS is a combined file system and logical volume manager (designed by Sun Microsystems). ZFS is scalable, and includes extensive protection against data corruption, support for high storage capacities, efficient data compression, integration of the concepts of file system and volume management, snapshots and CoW clones, continuous integrity checking and automatic repair, RAID Z, native NFSv4 ACLs, and can be very precisely configured.

Reference(s)

Install¶

TODO

Config¶

TODO

Use¶

TODO

Show disk space utilization info:
```
$ zfs list
```
Show ZFS storage pool debugging and consistency information:
```
$ zdb
```

Show all properties for $POOLNAME or $DATASET_NAME:

$ zfs get all $POOLNAME
$ zfs get all $DATASET_NAME

Check zpool status of all pools with extra verbose information:
```
$ zpool status -v
```
Check zpool status of specific pool $POOLNAME with extra verbose information:
```
$ zpool status -v $POOLNAME
```
Show verbose information about pools filesystem statistics:
```
$ zpool list -v
```
Show verbose IO statistics for all pools:
```
$ zpool iostat -v
```
Show verbose IO statistics for a specific pool $POOLNAME:
```
$ zpool iostat -v $POOLNAME
```
Show useful and advanced information on how ZFS's ARCG Cache is being used:
```
$ arcstat

$ arc_summary
```
Show the serial number of a disk (e.g. /dev/sdx):
```
$ smartctl -a /dev/sdx | grep Serial
```

How to replace a disk¶

Get the name of the disk you want to replace:

$ zpool status -v $POOLNAME

    pool: <POOLNAME>
   state: ONLINE
    scan: scrub repaired 0B in 09:56:17 with 0 errors on Sun Dec  8 10:20:19 2024
  config:

          NAME                                STATE     READ WRITE CKSUM
          <POOLNAME>                          ONLINE       0     0     0
            raidz2-0                          ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:3:0   ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:4:0   ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:5:0   ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:6:0   ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:7:0   ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:8:0   ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:9:0   ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:10:0  ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:11:0  ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:12:0  ONLINE       0     0     0  block size: 512B configured, 4096B native
              pci-0000:03:00.0-scsi-0:0:13:0  ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:14:0  ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:15:0  ONLINE       0     0     0

  errors: No known data errors

E.g. let's say pci-0000:03:00.0-scsi-0:0:12:0 has to be replaced.

Find the path to that disk:

$ zdb | grep "pci-0000:03:00.0-scsi-0:0:12:0"

  path: '/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:12:0-part1'
  phys_path: 'pci-0000:03:00.0-scsi-0:0:12:0'

Put the disk offline:

$ zpool offline $POOLNAME /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:12:0

Note that if you put the wrong disk offline, you can take it bake online with:

$ zpool online $POOLNAME /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:12:0

Make sure that the last command has been correctly issued before moving on:

zpool status -v $POOLNAME

  pool: <POOLNAME>
 state: DEGRADED
  scan: scrub repaired 0B in 09:56:17 with 0 errors on Sun Dec  8 10:20:19 2024
config:

        NAME                                STATE     READ WRITE CKSUM
        <POOLNAME>                          DEGRADED     0     0     0
          raidz2-0                          DEGRADED     0     0     0
            pci-0000:03:00.0-scsi-0:0:3:0   ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:4:0   ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:5:0   ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:6:0   ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:7:0   ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:8:0   ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:9:0   ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:10:0  ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:11:0  ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:12:0  OFFLINE      0     0     0  block size: 512B configured, 4096B native
            pci-0000:03:00.0-scsi-0:0:13:0  ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:14:0  ONLINE       0     0     0
            pci-0000:03:00.0-scsi-0:0:15:0  ONLINE       0     0     0

errors: No known data errors

Locate the disk to remove, e.g. by blinking its LED with ledctl:

$ ledctl locate=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:12:0

Physically remove the disk, and replace it with a new one.
You can stop the LED blinking, e.g. with ledctl:

$ ledctl off=/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:12:0

Tell ZFS to replace/resilver the disk:

$ zpool replace $POOLNAME /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:12:0

Now we wait! You can keep an eye on the status with the below command:

$ watch zpool status -v $POOLNAME

    pool: <POOLNAME>
   state: DEGRADED
  status: One or more devices is currently being resilvered.  The pool will
          continue to function, possibly in a degraded state.
  action: Wait for the resilver to complete.
    scan: resilver in progress since Wed Oct  9 09:47:36 2024
          590G scanned at 18.4G/s, 7.84M issued at 251K/s, 2.46T total
          0B resilvered, 0.00% done, no estimated completion time
  config:

          NAME                                  STATE     READ WRITE CKSUM
          <POOLNAME>                            DEGRADED     0     0     0
            raidz2-0                            DEGRADED     0     0     0
              pci-0000:03:00.0-scsi-0:0:3:0     ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:4:0     ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:5:0     ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:6:0     ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:7:0     ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:8:0     ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:9:0     ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:10:0    ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:11:0    ONLINE       0     0     0
              replacing-9                       DEGRADED     0     0     0
                old                             OFFLINE      0     0     0  block size: 512B configured, 4096B native
                pci-0000:03:00.0-scsi-0:0:12:0  ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:13:0    ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:14:0    ONLINE       0     0     0
              pci-0000:03:00.0-scsi-0:0:15:0    ONLINE       0     0     0

  errors: No known data errors

sanoid¶

Sanoid is a policy-driven snapshot management tool for ZFS filesystems.

How to install (see https://repology.org/project/sanoid/versions):

apkaptdnfemergenix~~pacman~~ AUR (by hand)yumxbpszypper

# apk add sanoid

# apt install sanoid

TODO

on NixOSon non-NixOS

# nix-env -iA nixos.sanoid

# nix-env -iA nixpkgs.sanoid

Install with AUR:

$ mkdir -p ~/apps/aur-apps
$ cd ~/apps/aur-apps
$ git https://aur.archlinux.org/sanoid.git
$ cd sanoid
$ makepkg -is # --syncdeps to auto-install deps, --install to install after building

For Artix users

If you are not using systemd, you might have to translate the sanoid systemd services and timer yourself

TODO

# xbps-install -S sanoid

TODO

How to use / configure:
- See https://github.com/jimsalterjrs/sanoid/wiki/Sanoid
- After installing sanoid, you will find a simple example configuration file in /etc/sanoid/sanoid.conf or in /usr/share/doc/sanoid/examples/sanoid.conf.
You will also find a more detailed example in /etc/sanoid/sanoid.defaults.conf or /usr/share/sanoid/sanoid.defaults.conf, in order to consult all possible options for the template part of the configuration file.
- Now let's list the available pool(s) for ZFS snapshots:
```
$ zfs list

    NAME                USED  AVAIL     REFER  MOUNTPOINT
    production_data    1.11T  11.1T     1.00T  /prod
    archiving_data     3.33T  33.3T     3.00T  /arch
    test_data             1G     1T        1G  /test
    ...
```
- Let's say you want to snapshot the production_data pool. In order to do so, you can create your own configuration file in /etc/sanoid/sanoid.conf, e.g. :
```
[production_data]
    use_template = production

[template_production]
    frequently = 0
    hourly = 0
    daily = 0
    weekly = 1
    monthly = 1
    yearly = 1
    autosnap = yes
    autoprune = yes
```
This configuration will keep a weekly, a monthly and a yearly snapshot.

You will have to wait for a 15mn (maximum) before the sanoid.timer applies your configuration and start to take snapshots.
- You can list your snapshots like so:
```
$ zfs list -t snapshot
```
ℹ️ Note about systemd: by default, sanoid will work with systemd, i.e. after installation, a systemd timer and two systemd services will be up and running:
```
$ systemctl status sanoid.service
  ...

$ systemctl status sanoid-prune.service
  ...

$ systemctl status sanoid.timer
  ...
```

The sanoid.timer timer unit runs sanoid-prune.service followed by sanoid.service every 15 minutes. To edit any of the command-line options, you can edit these service files.

ℹ️ Note about snapshots location: by default, all snapshots are saved to the sanoid cache directory: /var/cache/sanoid

RAIDZ expansion¶

TODO

If this cheat sheet has been useful to you, then please consider leaving a star here.