The Ultimate Guide to ZFS on Linux
For decades, the standard Linux storage stack involved multiple layers: mdadm for RAID, LVM for volume management, and ext4 or xfs for the filesystem.
ZFS changes everything. Originally developed by Sun Microsystems, ZFS combines the roles of a volume manager and a file system into a single, cohesive, highly advanced tool. It is designed from the ground up for massive storage capacities and absolute data integrity.
Core Concepts of ZFS
1. Copy-On-Write (CoW)
Traditional file systems overwrite data in place. If a power failure occurs halfway through writing a block, the data is corrupted. ZFS uses Copy-On-Write. When you modify a file, ZFS writes the new data to a completely new block on the disk. Once the write is complete, it updates the pointers. This means the filesystem is always in a consistent state.
2. End-to-End Checksumming
Bit rot is real. Cosmic rays, degrading magnetic platters, or faulty cables can flip bits over time. ZFS calculates a cryptographic checksum for every block of data. When you read a file, ZFS verifies the checksum. If it detects corruption, and you have redundant disks (RAID), ZFS automatically heals the data on the fly.
3. Vdevs and Pools
- Vdev (Virtual Device): A grouping of physical disks (e.g., a mirror, or a RAID-Z stripe).
- Zpool: The highest level structure. A pool is built from one or more vdevs. You create filesystems on top of the pool.
Setting Up a Zpool
Let’s assume you have three empty disks: /dev/sdb, /dev/sdc, and /dev/sdd. We can create a RAID-Z1 pool (similar to RAID 5, tolerating one disk failure).
sudo zpool create mypool raidz1 /dev/sdb /dev/sdc /dev/sdd
That single command creates the RAID array, initializes the filesystem, and mounts it at /mypool.
You can check the health of your pool:
zpool status
ZFS Datasets and Properties
Instead of creating directories, in ZFS you create Datasets. Each dataset can have its own properties, like compression, quotas, and mount points.
Create a dataset for a database:
sudo zfs create mypool/database
Enable LZ4 compression on that dataset (LZ4 is so fast it often improves disk performance):
sudo zfs set compression=lz4 mypool/database
Set a 50GB quota:
sudo zfs set quota=50G mypool/database
The Magic of Snapshots
Because of Copy-On-Write, taking a snapshot of a ZFS filesystem is instantaneous and takes 0 bytes of extra space initially. It simply freezes the data pointers at that moment in time.
Take a snapshot:
sudo zfs snapshot mypool/database@backup-monday
If you accidentally delete a table in your database, you can rollback the entire filesystem to that exact second in time instantly:
sudo zfs rollback mypool/database@backup-monday
Alternatively, you can browse the snapshot as a hidden read-only filesystem under /.zfs/snapshot/backup-monday/ to retrieve individual files.
ZFS Send and Receive
ZFS allows you to serialize a snapshot into a data stream and send it over SSH to another server. This is the most efficient way to back up massive amounts of data, as it only sends the binary block differences (deltas).
sudo zfs send mypool/database@backup-monday | ssh backup-server zfs receive backuppool/database
Conclusion
While ZFS is more memory-hungry than ext4 (due to its adaptive replacement cache, or ARC), the features it provides—instant snapshots, transparent compression, and cryptographic data integrity—make it the premier choice for NAS devices, database servers, and virtualization hosts on Linux.