In this blog, I will not go into details about what ZFS is and how it works, but I will only present my problem that I had and how I solved it. Maybe someone else will have a similar problem so I want to save him from stress, sleepless nights and nervousness. The solution is very simple now that I look at it from this perspective, but then I was stressed because I could not find a solution on how to return data worth 10 years.
I have two TrueNAS servers, one of which is the main NAS (located on the bare metal machine) and the other is the backup NAS server (virtual machine on the VMware ESXi hypervisor).
The main TrueNAS server was causing me some problems so I had to shut it down. During the repair process, I inadvertently formatted both disks and lost two pools with my data. I wasn’t too worried because I know I have a backup server with all my data. When I repaired the main TrueNAS server and put it into operation, I enabled the Replication task to replicate pools from the backup of the TrueNAS server. It was fine until the power went out in the middle of copying data. It has happened before that the electricity goes out, but when the electricity comes, everything turns on and returns to normal.
Yes, yes, I don’t have a UPS because it’s currently out of my budget but after this experience I definitely have to buy a quality UPS.
This time, however, it did not return to normal state. When the backup TrueNAS server turned on I had an offline pool. I tried to disconnect the pool and re-import it I got an error “I / O error importing pool” as I showed in the picture below.
I tried to import the pool through the shell and get the same error.
After several days of painstaking reading and learning about ZFS on the Internet and writing on forums, I came across this command:
[email protected][~]# zpool import -fFX your_pool_name
On the openzfs github page I found what these options do.
f = Forces import, even if the pool appears to be potentially active.
F = Recovery mode for a non-importable pool. Attempt to return the pool to an importable state by discarding the last few transactions. Not all damaged pools can be recovered by using this option. If successful, the data from the discarded transactions is irretrievably lost. This option is ignored if the pool is importable or already imported.
X = Used with the -F recovery option. Determines whether extreme measures to find a valid txg should take place. This allows the pool to be rolled back to a txg which is no longer guaranteed to be consistent. Pools imported at an inconsistent txg may contain uncorrectable checksum errors. For more details about pool recovery mode, see the -F option, above. WARNING: This option can be extremely hazardous to the health of your pool and should only be used as a last resort.
When I activated this command I had to wait more than 6 hours for it to complete (depending on how much capacity your pool has). It is important not to interrupt the process. You must leave the command to complete. When it’s done you still won’t be able to see your pool. You must restart your TrueNAS server then import the pool through the gui or shell.
Then just make smb share and you can access your data.
Conclusion for me and for you.
- Use the 3-2-1 data backup method. And that means 3 copies, two of which are on two different media, and one of which is a remote backup.
- Use a quality UPS if possible but if you don’t have a budget like me then some cheap UPS can serve as well. Better something than nothing.
Check video below for practical soulution.