Guidelines for using hard drives in an environment subject to abrupt power failures

A lot of work in recent years went into the QNX 4 filesystem design to increase its ability to be resistant to power-failure scenarios (through the costly use of synchronised writes to enforce ordering), but it wasn't designed to handle corrupted sectors (either real bad blocks or virtual ECC ones).

This type of corruption during power failures occurs only when a lack of atomic sector-writes causes an ECC error to manifest as a bad block. On hard drives that retain valid sector contents, when a multiple sector metadata update is required, the filesystems use ordered writes to ensure that if power is lost (and thus only some writes were made), then they err on the side of lost resources rather than filesystem corruption.

For example, when growing a file in the QNX 4 filesystem, the blocks are first marked as used in the bitmap (a write to the bitmap) and then assigned to the file (a write to the inode). If a power failure occurs, there could be some blocks marked in the bitmap that don't belong to any file; the alternative ordering would have the file using free blocks in the bitmap which could then be allocated to another file, which results in a cross-linked file corruption.

Our recent investigations do show possible physical I/O errors (ECC errors) with loss of data. It's very difficult for the current QNX 4 filesystem, which overwrites metadata in-place, to prevent this situation.

Based on our investigation of multiple examples of corrupted filesystems, we have compiled this document with recommendations that should help limiting catastrophic damage to the hard disk (i.e. not being able to mount it).

Implementing some of the strategies, described below, to limit the catastrophic filesystem corruption and the associated recovering procedures should be less risky to projects than introducing radical filesystem changes.