tl;dr
A hard drive is a relatively fragile data store. After discovering the
first indicators of a drive failure, the hard drive might suddenly die.
The present blog post, therefore, discusses the gradual acquisition of
evidence from an erroneous drive by utilizing the synergy of the open
source tools ddrescue
and partclone
. To spare the
mechanics of the drive and acquire the most critical data first,
partclone
is used to create a so-called domain file, where the used
blocks of the file system are noted.
# Record actually used blocks in domainfile partclone.[fstype] -s /dev/sdXY -D --offset_domain=$OFF_IN_BYTES \ -o domainfile
This is the basis for ddrescue
's recovery process, where a
blocksize-changing data recovery algorithm is utilized that will only
cover these areas for the moment.
# Restrict rescue area via domainfile ddrescue --domain-log domainfile /dev/sdX out.img mapfile
Afterward, additional runs can be conducted to acquire the remaining sectors.
Background
Since HDDs are rather sensitive mechanical components, it is not uncommon for them to exhibit read errors after a certain amount of usage and the wear and tear of the magnetic platters or alternatively as a consequence of shock events, which lead to a mechanical damage inside the drive. So-called head crashes, which most commonly occur when the HDD drops during regular operation might be lethal for an HDD and would require a complete dismantling of the drive in specialized laboratory. Grinding sounds are typical for such a scenario and require an immediate stop of operation. However, minor shock events of the HDD and/or when the actuator arm is in its "parking position" might not lead to great physical damage, but result in mechanical dysfunctioning and read/write errors. This regularly leads to clicking, knocking, or ticking noises, which stem from abnormal behavior of the disk's read and write head, when it is repeatedly trying to read a sector. If a hard disk makes noise, data loss is likely to occur soon or has already happened. Grinding or screeching noise should be an indicator to power down the device immediately and hand it over to a specialized data recovery laboratory, to secure the remaining evidence. Given minor clicking or knocking noise, one might try to recover the data with the help of specialized software as soon as possible, as it is discussed in this blog post.
Acquisition of data from erroneous drives
Standard approach with GNU ddrescue
GNU ddrescue
is the go-to tool to perform data recovery task with
open source tooling 1. It maximizes the amount of recovered data
by reading the unproblematic sectors first and scheduling areas with
read errors for later stages by keeping track of all visited sectors
in a so-called mapfile. ddrescue
has an excellent and exhaustive
manual to consult 2. To get a first glimpse ddrescue
's
procedure, which employs a block-size-changing algorithm, can be
summarized as follows: Per default, its operation is divided into four
phases, where the first and last one can be divided into passes, while
each phase consults the mapfile to keep track of the status of each
sector (or area) in its mapfile.
- Copying: Read non-tried parts, forward and backward with increasing granularity in each pass. Record the blocks, which could not be read, as non-trimmed in the mapfile.
- Trimming: Blocks that were marked as non-trimmed are trimmed in this phase, meaning to read from the edge forward sector by sector until a read error is encountered. Then read the sectors backward from the edge at the block's end until the sector read fails and keep track of the sectors in between as non-scraped in the mapfile.
- Scraping: In this phase, non-scraped blocks are scraped forward sector by sector while marking unreadable sectors as bad.
- Retrying: Lastly, the bad sectors can be read $n$-times with
reversed directions for each try, which is disabled by default and
can be set via the parameter
--retry-passes=n
. Unreadable sectors are filled with zeros in the resulting image (or device).
Using ddrescue
with its sane default settings is as simple as
running
ddrescue /dev/sdX out.img mapfile
To activate direct disk access and omit kernel caching, one must use
-d/--idirect
and set the sector sizes via -b/--sector-size
. An
indicator for kernel caching is when the positions and sizes in the
mapfile are always a multiple of the sector size 3.
# Check the disk's sector size SECTOR_IN_BYTES=$(cat /sys/block/sdX/queue/physical_block_size) # Run ddrescue with direct disk access ddrescue -d -b $SECTOR_IN_BYTES /dev/sdX out.img mapfile
Gradual approach by combining partclone
and ddrescue
While the straightforward sector-by-sector copying of a failing HDD
with ddrescue
often yields good results, it might be very slow.
Given the fact, that acquiring evidence after the damage is a race
against the clock because, with every rotation of the platter the
probability of an ultimate drive failing increases, one might want to
ensure, that critical data gets acquired first by determining the
actually used blocks of the filesystem and prioritizing those 4.
To accomplish this, the open source tool for cloning partitions
partclone
comes into the (inter)play with ddrescue
.
partclone
"provide[s] utilities to backup used blocks" and supports
most of the widespread filesystems, like ext{2,3,4}, btrfs, xfs, NTFS,
FAT, ExFAT and even Apple's HFS+ 5. One of its features is the
ability to list "all used blocks as domain files", so that "it could
make ddrescue smarter and faster when dumping a partition" 4.
partclone
operates similarily to ddrutility
's tool
ddru_ntfsbitmap
, which extracts the bitmap file from an NTFS
partition and creates a domain file 6 but works with other
filesystems as well by looking at their block allocation structures to
determine used blocks and store those in the a/m domain
mapfile 7. The term rescue domain describes the "[b]lock or
set of blocks to be acted upon" 8. By specifying
--domain-mapfile=file
the tool is restricted to look only at areas,
which are marked with a +
9.
Generating a domain mapfile
To generate a domain, file simply use partclone
with the -D
flag
and specify the resulting domain file via -o
partclone.[fstype] -s /dev/sdXY -D -o sdXY.mapfile
If you want to run ddrescue
on the whole disk and not just the
partition, to image the whole thing iteratively, it is necessary to
use --offset_domain=N
, which specifies the offset in bytes to the
start of the partition. This will be added to all position values in
the resulting domain mapfile. To create such a file use the following
commands:
# Retrieve the offset in sectors OFF_IN_SECTORS=$(mmls /dev/sdXY | awk '{ if ($2 == "001") print $3}') # Retrieve sector size SECTOR_IN_BYTES=$(mmls /dev/sdX | grep -P 'in\s\d*\-byte sectors' | \ grep -oP '\d*') # Calculate offset OFF_IN_BYTES=$((OFF_IN_SECTORS * SECTOR_IN_BYTES)) # Create domain file partclone.[fstype] -s /dev/sdXY -D --offset_domain=$OFF_IN_BYTES \ -o domainfile
The resulting domain file looks like illustrated in the following listing:
cat domainfile # Domain logfile created by unset_name v0.3.13 # Source: /dev/sdXY # Offset: 0x3E900000 # current_pos current_status 0xF4240000 ? # pos size status 0x3E900000 0x02135000 + 0x40A35000 0x05ECB000 ? 0x46900000 0x02204000 + 0x48B04000 0x005FC000 ? <snip>
The offset
at the top denotes the beginning of the file system. The
current_pos
-value corresponds to the last sector used by the file
system. Used areas are marked with a +
and unused areas with a
?
7.
Acquiring the used blocks with ddrescue
To acquire only those areas with ddrescue
that are actually used by
the file system and, therefore, have been denoted with a +
in the
domain file, use the following command.
# Clone only blocks, which are actually used (of part Y) ddrescue --domain-log domainfile /dev/sdX out.img mapfile # Check if acquisition was successful fsstat -o $OFF_IN_SECTORS out.img
Since you already know the offset, you might omit to clone the
partition table on the first run. After completion of the a/m command,
you can be sure that the mission-critical file system blocks have
been acquired which can be double-checked by diffing the domain file
and the mapfile, like this diff -y domainfile mapfile
.
Acquiring the remaining blocks with ddrescue
So the additional sectors, which might contain previously deleted
data, of the disk can be imaged in a subsequent and lengthy run
without having to fear a definitive drive failure too much. To do this
simply supply ddrescue
the mapfile, which recorded all previously
generated in the previous run without restricting the rescue domain
this time, so that it will add the remaining blocks, which were either
zero filled or omitted entirely:
# Clone remaining blocks ddrescue /dev/sdX out.img mapfile # Check result by inspecting the partition table mmls out.img
After the completion of this procedure, which is a fragile process on its own, some kind of integrity protection should be employed, even though the source media could not be hashed itself. For example, this could be done by hashing the artifacts and signing the resulting file, which contains the hashes.
Summary
The present blog post discussed the usage of ddrescue
as well as
gradual imaging of damaged drives. To acquire mission-critical data
first and rather timely, partclone
was used to determine the blocks
that are actually used by the file system residing on the partition in
question. This information was recorded in a so-called domain file and
fed to ddrescue
via the command line parameter --domain-log
, so
that the tool limits its operation on the blocks specified there.
Afterward, another lengthy run could be initiated to image the
remaining sectors.
Footnotes:
Install it via sudo apt install gddrescue