Logical imaging with AFF4-L

<2021-08-03>

tl;dr

Using AFF4-L containers is an efficient and forensically sound way of storing selectively imaged evidence. There exist two open source implementations to perform this task: c-aff4 and pyaff4. To image a directory with aff4.py, run:

python3 aff4.py --verbose --recursive --create-logical \
$(date +%Y-%m-%d)_EVIDENCE_1.aff4 /path/to/dir

If you have to acquire logical files on Windows systems, c-aff4 can be used with PowerShell to conveniently recurse a relevant directory:

Get-ChildItem -Recurse C:\Users\sysprog\Desktop\ `
 | ForEach {$_.FullName} | .\aff4imager.exe --input '@' `
 --output .\$(Get-Date -UFormat '+%Y-%m-%dT%H%M%S')_EVIDENCE_2.aff4

Although those two tools are helpful, it has to be noted that open-source tools for working with AFF4-L-containers, do not get the deserved attention.

Motivation

There are numerous factors that demand logical instead of physical acquisitions. Data is constantly moving into the cloud, encryption is becoming more widespread, and disk capacity has been growing alongside the amount of data being stored by users ever since. This leads to the fact that performing physical imaging in every case respectively incident is either not efficient or even not possible anymore due to the high storage requirements, time criticality, and data availability. Furthermore, full disk images are often neither needed to answer the investigative questions nor useful to do so. Therefore, a selective seizure of the relevant data makes far more sense. To determine the relevant evidence is another story, but if it is possible to do so, selective imaging might be the preferred solution regarding its efficiency. This brings up the question of how to acquire logical files in a forensically sound manner and how to store them for later analysis.

Central aspects when acquiring logical files are the metadata and timestamp preservation, as well as the assurance of the integrity of the acquired data. There are several options to do so, which differ in complexity and suitability. Think of creating a timestamp preserving TAR-archive with tar --create --atime-preserve=system --file ... ¹ in the most simple case, but this can be done way better as the following section explains.

Logical containers with AFF4-L

In the past, the widely used tool FTK Imager created AD1-containers during logical acquisitions ². This aged format, however, is not supported well for processing its contents with open-source tools or even with X-Ways. There was no widely adopted standard for interoperability and the solutions for storing logical acquisitions were far from ideal because they preserved less metadata than forensic analyst requires ³.

AFF4-L, developed by Bradley Schatz ⁴, comes to the rescue. This standard aims at providing an open standard with human interpretability using Zip-tooling (like 7zip), efficient access for many and large logical files, and the storage of arbitrary metadata ⁵. Two years after its presentation at DFRWS 2019 this container format is only very slowly adopted by a few commercial tools, e.g., Axiom ⁶, while Autopsy still lacks support for it.

Since this blog post is geared toward the practical work of seizing digital evidence, I'll skip the internals of this forensic container format, the academic thoughts behind it, and the engineering for its realization. So let's have look at the tools for performing an actual acquisition.

Open source tooling for working with AFF4-L

Currently, there exist two open source implementations pyaff4 and c-aff4. The first is the initial imaging/container-handling software written in Python and provided with all publications to AFF4, while the latter is a C implementation by Mike Cohen.

Pyaff4

To get pyaff4 up and running, grab the code from Github and install it via pip:

# Install pyaff4 lib
git clone https://github.com/aff4/pyaff4.git

# Check the help page
python3 aff4.py -h

Then you can perform logical acquisitions via the CLI-switch -c or --create-logical. To "image" single files use the following commands:

python3 aff.py -v -c evidence.aff4 /path/to/target

# Now append via -a/--append
python3 aff.py -v -c -a evidence.aff4 /path/to/target

To acquire a directory-structure recursively, use the CLI-switch -r (--recursive):

python3 aff4.py -v -r -c evidence.aff4 /path/to/target/dir

Afterward, you can verify and view the gathered data with aff4.py by --verify and --list or, alternatively, inspect the container with a common Zip program (here 7zip is used):

# Verify the container's contents
python3 aff4.py -v evidence.aff4

# List the container contents
python3 aff4.py -l evidence.aff4

# View AFF4-L contents with zip-programm (e.g. `apt install p7zip-full`)
7z l evidence.aff4

For a thorough inspection of the content and the metadata use:

python3 aff4.py -m evidence.aff4 | less

This yields some output on stdout, which might look like the following text, which specifies a single file, with its UUID, filename, hashes, timestamps, etc…:

@prefix aff4: <http://aff4.org/Schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xml: <http://www.w3.org/XML/1998/namespace> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<aff4://5fef26cc-ca70-4a85-ab9d-8200cc9c6e3d/path/to/target/file> a aff4:FileImage,
	aff4:Image,
	aff4:ZipSegment ;
    aff4:birthTime "2021-07-21T18:00:50.176313+02:00"^^xsd:dateTime ;
    aff4:hash "df4d6b951343f74e2dce8b279582e880"^^aff4:MD5,
	"ac28e46bd2de6868bdd86f20006c9e360fc4ed52"^^aff4:SHA1 ;
    aff4:lastAccessed "2021-08-03T09:26:31.013917+02:00"^^xsd:dateTime ;
    aff4:lastWritten "2021-07-21T18:00:36+02:00"^^xsd:dateTime ;
    aff4:originalFileName "/path/to/target/file"^^xsd:string ;
    aff4:recordChanged "2021-07-21T18:11:30.107518+02:00"^^xsd:dateTime ;
    aff4:size 478720 .
<snip>

Afterward, you can extract single files from the container with -x (short for --extract) or extract all files to a specified folder (-f) with -X (--extract-all):

# Extract all files to dir
python3 aff4.py -X -f ~/CASE_123/dump evidence.aff4

Note: Always use absolute pathnames, because .. gets resolved during the extraction.

So, in summary, this tool comes in handy if you want to selectively image data from media that is connected to your forensic workstation (with a writeblocker of course). Another use case might be mounting an SMB-share in the target environment, which could be a compromised enterprise network, a suspect's workplace, etc., to extract case-relevant data selectively. A third usecase could be to surgically acquire data from virtual machine images of servers, which can be mounted read-only with qemu-nbd and then selectively imaged with the help of pyaff4 for later analysis.

However, if you have or want to perform live response directly on a system under investigation, pyaff4 might not be the best tool. Even if you package it as a PE-file (with Pyinstaller or the like) it seems to be very bloated and has a big footprint, therefore c-aff4 is considered an alternative.

c-aff4

c-aff4 "implements a C/C++ library for creating, reading and manipulating AFF4 images" ⁷. It provides the canonical aff4imager, which is a general purpose standalone imaging tool. This binary is useful to perform acquisitions during live responses and triage analyses.

Compilation

It is most convenient to rely on the provided Docker setup to compile c-aff4 from source for Windows:

# Get the source
git clone https://github.com/Velocidex/c-aff4.git
cd c-aff4

# Compile it for windows
docker-compose up

# Inspect the results
ls -l build/win32/*

After running the build-process, you'll find a self-contained PE-file named aff4imager.exe inside the directory ./build/win32/bin/, which can become an integral part of your live response toolkit. Consider renaming and modifying its signature to defeat anti-forensic measures; For inspiration on this, see my toolkit-obfuscator-repository ⁸.

Instructions on the build process for Linux and macOS can be found in the a/m repository.

Usage

The 32-bit PE-file aff4imager.exe can be used conveniently on probably any Windows-machine you'll come across. If there is Powershell available, which should be the case for the most machines running in a production environment, it becomes very easy to image a specific directory recursively by running the following command ⁹:

Get-ChildItem -Recurse C:\Users\sysprog\Desktop\ `
 | ForEach {$_.FullName} | .\aff4imager.exe --input '@' `
 --output .\$(Get-Date -UFormat '+%Y-%m-%dT%H%M%S')_EVIDENCE_2.aff4

Using the @-sign as the input filename (after --input), makes aff4imager read the list of files to acquire from stdin and place those in the resulting container, specified by --output. Note that if you want to append to an AFF4-L-container, you must specify the container twice, once with -o and once as a positional argument.

To list the acquired files stored in the container and the corresponding metadata, you may want to execute the following commands:

# List files in container, --list
.\aff4imager.exe -l .\container.aff4

# View metadata, --view
.\aff4imager.exe -v .\container.aff4

For the full documentation on aff4imager.exe's usage, refer the official documentation at http://docs.aff4.org/en/latest/.

Verdict

Logical imaging is nowadays often the preferred way of acquiring digital evidence. AFF4-L seems to be the best container format defined as an open standard. pyaff4 and aff4imager (provided by c-aff4) are two open source tools to acquire files logically "into" AFF4-L-containers. The present blog post gives an introduction to their usage. Unfortunately, the AFF4-L-container format and the corresponding imaging tools seem not to get the momentum and attention of the open-source community they deserve, which might change with their popularity.

Footnotes:

See https://manpages.debian.org/stretch/tar/tar.1.en.html

See https://support.accessdata.com/hc/en-us/articles/202930279-Best-Practice-Loose-Files-to-AD1?mobile_site=true

See B. Schatz presentation at DFRWS 2019 https://dfrws.org/wp-content/uploads/2019/06/2019_USA_pres-aff4-l_a_scalable_open_logical_evidence_container.pdf, p. 14

⁴

Schatz, B. L. (2019). AFF4-L: a scalable open logical evidence container. Digital Investigation, 29, S143-S149. https://www.sciencedirect.com/science/article/pii/S1742287619301653

⁵

Ibid, p. 15

⁶

See https://www.forensicfocus.com/news/aff4-l-support-portable-case-updates-and-more-in-magnet-axiom-4-5-magnet-axiom-cyber-4-5/

⁷

See https://github.com/Velocidex/c-aff4

⁸

See https://github.com/jgru/toolkit-obfuscator

⁹

The oneliner-version is: Get-ChildItem -Recurse C:\Users\sysprog\Desktop\ | ForEach {$_.FullName} | .\aff4imager.exe -i '@' -o .\$(Get-Date -UFormat '+%Y-%m-%dT%H%M%S')_EVIDENCE_2.aff4