tl;dr
Using AFF4-L containers is an efficient and forensically sound way of storing selectively imaged evidence. There exist two open source implementations to perform this task: c-aff4 and pyaff4. To image a directory with aff4.py
, run:
python3 aff4.py --verbose --recursive --create-logical \ $(date +%Y-%m-%d)_EVIDENCE_1.aff4 /path/to/dir
If you have to acquire logical files on Windows systems, c-aff4
can be used with PowerShell to conveniently recurse a relevant directory:
Get-ChildItem -Recurse C:\Users\sysprog\Desktop\ ` | ForEach {$_.FullName} | .\aff4imager.exe --input '@' ` --output .\$(Get-Date -UFormat '+%Y-%m-%dT%H%M%S')_EVIDENCE_2.aff4
Although those two tools are helpful, it has to be noted that open-source tools for working with AFF4-L-containers, do not get the deserved attention.
Motivation
There are numerous factors that demand logical instead of physical acquisitions. Data is constantly moving into the cloud, encryption is becoming more widespread, and disk capacity has been growing alongside the amount of data being stored by users ever since. This leads to the fact that performing physical imaging in every case respectively incident is either not efficient or even not possible anymore due to the high storage requirements, time criticality, and data availability. Furthermore, full disk images are often neither needed to answer the investigative questions nor useful to do so. Therefore, a selective seizure of the relevant data makes far more sense. To determine the relevant evidence is another story, but if it is possible to do so, selective imaging might be the preferred solution regarding its efficiency. This brings up the question of how to acquire logical files in a forensically sound manner and how to store them for later analysis.
Central aspects when acquiring logical files are the metadata and timestamp preservation, as well as the assurance of the integrity of the acquired data. There are several options to do so, which differ in complexity and suitability. Think of creating a timestamp preserving TAR-archive with tar --create --atime-preserve=system --file ...
1 in the most simple case, but this can be done way better as the following section explains.
Logical containers with AFF4-L
In the past, the widely used tool FTK Imager created AD1-containers during logical acquisitions 2. This aged format, however, is not supported well for processing its contents with open-source tools or even with X-Ways. There was no widely adopted standard for interoperability and the solutions for storing logical acquisitions were far from ideal because they preserved less metadata than forensic analyst requires 3.
AFF4-L, developed by Bradley Schatz 4, comes to the rescue. This
standard aims at providing an open standard with human
interpretability using Zip-tooling (like 7zip
), efficient access for
many and large logical files, and the storage of arbitrary
metadata 5. Two years after its presentation at DFRWS 2019 this
container format is only very slowly adopted by a few commercial
tools, e.g., Axiom 6, while Autopsy still lacks support for it.
Since this blog post is geared toward the practical work of seizing digital evidence, I'll skip the internals of this forensic container format, the academic thoughts behind it, and the engineering for its realization. So let's have look at the tools for performing an actual acquisition.
Open source tooling for working with AFF4-L
Currently, there exist two open source implementations pyaff4 and c-aff4. The first is the initial imaging/container-handling software written in Python and provided with all publications to AFF4, while the latter is a C implementation by Mike Cohen.
Pyaff4
To get pyaff4
up and running, grab the code from Github and install it via pip
:
# Install pyaff4 lib git clone https://github.com/aff4/pyaff4.git # Check the help page python3 aff4.py -h
Then you can perform logical acquisitions via the CLI-switch -c
or --create-logical
. To "image" single files use the following commands:
python3 aff.py -v -c evidence.aff4 /path/to/target # Now append via -a/--append python3 aff.py -v -c -a evidence.aff4 /path/to/target
To acquire a directory-structure recursively, use the CLI-switch -r
(--recursive
):
python3 aff4.py -v -r -c evidence.aff4 /path/to/target/dir
Afterward, you can verify and view the gathered data with aff4.py
by --verify
and --list
or, alternatively, inspect the container with a common Zip program (here 7zip
is used):
# Verify the container's contents python3 aff4.py -v evidence.aff4 # List the container contents python3 aff4.py -l evidence.aff4 # View AFF4-L contents with zip-programm (e.g. `apt install p7zip-full`) 7z l evidence.aff4
For a thorough inspection of the content and the metadata use:
python3 aff4.py -m evidence.aff4 | less
This yields some output on stdout, which might look like the following text, which specifies a single file, with its UUID, filename, hashes, timestamps, etc…:
@prefix aff4: <http://aff4.org/Schema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xml: <http://www.w3.org/XML/1998/namespace> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . <aff4://5fef26cc-ca70-4a85-ab9d-8200cc9c6e3d/path/to/target/file> a aff4:FileImage, aff4:Image, aff4:ZipSegment ; aff4:birthTime "2021-07-21T18:00:50.176313+02:00"^^xsd:dateTime ; aff4:hash "df4d6b951343f74e2dce8b279582e880"^^aff4:MD5, "ac28e46bd2de6868bdd86f20006c9e360fc4ed52"^^aff4:SHA1 ; aff4:lastAccessed "2021-08-03T09:26:31.013917+02:00"^^xsd:dateTime ; aff4:lastWritten "2021-07-21T18:00:36+02:00"^^xsd:dateTime ; aff4:originalFileName "/path/to/target/file"^^xsd:string ; aff4:recordChanged "2021-07-21T18:11:30.107518+02:00"^^xsd:dateTime ; aff4:size 478720 . <snip>
Afterward, you can extract single files from the container with -x
(short for --extract
) or extract all files to a specified folder (-f
) with -X
(--extract-all
):
# Extract all files to dir python3 aff4.py -X -f ~/CASE_123/dump evidence.aff4
Note: Always use absolute pathnames, because ..
gets resolved during the extraction.
So, in summary, this tool comes in handy if you want to selectively image data from media that is connected to your forensic workstation (with a writeblocker of course). Another use case might be mounting an SMB-share in the target environment, which could be a compromised enterprise network, a suspect's workplace, etc., to extract case-relevant data selectively. A third usecase could be to surgically acquire data from virtual machine images of servers, which can be mounted read-only with qemu-nbd
and then selectively imaged with the help of pyaff4
for later analysis.
However, if you have or want to perform live response directly on a system under investigation, pyaff4
might not be the best tool. Even if you package it as a PE-file (with Pyinstaller or the like) it seems to be very bloated and has a big footprint, therefore c-aff4
is considered an alternative.
c-aff4
c-aff4 "implements a C/C++ library for creating, reading and manipulating AFF4 images" 7. It provides the canonical aff4imager
, which is a general purpose standalone imaging tool. This binary is useful to perform acquisitions during live responses and triage analyses.
Compilation
It is most convenient to rely on the provided Docker setup to compile c-aff4
from source for Windows:
# Get the source git clone https://github.com/Velocidex/c-aff4.git cd c-aff4 # Compile it for windows docker-compose up # Inspect the results ls -l build/win32/*
After running the build-process, you'll find a self-contained PE-file named aff4imager.exe
inside the directory ./build/win32/bin/
, which can become an integral part of your live response toolkit. Consider renaming and modifying its signature to defeat anti-forensic measures; For inspiration on this, see my toolkit-obfuscator
-repository 8.
Instructions on the build process for Linux and macOS can be found in the a/m repository.
Usage
The 32-bit PE-file aff4imager.exe
can be used conveniently on probably any Windows-machine you'll come across.
If there is Powershell available, which should be the case for the most machines running in a production environment, it becomes very easy to image a specific directory recursively by running the following command 9:
Get-ChildItem -Recurse C:\Users\sysprog\Desktop\ ` | ForEach {$_.FullName} | .\aff4imager.exe --input '@' ` --output .\$(Get-Date -UFormat '+%Y-%m-%dT%H%M%S')_EVIDENCE_2.aff4
Using the @
-sign as the input filename (after --input
), makes
aff4imager
read the list of files to acquire from stdin and place
those in the resulting container, specified by --output
. Note that
if you want to append to an AFF4-L-container, you must specify the
container twice, once with -o
and once as a positional argument.
To list the acquired files stored in the container and the corresponding metadata, you may want to execute the following commands:
# List files in container, --list .\aff4imager.exe -l .\container.aff4 # View metadata, --view .\aff4imager.exe -v .\container.aff4
For the full documentation on aff4imager.exe
's usage, refer the official documentation at http://docs.aff4.org/en/latest/.
Verdict
Logical imaging is nowadays often the preferred way of acquiring
digital evidence. AFF4-L seems to be the best container format defined
as an open standard. pyaff4
and aff4imager
(provided by c-aff4
)
are two open source tools to acquire files logically "into"
AFF4-L-containers. The present blog post gives an introduction to
their usage. Unfortunately, the AFF4-L-container format and the
corresponding imaging tools seem not to get the momentum and attention
of the open-source community they deserve, which might change with
their popularity.
Footnotes:
See B. Schatz presentation at DFRWS 2019 https://dfrws.org/wp-content/uploads/2019/06/2019_USA_pres-aff4-l_a_scalable_open_logical_evidence_container.pdf, p. 14
Schatz, B. L. (2019). AFF4-L: a scalable open logical evidence container. Digital Investigation, 29, S143-S149. https://www.sciencedirect.com/science/article/pii/S1742287619301653
Ibid, p. 15
The oneliner-version is: Get-ChildItem -Recurse C:\Users\sysprog\Desktop\ | ForEach {$_.FullName} | .\aff4imager.exe -i '@' -o .\$(Get-Date -UFormat '+%Y-%m-%dT%H%M%S')_EVIDENCE_2.aff4