The importance of knowing 'where' in digital forensic analysis

With so many devices, file systems, operating systems, user artifacts, application artifacts, and more, keeping-up with relevant knowledge is a real struggle. As examiners, we tend to gravitate towards actions which make our lives easier and only seek deeper knowledge out of necessity when a case requires it. This issue is often exacerbated by a formidable case-backlog, which begs our attention incessantly. Regardless of both these factors, we cannot shake the duty to know.

Digital forensics presents a key challenge: knowing where. We can investigate our lives away on digital evidence, but if we cannot clearly articulate the foundations of what we have done, our findings may be misunderstood or even dismissed at court. Throughout my 18-years in the field of digital forensics, I gave testimony concerning criminal investigations many times in state and federal courts . The importance of knowing was reinforced consistently throughout.

Knowing Where

One of our primary goals is to seek the evidence necessary to reveal elements of crimes or other aspects of the investigations we conduct. Forensic tools do an excellent job of yielding this information during evidence processing procedures. We may also simply navigate the file system and bookmark data pertinent to our case. Depending on how the information is documented and/or displayed, we need to know the source of these findings. Understanding the source of the data is vital for several reasons.

Where was the evidence recovered?

If we cannot state the origin of the evidence, how can we successfully lay the foundation for its admission as a trier of fact? For example, Figure 1 shows parsed Microsoft Internet Explorer browser history artifacts within OpenText™ EnCase™ Forensic:

Figure 1 – Parsed Internet history records in WebCacheV01.dat

The Name column gives the source, WebCacheV01.dat, thus indicating where the values shown in the other columns were parsed from. Those values confirm the related storage location in the folder structure, and also indicate how the data is stored, in this case in an Extensible Storage Engine (ESE) Database. This, in turn, might require us to present the parsed data in a specific way or perform additional validation-checks. In the case of dealing with data stored in a WebCacheV01.dat file, we need to ensure access-counts and timestamps have been decoded correctly, which is something covered in detail on our Internet-Based Investigations course. Simply submitting a parsed URL as an evidential value of note falls far short of identifying its provenance and thereby its true significance.

Recently, I have heard of case after case being subject to scrutiny or further investigation because the examiner was unable to articulate the source of the information. As we progress deeper into this field, the question of where evidential data originates should be asked as a matter of course. Consider:

How else can we validate that data or investigate it in more depth if we don’t understand its source?
Where would we look if evidence processing yielded no results?
How can we verify other tools if we are unsure where the data they yield originates from?
How can we be sure that tools are doing a thorough job of parsing the areas where relevant data may be stored?

As a digital forensic investigator, the burden is upon us to know where the evidential data we produce originates.

Let’s look at another important example. While conducting an additional search of our evidence, we might perform a raw keyword search. This technique can assist us in locating evidence in more obscure areas. As demonstrated in Figure 2, keyword search-hits in EnCase Forensic can be viewed easily by clicking on the links in the Hits column shown under the Keyword Hits tab. This causes EnCase Forensic to load the responsive data, which can then be viewed in detail in the Text tab of the View pane (that shown in the bottom-left):

Figure 2 – Reviewing raw keyword search hits

This screenshot displays how, without changing any views, we can discern the Unicode-encoded data of note located in the WebCacheV01.dat file stored in the Windows user profile for the John Bender user. The data is at file offset 33173042 for a length of 18‑bytes. Right-clicking on the hit and taking the Go To File option shows where the data is sourced. This is demonstrated by Figure 3, which shows the previously mentioned WebCacheV01.dat file as a file on disk.

Figure 3 – WebCacheV01.dat location

Although not visible in this screenshot, EnCase Forensic would also have displayed the physical and logical sector number, cluster number, sector and file offset, length, and folder-path. One might be tempted to say, “why do I need to know this?” or even assert, “I don’t need to know this,” but relevant data’s place of origin and how it is stored is always relevant. Taking into account different sector sizes, fix-up codes, file-system data-storage mechanisms such as resident and non-resident data, RAM and file slack, the where will often tell a clearer, more accurate story about evidentiary data and its true significance. Simple finding and reporting of data often lacks the depth needed to see the whole picture.

The last example I would like to cover relates to Microsoft® Windows Registry artifacts. The Ares Peer-to-Peer Registry data shown in Figure 4 has been parsed, de-obfuscated, and bookmarked using the Ares and Lime Pro Registry Report EnScript:

Figure 4 – Parsed Ares peer-to -peer software Registry artifacts

The output of the script shows the source of the information: an NTUSER.dat Registry-hive file. The file-path can be checked in the same manner as already mentioned so as to determine which user-profile the data is associated with. The evidential potential of the search terms recovered by the script is obvious, but what about the time shown for the last search?

As forensic examiners, we might anticipate Registry dates and times to be stored in-line with how file-systems store them. However, data stored in a Registry hive is not stored as files and folders. It is stored as a sequence of keys and values. If a key contains multiple values and one is updated, it is the last written timestamp of the key that is updated, not the value. As a matter of fact, Registry values do not have timestamps. So, having identified where the evidence is located and the way in which data is stored, we can determine the rules in effect and interpret/explain the data accordingly. Detailed Registry analysis is documented in our Advanced Analysis of Windows Artifacts course.

Knowing where is inescapable when conducting a thorough investigative analysis. We, as examiners, must understand our evidential data, where it resides, and how successful or accurate our tools have been in providing it. Knowing where provides additional context to the way in which data is stored. This information allows us to better understand what the data indicates, where else we might look for relevant data, and what storage-mechanism rules are in effect. Knowing where helps us to identify many other aspects pertinent to our investigation.

For more information, take a look at our Security solutions, register for our EnCase Training, or talk to one of our expert trainers.

Note: although the above content outlines functionality for Encase Forensic, the functionality is similar in OpenText™ EnCase™ Endpoint Investigator.

OpenText

OpenText, The Information Company, enables organizations to gain insight through market-leading information management solutions, powered by OpenText Cloud Editions.

See all posts