Hi again,
In line with the previous post, we’re going to talk again about forensics scripting using Perl and SQLite. However, this time we are going to use a new example: the Google Chrome history processor.
I’ve uploaded the script to my recently created Google Code site, so everyone can easily get the source and play with it. I like this example because, in comparison to the last example we talked about (the Mozilla Firefox downloads processor) it brings more functionality and includes extra features that may be useful for your investigative processes.
The new script uses Spreadsheet::WriteExcelXM to perform Excel XML outputs than can be easily loaded with Excel or Libreoffice. It also generates a double comma separated plain text file, which is used as the source to build the XML report. Some people will find useful the plain text one, some others prefer to load the XML in their favourite calc software. It’s really up to you. Double comma is used to prevent distortions with single commas in the URL title, but feel free to modify and use your favourite delimiter.
One interesting feature of Chrome’s history is the storage of transition types. Our script automatically transforms the values of the different transition values stored in the visits table inside the history SQLite file. These values have to be logically AND with something called CORE MASK, initially set to 0xFF, to obtain a number between 0 an 10 that describes the transition type. The original source code of Chrome is self explanatory, so check the SVN trunk for further details: http://src.chromium.org/svn/trunk/src/content/common/page_transition_types.h. The transformation is coded below:
my $core_mask = 0xff;
$transition = $core_mask & @row[10];
switch ($transition) {
case 0 { $transition_text = "LINK: User reached page by clicking a link on another page"}
case 1 { $transition_text = "TYPED: User typed page in URL bar"}
case 2 { $transition_text = "AUTO BOOKMARK: User got to this page through a suggestion in the UI"}
case 3 { $transition_text = "AUTO SUBFRAME: Content automatically loaded in a non-toplevel frame"}
case 4 { $transition_text = "MANUAL SUBFRAME: Subframe navigation explicitly requested by the user"}
case 5 { $transition_text = "GENERATED: User typed page in the URL bar and selected an entry that did not look like a URL"}
case 6 { $transition_text = "START PAGE: Page was specified in the command line or is the start page"}
case 7 { $transition_text = "FORM SUBMIT: User filled out values in a form and submitted it"}
case 8 { $transition_text = "RELOAD: User reloaded the page"}
case 9 { $transition_text = "KEYWORD: URL generated from a replaceable keyword other than the default search provider"}
case 10 { $transition_text = "KEYWORD GENERATED: Visit was generated by a keyword "}
else { print "Unable to understand the transition value. Check, something is horribly wrong here :)" }
}
The script, explained
Core operation in the script consists of a SQL SELECT statement that will retrieve data from two tables inside the history file: urls and visits. Of course, as we want correlated information and not scattered data, we will grab the information only when the id value in the urls table matches the url value in the visits table, which is the numerical value that matches the id in the urls table for each visit event.
SELECT urls.id, urls.url, urls.title, urls.visit_count, urls.typed_count, urls.last_visit_time, urls.hidden, urls.favicon_id, visits.visit_time, visits.from_visit, visits.transition, visits.segment_id, visits.is_indexed FROM urls, visits WHERE urls.id = visits.url
Once this execution is prepared, we can loop and generate the text file, transforming the timestamps into human readable values, and turning the transition values into transition events, as explained before. Each event is explained in the code. This text file will then be converted into an XML with the following fields:
ID: It’s a primary key for cross table references. Stored in the urls table
URL: The stored visited URL
TITLE: Page title
VISIT COUNT: Total visit count for the given URL
TYPED COUNT: Number of times the URL was typed
LAST VISIT TIME: Timestamp of the last visit, as stored in the urls table
VISIT TIME: Timestamp of the visit, as stored in the visits table
IS HIDDEN: Indicates if the URL is displayed by the autocomplete function. 0 means it will be displayed, 1 means it will remain hidden
FAVICON ID: Used to reference the favicon table
FROM VISIT: Used to reference the URL the visit came from. Will show 0 if no reference is in place
TRANSITION: Shows the transition type
SEGMENT ID: Stores the segments value, used in cross table reference
IS INDEXED: Determines if there’s indexed data for this visit. 1 there is, 0 there isn’t
I’ve uploaded an example of both outputs here: plain text report and XML report.
Hope this may be useful for you :D
With regards,
Thank you for the script .. just a small correction…
$visittime = gmtime((@row[8]-11644473600000000)/1000000);
$lastvisittime = gmtime((@row[5]-11644473600000000)/1000000..
or localtime, if you wish…