Search This Blog

Wednesday, June 19, 2013

SQLite on the Case

It is very common in SQLite databases for integers to represent a deeper meaning than their numeric value. We usually refer to this as a flag. Take the iOS call_history.db for example: the call table has a column literally called flags and the integers in that column represent the type of call.

Integer Interpretation

4

Incoming

5

Outgoing

8

Blocked

16

Facetime

Note More values are possible than those in the table above. The data above is accurate for iOS 6, but not complete. Other values integers are possible that represent a network error of some time, and while it is possible to determine whether the call was incoming, Facetime, etc. when the error occurred, this exceeds the scope of this post.

Just as common as integer flags is the utter lack of an explanation for them in the database itself. Take a look at the table schemas:

CREATE TABLE _SqliteDatabaseProperties (key TEXT, value TEXT, UNIQUE(key));
CREATE TABLE call (ROWID INTEGER PRIMARY KEY AUTOINCREMENT, address TEXT, date INTEGER, duration INTEGER, flags INTEGER, id INTEGER, name TEXT, country_code TEXT, network_code TEXT, read INTEGER, assisted INTEGER, face_time_data BLOB, originalAddress TEXT);
CREATE TABLE data (ROWID INTEGER PRIMARY KEY AUTOINCREMENT, pdp_ip INTEGER, bytes_rcvd REAL, bytes_sent REAL, bytes_last_rcvd REAL, bytes_last_sent REAL, bytes_lifetime_rcvd REAL, bytes_lifetime_sent REAL);

The tables are essentially stand alone—the data table is not joined with the call table to give meaning to the flags column integer. So, how do we determine the meaning? Why, by testing, of course. If you thought use another tool and see what it says then go outside and drag your knuckles on the concrete for a bit. It’s okay, I’ll wait….

The best resource for checking flag values is the device’s user interface: get a test phone, export the database, and compare the records shown through the device interface to the values in the table. Yes, you have to do some work, but that is as it should be. At a minimum, you need to check if your tool or manual query is corret, right? By the way, you might be wondering how the UI knows the interpretation: its in the application programming, but you don’t get to see that because its in binary form.

So, let’s go forward with the assumption we have checked the UI for this database or a same generation database and are confident of the interpretation. Can we craft a SQL statement that can perform the interpretation for us? You bet!

The Case Expression

The SQLite CASE expression is the IF/THEN of SQL queries. The basic form of the expression is as follows (credit: sqlite.org):

  • CASE x WHEN w1 THEN r1 WHEN w2 THEN r2 ELSE r3 END

  • CASE WHEN x=w1 THEN r1 WHEN x=w2 THEN r2 ELSE r3 END

Think of the x value in the expressions above as the column name. Rather than try to explain further with hypothetical values, lets turn back to the call_history.db flags. We could use the following excerpt in our query:

case flags
    when 4 then "Incoming"
    when 5 then "Outgoing"
    when 8 then "Blocked"
    when 16 then "Facetime"
    else "Dropped"
end as flags
Note We use the as flags alias at the end of the case statement above to name the column "flags." Without the alias, the column would take the name of the entire case statement.

The case statement above essentially reads: for the flags column, interpret 4 as Incoming, 5 as Outgoing, 8 as Blocked, 16 as Facetime, and anything else as Dropped. If we were interested in the producing output from the call_history.db that included the rowid, phone number, date, call duration, and call type, we could use the following query:

select
        rowid,
        address,
        datetime(date,'unixepoch','localtime') as date,
        duration,
        case flags
                when 4 then "Incoming"
                when 5 then "Outgoing"
                when 8 then "Blocked"
                when 16 then "Facetime"
                else "Dropped"
        end as flags
from call

Our output would look like the following:

ROWID address date duration flags

1948

8885551212

2013-02-30 19:46:41

15

Outgoing

1949

8881215555

2013-02-30 21:05:11

67

Incoming

1950

8881215555

2013-02-01 00:17:05

113

Incoming

1951

8885551212

2013-02-01 10:32:32

54

Outgoing

1952

8885551212

2013-02-01 10:37:25

34

Outgoing

But if you act now, there’s more…

It’s very nice that we can interpret these flags using a SQL statement. This means we don’t have to write a program, use an in line editor, or write a spreadsheet rule to do it. One stop shopping, just what every busy forensicator wants! But, I like to see the original value, too, just in case it turns out my interpretation was wrrr…, uh, less correct than I would like.

Enter SQL operators. Operators are syntax (usually symbols, but sometimes words) that cause the SQL engine to perform a specific task on the data such as concatenation or filtering (e.g., equals, greater than, less than). The double pipes || operator, which is the focus of this section, concatenates values. With it we can make it clear that the duration is in seconds, and we can append the append the flag integer to its English equivalent.

select
        rowid,
        address,
        datetime(date,'unixepoch','localtime') as date,
        duration || " sec" as duration,
        case flags
                when 4 then "Incoming"
                when 5 then "Outgoing"
                when 8 then "blocked"
                when 16 then "Facetime"
                else "Dropped"
        end || " (" || flags || ")" as flags
from call

Now our output could would look like the following:

ROWID address date duration flags

1948

8885551212

2013-02-30 19:46:41

15 sec

Outgoing (5)

1949

8881215555

2013-02-30 21:05:11

67 sec

Incoming (4)

1950

8881215555

2013-02-01 00:17:05

113 sec

Incoming (4)

1951

8885551212

2013-02-01 10:32:32

54 sec

Outgoing (5)

1952

8885551212

2013-02-01 10:37:25

34 sec

Outgoing (5)

Note It’s very true that we could have just labeled our duration column Duration(sec), but I wanted to illustrate another variation of the use concatenation in a query.

The second-to-last line in the query, end || " (" || flags || ")" as flags, adds the integer value of flags, encapsulated in parenthesis, to the end of the interpreted value. Thus, the complete case statement can be read _print the translation of the flag, followed by a space, open parenthesis, the flag integer, and finally a closed parenthesis.

I hope this gives you some ideas on how to produce output with more clarity and analytical value.


Monday, June 17, 2013

TextMe App: Lesson Learned from Unusual Tables

I recently had the opportunity to help a colleague with an iPhone database that was not supported by his automated tools. The application was the TextMe application, and predictably, the texting app stored its chat in a SQLite database. What made the database interesting was the fact that there was no immediately obvious way to identify to whom a message was sent.

Let me illustrate: A quick scan of the database reveals some of the capabilities of the application: texting (ZMESSAGE table) and calling (ZCALL, ZVOICEMAIL)

SQLite Command Line Interface
$ sqlite3 TextMe2.sqlite .tables
ZATTACHMENT     ZDISCUSSION     ZMESSAGEHEIGHT  Z_METADATA
ZCALL           ZFAVORITE       ZVOICEMAIL      Z_PRIMARYKEY
ZCONTACT        ZMESSAGE        Z_3DISCUSSIONS
$

The subject of this investigation was the text messages, so I needed to see how the table was constructed.

SQLite Command Line Interface
$ sqlite3 TextMe2.sqlite ".schema zmessage"
CREATE TABLE ZMESSAGE ( Z_PK INTEGER PRIMARY KEY, Z_ENT INTEGER,
Z_OPT INTEGER, ZSTATUS INTEGER, ZCALL INTEGER, ZDISCUSSION INTEGER,
ZHEIGHT INTEGER, ZSENDER INTEGER, Z3_SENDER INTEGER, ZTIMESTAMP
TIMESTAMP, ZBODY VARCHAR, ZGUID VARCHAR, ZLOCATION VARCHAR, ZREMOTEID
VARCHAR );
...
$

The CREATE TABLE statement shows us there are 14 fields in the table, and a majority are integers: Z_PK, Z_ENT, Z_OPT, ZSTATUS, ZCALL, ZDISCUSSION, ZHEIGHT, ZSENDER, Z3_SENDER, ZTIMESTAMP, ZBODY, ZGUID, ZLOCATION, ZREMOTEID. But, don’t fall into the trap that the declared type of each column (e.g. INTEGER, TIMESTAMP, VARCHAR)actually constrain the data to those types, they don’t. Treat the type-name as informative only, but verify the data before forming any conclusions.

Inspecting the columns, we see some of obvious value in a forensic examination:

  • Z_PK (which is the primary key, an auto incrementing integer who’s chief value is assisting us in identifying if messages have been deleted from the database)

  • ZSTATUS

  • ZSENDER

  • ZTIMESTAMP

  • ZBODY

Others might be grabbing your attention, but I’m going to keep this discussion focused on these columns. Some sample data is in order:

SQLite Command Line Interface
$ sqlite3 -header TextMe2.sqlite "select z_pk, zstatus, zsender,
ztimestamp, zbody from zmessage limit 5;"
Z_PK|ZSTATUS|ZSENDER|ZTIMESTAMP|ZBODY
2|2|2|386362581|Hi!
3|4|10|386362603|hey, what are you doing?
4|3|2|386362630|I'm checking out this new app
5|3|2|386362634|It might be a challenge to decode
6|3|2|386362644|But I'll figure it out...

We see we have a couple of interpretation issues, here: the status is an integer that needs to be interpreted, and is the sender. The date is some form of epoch time, and my eyes tell me its likely Mac Absolute Time. I expect to find the interpretations in other tables in the database. But what jumps off the screen at me is that there is no obvious answer the following question: To whom is the sender sending the message? The time stamp gives us a sequence of messages, but how do we know that sender "2" is sending messages to sender "10"? Couldn’t sender "2" be sending his message to, say, sender "5" and in the midst, receives a message from sender "10"? Absolutely!

So, how to we rectify this issue? Well, I sort of mischievously left off the zdiscussion column in my query. I did this to steer the conversation and simulate what can happen when an investigator encounters a new database for the first time: overlook an important column. If we include the column, we see something interesting:

SQLite Command Line Interface
$ sqlite3 -header TextMe2.sqlite "select z_pk, zstatus, zdiscussion,
zsender, ztimestamp, zbody from zmessage limit 5;"
Z_PK|ZSTATUS|ZDISCUSSION|ZSENDER|ZTIMESTAMP|ZBODY
2|2|2|2|386362581|Hi!
3|4|2|10|386362603|hey, what are you doing?
4|3|2|2|386362630|I'm checking out this new app
5|3|2|2|386362634|It might be a challenge to decode
6|3|2|2|386362644|But I'll figure it out...

Now we see that the conversation is all part of the same discussion. And if we studied the whole database, we’d see example of where more than one conversation was occurring a the same time, and by sorting on the discussion field, we make sense of those conversations. But date stamp alone does not clue us in.

This might not seem like a big deal, but most messaging databases I have encountered have the remote party in the message record for both sent and received messages. This works well and leads to easy interpretation, e.g., "sent to Joe" and "received from Joe". But this database without properly understanding the discussion column, is the equivalent of "sent by Joe" and "sent by Jane", leading to the question "to whom?"

Rather than breakdown the rest of the analysis, I’m just going to share the query I used to complete the analysis:

select
  m.z_pk as "Key",
  datetime(ztimestamp + 978307200, 'unixepoch', 'localtime') as "Date",
  case
    when m.z_ent then (select z_name from z_primarykey natural join zmessage)
    else "Unknown(" || m.z_ent || ")" end
    as "Type",
  case zstatus
    when 2 then "Sent"
    when 3 then "Delivered"
    when 4 then "Received"
    else "Unknown(" || zstatus || ")" end
    as "Status",
  zdiscussion as "DiscussionID",
  zusername as "Contact",
  zbody as "Message"
from zmessage as m, zcontact as c
on m.zsender = c.z_pk
order by discussionid asc, date asc;

By way of brief description:

  • the AS statements, such as that seen in m.z_pk as "Key", create aliases, effectively renaming the columns (or in the FROM statement, the tables) to make the output more informative.

  • The first CASE statement queries the z_primarykey to interpret the Z_ENT integer into its textual value.

  • The second CASE statement interprets the ZSTATUS flag into its English equivalent, which was not stored in the database, but determined by studying the application user-interface. This begs the question, then where does the application get the textual interpretation? Probably within its code.

  • The FROM statement queries two tables, zmessage and zcontact, to interpret the ZSENDER integer. But wait, you say, there is no ZSENDER in the select statement! (see next bullet point…)

  • The ON statement is the SQLite equivalent of an SQL INNER JOIN, which is an intersection of the two tables, i.e., it select rows from both tables where the columns match. In this case, the columns to be matched are ZSENDER from the ZMESSAGE table and Z_PK from the ZCONTACT table. The effect is that the SELECT statement knows which ZUSERNAME from ZCONTACT to return based on the ZSENDER value in the ZMESSAGE table.

  • The ORDER BY statement sorts the output first by the ZDISCUSSION column, then by the ZDATESTAMP, both in ascending order. Note that the column alias names are used.

I hope this gives some insight into the workings of TextMe2.sqlite databases and helps you in the future with the analysis of never-before-seen SQLite databases.


Sunday, June 16, 2013

Why I do What I do: Thanks Dad!

This post has both nothing and everything to do with why and how I do data forensics. I hope you'll take a moment to read it. 


I am a tenacious do-it-your-selfer in many areas of my life, including home repair, veterinary medicine (just ask my poor dogs), and data forensics. Some of it I even do reasonably well once in a while (even a broken clock is right twice a day, right?). There has been a strong influence in my life that lead me inevitably to this station in life: my father.

My father is a retired firefighter--scratch that--fireman.  Some would say he was tough on me while I was growing up.  I'm sure I've said that myself more than once.  But it would be more accurate to say that from an early age, my father instilled in me a strong sense of right-and-wrong and as self-reliance.  No excuses, no B.S.  He never failed to try to teach me, never missed an opportunity to impart wisdom that came from much experience.  He saw some of the worst life had to offer and lost friends in the service of his city, and I think he was hell bent on helping me avoid the pitfalls of life.  Unfortunately, my career choices betray how closely I listened to the "avoiding harm" lessons: I became a soldier and then a police officer!

Our relationship during my teen years was rocky at times, with frequent arguments over differences of opinion on important life matters (so important, that I can't remember a single one of them).  I didn't know it then, but we argued because we were so alike.  We have a lot of the same qualities: strong wills, the desire to be right (not for the sake of being better than others, but for the sake of not being correct in what we believe, do, and say), and the wish to pass what we know onto others. It was really just our point of view that differed.

My father doesn't just talk-the-talk, though.  He, above all, walks-the-walk.  I admire him greatly in this, and I strive to be more like him in this way.  He put his family first: I can remember how happy it made me when he bought a HiFi stereo system with all the bells and whistles, not because I had a new stereo to play with (oh no, DO NOT touch the equalizer settings!), but because he finally, after something like 15 years of my life, spent money on himself for something non-essential!  It made him happy, and that made me happy, too (and eventually, I was permitted to touch the equalizer... once I was schooled in the proper shape of sound).

So, on this Father's Day, I choose to write a non-technical post, but an important one because it acknowledges the source of my beliefs:

  • Thanks Dad, for making me care about right and wrong: I may take a little too long to reach a conclusion in a case, but I'm not likely to state something untrue because I have checked and double-checked the facts to the best of my ability.  
  • Thanks Dad, for giving me and insatiable curiosity for the world around me, so that I can now study a file or file system and come to understand how they work (or hours, days, and sometimes weeks trying!).
  • Thanks Dad, for teaching me to do for myself, so that now I can write programs to solve forensics problems to get investigators and prosecutors the essential information they need.
  • Thanks Dad, for teaching me so I know how to teach others, so that all that I have learned I willingly pass on and hopefully advance my field, if only a little.
  • Thanks Dad, for helping me find perspective in what I do, not letting work step in-line ahead of my family (the real reason I am no longer a gunslinger).

Happy Father's Day.  I hope I grow up to be just like you.

Time Perspective

Time Perspective Telling time in forensic computing can be complicated. User interfaces hide the complexity, usually displaying time stamp...