TL;DR: Apple Notes has a few bespoke embedded objects which are messier than the Easy Embedded Objects previously explained. This post covers how to piece back together a more complex embedded object, the Apple Notes Table.
Background
If you haven’t read the previous post about Easy Embedded Objects, please go do that now. It will explain the background assumptions of this work and show how this works on some very straight forward objects. These are not those, so you’ll want to build on that knowledge.
Below are the steps to rebuild com.apple.notes.table
objects. These are specific to Notes and get much more involved as a result, hence their own post. If you still don’t want to do this by hand, feel free to check out the Apple Cloud Notes Parser, which handles the below.
Warning: This gets fairly technical. I highly recommend not doing this by hand, but use this post to understand what is happening under the hood and fact check your tool output. I also highly recommend a good espresso (or two) before beginning. Andiamo!
com.apple.notes.table
The Type UTI of com.apple.notes.table
represents a columned table embedded in the Note. This type took me the longest to figure out how to parse, and previous work by dunhamsteve was invaluable in sorting out what I was seeing, although it didn’t quite describe the right answer for current table types in iOS 13.
Table Structure
Before going into how to rebuild tables, I want to address why this is needed. Table textual content can be pulled out of the underlying protobuf fairly easily (after you figure out it is a protobuf and know where to fetch text from), however the structure is the annoying part. Why worry about structure? Because until the user enters data into one of the table cells, it doesn’t appear to even record an empty string which means if you just scan for strings, you might know text exists, but not where it goes. That also means that you can’t even figure out how large the table is based on how many cells have text, let along where they go.
Why does that matter? Imagine a table where you only know the user entered three strings: “Things to do today”, “Things not to do today”, and “kill everybody.” Obviously this is an over the top example, but which column “kill everybody” falls in makes a difference. More realistically, you could imagine a table with names and money, knowing which were debits, which were credits, and who did what would be important. Without rebuilding the table, you can’t do that reliably.
Why is this the case, when the other embedded objects and textual formatting in a Note are much simpler to understand and parse? Apple can’t just store a basic list of rows and columns because of the iCloud integration and the ability to share Notes. Two users can make edits to the same Note and Apple needs to be able to put those together, such as a user adding a column and another user adding a row at the same time. That seems trivial, but if you imagine a 2x2 table which has a row added at the start and a column added at the start, the question of where the original four cells move to when both users’ edits are combined quickly becomes non trivial. To be able to tell what goes where when that happens, Apple breaks the table down into rows, columns, and then mappings of cells to those.
For example, imagine a table with two columns, Column1
and Column2
, and two rows, Row1
and Row2
. Inside are four cells, Cell1
which belongs to Column1
and Row1
, Cell2
which belongs to Column2
and Row1
, Cell3
which belongs to Column2
and Row2
, and Cell4
which belongs to Column2
and Row2
.
Column1 | Column2 | |
Row1 | Cell1 | Cell2 |
Row2 | Cell3 | Cell4 |
Now if a user adds a new row, Row3
, at the start of the table, there is no question what happens to each of the cells that exist. Why? Because they don’t belong to the new row that is on top. So we would have two new cells, Cell5
which belongs to Column1
and Row3
and Cell6
which belongs to Column2
and Row3
. We haven’t changed the mappings of any other cells yet.
Column1 | Column2 | |
Row3 | Cell5 | Cell6 |
Row1 | Cell1 | Cell2 |
Row2 | Cell3 | Cell4 |
If, at the same time, another user was deleting Column2
, Apple would still be able to know exactly which cells to delete. Cell6
, Cell2
, and Cell4
all belong to Column2
, so they would all die.
Column1 | |
Row3 | Cell5 |
Row1 | Cell1 |
Row2 | Cell3 |
I don’t want to get too far into data types and theory, hopefully this example simply gets across why Apple would make this much more complex than the other aspects of Notes and why you should care to understand it enough to rebuild properly (or use a tool that does).
Rebuilding Tables
The first step to putting a com.apple.notes.table
back together is finding it. We’ll use this as an example Note:
As we know from the previous blog post, there’s an attachment in the middle of this Note, with a UUID of CD0CE698-2765-4C55-B53C-CB8E8C4C5609
and Type UTI of com.apple.notes.table
. As before, we can pull the right information out of the ZICCLOUDSYNCINGOBJECT table using that UUID, but what we pull is where this gets really different.
The ZICCLOUDSYNCINGOBJECT.ZMERGEABLEDATA1
field holds the key for all of the com.apple.*
types (yes, even the com.apple.drawing.2
type I hinted at in the last post). For com.apple.notes.table
objects, this field holds a GZipped protobuf that is similar to, but different from the overall Notes protobuf format. The good news is you already know the first two steps, this value needs to be gunzipped and parsed into its corresponding parts. The bad news is after that, you get ~390 lines to read, about 40 times as much as the Note itself. Let’s look at this one, section by section, to understand what they do, with these caveats:
- As with the previous post I’m editing the parts of the protobuf I display to remove a lot of unnecessary information. Because this gets so involved, though, you can download a copy for reference.
- I will generally be presenting these in the same order I needed to in the rebuild_table method of the AppleNotesEmbeddedTable object in Apple Cloud Notes Parser
- My naming on the protobuf parsing is not necessarily the most clear. Please know those are wholly names I’ve chosen as I reveresed the format, and don’t believe they are part of an Apple standard somewhere. Cleaning them up is on the todo list.
- I fully expect the protobuf examples given to be referenced as you read the text. The only way this made sense to me was as I followed along with my finger and finally got to the end, I’m trying to set up the same sorts of situations for the reader.
Key Items
First we need to understand the key items, we will later use this to understand what type of map entry we are looking at in to protobuf.
Without getting into protobuf specifics, field 4 under the Mergeable Data Table Data
message is a repeatable field. This means there can be any number of entries and, while the order matters for parsing, the order is not guaranteed to be the same, table to table. By that I mean that in this specific table, “crRows” was in position 3 (0-based ordering, of course), but in another table, it might swap places with “crColumns”. What you need to do with this section is build an Array that maps the position (i.e. 0) to the entry (i.e. “identity”) as you’ll refer to that later.
Type Items
Next we can do the same thing to the type items.
This looks a lot like before and we need to do the same thing: build an Array mapping the position to the entry. In this case, our 0-th entry would be “com.apple.CRDT.NSNumber”, our 1st would be “com.apple.CRDT.NSString”, and so on.
UUID Items
Next we repeat with the internal UUIDs, in our early example, these would be things like Row1
, Column1
, and Cell1
, but in a nice binary format.
Yet again, we just need an Array and we will be storing the actual bytes above. Where needed in this post, I will refer to them using the hex representation given, not the string representation listed on the right.
At this point, we have arrays for the key items, type items, and UUIDs to refer to as we parse through the meat of the protobuf. We could say, for example, that key_items[0]
is “identity”, type_items[0]
is “com.apple.CRDT.NSNumber”, and uuid_items[0]
is “EEFE10DA5A79432588BA6DCAE2E9B7EC”.
Table Objects
Finally, we need to add all of the Mergeable Data Table Object
messages, which are repeatable field 3 in the Mergeable Data Table Data
message, to another Array, let’s call it table_objects
. Let’s look at just the first one to understand what we’re dealing with.
In this case, the Mergeable Data Table Object
has exactly one message under it, field 13, which is a Table Map
. Table Maps
always have an integer called Type
in field 1 and then have a repeatable field 3 called Map Entry
. We will be looking up the Type
integer in our type_items
Array and the Key
in the Map Entry
in our key_item
Array. Make sense? It didn’t to me either at first, so let’s look deeper.
In this case, we have a Type
of 9. If we refer back to our list of type_items
, the bottom one would be index 9 (entry number 10): “com.apple.notes.ICTable”. This tells us this specific Table Map
is an ICTable, great! Next we can look at each Map Entry
to figure out what they are.
- The first has a
Key
of 0, which would be the first entry in ourkey_items
, or “identity”. Its value is all 0’s, formatted as a UUID, that doesn’t appear too helpful. - The second has a
Key
of 1, which would be “crTableColumnDirection”. That proves to be helpful should you deal with the large swaths of the world that don’t write left-to-right. - The third has a
Key
of 3, which is “crRows”, which is how we identify all the rows in the table. Its value is anObject ID
message which has only one field under it, anObject Index
set to 3. Everytime you see anObject Index
you are going to use the resulting value to look up aMergeable Data Table Object
in thetable_object
Array. In this case, whatever is intable_objects[3]
is “crRows” - The fourth has a
Key
of 5, which is “crColumns” and how we identify all the columns in the table. Its value is theObject ID
withObject Index
of 10, sotable_objects[10]
is our “crColumns”. - The fifth and final entry has a
Key
of 6, which is “cellColumns” and how we identify the mappings of cells to columns and rows. In this case, the object attable_objects[17]
is what we want.
Brief Summary
That’s a lot to take in, so for a quick summary, at this point we have an Array of all the key items, an Array of all the type items, an Array of all the UUIDs, and an Array of all the table objects. We also know how to deal with a Map Entry
message, by looking the key up in our key Array and potentially using an Object Index
to go pull out the table object it is referring to. We essentially have all the information we need at this point, we just need to stitch it back together.
Identifying Rows and Columns
To put this back together I start by finding the “com.apple.notes.ICTable” (which we accidentally did by looking at the first item, but don’t assume it will always be there, loop over all items until you find the right Type
based on the type_items
Array). At this point I loop over each of the Map Entry
message under it, as we did in our example above. I check the Key
of each of them and handle both the “crRows” and “crColumns” very similarly. Let’s look at the object that “crRows” pointed to, it was in object 3 (but note that the initial ‘3’ in this section is referring to field 3 of a previous message, you’ll see that on all of these Mergeable Data Table Objects
):
This table object has one field under it, 16, which is an Ordered Set
. The only places these Ordered Sets
come up are the “crRows” and “crColumns” (that I’ve seen) and they serve help us understand where the rows and columns go.
As you look under the Ordering
message, you’ll see an Array with field 1 being a Note! We parsed a Note to find the UUID of this com.apple.notes.table, opened the table and got another Note, how crazy is that? Now, this isn’t a real Note, this is just using the same protobuf as you’d find in ZICNOTEDATA
, including using the Unicode character for a replacement to identify the rows. We can tell from that there are three rows because there are three replacement characters.
Unlike a normal Note, however, we don’t see what to replace it with in the Note itself, for that information we look lower at the Attachments
repeated field 2 and see three UUIDs with indexes. Index 0, for example, is “BB3738D946074FAAA1B28C2B5437540F”. That should look familiar, it is one of the entries in our uuid_items
Array from earlier. So this tells us that the first row is called “BB3738D946074FAAA1B28C2B5437540F”, the second is called “786888513C244539B82130BDF7D205B1”, and the third “E04653E26E744EDFAF537D96721FD74F”.
Here comes the really annoying part that stumped me for so long. How do you take the knowledge that Row1
is “BB3738D946074FAAA1B28C2B5437540F” and use that to actually display data in the right place? What you end up doing is keeping track of pointers back to the correct row index. What I mean by that is we look at “BB3738D946074FAAA1B28C2B5437540F” and we know that it is in position 2 (remember, 0-based indexing) of our uuid_items
Array. So then we would record something like row_index[2] = 0
. Meaning, if I ever look up the UUID that is in position 2 from a pointer in this protobuf, I need to know that gets spit out as the first row. We would also add in row_index[6] = 1
and row_index[4] = 2
based on the following Attachments
. All of this gives us a way to go from the row’s UUID (or rather its index since numbers are far nicer to type than a lot of hex) to where we need to spit it out on the screen.
Sadly, that’s not enough, as you’ll never find pointers directly to any of these UUIDs. This is what drove me crazy and the answer is in the next section of Ordering
, the Contents
in field 2. These Dictionary Elements
all have a Key
and a Value
, both of which are Object Indexes
. What it is saying is something in the table_object
in the Key
equals something in the table_object
in the Value
. We already know we have to go find those objects, so here are objects 4 and 5, respectively:
Thankfully we see Table Maps
and we already know how to deal with that. Table Object
4 is Type
2, which going back to way earlier we know is “com.apple.CRDT.NSUUID”. It has a Map Entry
with Key
of 4 (“UUIDIndex”) and a value that is the number 1. This means the item in our UUID item array in index 1 is what this entire object refers to: “B945C2B235A94958AB9DBCD8E8867C30”.
Table Object
5 is also Type 2
and also has a Map Entry
with Key 4
, but its value is 2. That means it refers to the UUID at index 2 in our uuid_items
Array: “BB3738D946074FAAA1B28C2B5437540F”, now we finally have something referencing one of our rows!
Taken altogether, we would use that first Dictionary Element
to insert another row into our pointers that says row_index[1] = 0
. Meaning that if we get the UUID in index 1, that also refers to the first row (index 0). We would run through that whole process with the other two Dictionary Elements
as well, to end up with this pseudocode for our row_item
pointers:
Brief Summary
That is a lot to work through and took a few hours of reading what others were saying and stepping through bit by bit to get it. To summarize again, the “crRows” and “crColumns” entries require you to not just note what the row and column UUIDs are, but what their index is in the uuid_items
Array you built and note all of the other UUIDs that can point to the same place. For the sake of brevity, assume we’ve now done the same on the “crColumns” entry which has exactly the same structure and we have built these Hashes for ourselves telling us exactly which UUID indices map to which rows and columns for output.
Identifying Cells
With our knowledge of which UUIDs point to which rows and columns we are so close to being able to build this table. We can certainly at this point flesh out the size of the table (this is a 3x3). To finish this off, let’s look at our “cellColumns” object, remember we’d previously identified it was in index 17 of our table_items
Array.
After all the rest, this doesn’t look scary at all, but just wait. To understand this, you need to know that the “cellColumns” object is made up of a Dictionary
which has a repeatable field Dictionary Element
. Each of these Dictionary Elements
represents a column. That seems odd because we said this is a 3x3 table which should have 3 columns. Remember the warning at the start of this post, Apple only remembers which cells actually have text. In this case, only one column is needed to track that because only one column had any cells with text in them.
For each and every column, then, we will have yet another Key
-Value
pair saying that something in the key Object Index
is equal to something in the value Object Index
. This should be old hat now, we go grab indixes 18 and 21 respectively to see what’s in them:
Our Key
in this case was 21, the second message above, which already looks familiar. We know the Type
of 2 and Map Entry Key
of 4 means we’re looking up a UUID, specifically index 11. Recall above we noted that UUID index 11 is one of the column UUIDs that points to the second column (index 1 in a 0-based world): column_index[11] = 1
.
To know what we’re equating this column to, we have to look at object 18, the top one above. This is another Dictionary
but this time the Dictionary Elements
listed are the rows representing each cell in that column with a value. In this case we see that a Key
of 20 equals a Value
of 19. Last time, I promise you, let’s go pull those objects:
Taking the Key
of 20 first, the bottom entry is very obviously saying that the UUID in index 5 is our Key. Well, the UUID in index 5 is one of the ones we know to refer to the second row: row_index[5] = 1
. This puts our target dead center of the 3x3 table because row_index[5] = 1
and column_index[11] = 1
, but what is it?
Looking at the Value
field of 19 we see… another Note! And this time, it’s a real one! We can quickly see that the text for this field is “3x3 middle” which seems ironic until you know this was all contrived to test a theory.
With that, we now know how to properly display the Note given at the start:
Table title
3x3 middle | ||
After the table
Stitching it All Together
Larger tables obviously have more rows and columns, you’ll have to do a lot more object lookups. They’ll certainly have a lot more cells to look up, but as you repeat that last step over all the cells, looking up each row and column in your lookup table, it all will fall into place. If you follow these steps, you’ll be able to pull out things many others wouldn’t even know existed. I am linking to the actual methods that do each of the below steps in case reading code is more how you learn.
- Build your key items
- Buld your type items
- Build your UUID items
- Use the above to identify your row translations
- Use the above to identify your column translations
- Use all of the above to parse your cells by…
- Looping over each column
Dictionary Element
…- Then looping over each row
Dictionary Element
…- Then putting specific Note text into a specific cell
- Then looping over each row
- Looping over each column
Conclusion
Thanks for sticking with me through all that, I struggled to find the right ways to explain it and hope this was at least somewhat clearer than busting into that protobuf yourself. I hope it is useful knowledge for the forensic examiner to understand how to view this data if they can’t actually load the Notes database onto their phone because there is no way any of this information could be accidentally found or properly understood in context. If nothing else, this will be useful for me in 6 months when I’m trying to remember exactly why I did what I did.