TL;DR: Apple iCLoud Notes are GZIP’d protobufs when stored and this updated program will decompress them for you and help you understand how to display them closer to the original, displaying them with original formatting and images.
Background
Two years ago, after going through SANS FOR585, I put out a small Perl script to better parse the “new” version of Apple Notes which gzipped its contents instead of storing them plaintext. One of the requests I have heard for the script since then was to pair the images included in Notes since iOS 9 with the note itself. In reviewing the code and digging into the issue, I learned a lot more about how Apple Notes works, the differences between the versions in function and storage, and came up with a more fully-featured program to handle this better. More of those details will come in later blog posts, this one focuses on the new version of Apple Notes Parser.
Apple Notes Data Formats After iOS 9
After iOS 9 released the iCloud version of Apple Notes, the formatting changed significantly. Whereas in iOS 8 and below the note was stored as plaintext, in iOS 9 and beyond the note was stored as a protocol buffer (protobuf) that was gzipped and put into the database as a blob. My previous work would guess as to the location and length of the plaintext in that protobuf, without handling it as a protobuf, because, frankly, I had not recognized it as such until I read this post on Mac4n6. After reading that, it is clear Apple Notes uses protobufs in a few places in their database, most notably to store the note’s data and to store embedded object data, but mroe on that in future articles.
Apple Notes Parser Updates
Language
The new version of the program has a lot of changes under the hood and is definitely a breaking change to its predecessor due to the change in languages. As I looked to parse the protobuf I came up against the issue that Perl is not an officially supported language by Google. While there is a Perl module that adds this functionality in (Google::ProtobolBuffers), it hasn’t been updated in a few years and didn’t work on current examples. Since Ruby is an officially supported language, is a language I enjoy, and would scale better for future additions, I opted to port the code over to that.
While re-writing the parser into a Ruby program, I made it fully object oriented. This makes it much easier to fix and extend for the future (and I’d seriously suggest those who just know how to write a good enough script to get the job done look into some programming theory to see how much time you can save in the long run by designing it right from the start, not that I’ll claim this is). This also means that others who want to interact with notes can use the base classes as a starting point to do something different.
For example, the AppleNote class represents a note and holds all the necessary information to interact with that note. AppleNote, AppleNotesAccount, AppleNotesFolder, and AppleNotesEmbeddedObject control the CSV output, each using a function named to_csv
. AppleNote’s is:
If you thought there were too many columns and wanted to get rid of the encryption information in favor of just knowing how many embedded objects there were, you could change it to this (note the change on the last line):
Added Functionality
The new version preserves all of the old functionality. You can still run it with no commands to parse a NoteStore.sqlite that is in the same folder. However,it also has some new functionality. If you point it at an iTunes backup folder, it will identify the NoteStore.sqlite (hashed to 4f98687d8ab0d6d1a371110e6b7300f6e465bef2) and parse that file. But it will also identify all the embedded images from the NoteStore.sqlite file that remain in the backup and pull those out as well for examination.
Now when you run the program, the following happens (this may change slightly over time, whatever is in the Github repo is definitive):
- An AppleBackup object is created based on command line arguments (currently either an iTunes backuup folder, or just a NoteStore.sqlite file)
- The AppleBackup object creates an AppleNoteStore object that handles the NoteStore file(s)
- The AppleNoteStore object guesses which iOS version it came from based on the structure of the database it is pointing to.
- The AppleNoteStore object rips the accounts from the sqlite database, creating individual AppleNotesAccount objects for each.
- The AppleNoteStore object rips the folders from the sqlite database, creating individual AppleNotesFolder objects for each.
- The AppleNoteStore object rips the notes from the sqlite database, creating individual AppleNote objects for each.
- Each AppleNote object pulls information from both ZICCLOUDSYNCINGOBJECT and ZICNOTEDATA (and Z_11NOTES for iOS 11) to track most of the relevant information.
- If the note is encrypted, the encryption variables are added to the AppleNote object.
- Each non-encrypted AppleNote object then attempts to gunzip its compressed data.
- If successful, it then attempts to parse the protobuf that should be inside.
- If successful, it stores the plaintext in another column in the NoteStore’s ZICNOTEDATA table (ZPLAINTEXTDATA), and adds the plaintext to the AppleNote object.
- If successful, it then scans the plaintext and protobuf for embedded objects, and creates AppleNotesEmbeddedObject objects for each that it finds.
- At the end, the program creates an output directory that contains:
- A
csv
folder for four CSV files summarizing the AppleNotesAccount, AppleNotesFolder, AppleNote, and AppleNotesEmbeddedObject objects. - A
files
directory if an iTunes backup was used, containing copies of the pictures that were embedded in the notes, following the file path they should have. - An
html
directory that contains an HTML representation of that AppleNoteStore (i.e. the Folders and Accounts, with Notes and content formatted as they were originally). - A copy of the NoteStore.sqlite file that was made before all this began, to leave the original intact.
- A copy of the notes.sqlite file that was made before all this began, to leave the original intact.
- A copy of the Manifest.db if an iTunes backup was used, to leave the original intact.
- A
For embedded objects, the program tries to represent them as faithfully as possible. All will have at least the object type (such as public.jpeg) and the object’s UUID, which can be looked up in the ZICNOTEDATA.ZIDENTIFIER column. Pictures will identify where they are on disk and tables will identify the text in each cell. In the HTML output, pictures refer to the thumbnail stored by Notes, although it will also copy out the fullsize image, and tables are rendered as a table.
Requirements and Usage
This program now requires Ruby, instead of Perl, which concerned me at first since I consider Perl to be fairly ubiquitous and Ruby not as much. However, this code doesn’t have many dependencies, those it does are generally old and well maintained gems, and the backwards compatibility of Ruby means this code will run on versions far older than I have pinned it. Right now the program is expecting Ruby 2.3.0 or newer (~2015), but for those with an older version that want to decrement the version of Google Protobufs used, it should work back to 2.0 at least. Ruby 2.7 was just released and Google Protobufs is not yet compiled for it, until that occurs, I’d pinned it to require a version less than 2.7. The few gems that are required are all on the official Ruby gems repo and shouldn’t have any surprises.
Basic
To use the new version the same as the old, put a NoteStore.sqlite file into the root directory of the program and type rake
into the command line. Although it doesn’t appear in the above, Rake is the Ruby version of the classic Make tool, which is basically what real programmers used before IDEs automated everything and we just docker’d our saltstack using a repo of someone else’s code (speaking in jest).
In this example, rake
was expanded to run ruby on the notes_cloud_ripper.rb
file, passing in the --file NoteStore.sqlite
argument to identify the file you parse. If you’d like to get more particular with the arguments, you’ll want to directly invoke ruby and specify the arguments to use.
iTunes / Logical
In this example, ruby notes_cloud_ripper.rb
executed with the argument to find an iTunes backup directory located at --itunes-dir ~/phone_rips/iphone/notes_2019_12_05/device_id/
. This would be the root of the iTunes backup, with Manifest.db
present.
Physical
In this example, ruby notes_cloud_ripper.rb
executed with the argument to find a physical backup directory located at ~/phone_rips/iphone/iOS13_tar/
(how you obtain that physical backup is up to you. This would be the root of the physical backup, with the private
directory under it (for the sake of hard drives, the important thing really is to export the /private directory).
Power to the Pictures!
To show the value in running this on real backups, below is an example of the files created from the iTunes example above. Because we used a full backup, the output directory now includes the pictures, thumbnails, and drawings that were embedded in the notes:
And this is a screenshot of one of the notes in that HTML export:
Conclusion
At the end of the day, I hope this update makes the Notes Parser more useful. While this was a majorly breaking change, it should still have the functionality from the previous version, but with added features to get after embedded objects better. Additional posts will help lay out what the Notes formats and how they’ve changed over the years.