TL;DR: Week 5 of the #MagnetWeeklyCTF got a little sporty with the addition of a Linux image (yay) and Hadoop questions (oh no).
Review
Check out the week 1 blog post for how to get started on the Magnet Weekly CTF.
Get the challenge
The weekly challenge for week 5 was:
What is the original filename for block 1073741825?
Ok, seems straight forward enough, we just need to know how to map a file system’s block number to the actual filename.
Open the target file(s)
Magnet provided three disk images this week, all of which are in Encase format (.E01). To open them up on a Linux machine, you need to install ewf-tools
and use the ewfmount
command. That will give you an image named ewf1
which you can then mount
to get to your actual file system.
Note: I was doing this last minute and hit some odd permissions errors with ewfmount
as it mounted everything as if it were owned by root, despite not being run by root. That was all cleared up by making sure I sudo’d everything. I also had to make sure I specified the offset within the file for which partition I was trying to mount. All three drives have their main partition 2048 blocks into the file, and the blocks are 512 bytes, so the offset
is 512x2048=1048576. I also made sure to specify in the options (-o
) that it was a loop
device, readonly (ro
), and not to try to recover anything (norecovery
).
Learn some background
Having never run a Hadoop cluster myself, I did some brief googling to figure out what made Hadoop’s file system different from others I knew. Apache’s HDFS design document gives a great overview of the file system and Stack Overflow comes in clutch as always with the command which should be the answer: hadoop fsck / -files -blocks | grep blk_1073741825
. That is older syntax and should be even easier with hdfs fsck -blockId blk_1073741825
.
Derp for an hour or two
With the disks mounted, I tried a lot of different ways to get hdfs fsck
to work, to no avail. This was a serious point of frustration, but mainly because it was last minute, I didn’t have time to go study how to do it right, and I very much wanted sleep at that time. Finally, my brain was able to pull me head out of the frustration enough to try something else (which I should always do way sooner than I do).
Find the Fsimage file
I at least had enough understanding at this point to know I was looking for the fsimage
files. They literally could not be found anywhere on the copies I mounted on my first machine and in desparation I copied everything over to another testbed and tried again. Lo, and behold, when I did a find | grep fsimage
this time, there they were! I am fairly certain one of the ways I tried to get Hadoop running overwrote them initially, before I remembered the ro
flag on mount
. On the HDFS-Master.E01
file, it can be found in /usr/local/hadoop/hadoop2_data/hdfs/namenode/current/fsimage_*
. With this open in a hexeditor, you can clearly see the filename present, at which point I tried “AptSource” and got the flag.
Alternatives
Log Files
Early on, before I wanted to commit to installed hadoop and couldn’t get the images to mount, I tried my usual opening move, I used grep
to see what showed up. I saw an answer and tried the answer, but my lack of Hadoop knowledge meant I tried the wrong answer. The filename and block ID were both found in a log file, my issue was I took the entire path given in the log file, instead of only the name.
Notice these two lines imply that block blk_10737418251001 is allocated for “/text/AptSource._COPYING”. I figured that the .COPYING was likely garbage, but thought “/text/” was needed. As it turns out, the filename itself was “AptSource” and clearly present. I had the right answer, but lacked the understanding to implement it.
File Size
Along the way, I also discovered the size of the file is logged in the log file and using file -size [size]c
was able to find the original source file using this. Unfortunately, that name is different and not the actual “original” filename that the question asks for.
Looking at these files and the contents of the actual block found in usr/local/hadoop/hadoop2_data/hdfs/namenode/current/fsimage_0000000000000000024
showed that the original file was home/hadoop/temp/sources.list
.
Console history
You can also find this file by examining the actions of the admin running the cluster. Looking at the user’s bash history gives insight into when the cluster was created and this file was pushed in. You can even see that the admin made the usual mistake I do, trying to move a file to a directory which doesn’t exist.
In this case, the admin pushed in the file sources.list
, which we had already discovered as being the same size and content as our block in question.
Conclusion
I have a lot to learn over the next few weeks and very little spare time to do so, unfortunately. I wasted a lot of time trying to get the assumed answer from the Internet running, instead of looking at what evidence was in front of me and working with it. This is going to be rough, but I sure hope I can continue to scrape by the usual CLI tools without actually getting hadoop running.