If you missed any previous days, click here for all my content about that: Advent of Code, if you want to know why you should participate and try to code the solution for yourself, click here: Advent Of Code 2022 – 7 Reasons why you should participate. If you’re here to see the solution of Day 7, continue reading 😉
Table of Contents
- GitHub Repository
- Day 7 Puzzle
I will upload all of my solutions there – in the form of Python (or Scala alternatives) notebooks. You have to use your own input for the “input.txt” since everyone gets their own and everyone needs to provide a different solution based on their input.
Day 7 Puzzle
On day 7 of Advent of Code, we had to build a tree of a filesystem that we get knowledge of through recorded terminal commands, such as this:
$ cd / $ ls dir a 14848514 b.txt 8504156 c.dat dir d $ cd a $ ls dir e 29116 f 2557 g 62596 h.lst $ cd e $ ls 584 i $ cd .. $ cd .. $ cd d $ ls 4060174 j 8033020 d.log 5626152 d.ext 7214296 k
Which would generate the following tree:
- / (dir) - a (dir) - e (dir) - i (file, size=584) - f (file, size=29116) - g (file, size=2557) - h.lst (file, size=62596) - b.txt (file, size=14848514) - c.dat (file, size=8504156) - d (dir) - j (file, size=4060174) - d.log (file, size=8033020) - d.ext (file, size=5626152) - k (file, size=7214296)
Here we then had to calculate the size of the folders to then add up the ones with a total size with less than 100,000. Now you could do this in a shortcut-way without reconstructing the complete tree. However, I chose to train my data structures and build that tree from top to bottom – or rather, from root to leaves.
First off, I created a class for the entries in the tree. This would either be a file or a directory, indicated by a boolean value and if it’s a file, it has a size.
At the start the node does not have children or a parent. You can however add a child through the method
add_child which assigns the child a parent before adding it to the
Then I calculated the size recursively, always adding the size of the children to the current size until I arrive at a file (which by its nature does not have children), in which case I return just the file size.
I also included a print method for the tree. Frankly because debugging without being able to see the tree was nearly impossible.
class TreeNode: def __init__(self, is_dir: bool, name, size=None) -> None: self.is_dir = is_dir self.name = name self.size = size self.children =  self.parent = None def add_child(self, child): child.parent = self self.children.append(child) def get_size(self): if self.is_dir: total_size = 0 for child in self.children: total_size += child.get_size() return total_size else: return self.size def print_children(self, level): if self.is_dir: print('--' * level + self.name + ' (total=' + str(self.get_size()) +')') else: print('--' * level + self.name + ' (file=' + str(self.get_size()) +')') if len(self.children) > 0: for child in self.children: child.print_children(level+1)
I also wrote a class for the tree. You don’t have to do that, most of the methods are one liners and its main purpose is keeping track of the current directory/node. You could easily do that in your main loop.
class Tree: def __init__(self) -> None: self.root = TreeNode(is_dir=True, name="root") self.current = self.root def reset_to_root(self): self.current = self.root def go_up_one_level(self): self.current = self.current.parent def go_to_child(self, name): self.current = list(filter(lambda child: child.name == name, self.current.children)) def add_new_child(self, child): self.current.add_child(child)
Parsing the tree
There are a few different cases of lines:
- ‘$ cd /’ which returns us to the root directory
- ‘$ ls’ which starts the sequence of listing files and directories within the current ones
- here I use a while loop to catch all of those and add them as children
- ‘$ cd ..’ simply goes up to the parent directory/node of the current one
- ‘$ cd <name>’ which enters the given sub-directory
- side note: thankfully the (or my) input never tries to enter directories that you didn’t previously see
with open('example.txt', 'r') as f: lines = f.readlines() lines = [entry.strip() for entry in lines] tree = Tree() while len(lines) > 0: line = lines.pop(0) if line == '$ cd /': tree.reset_to_root() elif line == '$ ls': while len(lines)>0 and '$' not in lines: line = lines.pop(0) size, name = line.split(' ') if size.isdigit(): new_node = TreeNode(is_dir=False, name=name, size=int(size)) else: new_node = TreeNode(is_dir=True, name=name) tree.add_new_child(new_node) elif line == '$ cd ..': tree.go_up_one_level() elif '$ cd' in line: _, _, name = line.split(' ') tree.go_to_child(name)
Putting it together
This is the method (within the TreeNode class), that solves Part 1:
def find_subdirectories_part1(self): dir_sizes = 0 if self.is_dir: for child in self.children: if child.is_dir and child.get_size() <= 100000: dir_sizes += child.get_size() + child.find_subdirectories_part1() else: dir_sizes += child.find_subdirectories_part1() return dir_sizes
Here we recursively go through the tree building up a sum of file sizes as we go. When iterating through a nodes children, we only add the current directory size to our sum, if it’s below the 100,000 threshold. Otherwise we just look deeper in the tree.
# call on root node to solve tree.root.find_subdirectories_part1()
After building up a whole tree in Part 1, Part 2 is fairly quick after reading the assignment. Basically, we are looking for a directory that is at least as big as 30000000 – (70000000 – tree.root.get_size()).
To do that I created a similar method as in Part 1. This time, I’m not building a sum, but instead collecting all the possible directory sizes in an array.
# method in TreeNode class def find_subdirectories_part2(self, min_size): dir_sizes =  if self.is_dir: for child in self.children: if child.is_dir and child.get_size() >= min_size: dir_sizes += [child.get_size()] + child.find_subdirectories_part2(min_size) else: dir_sizes += child.find_subdirectories_part2(min_size) return dir_sizes
The instruction was the smallest sufficiently large directory, so finally I select the minimum of all directories found above.
total_space = 70000000 space_needed = 30000000 current_empty_space = 70000000 - tree.root.get_size() possible_dirs = tree.root.find_subdirectories_part2(space_needed - current_empty_space) min(possible_dirs)
I’m tired. I did this way too late.