How to solve Advent of Code 2022 – Day 7 with Python

If you missed any previous days, click here for all my content about that: Advent of Code, if you want to know why you should participate and try to code the solution for yourself, click here: Advent Of Code 2022 – 7 Reasons why you should participate. If you’re here to see the solution of Day 7, continue reading 😉

GitHub Repository

I will upload all of my solutions there – in the form of Python (or Scala alternatives) notebooks. You have to use your own input for the “input.txt” since everyone gets their own and everyone needs to provide a different solution based on their input.

Day 7 Puzzle

On day 7 of Advent of Code, we had to build a tree of a filesystem that we get knowledge of through recorded terminal commands, such as this:

$cd /$ ls
dir a
14848514 b.txt
8504156 c.dat
dir d
$cd a$ ls
dir e
29116 f
2557 g
62596 h.lst
$cd e$ ls
584 i
$cd ..$ cd ..
$cd d$ ls
4060174 j
8033020 d.log
5626152 d.ext
7214296 k

Which would generate the following tree:

- / (dir)
- a (dir)
- e (dir)
- i (file, size=584)
- f (file, size=29116)
- g (file, size=2557)
- h.lst (file, size=62596)
- b.txt (file, size=14848514)
- c.dat (file, size=8504156)
- d (dir)
- j (file, size=4060174)
- d.log (file, size=8033020)
- d.ext (file, size=5626152)
- k (file, size=7214296)

Part 1

Here we then had to calculate the size of the folders to then add up the ones with a total size with less than 100,000. Now you could do this in a shortcut-way without reconstructing the complete tree. However, I chose to train my data structures and build that tree from top to bottom – or rather, from root to leaves.

Tree Nodes

First off, I created a class for the entries in the tree. This would either be a file or a directory, indicated by a boolean value and if it’s a file, it has a size.

At the start the node does not have children or a parent. You can however add a child through the method add_child which assigns the child a parent before adding it to the children list.

Then I calculated the size recursively, always adding the size of the children to the current size until I arrive at a file (which by its nature does not have children), in which case I return just the file size.

I also included a print method for the tree. Frankly because debugging without being able to see the tree was nearly impossible.

class TreeNode:
def __init__(self, is_dir: bool, name, size=None) -> None:
self.is_dir = is_dir
self.name = name
self.size = size
self.children = []
self.parent = None

child.parent = self
self.children.append(child)

def get_size(self):
if self.is_dir:
total_size = 0
for child in self.children:
total_size += child.get_size()
else:
return self.size

def print_children(self, level):
if self.is_dir:
print('--' * level + self.name + ' (total=' + str(self.get_size()) +')')
else:
print('--' * level + self.name + ' (file=' + str(self.get_size()) +')')
if len(self.children) > 0:
for child in self.children:
child.print_children(level+1)

Tree

I also wrote a class for the tree. You don’t have to do that, most of the methods are one liners and its main purpose is keeping track of the current directory/node. You could easily do that in your main loop.

class Tree:
def __init__(self) -> None:
self.root = TreeNode(is_dir=True, name="root")
self.current = self.root

def reset_to_root(self):
self.current = self.root

def go_up_one_level(self):
self.current = self.current.parent

def go_to_child(self, name):
self.current = list(filter(lambda child: child.name == name, self.current.children))[0]

self.current.add_child(child)

Parsing the tree

There are a few different cases of lines:

• ‘$cd /’ which returns us to the root directory • ‘$ ls’ which starts the sequence of listing files and directories within the current ones
• here I use a while loop to catch all of those and add them as children
• ‘$cd ..’ simply goes up to the parent directory/node of the current one • ‘$ cd <name>’ which enters the given sub-directory
• side note: thankfully the (or my) input never tries to enter directories that you didn’t previously see
with open('example.txt', 'r') as f:
lines = [entry.strip() for entry in lines]

tree = Tree()

while len(lines) > 0:
line = lines.pop(0)
if line == '$cd /': tree.reset_to_root() elif line == '$ ls':
while len(lines)>0 and '$' not in lines[0]: line = lines.pop(0) size, name = line.split(' ') if size.isdigit(): new_node = TreeNode(is_dir=False, name=name, size=int(size)) else: new_node = TreeNode(is_dir=True, name=name) tree.add_new_child(new_node) elif line == '$ cd ..':
tree.go_up_one_level()
elif '\$ cd' in line:
_, _, name = line.split(' ')
tree.go_to_child(name)

Putting it together

This is the method (within the TreeNode class), that solves Part 1:

def find_subdirectories_part1(self):
dir_sizes = 0
if self.is_dir:
for child in self.children:
if child.is_dir and child.get_size() <= 100000:
dir_sizes += child.get_size() + child.find_subdirectories_part1()
else:
dir_sizes += child.find_subdirectories_part1()
return dir_sizes

Here we recursively go through the tree building up a sum of file sizes as we go. When iterating through a nodes children, we only add the current directory size to our sum, if it’s below the 100,000 threshold. Otherwise we just look deeper in the tree.

# call on root node to solve
tree.root.find_subdirectories_part1()

Part 2

After building up a whole tree in Part 1, Part 2 is fairly quick after reading the assignment. Basically, we are looking for a directory that is at least as big as 30000000 – (70000000 – tree.root.get_size()).

To do that I created a similar method as in Part 1. This time, I’m not building a sum, but instead collecting all the possible directory sizes in an array.

# method in TreeNode class
def find_subdirectories_part2(self, min_size):
dir_sizes = []
if self.is_dir:
for child in self.children:
if child.is_dir and child.get_size() >= min_size:
dir_sizes += [child.get_size()] + child.find_subdirectories_part2(min_size)
else:
dir_sizes += child.find_subdirectories_part2(min_size)
return dir_sizes

The instruction was the smallest sufficiently large directory, so finally I select the minimum of all directories found above.

total_space = 70000000
space_needed = 30000000
current_empty_space = 70000000 - tree.root.get_size()
possible_dirs = tree.root.find_subdirectories_part2(space_needed - current_empty_space)
min(possible_dirs)

Conclusion

I’m tired. I did this way too late.

Consent Management Platform by Real Cookie Banner