Thursday 3 December 2020

Regex (Advent of Code)

This blog describes the solution to this puzzle https://adventofcode.com/2020/day/2, that came out the same day as this blog is written. Although it is very easy to solve this puzzle with python or even with a google spreadsheet, in this blog, I will solve it in the esoteric language Retina. For those of you who never heard of this language, the linked page gives the following explanation:

Retina is a regex-based recreational programming language. Every program works by reading a (finite) string from standard input, transforming it via a series of regex operations (e.g. counting matches, filtering lines, and most of all substituting). Retina was built on top of .NET’s regex engine, but provides its own, more powerful substitution syntax.

As retina does not allow reading or writing files, the input from the puzzle will be provided via stdin.

The two solutions in runnable form can be found here: Part 1 and Part 2.

Part 1

For part one we have to count the lines of the form 1-3 a: abcde wherein the password (string at the end), the number of times the letter (before the colon) appear between “minimum” and “maximum” times (the indicated numbers).
The code does the following:

%(`

Do for every line

(\d+)\-(\d+) (.): (.+)

Select the minimum, maximum, letter and password. They are stored in $1 $2 $3 and $4
So, for example, if our input was 1-3 b: abbc the variables would respectively be 1, 3, “b” and “abbc”

K`$1,$2:$4¶$3¶_

Build for the given variables the following string

K`<mimimum>,<maximum>:<password>
<letter>
_

So for our previous example, this string becomes

K`1,3:abbc
b
_

This is in itself a retina program, that takes the password, and replaces each occurrence of <letter> in the password with an underscore. <mimimum>,<maximum>: is prepended, but as they do not contain any letters, this piece is unaffected by the replace statement.

~`$-
_

Executes the code given by the struct. $- means the result of the last expression (so the code described above). As the code does not take any input, the underscore is ignored. The result is something of the form 1,3: a__c.

[a-z]*
<whitespace>

Delete all lowercase letters. The result is something of the form 1,3:__. It has the minimum and maximum, and contains a list of underscores with its length equal to the number of occurrences of <letter>

:(_*)$

This matches the underscores after the colon.

:$.($1)

Replaces the previous match with the length of the string, which is the number of underscores. The result will be something like 1,3:2

^(\d+),(\d+):(\d+)$
$1;$3;$2

Extract the 3 numbers, and puts them in in a comma seperated list, with the mesured lenght in the middle.
In our example, this will be 1,2,3.
Note that if our measured length is between minimum and maximum, this list will be sorted.

N`\d+

Sort the list.

^
$-1x

Append the previous expression (the unsorted list) to the current expression. In our example, the value will now be 1,2,3x1,2,3

^(.*)x\1$
ok

This replaces the full string with ok if the expression before the ‘x’ is the same as the expression after the ‘x’. This only happens when our list was already sorted before the sort operation.

)*`

End the Per-line mode and concatenate all lines.
Now we only have to count all occurrences of ok, and we have our list. We do that by just typing ok

If you try it with your own puzzle input, make sure to not add a trailing enter, as the procedure above replaces an empty string by ok, so it will count empty lines.

part 2

For part one we have to count the lines of the form 1-3 a: abcde where either the first or third element of our password is the correct letter. Note that this is one indexed, contrary to the superior and more practical zero indexing. This we will have to keep in mind later.

%(`

Again we start with “for every line”

(\d+)\-(\d+) (.): (.+)

Same as above. Extract the variables.

K`a$4¶^(.){$1}(.).$*$$¶$$2¶K`a$4¶^(.){$2}(.).$*$$¶$$-1$3$$2

Build the following string

K`a<password>
^(.){<first index>}(.).*$
$2
K`a<password>
^(.){$2}(.).*$
$-1<letter>$2

This is in itself a retina program, that works as following:

K`a<password>

Create a string with the password. We prepend an “a” to make our indices work. As the following code assumes the password is zero-indexed, but it is one indexed, we can apply an extra character to the front of our password to shift all indices by one.

^(.){<first index>}(.).*$

Select the letter at the index of <first index>. The first part of the regex ^(.){<first index>} matches <first index> characters, so $2 will be matched with the first character after it, which is the <first index>-th character. .*$ matches the rest of the string, to make sure the full string is replaced.

$2

Replace the full string with $2, which was the needed character.
The next three lines of code do the same thing, except for the second number. And at the end instead of only selecting the needed character, you also add the previous one and the original letter.
Executing this code leaves us with three letters, and we have to check if one of the side letters matches the middle letter, but not both. This can be done by first deleting all triples, which is done by this piece of code:

m`(.)\1\1

And then count all doubles, stated by this piece:

m`(.)\1

This will give the correct results in every case.