I recently participated in the GSA ULTRA 2020 summer competition. I have been participating in GSA ULTRA competitions since 2018, but this was the first time that I seriously tried for the top prizes. Here I present my solutions, in what I consider to be more or less increasing order of difficulty.

As there were almost 20 problems in the competition, this is an extra long blog post. I tried to trim a lot of the statements and did not include the code so that the post is as concise as possible. The implementation is linked to at the start of every solution.


Annie and Betty play a variant of Rock, Paper, Scissors using special playing cards, each marked with either RP or S. They start the game with a stack of n cards each. A stack of cards is represented as a string, where we consider the start of the string to be the bottom of the stack and the end of the string to be the top of the stack.

In each round they each show the top card from their stack:

  • The winning player puts both cards just played at the bottom of her stack (with the winning card at the very bottom)
  • In the case of a draw, each player puts the card she just played at the bottom of her own stack

As usual, R beats SS beats P and P beats R; if both players have the same card, the round is a draw.

The game ends when either player runs out of cards. You should output the number of rounds until the game ends. It is guaranteed that the game will terminate in all cases.


  • a is a string representing Annie’s initial stack
  • b is a string representing Betty’s initial stack


  • $1 \leq \texttt{n} \leq 1,000$


This was probably the easiest of all the problems on the list, as it simply required simulating the process, for which I used 2 deques. You can view my code for details.

Hardcore Parkour

Can you help Peter Pouncer pick a perfect parkour path? Peter wants to run and jump his way through a series of platform columns in the shortest time possible. Each platform column consists of one or more platforms, each at a different height, specified as an integer number of metres. He will visit exactly one platform from each column, starting at the 0th column. Peter is free to choose which platform he starts on at the 0th column and which platform he jumps to at each subsequent column, so long as the next platform is not more than five metres higher or lower than the current platform.

The time it takes for Peter to visit each platform (including jumping time) is determined by his pace in “seconds per platform”, which is an integer between 1 (fastest) and 10 (slowest) inclusive. At pace p, it takes p seconds to visit a platform. Each time Peter jumps to a new platform, his pace will:

  • get faster by 1 if the new platform is lower and he isn’t already at the fastest pace of 1
  • get slower by 1 if the new platform is higher and he isn’t already at the slowest pace of 10
  • be unchanged otherwise

Peter’s initial pace is 5, so it will take 5 seconds to visit the 0th platform. If his first jump is to a higher platform, it will take 6 seconds to visit that platform, and so on.

You should output the total time in seconds for Peter to complete the course if he takes the best possible route. This includes the time to visit the 0th platform and the last platform.


  • ps is a list of tuples of integers, where each tuple represents a platform column, and each integer within a tuple represents the height, in metres, of a platform within that platform column


  • $1 \leq \texttt{len(ps)} \leq 10,000$
  • $1 \leq \texttt{sum(len(p) for p in ps)} \leq 200,000$


This was a pretty straightforward challenge to pass all the test cases, a simple implementation worked. To get double points I used many optimization tricks but none of them were too interesting. You can view the code here.

Jumping Jimmy

Jumping Jimmy is a flea. He has recorded the height of every jump he has performed.

Cynical Cedric the centipede is jealous because he cannot jump at all. So Cedric challenges Jimmy to answer queries on the highest jump he achieved during certain periods in his track record. Each query is a tuple of integers (low, high) and the answer to that query is the maximum height of the jumps with index between low (inclusive) and high (exclusive).

You have to help Jimmy by writing some code which answers these queries as quickly as possible. You should output the sum of the answers to the queries.


  • a is a list of all the jumps
    • each jump is an integer in the range [1, 100,000)
    • the length of a is in the range [1, 100,000)
  • qs is a list of tuples (low, high) where
    • $0 \leq \texttt{low} < \texttt{high} \leq \texttt{len(a)}$
    • the length of qs is in the range [1, 100,000)


This problem is fairly standard, just use any RMQ data structure to solve it. Honestly my code even overcomplicates the solution now that I look at it, but it does the job. You can view it here.

Paper Cut

Polly runs a paper mill. Given integers a and b and a rectangle of paper measuring c * d, she wants to know whether she can cut it exactly into rectangles of size a * b with no spare paper left over. She uses a guillotine which cuts a rectangle into two smaller rectangles along a straight line orthodp[len(a)][len(b)][0]gonal to the edges of the rectangle. For a given query (a, b, c, d), we say the value of that query is 1 if Polly can cut the paper as required and 0 otherwise.

Given a list of queries qs, return the sum of 2**i modulo (10**9+7) for queries i which have value 1.


qs, a list of tuples (a, b, c, d)


  • $1 \leq \texttt{len(qs)} \leq 1,000$
  • $1 \leq \texttt{a}, \texttt{b} < 1,000,000$
  • $1 \leq \texttt{c}, \texttt{d} < 1,000,000,000,000$



This was quite an easy problem that required a little bit of mathematical insight. Here are some observations:

  • We can divide a, b, c, d through by their common divisor without changing the result.

  • If a and b divide c and d in some order, we can clearly cut the paper

  • If $a*b$ does not divide $c*d$, we clearly cannot cut the paper

  • If none of the cases above cover the query, we can assume that $a, b$ are coprime (after dividing everything by the common divisor). Doing some maths showed that the query is satisfied iff one of the sides is divisible by $a*b$, and another is of the form $x*a+y*b$ for natural numbers $x, y$.

With these observations we can calculate each query very quickly. Use logarithmic exponentiation to calculate the final sum.

Two Quests

Barry Trotter, junior wizard, is applying for apprenticeships with two master wizards, Albus and Altrain. Each master wizard has set Barry a quest. Barry plans to complete both quests, travelling the minimum possible distance to do so. This will put him in a good position to negotiate the terms of his apprenticeship.

A quest consists of a sequence of tasks to undertake at locations along the line of the Great Valley. Each quest is described by a list of integers which indicates the locations of the tasks. Barry starts at location 0 and must travel from there to each task location in order. He can’t undertake any task of a quest until he has completed all preceding tasks in that quest but he can interleave tasks between the two quests to make the most efficient use of his travel time.

You should output the total distance travelled by Barry to complete both quests when taking the shortest possible route.


  • Two lists of integers, a and b, representing the two quests


  • The integers in the quests are in the range [0, 100,000)
  • The length of each quest is in the range [0, 2,000) and can be different for the two quests



A quick and simple simple quadratic approach works:

  • Let dp[i][j][q] be the minimum distance to complete the first i tasks from a, and first j tasks from b, the last task completed being from a if q = 0, and from b if q = 1.

  • The update for q = 0: dp[i][j][0] = min(dp[i-1][j][0] + abs(a[i]-a[i-1]), dp[i-1][j][1] + abs(b[j]-a[i]))

  • Analogous update for q = 1

  • Some care needs to be taken to correctly initialize the base cases, you can see details in my code

  • The answer is then min(dp[len(a)][len(b)][0], dp[len(a)][len(b)][1])

Prime Feast

Patrick is a prime number eater. He likes to feast on a string s, consisting of the digits from '0' to '9' inclusive, by successively eating substrings of s subject to certain conditions on the values of those substrings parsed as integers. Eating a substring removes it from s.

He only eats prime numbers smaller than 10,000 and because smaller primes taste sweeter than larger primes, each prime he eats must be strictly smaller than the previous prime. Once there are no remaining primes to eat the feast is over. He takes care to maximize the sum of the primes he consumes in the feast.

Note that leading zeros have a bitter taste and Patrick will never eat them.

You must output the total of the primes eaten by Patrick. If there are no primes Patrick can eat then output 0.


  • s is a string consisting of the digits from 0 to 9.


  • $1 \leq \texttt{len(s)} \leq 100$



I though this challenge might be interesting but it had some really bad tests. My code works correctly most of the time, but it definitely does not cover all of the test cases.

My final algorithm is rather simple to understand:

  • Repeatedly find numbers that can be extracted from the string next, starting from the largest values first, as those are likely to produce the largest answers

  • Identify situations early where we won’t be able to get a better result than we already have, using some combination of dynamic programming and knowing the value our parent is looking for

The first part is rather trivial. The second part is a lt trickier to get correctly, and I definitely did not (even though I think I know how I could make it correctly). But it didn’t matter, as the tests passed anyways.

Basic idea

When calculating the solution to a subproblem, I pass the solver the number that I need to get in order to improve on the best global solution. I stop calculating once I can no longer find a better global solution. The solution is stored for future reference, in case we encounter the same string again.

There is a small issue with this solution: if we deduce that we can’t improve on the global solution now, it does not mean that we won’t be able to do so later, when we might need less to improve on the global solution. There isn’t a simple way to fix this (or at least I don’t know one), apart from using a different and slower pruning strategy. Luckily, unless the tests specifically targeted this solution, it still very likely to pass the tests, which is what ended up happening.

Getting double points

Funnily, my code wasn’t fast enough to get double points, even though it was incorrect to make it faster. The obvious thing to do was to see just how bad the tests were, and hence I tried pruning more cases by slapping some constants in an inequality. That worked, and I quickly doubled my points for the problem.

Moral of the problem: don’t assume the tests are good, sometimes a silly solution may just work well enough.


Mick can’t get no satisfaction… unless a certain logical proposition is satisfied.

In particular, given a proposition consisting of n variables A, B, C,..., the operators ANDOR and NOT and parentheses, Mick would like to know for how many of the 2**n possible assignments of True or False to these variables the proposition is True.

The usual rules of precedence apply, i.e. the highest precedence is parenthesis, followed by NOT, then AND, then OR.

You should output the number of possible variable assignments which make the proposition True, as an int.


  • n is an integer
  • proposition is a string


  • $1 \leq \texttt{n} \leq 26$. The variable names used in the proposition will be the first n uppercase letters of the English alphabet
  • $1 \leq \texttt{len(proposition)} \leq 1,000$



This challenge was a fun one, but not particularly difficult. It consisted of several parts:

  • Parsing the formula into an expression tree

  • Efficiently choosing the next variable to substitute

  • Simplifying the formula after substitution

  • Reusing answers using dynamic programming

Parsing and simplification

These are covered quite clearly in the solution code.

Choosing the next variable

The idea is to choose a variable to substitute that would decrease the size of the formula, or the remaining number of variables in the formula as much as possible. I chose what I think is a good and simple to calculate heuristic: the variable that appears the largest number of times.

Reusing answers

Every answer I calculate I also store in a dictionary so I can reuse in the future without the need to calculate. For a key I use the string representation of a formula, which I calculate during initialization. Additionally I store the number of variables still unassigned at the point of consideration of a formula. This is necessary, since the answer depends on the number of unassigned variables. In particular, it is proportional to $2^{\texttt{\#unassigned\_variables}}$.

Combining all these parts together produced my final solution, and it was enough to get double points for the task.


It’s a spacedog-eat-spacedog world! There are N spacedogs living in K-dimensional space. Each spacedog has an initial mass, given as an integer, and an initial position, given as a tuple of K integers.

At each time step, the smallest spacedog will fight whichever spacedog is closest to it (by the usual Euclidean distance), whereupon the larger spacedog will eat the smaller spacedog. This fight takes place at the midpoint between their positions at the start of the time step (with each coordinate rounded down to the nearest integer) where the larger spacedog remains at the end of the time step. The larger spacedog’s mass increases by the mass of the smaller spacedog at the end of the time step.

This process repeats until there is just one spacedog remaining. You should output the sum of the final coordinates of the last remaining spacedog.


  • masses is a list of integers
  • locations is a list of tuples, each of K integers


  • $1 \leq \texttt{N} \leq 10^4$
  • $1 \leq \texttt{K} \leq 5$

Inputs are chosen so that the smallest spacedog and its closest spacedog are guaranteed to be unique at each time step.



This question was not difficult conceptually: all we need is a data structure that allows us to add, remove, and find the closest nieghbour efficiently. These are exactly the operations supported by a k-d tree! I couldn’t find a short and easy to understand implementation of a Python k-d tree online, so I made one myself.

To make my life easier, I made a few design choices:

  • All the points are kept in leaf nodes. This way the overall reasoning becomes simpler, as we only ever add or remove nodes that are leaves.

  • Any node keeps children that are bounded in a specific box, where the bounds are non-integer. This way there is no ambiguity about where a point is located, as it cannot be on a boundary. We only store the coordinate at which we split the box for a node, but otherwise calculate the boxes dynamically for each function call.

The remainder of the solution is rather straightforward:

  • create a min priority queue where we order spacedogs by their weight

  • create a k-d tree which allows for efficient nearest neighbour queries

  • repeatedly remove the smallest spacedog from the queue, find its nearest neighbour and merge these together, updating the queue and the k-d tree appropriately

  • some care needs to be taken to ensure that the smallest spacedog hasn’t been merged already. For this I assign a unique index to each spacedog, and use a remaining dictionary to keep the indices of the remaining spacedogs. When a spacedog is not in remaining, we simply pop it off the queue.

Task Genie

Thomas has a number of tasks to complete. Each task has a time cost c, and a parent task p (except for the 0th task, which has no parent task and acts as the root of the tree of tasks). Thomas can only start a task after its parent task has been completed but otherwise there is no constraint on how many tasks Thomas can undertake simultaneously. The time it will take Thomas to complete all tasks is therefore the maximum, over all sequences of tasks from the root task to a leaf task, of the sum of the time costs of the tasks in such a sequence.

Fortunately, Thomas knows a genie who will grant him w wishes. Each wish has the effect of reducing the time cost of a task to 0.

You should output the minimum time for Thomas to complete all tasks, assuming he makes the best use of his wishes.


  • ts is a list of tasks represented as tuples of int(p, c), where

    • p is the zero-based index in ts of the parent task
    • c is the time cost of the task

    For the 0th task in tsp will be -1 to indicate that it has no parent task

  • w is the number of available wishes


  • $1 \leq \texttt{len(ts)} \leq 100,000$
  • $0 \leq \texttt{w} \leq 50,000$
  • $0 < \texttt{c} \leq 1,000$



Naive solution

There is a fairly straightforward solution, based on the following observations:

  • For each vertex v let cost[v][i] denote the cost of the subtree starting at v, given that we can remove i tasks in the subtree of v.

  • For leaves this is simple to calculate: cost[leaf][0] = c[leaf] and cost[leaf][i] = 0 for $i \ge 1$

  • Similarly for v which have just one child v': cost[v][i] = min(cost[v'][i-1], cost[v'][i] + c[v]), as we either use the ith zero on v itself, and i-1 zeros on subtree of v', or all i zeros on subtree of v'.

  • If there are at least 2 children, we can merge their cost: cost[new_child][i] = min(cost[c1][j], cost[c2][i-j]) over all valid j. This can be done in time $O(w*\log w)$ using binary search.

Proper solution

A crucial thing to notice is that the above solution does a lot of unnecessary computation. We don’t actually need to calculate every cost[v][i] for every i. Instead we do the following:

  • At each vertex maintain a priority queue with the time needed to complete the subtree of each of the children far in the subtree of each child

  • When we want to apply a new genie wish

    • we start from the root of the whole tree

    • at each node we can either apply the wish to the root of the current tree, or the subtree of the child that takes the longest time to complete currently

    • after considering both options recalculate the best node cost according to a formula similar to one in the previous solution, as well as the priority queues along the branches that we took

  • We apply the genie wish $w$ times, and the answer is the time needed to complete the whole tree at the end

More details can be found in the code itself.

Note: this solution is still quadratic in the worst case, and should fail tests where we end up with very long, linear-length branches. The tests were likely chosen randomly, and hence such an important edge case was missed, so this solution was good enough.


Explodium is an extremely powerful explosive, with just a few grams causing damage over a great distance. However, it is also extremely expensive! The Intergalactic Peace Force (IPF) is planning a bombing raid of targets on the one-dimensional surface of an alien planet and would like to minimise the amount of explodium they use.

Each target has a position and a strength (both integers). The IPF will drop explodium bombs, choosing for each bomb both a position and a “payload”, which is how many grams of explodium the bomb has. The bomb position and payload are floats.

The amount of damage inflicted on a target at position pos_target by a bomb at position pos_bomb with payload g is:

max(0, g - abs(pos_bomb - pos_target))

One target can receive damage from multiple bombs, in which case the damage adds up (e.g. a target receiving 3 units of damage from one bomb and 1.5 units of damage from another bomb would receive 4.5 units of damage in total). One bomb can cause damage to multiple targets, in which case the damage caused to each target is unaffected by the presence of the other targets.

A target will be destroyed if the total damage to that target is greater than or equal to the strength of the target. The bombing raid will be successful if all targets are destroyed.

You should output the minimum total payload required for a successful bombing raid, floored to an integer.


  • targets is a list of tuples, (position, strength)


  • $1 \leq \texttt{len(targets)} \leq 100,000$
  • $0 \leq \texttt{position} < 10,000,000$
  • $0 < \texttt{strength} < 10,000$



Inspecting the damage function

To visualize how bombs damage targets, imagine a bomb detonated at a with payload g. Then:

  • Draw a circle (or just an interval) at a of size g

  • A point falling inside the circle gets damaged by the distance from the edge of the circle

  • A point outside the circle does not get damaged by this bomb

This way of thinking is useful when considering the joint effect of several bombs

Using several bombs on the same target

Let’s say that we have 2 bombs that damage a particular target. Can we combine then together, and have at least as much damage everywhere for at most the same payload?

Yes we can! In particular, if the radiuses of the 2 bomb circles are $r_1$ and $r_2$, and the bombs are located at $A$ and $B$, we can replace that with a bomb of radius $r_1 + r_2$, with a center chosen as follows:

  • placing the bomb at $A$, the circle of the first bomb would be encircled.

  • similarly for the second bomb, if we place the new bomb at $B$

  • since the original circles overlap, sliding the new bomb’s center from $A$ to $B$ at some point we will encompass both circles.

  • the damage is still at least as powerful at each point, and we have used the same amount of payload.

Overlapping bombs

So now we reduced our possible optimal candidates to ones that have at most a single bomb per target. What if now 2 bombs still overlap? Once again, very similarly it makes sense to merge them. As every point gets destroyed by exactly 1 bomb, we can be slightly more efficient: replace the 2 overlapping circles with one that tightly encapsulates both of them. This produces a strictly more optimal solution.

Non-overlapping bombs

Finally notice that it does not make sense to combine non overlapping bombs. If we did, we would need to encapsulate both circles produced by the bombs, and need a larger payload than the cumulative payload of the 2 bombs we are combining.

Final solution

Using these observations it is quite clear that the following solution works:

  • Initially place a bomb on each target that has just enough power to destroy the target

  • While there are 2 overlapping bombs, combine them into a single bomb as described above.

This can be implemented in linear time (apart from the time to sort the points initially) if we consider the points from left to right. You can find more details in my code.

Doubling Danny

Danny would like to be able to compute large powers of 2 so he knows what he is getting himself into. You will be given a list, a, of non-negative integers and should output the sum of 2**x for each x in a, reduced modulo a given prime, m.


  • a is a list of integers with $1 \leq \texttt{len(a)} \leq 5000$. Each x in a satisfies $0 \leq \texttt{x} \leq 2^{5000}$
  • the prime m, where $\texttt{m} \leq 2^{2203} - 1$



The obvious idea for the solution is to just use logarithmic exponentiation. The issue is that the numbers $a_i$ are absolutely massive, so every multiplication takes a significant amount of time, and we need to multiply a lot.

Another quick observation is that we can use Fermat’s Little Theorem to reduce the $a_i$ modulo $m-1$. However, this still doesn’t solve much, as though $a_i$ get approximately halved in binary length, they are still way too large.

Dynamic programming to the rescue

My implementation of the naive logarithmic exponentiation one bit at a time took around $2$ minutes to complete, which is significantly larger than the $50$ seconds that I need to pass the tests. But it isn’t too bad either, I needed to increase the speed of my solution by only a small factor. Well, what if instead of constructing $2^{a_i}$ one bit of $a_i$ at a time, we constructed it $k$ bits at a time? Since the majority of the time is spent doing multiplications on very large numbers, this should reduce the number of multiplications proportionally to $k$, and hence improve execution speed!

So that’s what I did. I chose $k=8$ and divided bits of each $a_i$ into blocks of size $k$. For each block position (around $2000/8$ of them) and each possible bit arrangement ($2^8=256$ of them) we precompute the value that we need to multiply by if we encounter those particular bits in that position. Precomputation of these values can be done quite efficiently, with just a single multiplication per value.

The number of multiplications needed for the original solution was approximately $1000*5000$: we have $5000$ numbers of length around $2000$, and can expect around half of the bits to be set on average. In the improved version, we only need to multiply once every $8$ bits, which is an improvement by a factor of $4$. And precomputing the blocks is quite quick in comparison to the rest of the algorithm.


This solution was the final one I used, and it barely finished the tests in time. I did some further caching optimisations to reduce it to 45 seconds just in case, as the solutions were going to be rejudged after the competition, and called it a day.

Many other people managed to solve this challenge in almost no time, so I likely missed some trick to significantly speed up my solution. That being said, it is still a useful skill to make a hacky solution work by incremetally optimizing the bottleneck. I really enjoyed this challenge as it had a very simple statement, and optimizing operations on very large numbers is not something I get to do often.

Grand Hotel

Al O’Cator operates the best hotel in town, with n rooms numbered from 0 (inclusive) to n (exclusive), which are all initially unoccupied.

Al receives two kinds of requests:

  • A check-in request, which will include the number of rooms, m, required for a party of guests. Al will check those guests into m consecutive unoccupied rooms, marking them as occupied. Al will always check guests into the lowest-numbered rooms possible.
  • A check-out request, which will give the index, i, of an earlier check-in request. The index is zero-based and counts only check-in requests. The party from the ith check-in request will then leave the hotel and Al will mark their rooms as unoccupied.

It is guaranteed that Al will be able to satisfy all the requests.

You should output the sum, over all parties remaining at the hotel after all requests have been processed, of the check-in request index for the party times the lowest room number Al gave to the party.


  • n is an integer giving the number of rooms in the hotel
  • rs is a list of requests. Each request is either
    • a tuple ('I', m) signifying a check-in request for m rooms, or
    • a tuple ('O', i) signifying a check-out request for the party from the i th check-in request.


  • $1 \leq \texttt{n} \leq 100,000,000,000$
  • $1 \leq \texttt{len(rs)} \leq 1,000,000$



Naive solution

The obvious way to solve this challenge is to maintain a linked list of available and taken spaces, and do a straightforward simulation of the queries.

A check-in request can then be completed in linear time, whereas a checkout request can be completed in constant time (with the help of an auxilary dictionary), for a total quadratic complexity. This is too slow for $\texttt{len(rs)}$ up to a million.

Putting the linked list in a tree

The above solution doesn’t have to be completely scrapped, we just need a way to speed up the check-in requests. We can do that by wrapping the linked list into a balanced binary tree.

More specifically:

  • The linked list consists exactly of the leaf nodes in the usual order

  • Every leaf node stores its size and whether it is empty / which check-in resides in it

  • Every non-leaf node stores the largest empty space in its subtree

That is the essance of the data structure we need to use. We rebalance the tree after each query, so each query can be completed in logarithmic time. I chose an AVL tree as the underlying tree. Since I needed to update both the tree and the list during each query, the solution ended up being quite complicated, handling a lot of corner cases. You can find the details in the code.

Gone to Seed

Andy enters a tennis tournament for 2**k players. It is a knockout format with 2**(k-1) first-round matches, the winners of which play in 2**(k-2) second-round matches and so on until only two players remain to contest the final match (the kth round) to determine the winner of the tournament. The players are randomly seeded, i.e. the ordering of the players to determine which remaining players meet in each round is randomised uniformly.

Each player has a level of skill, which is an int between 1 and 20. In a match between players a and b with skills skill_a and skill_b respectively, player a wins with probability skill_a / (skill_a + skill_b) and player b wins with probability skill_b / (skill_a + skill_b).

You should calculate the probability that Andy wins the tournament as a fraction in lowest terms, then output the string concatenation of the numerator followed by the denominator.


  • skills is a dictionary mapping a player’s name to their skill level, e.g. Andy’s skill is keyed with the string 'Andy'


  • $1 \leq \texttt{k} \leq 4$ i.e. len(skills) is 248 or 16
  • $1 \leq \texttt{skill} \leq 20$ for each skill in skills



The first observation that we can make is that the number of teams is extremely small. Up to 8 teams we can pretty much brute force, so the only interesting case is when we have 16 teams.

try #1

The obvious thing to do is to figure out the probability of Andy facing off against a specific player in each round by doing some dynamic programming. There is even a StackExchange question that has 2 answers with some promising formulas. Unfortunately, both of them are wrong. The event of 2 players meeting in a specific round is quite difficult to calculate, as they cannot play against the same opponents to get to that round. Very quickly this becomes equivalent to simply bruteforcing the solution.

try #2

The second option is to instead calculate the probability of a specific group of people getting through from one round to another. Then we can use conditional probabilities: the probability of that group of people getting to round $i$, is the probability of them getting to round $i-1$ times the probability of them getting from round $i-1$ to $i$. Hence the difficult part of the task becomes calculating the probability of the people getting from round $i$ to $i+1$, and particularly doing that for the first 2 rounds.

There are 12870 possible groups of 8 players in a 16 player tournament, and $2^8 = 256$ possible pairings once we know who goes through, for a total of around 3 million probabilities that we need to calculate. Keeping in mind that we are doing precise calculations with likely large fractions, this will likely be too slow (although quite close to a solution that would suffice). We need to think of some kind of additional optimization to reduce the number of calculations.

final optimization

The crucial observation is that we don’t need to consider each of the pairings individually. Let’s say that we have 8 players that are going to go through to the next round, and 8 players that will lose this round. Then we can partition the 8 losing players into 2 groups of 4: the first group will play against the 4 alphabetically smallest winners, the second group against the other winners. There are 70 such partitions of the 8 losing players, and if we precompute the values for 4 specific players to win against 4 other specific players, we can combine the answers in a single multiplication.

Similarly we can reuse these precomputed values in the second round, when for each possible group of 8 players that went through we want to calculate the probability of each of the 4 players going through to the 3rd round.

That’s the essance of the algorithm. With heavy caching I was able to reduce the total running time to around 35 seconds, which means that this dynamic progamming optimization was indeed necessary: it reduced the workload by up to a factor of 4. You can find more details in my code.

Garden Path

Gertrude is laying a garden path with rectangular pavestones measuring 1x2 units. The path is width units wide and length units long. Certain square areas of the path will be occupied by plants. Gertrude wants to know how many different ways she can exactly cover the path (excluding the areas occupied by plants) using the pavestones.

For example, for the case width = 3, length = 4 and plants at coordinates [(0,0), (2,3)], there are five possible ways to pave the path, shown below.

You should output the number of ways Gertrude can pave the path mod 1,000,000,007.


  • width is the width of the path
  • length is the length of the path
  • plants is a list of 0-indexed coordinates of the plants


  • $1 \leq \texttt{width} \leq 6$
  • $1 \leq \texttt{length} \leq 1,000,000,000$
  • $0 \leq \texttt{len(plants)} \leq 1,000$
  • For each tuple (x, y) in plants:
    • $0 \leq \texttt{x} < \texttt{width}$
    • $0 \leq \texttt{y} < \texttt{length}$



Observations about input

  • The length of the garden is very large, so our algorithm will have to run in sublinear time in terms of length. This hints towards building up the garden path by repeatedly joining two pieces of size $2^n$.

  • The width on the other hand is extremely small. We can use the small width to describe the jaggedness of the sides of the garden when we are building it up. In particular, if we we have built up the path up to some length k, with possibly some tiles sticking out into position $k+1$, we can read the parts sticking out as binary: $1$ for sticking out and $0$ for not sticking out.

Dynamic programming

Now since we want to do binary exponentiation, we should precompute the number of ways to tile a garden of size $2^k$, for every possible jaggedness on either side. That is a total of around $30*36*36$ elements that we need to calculate, which is very much managable (the exact details can be found in the code).

It is important to carefully define the length of a path. The length of the path should include only one of the parts that are sticking out: that way when we join 2 paths over a sticking out part, their lengths are also added. In this case I chose to think of the right sticking out part as belonging to the path, and the left part belonging to the part on the left.

We also need to constrain the pieces that a path is allowed to place on the cells that are sticking out. In particular, we need to be able to tell which piece placed a vertical domino on the edge of two pieces (otherwise we might end up counting the same placement twice). A simple solution is to only place vertical dominos on the right side to avoid repetition.

Binary exponentiation over gaps

Now we are able to efficiently calculate the number of ways to cover the path of any size and any jaggedness, given no plants. How do we include plants in the solution?

  • Partition the length of the path into intervals, where we partition on the length positions of plants.

  • For each gap perform binary exponentiation, since there are no plants in between.

  • At the end of each binary exponentiation make sure that none of the jaggedness values overlap with the plants. Remove those that do.

  • Finally add the plants to all the remaining jaggedness values: as far as the next interval is concerned, the plants are just part of the path calculated so far.


This was likely my favourite problem of the whole competition. It included a multitude of techniques and clever mathematical observations.

Snakes and Ladders

Bob makes board games for a living. Lately he has been focussed on Snakes and Ladders but he’s worried about how long the game will take to play. A game consists of an integer number of squares, N, and a list of S snakes and a list of L ladders. Each snake or ladder is a tuple of int(start, end). For snakes, end < start. For ladders, end > start. It is guaranteed that no square is the start or end of more than one snake or ladder, and the game begin square (0) and game end square (N-1) will not be the start or end of any snake or ladder. It is also guaranteed that the game can actually end: there are never six consecutive snake start squares.

Play for a single player proceeds as follows:

  • The player begins on the game begin square (0)
  • On each turn, the player rolls a normal die and moves forward the number of squares indicated by the roll:
    • If they land on the start square of a snake or ladder, they immediately move to the end square of that snake or ladder and this counts as the same turn
    • Landing on the end square of a snake or ladder has no special significance
    • If they land on the game end square (N-1) or get a roll which would exceed that square, the game is complete

You should output the expected number of turns it will take for a single player to complete the game, truncated (i.e. rounded down) to the nearest int.


  • N is an integer
  • snakes is a list of S tuples of two integers
  • ladders is a list of L tuples of two integers


  • $10 \leq \texttt{N} \leq 100,000$
  • $1 \leq \texttt{S} \leq 100$
  • $1 \leq \texttt{L} \leq 100$



Naively, we can look at the problem as a system of linear equations. more specifically:

  • Define m[i] as the square that you are sent to after landing on square i. So any start square of a snake or ladder gets mapped to the corresponding end square, and all other squares are mapped to themselves.

  • Additionally define m[N+i] = N-1 for i = 0, ..., 5 for convenience

  • For every square $i$ that is mapped to itself define a variable $v_i$ that is the number of expected throws until the game ends. Then we have:

    • $v_{N-1} = 0$

    • $v_{i} = 1 + \frac16\sum_{j=1}^6v_{m[i+j]}$ for other $i$, as we make one throw and land on one of the squares $m[i+1], …, m[i+6]$ with equal probability.

The best general linear equation solvers work in worse than quadratic time in the number of variables, which is way too slow for the given constraints. We clearly need to abuse the structure and sparseness of the equations, and the small number of snakes and ladders to solve the challenge.

Observation #1

There are 2 obvious ways to approach the problem: start from square $0$ and work upwards or start from square $N-1$ and work backwards. Clearly $N-1$ is the more useful choice, as we actually know the value of $v_{N-1}$.

If there were no snakes, we could simply calculate the value of $v_i$ once we know the values of $v_j$ for $j>i$. With snakes it becomes more complicated, as to calculate the node just below a snake start, we need to reference a node we haven’t calculated earlier: the snake end. Hence we need a more complicated representation of the value of a variable.

We maintain a representation of each variable, where the variable is only allowed to be a sum of a constant, and a linear combination of snake ends that are below it. Maintaining this invariant will allow us to get a definitive answer for $v_0$, as there are no snake ends smaller than $0$.

How do we to this?

  • Denote each variable by a Counter object, in which we store the coefficients of the constant and the linear combination described above.

  • The Counter for $v_{N-1}$ is just an empty counter

  • The Counter for $v_{\texttt{snake\_end}} = {\texttt{snake\_end}: 1}$ until we can properly calculate it.

  • Other Counters don’t need to be initialized

  • Choose some key for the constant value, like $-1$

  • Now calculate $v_i$ for i = N-1, ..., 0 according to the formula defined above

  • When we calculate $v_i$ for $i$ that is a snake end, we end up with an equation of the form $v_i = c*v_i + \texttt{remaining\_counter}$. We can move the terms around to get $v_i = \frac1{1-c}\texttt{remaining\_counter}$, and we can then use this to remove $v_i$ from any counters and maintain our invariant going forward.

Once we are done, we just return the value remaining in the Counter of $v_0$.

Observation #2

The solution above completed on all tests in 11 seconds. To get all double the points, I needed it to be faster than 1 second, so I thought of the following additional optimisation.

Clearly there is a very small number of snake and ladder squares in comparison to the normal squares. In particular, if we have the maximum possible $10^5$ variables, the average gap between “non-normal” squares is at least $250$. Therefore it would be nice if we could go through these gaps without considering every single element.

Let’s consider what happens if there are no snakes or ladders present. Our recurrent formula then becomes very simple: $v_i = 1 + \frac16\sum_{j=1}^6 v_{i+j}$. Using matrices we can then write:

$$ A \left(\begin{matrix} v_{i} \\ v_{i+1} \\ v_{i+2} \\ v_{i+3} \\ v_{i+4} \\ v_{i+5} \\ 1 \end{matrix}\right) = \left(\begin{matrix} v_{i-1} \\ v_{i} \\ v_{i+1} \\ v_{i+2} \\ v_{i+3} \\ v_{i+4} \\ 1 \end{matrix}\right) $$

where the matrix is:

$$ A = \left(\begin{matrix} \frac{1}{6} & \frac16 & \frac16 & \frac16 & \frac16 & \frac16 & 1 \\ 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ \end{matrix}\right) $$

In particular:

$$ A^k \left(\begin{matrix} v_{i} \\ v_{i+1} \\ v_{i+2} \\ v_{i+3} \\ v_{i+4} \\ v_{i+5} \\ 1 \end{matrix}\right) = \left(\begin{matrix} v_{i-k} \\ v_{i-k+1} \\ v_{i-k+2} \\ v_{i-k+3} \\ v_{i-k+4} \\ v_{i-k+5} \\ 1 \end{matrix}\right) $$

We can now use logarithmic exponentiation to multiply a logarithmic number of matrices, to get the result. Since the matrix has small dimensions, this is quite efficient.

Now when we introduce snakes and ladders to the equation, a similar idea works, but now we do logarithmic exponentiation on the gaps between “non-normal” squares, maintaining the current state vector, and for the snake / ladder squares we do the following:

  • For snake / ladder starts at some square $i$, we replace $v_i$ with $v_{m[i]}$ in the state vector

  • For snake ends we do the same calculations that we did in the previous solution, and remove this variable from any counters that we are storing

  • For ladder ends we store the calculated variable as we are going to need it in the future when we encounter the respective ladder start.

This solution was enough to get me the double points that I was trying for. You can see the implementation details in my code.


Did I enjoy the competition?

Absolutely! It was incredibly fun, and I did a lot more than I initially expected. My initial impression was that I could do maybe half the problems if I spent a lot of time on them. I consider finishing the competition with only 2 unsolved problems to be a huge success, even if I didn’t get the big prizes (I did get one of the 100 pound spot prizes for solving one of the problems though!).

Would I do anything differently now?

I would definitely be more serious about going for the prizes at the begginning of the competition. By the end it was incredibly difficult, as I still had several questions to complete, and less than a day to complete.

Another thing that I wish I had done is double check if I hadn’t missed any easy points. It turns out that I was rather close to getting double points for one of the questions, but had completely forgot about that question’s existance. Doubling my points in that question would have allowed me to choose a different strategy at the end, and I could have been a lot closer to getting prizes!

Was it worth participating in?

As far as monetary rewards go, not really. 10 pounds per day is not a good payoff by any measure. But as far as programming competitions go, this one was about as worthwhile as it gets.

  • The competition was incredibly fun due to the ever-changing, dynamic scoreboards.

  • I learned a lot of Python optimization techniques, and the power of dynamic programming when trying to make certain problems fit in the required 50 seconds, or to get double points from a problem.

  • I got quite a bit more debugging experience, as some of the problems took me up to a whole day to implement corectly. In particular, I had to write quite a few test generators to test both correctness and speed.

  • I got a lot more confidence in myself and expect to do significantly better in the next GSA ULTRA competition.

  • Writing this blog post was also a fairly fun challenge, even though it took ages. I hope that writing such posts will improve my explanation skills in the long run.

Final thoughts

Hope I didn’t write this post for nothing. If you, the reader, are someone trying to learn competitive programming, I hope that you found my explanations useful. I really thought that many of these problems (especially the harder ones) required some cool and useful techniques, which is why I decided to do this writeup in the first place. Way too often do I read competitive programming writeups that skip many of the details, or the proof, and was trying to make one that is understandable with a small mathematical background. Let me know how I did!