Friday, 13 November 2020

Exponential Coins

There is a population of 1 million people, of which some are infected by a disease. Each two weeks the following happens:

  1. Every infectious person has 50% chance to be cured.
  2. Each infectious person infects 2 other healty people (if healty people still exist).

What is the change the disease will eventually be eradicated?

A more interesting (and at the time of the writing of this blog, actual) way to look at this problem other than the abstract notion of coins in a jar, we can model as a pandemic spreading through a population.

Suppose a jar that can hold a million coins, with a certain number of coins in it. You can do the following procedure:

  1. Take out all coins
  2. Flip all of them
  3. Count number of heads
  4. Put 3x the number of heads back into the jar.
    If the coins to not fit in the jar, we only leave a million, and toss the rest away

If we start with a jar with a single coin in it, what is the chance that we eventually the jar will be empty.

I would like to note that this thought experiment, even tough -as we will see- will have some properties in common with how a real pandemic spreads, is a vast oversimplification and can in no way to be used to model a real pandemic. The interesting quirk of probability that is displayed here most certainly does not work in real life.

Infinite case

First, let us take a look at the infinite case, so where there is no limit on the size of the jar.

Expected value

An observation that can be made is that the expected value of the number of coins gets multiplied with 1.5 after each step. This might seem obvious, as for example if we have exactly 10 coins, it is easy to see that we will have an expected value of 15 coins next step, but the truth is even stronger, as for example, if we would have 50% change to have 6 coins, and 50% to have 14 coins, the expected value is 0.5×1.5×6+0.5×1.5×140.5 \times 1.5 \times 6 + 0.5 \times 1.5 \times 14 which is still 15. It works in general for any given distribution, that after each step, the expected value will increase by 50%.
This means that the expected value will grow to infinity, and specifially, the expected value after ii steps is
Ei=(32)iE_i = \left(\frac{3}{2}\right)^i

Jar duplication

Imagine instead of a single jar, we have multiple jars that we apply the process to. So for each jar, we take out all the coins, flip them, count the heads, multiply by three, and put them back in the original jar. It is easy to see that this process is equivalent to combining all the different coins in a single jar, as the total number of coins added back in is equal to the total number of flipped heads.

This means that a jar with nn coins can be considered as nn jars, with each a single coin in it. The total jar will be empty only if every jar ends up empty, and all jars are independant.

So if the chance a jar with a single coin ends up empty is called pp, then the chance that a jar with nn coins ends up empty is pnp^n, as it is the same as nn independant jars ending up empty.

Chance

As pp is a global independant property, it is conserved with our porcedure. After one step of our porcedure, we have 50% chance to become empty (so a chance of 1) and 50% to end up with three coins. This means that pp follows the following equation:
p=12(1+p3)p = \frac{1}{2} \left(1+p^3 \right)

The equation has three solutions, 11, ϕ1\phi-1 and 1ϕ-1-\phi, with ϕ\phi the golden ratio. The last solution is smaller then zero, so that is impossible. It might be natural to scrap the solution of 11, and keep ϕ1\phi-1, and while it is correct, the mathematical reasons why this is actually are quite complex.
An important note to make is that most of the chance to empty the jar (or eradicate the disease) happens when the number of coins is low. For example, if there is one coin, there is already a 50% chance that the jar will end up empty after the first step, while the chances of ending up empty after any other number of steps is only about 23% chance of successfull eridication if it fails the first step.
Even worse, if we ever reach a million, the chance to ever empty the jar drops to (1ϕ)1000000(1-\phi)^{1000000}, which equals about 2×102089882 \times 10^{-208988}, so basically impossible.

Finite case

One might reasonably assume that, as most of the chances to empty the jar are contained in the low numbers, that adding a constraint to the maximum number of coins in the jar will only change the chance slightly. One would be wrong. To show this, let us look at a slight variant of the problem:

You have a jar that can hold a million coins. Each step you do the following procedure:

  1. Take out all coins
  2. Flip all coins
  3. If there is at least one head, refill the jar again up to a million coins.

It is easy to see that the jar will always contains more coins then in the original problem, so it is a stronger version, with a lower chance to eradicate.

Markov chain

The above problem can be represented by a Markov chain with two possible states: a million coins, or no coins. As the chance of going from a million coins to no coins is nonzero (But very small, about 103010310^{-30103}) and constant, the chance to eventually have an empty jar follows from the following equation:
(1p)=(11030103)(1p)(1-p) = (1-10^{-30103})(1-p)
1030103(1p)=010^{-30103}(1-p) = 0
And even tough that the coefficient of (1p)(1-p) is very small, it is nonzero, so pp has to be one. This means that even this extreme system, were a million coins are added, there is still a 100% chance (almost surely) To end up with zero coins. On the other hand, the expected number of steps has to be lower than 103010310^{30103}.

Infimum

There is a chance of 2ϕ2-\phi that the ininite system grows beyond one million (because it needs to not surely finish, and it had a chance of ϕ1\phi - 1 to not finish.) This means that this is also the chance that our finite system reaches a million. If it has reached a million, it has a chance of pp to get out of this state of a million. We can calculate p as follows:
pp is the chance that a million coins throw less then 333333 heads. This folows the binonial distribution, but is estimated really well in this region by the normal distribution with avarage 500000 and standard deviation of 500. The zscore of 333333 in this normal distribution is (333333-500000)/500= 333. The p value of the left tale is: 102408310^{-24083} which means that in 30% of the cases it takes at on avarage more than 102408310^{24083} steps, not including the steps to get to one million, and the steps to get back to zero afterwards, and the large chance that when you go below a million, you will probably immedaitly get stuck again. This means that the expected value has to be bigger than 102408210^{24082} steps.

Conclusion

The combination of this means that the expected number of steps falls between 102408310^{24083} and 103010310^{30103}. We could narrow this further down, but both numbers are so unphantomably huge, that there will be no point, just rest assured that even if there is certainty that the system will end up without any coins, it will take a very long time.