If you keep tabs on birthday notifications on Facebook, you’ll realize that there are several friends of yours who share a birthday. If so, what are the chances of two randomly chosen friends having the same birthday? This is a well-studied problem in statistics known as the Birthday Problem.

The premise of the Birthday Problem is simple. If N number of people are in a room, what is the probability of two of those people sharing a birthday? I decided to investigate. Straight to my Facebook account, I collected 628 friends’ birthdays, wrote a script and bingo, I had a graph.

Probability of Two People Sharing a Birthday

As shown on the graph above, there is a 0.5 probability of sharing a birthday in a group on 24 people and a 1.0 probability in a group of 70. The interesting bit of this problem is the fact that most people expect a group larger than 365 people to have a chance of sharing a birthday.

You can do this yourself. Go to Facebook and select 70 random friends. You’ll be guaranteed at least two of them share a birthday. This is not just fancy recreational mathematics, a lot of computing, engineering and scientific processes rely on this concept.

In medicine, the concept is used to calculate Class Phenotype Probability – the chance of two people sharing the same blood-type. This particularly important in calculating the likelihood of finding matches between donors and recipients in blood transfusion.

Since the US has had 44 presidents, according to my graph there's a probability of 0.89 that two of them will share a birthday. Guess what? Warren Harding (29th presidents) and James K. Polk (11th president) both have a birthday on November 2nd.

As I have pointed out earlier, in a group of 24 people, there’s a 50-50 chance of two having the same birthday. In a room of 75

there’s a 99.9% chance of two people matching.

In cryptoanalysis, a code-breaking technique known as the birthday attack utilizes the same concept to find chances of a pattern repeating itself in an encryption scheme. Using this probabilistic model, a programmer can store the results from previous attempts to decipher a hash string in memory in order to decrease overall processing time when attempting to crack a digital signature.

Each time you add one person to the set of people you are looking for duplicate birthdays in, you are looking for duplicates against all the people already in the set, not just one of them.

The same technique is used to look for conflicts in one-way functions. The attack looks for a shared hash value, while at the same time creating another set of other hash values. Using the birthday example, the search works by looking for people's birthdays that correspond to one date while creating a database of other people's birthdays that can then be used for future comparisons.

The comparison works by finding values that do not match those that you have already checked and remembering these so that the comparison does not have to double back to check against values already eliminated.

An attacker is able to compare the values they have against a set of password hashes that they know the passwords for. It may take a long time, but eventually the hacker will find a password that matches.

To protect yourself against the Birthday Attack, ensure the security of the transmission medium. This denies any potential attackers a point of reference that they can use to initiate a hash search. If they don't know how many characters the password has, it's harder for them to initiate a brute force attack.

To secure your communication, you can send your messages through an encrypted tunnel created through a secure shell (SSH) protocol, and it will be much harder to intercept.

Encrypt any messages that contain digital signatures. The recipient of any communication that contains a digital signature can have an asymmetric encryption key, and the sender can encrypt the message that contains the signature. In such a case, only the recipient with the key can decrypt and decode the message.

## Comments