Beginners Guide To Data Based Thinking

“هتبدأ منين؟ دي البلد لازم تتفور كلها عشان نبدأ من جديد”

“Where to begin? The whole country needs to be destroyed so we can start anew.”

The general sentiment of hopelessness is an accepted fixture of any conversation surrounding development in Egypt, and a favorite of the older generation. At worst, it is used to deliberately dismiss the possibility of achieving progress and at best, it signals a transition from constructive discussions that inch towards improving the status quo to something only slightly less overwhelming, like what to do Thursday night or the performance of the national team (* mesh 7anetahel kas afrekya ba2a?).

It is only natural that huge problems tend to overwhelm us. As humans, we have a tendency to avoid trying when it appears that even the sum total of our most valiant efforts will be but a drop in an ocean of failure—perhaps this explains why people who call out others for littering are often ridiculed (* هههه هيا جت على دي يعني؟*).

A recent experiment at the University of Pennsylvania illustrates this tendency with supreme clarity. Two groups of students were given $5 to fill out a short survey after which they were shown a flyer asking them to donate for a good cause to one of the world’s leading charities. Try and guess which one of the two was more successful.

Excerpts from experiment

Perhaps predictably the second excerpt caused students to donate almost twice as much money as did the first excerpt ($1.16 to $2.83).

In another example, scientists asked people to say how much they’d pay to save birds affected by the Exxon Valdez oil spill. They discovered that the quantities of birds affected mattered, but not in the way they had expected.

In fact, amount of money people were willing to donate actually decreased if scientists said 200,000 birds were affected, as opposed to 20,000. It’s an “ooh that’s interesting” moment until you realize how absolutely insane this is. People’s willingness to help can be inversely related to the magnitude of problem faced. The fact that the first case is a tragedy affecting TEN times as many animals (and probably needs significantly more resources) causes participants to donate LESS.

These examples illuminate a decent amount about human nature and drive home the point that as humans we are emotionally driven decision makers and that facts do little to convince people — something that advertisers and the Egyptian media have figured out long ago. (nefsy akol bizza!).

More importantly they suggest that data-based thinking comes unnaturally to most people.

While only mildly tragic on the level of the individual, the repercussions are massive when policy makers on the national level adopt anecdotes, opinion, and conjecture of the most unscientific nature as the basis of policy making. Examples of arbitrary policy making in Egypt are too many and warrant a separate post.

For now, below is a quick beginner’s guide to sifting through the data pollution we’re exposed to on a daily basis.

1. Lazy Averages

The same way you can drown in a river that is on average 5cm deep, if certain parts of it are really deep, you and hundreds of others can live in inhumane conditions in a country where the national income per capita is about $50,000 per annum, if the distribution is unequal (take a bow United Arab Emirates).

Ways in which it messes society: a single number should never be used to encapsulate ultimate truth, and those presenting it need to contextualize. Which would win a race, a supercar with an avg. top speed 300km/hr or a Vespa with an average top speed of 70km/hr assuming no headwind? You know only one metric but have no context to answer the question.Who is driving? Where are they driving? I’d bet on the Vespa if they’re racing through downtown Cairo. Unless the rules say you’re allowed to run over people, then I’m back on team supercar.

The previous example was obviously a simplification but single aggregated averages are used all the time to give flattering/unflattering accounts at will.

At one point, Egyptian Electricity executives dropped a statistic that people’s complaints about power cuts were overblown because on average over the course of a year a household would lose 30 mins of power a day.

That is hardly unbearable you spoiled citizens…damned Hollywood has you believing you deserve 24 hrs of power supply rather than 23.5 hrs.

Giving them the benefit of the doubt, lets assume that this number is actually correct. 0.5 hrs/day x 356 days = 178 hrs of power loss annually. But given that these power cuts take place mainly during the two hottest months of the year, these 178 hours are distributed over 2 months not 12.

So 178hrs / 61 days = 2.91 hrs/day. At 3 hours of power loss per day it’s starting to get really annoying but still not unbearable— on average. Still we’re assuming a nice distribution of equal 3 hour power cuts for two months, and the reality is likely to be far from that. Given that food is safe in fridges for only about 4 hours without power, we’re teetering awfully close to destructive levels of power cuts, all while making very conservative estimates.

The ridiculous part is that these are the numbers they gave us but they mean something entirely different when we take a minute to think about them.

Ways to overcome it: Make sure to ask for segmented data (What do the figures look like for the top 10%? Bottom 10%? etc.) or ask for visual representations of the dataset. Alternatively, ask for measures of dispersion such as the standard deviation or variance to get an idea of the spread.

2. Anecdotal Evidence

An evil so big it probably deserves its own post.

Anecdotal evidence is the reason some people still think smoking is harmless because their grandmother — besm ellah masha’Allah 3aleiha — lived to 108 smoking 2 packs of Marlboro Reds a day and washing them down with a glass of Double Black.

In extreme cases where the sample size is 1, it is essentially equivalent to flipping a coin once, getting heads, and proclaiming that everyone who will ever flip a coin will get heads ad infinitum.

Sounds absurd when you put it that way but you’d be surprised how pervasive this line of thinking is in our daily lives. While they may be fun and entertaining around campfires, anecdotes are where truths go to die.

Even without sufficient evidence, we struggle to not have an opinion on how and why things happen. Notable examples include almost every world religion. The Sun moves across the sky, we have no idea how or why that happens so until anyone can disprove our theory, we’ll believe that a Scarab rolls the Sun like a flaming ball of dung, said the Ancient Egyptians.

Ways to overcome it: Make it a habit to look at the big picture and in the absence of a bigger set of examples, refrain from judgment. Look for numbers, controlled experiments, survey data, studies on a national scale, and reports to help build opinions.

Ways in which it messes up society: Sexism is not an issue in the workplace because Marissa Mayer is the CEO of Yahoo (or because 3amety maska manseb kebeer fe bank mulchinational). Racism doesn’t hinder job prospects in America because Obama is the President of the United States. Racism is not real because my friend who says that he’s never faced a difficulty because of his race.

3. Correlation without Causation

Fact: When ice cream sales increase so do forest fires.

No, you don’t set a tree on fire with every scoop at Stavolta, you’re just more likely to have one in the summer, which happens to be when fires tend to happen. Just because an increase in one coincides with an increase in the other, does not mean that there was any causation. The ambiguity surrounding this simple fact is the breeding ground of conspiracy theorists.

Ways in which it messes up society: Below are photos of the same man in Afghanistan, in Bosnia and Herzegovina, in Sudan, in Israel and in two other conflict-ridden regions meeting important leaders.

Author Bernard Henri Levy

What did you say?

He was also in Egypt in January of 2011?

That’s suspicious…

Bernard in Egypt

Picture from a Chain Email that labelled this man as the leader behind the Arab Spring — the one man instigator.

The email suggests that this is the one man instigator behind all the major revolutions that took place in the last two decades, leaving behind a trail of destruction and bringing death & despair to what were supposedly peacefully utopian areas.


Alternatively, Bernard-Henri Levy, is the sleek blazer-wearing public intellectual and author with over 20 published works on revolutions that his website and Wikipedia page claims him to be.

In this case nobody really knows but it seems infinitely more likely that this is a man, like many others, who is attracted to revolutionary events rather than the one-man show who sets them into motion ushering in decades of destruction and social strife. It’s not like the regions involved had a whole slew of the conditions that could lead to conflict like poverty, massive unemployment amongst a burgeoning & young population, heightened ethnic/social tensions.

In any case the fact that foreigners were attracted to Tahrir Square during one of the momentous events in recent Middle Eastern history should not be interpreted as evidence to suggest that foreigners caused the revolution. At least one person in government failed to realize that and capitalized on the xenophobic sentiment by launching the infamous “ohh rllly?” ad.

Ways to overcome it: Causality is much more complicated and much less linear than we assume, so the best advice I can give is to refrain from making causal statements a la Bernard Levy is the man behind world’s turmoil.

Let’s say we have an apple tree. An apple ripens and it falls to the ground.

Why did the apple fall?

Some would say gravity, some might say the stem got weaker as the apple ripened, while others would say it’s the breeze. Each of these answers is incomplete because the apple would not fall unless the combination of the gravitational pull, the weakness of the stem, and the added rustle from the wind caused it to fall so how can we attribute the event to a single cause?

4. Bull---- Figures

95% of all statistics are made up on the spot, just like this one.

Like correcting grammar, challenging data is both a good habit and an excellent way to lose a lot of friends, quickly.

If the cab driver who took me home the other day was to be believed, 90% of Egyptians drive cars and his recommendation was that I leave the country before the other 10% get cars and the city streets get clogged beyond repair. Partly because his belief in the number was so strong, I didn’t have it in me to challenge his number. Moreover, the fact that he was downright mortified of the 10% potential increase made it seem almost cruel to inform him that the real number is closer to 13% with much more room to in increase than he thought.

Ways in which it messes up society:

Ways to overcome it:

It would probably be a combination of three things;

  1. Build up an internal database of value ranges to better contextualize data. The same way you know that a human body temperature of 48 degrees Celsius is absurd because you’re aware of what the standard range is, you should try and develop a working knowledge of ranges for all sorts of things. Next time you read a number heavy article, don’t gloss over the numbers.
  2. Develop a habit of guesstimating. Make educated guesses and verify them against reality when you can. It’s like taking that game where you try and guess the bill when out with friends, and playing it all the time.
  3. Ask for sources and call people out on their facts and figures (numbers not body-shapes). How do you know that I didn’t lie to you about the 13% car ownership number?

Cover Image: Tahrir Square | Ramy Raoof