Open Data in South Africa and Beyond

Everyday, every hour, every minute, every second of our lives we are creating data. Consciously or unconsciously, everyone is creating tons of data just by being alive.

Unconsciously being a citizen of South Africa you are a data point that moves and evolves. Consuming electricity at some rate, using water at another rate and being transported from point A to point B using some form of transportation.

Eskom Open Data Image Credit: Open Data for Africa by AfDB

Consciously, the data collection is more direct: we occasionally get sick, visit the doctor, get scans. Our cars break down, we get them checked out, fixed etc.

What happens to all this data?

Do you have access to it?

Do you have a right to it?

What possible things can be done with this data?

These are not easy questions to answer or get answers to. There is a movement though, to make data Open. Open Data, to put it simply is a

movement to make available data to everyone without any restrictions or fees.

I will separate the use of the data to its access. A (relatively) simple argument to make is for data collected with public money (Tax/Rates etc.).

Public Data

So let's say your municipality gathers and stores data about the amount of water consumed by every household on a daily basis.

City Ranking on Piped Water Inside Dwelling Image Credit: Statistics South Africa

This data might be used for their planning or monitoring. There still lies a question of other potential uses of that data by other organisations. To get this data, one would have to know about it’s existence and go through a process of requesting access to it - which normally comes with jumping through hoops and limitations.

If the data was collected by an institution that is a funded by the public (government institution, paid for by taxes), then shouldn’t the data automatically be open to anyone?

This is one of the arguments for Open Data. A lot of research, in any country, is partly funded by public money and as such the argument goes that such research should always aim to make the data used in the research open and easily accessible once the research is concluded.

The typical flow of research is that the researcher/student/lab collects data to pursue some project. Once the project is concluded that data is often just archived rarely to be used again.

Open Data in 60 Seconds | Data

What if someone could actually use that data from a different field or subject area?

This could lead to new breakthroughs and understandings. Or someone else could use it to develop a new innovative service given the insights that could be gleaned from the data. This archived and non-utilised data could lead to better services created for others. This definition now extends to private individuals and entities who have collected data, have kept it and for a myriad of reasons do not make it open to others even after their own value of utility of data has reached zero.

Now imagine, analysis of municipal water usage information leading to services that better predict when water problems might happen in different parts of the city and be able to avoid or minimise complete shutdowns.

Government Accountability

Another important reason for open data is to keep governments accountable. With government data available to be scrutinised and analysed by outsiders, the citizens of the country can know what the government is doing as well as ask the right questions.

Good examples of such services, made possible by open data, are:

The People's Assembly

The People's Assembly Image Credit: The People's Assembly

The People’s Assembly makes information about the South African parliament available to citizens that should be public by law.


Code4SA uses open data to create, visualise and push data driven journalism.

Open Africa

On the other hand Open Africa pushes for open data throughout the continent, allowing for people and organisations to measure the pulse of the continent, country by country and data entry by data entry.

Open Africa


I would be doing the topic a disservice if I also did not cover the arguments against Open Data.

Researchers and organisations toil away collecting data to carry out studies or reach some specific goals. Some of them may not want to share the data with others, but may want to control on who accesses it to guarantee that whoever uses it, does so with similar goals or principles to the original intent the original gatherers of the data had.

I myself, as a researcher, have had numerous instances where I had to agree to use some data I was getting access to in one manner and but not in another. Some Public (government) datasets tend to not have such restrictions once a government department make the data public. B

Think about medical data collected from public health institutions, what would the consequences of making patient data (anonymous) openly available without restrictions?

What about ethics?

What about respect for the individuals from whom the data was collected?

In this case one can more easily argue that the data should be availed to those who work in public health, who have ethics in mind and also with a goal to improve the public health care system. Point of contention is misuse of the data, for example de-anonymising data to reveal individuals or linking multiple datasets to reveal personal information about people.

South Africa Ranked 1st in Africa for Open Data

To create a global ranking, we aggregated the sub-indexes of the Open Data Barometer. Comparing scores and ranks in the second edition with those in the first can help to identify countries making progress, and those where progress has stalled.

A large number of research projects, that allow access to data, disallow the use of that data for commercial purposes (private business or private research institutions) or by non-research organisations. This in my opinion is understandable, though it is always going to lead to numerous debates. The data was collected in pursuit of understanding our world better, the original researchers were not collecting it so that another company could get a commercial advantage. If the company seeks such data, I think, they should put their own resources collecting it and doing the work to clean it, prepare and analyse it (a large amount of man hours) or compensate the collectors fairly for their effort.

The topic of Open Data is a very interesting one, especially for the African continent as the continent tends to upend a lot of established technology trends.

Cover Image Credit: Department of Foreign Affairs and Trade

Vukosi Marivate

About Vukosi Marivate

Dr. Vukosi Marivate is a Data Hoarder. Writes in his personal capacity on interesting things we can learn from data. Works for CSIR, ex-intern at Google Inc. PhD, CompSci, Rutgers University (USA)