A room teeming with about thirty laptop screens, pairs of eyes glued to them, nimble fingers tapping away at the keyboards, fervent conversations interjected by jargons such as feature extraction, training set, overfitting, pooling layers, sobel filter, scikit image and the like, a whiteboard full of equations and codelets, and a heavy aroma of coffee. This might seem like the perfect picture of a San Francisco Data Science hackathon. Except, this scene took place about 11,000 miles away from San Francisco.
It happened in the tiny island nation of Mauritius, in the Indian Ocean about 1,200 miles off the southeast coast of Africa.
I was one of the data science instructors of a career development workshop for young Mauritian and South African students, who were having their first stabs at a real life data science problem.
That trip halfway around the world took me about 30 hours of travel each way, but it was one of the most rewarding experiences of my life.
Modelled as a ten-day experiment in immersive learning with team building activities, these JEDI (Joint Exchange Development Initiative) workshops ought to be one of the best ways for young minds to learn the most important ingredients of a research career.
Astrophysics, Big Data, And the African Continent
Researchers in Astronomy and Astrophysics have been dealing with tremendous amounts of data since long before “Big Data” became a hype. For example, in my cosmologist life, I have worked on data from the Atacama Cosmology Telescope, the raw volume of which would put to shame many medium scale software companies. With the advent of bigger telescopes, data in Astronomy is becoming even bigger than what it used to be.
The African continent is truly entering the astronomy big data regime with the Square Kilometer Array (SKA) — a multi-country, multi-continent radio telescope with unprecedented sensitivity that will collect about an exabyte of data a day.
Recent technological advances happening in industry, such as MapReduce-based distributed Machine Learning such as Apache Spark, and GPU based Deep Learning are extremely relevant for the kind of science that will be done with the SKA data.
Young minds in the African countries participating in SKA need to be made aware of these tools and techniques to develop the necessary skills that will allow them to become tomorrow’s scientists. This is the motto behind the SKA Africa Astronomy Training Platform (ATP) — a series of training programs that allow for direct knowledge and skills transfer in radio astronomy, machine learning, and the related fields. The Mauritius Machine Learning JEDI workshop last week was a part of this program.
The JEDI Way
This is a somewhat official description of the JEDI process:
The Joint Exchange Development Initiative (JEDI) is a concept to enhance development and education via direct transfer of skills and expertise in any specific field. It is an initiative to provide development via joint exchange among stakeholders. This is achieved by bringing stakeholders: students, post-docs and staffs together in an informal but intense research environment to tackle unsolved problems for e.g. in Astronomy.
In a more informal way, as one of the guiding members of JEDI, Professor Bruce Bassett puts it, these JEDI workshops are “an antidote to conference ennui”. To quote him further:
Boiled down to its essence the JEDI is an attempt to cure the usual conference woes: the one-way, top-down communication where sometimes 80% of the audience have their heads buried in their laptops because they are tired of poorly delivered talks and their bodies are aching after being in the same seat for hours on end with no movement except for the 30 minutes where they crowd around the coffee and biscuit table in a desperate search for relief and connection.
In contrast to the accurate picture of a conference setting that Bruce painted above, I found JEDI to be the perfect non-conference way to learn “on the job”.
Groups were formed, problems were assigned to each group, and from then on they operated as little collaborations with active sharing of knowledge. The two problems we tackled were the Kaggle problem on Diabetic Retinopathy and an astrophysical problem of detecting extended sources in radio images. Both were image processing problems and called for similar techniques.
It was interesting to see how each group quickly identified their member’s individual strengths and weaknesses and tried to optimize the way they could work together most effectively — the hallmark of any good collaboration.
We, the instructors, were there to give them directional help throughout the day, as well as some quick tutorials (Dr. Jasper Horrell gave a quick intro to Deep Learning, while I gave a tutorial on Python Pandas) to help them with the tools.
Every evening, they were also asked to present their day’s findings. Given that most of the participants had little or no exposure to machine learning prior to the workshop, I was impressed by how quickly the teams picked up the tools and made real dents into these hard problems. They would read up papers, google up methods, and try to reuse open source code where possible instead of reinventing the wheel, and write their own where there was nothing available — all indispensable tools of a research career.
Also, it was rewarding to see how they felt proud and empowered by their daily progress, and how eloquently they described their accomplishments at the end of each hard day.
They were voraciously learning while forging teamwork, writing serious code collaboratively, and also picking up clear presentation skills. This would never be possible in a conference setting.
The JEDI workshops are run as residency programs, where the participants live “on campus”, help each other out outside of work, and cooking for the entire team is done communally.
Being a lifelong believer in the idea that nothing brings people closer than creating and sharing food, I found this one of the most remarkable features of the workshop. (For my part, I cooked a huge pot of chicken cacciatore.)
Formal barriers broke down over food and drinks, jokes were cracked, people skills were developed, hobbies and extracurricular skills came out over the dinner table. A successful research career is often about forging human relationships with collaborators that often last a lifetime.
Such friendships are often kindled outside of work.
None of this would have been possible without some key people with really big hearts. Prof. Bruce Bassett has been a major force behind the conceptualization of the JEDI format. His in-depth understanding of both the academic landscape and the world of big data and machine learning, and how Africa fits into it, has helped shape much of the program.
If Bruce is the philosophical driving force behind the JEDI program, the raw steam force comes from Dr. Nadeem Oozeer, who is a Mauritian astrophysicist, and a Commissioning Scientist at the Square Kilometre Array (SKA) in South Africa. He is also a Project Leader for the JEDI initiative for Africa. Beside these accolades, Nadeem is a super passionate guy, who strongly believes in the goals and ideals of the JEDI program, and embodies the excitement that stems from Africa being on the cusp of a new age of astrophysical big data research. Almost single handedly, he made sure that every aspect of the workshop ran smoothly — from logistical components, to continually providing motivation, guidance and camaraderie to the participants both within and outside of work.
His energy and dedication is truly commendable!
At the conclusion of the program, I was fortunate enough to meet some of the high level delegations from the Mauritian government, including the Ministry of Education, and the South African High Commission.
They profusely expressed their support of programs such as this, and I felt hopeful that future Mauritian JEDIs will be on a larger scale, and I will be called back again by the young and buddying astronomer /data scientist friends I made there!
Cover Image: Sudeep Das