New Data Sharing Initiative

Open access, curated data provides new resource at ICTP
New Data Sharing Initiative

When Nobel Laureate Abdus Salam founded ICTP in 1964, one of the main goals was to help scientists in developing countries expand their access to key resources. Access to journals, books, networks, conferences: all are necessary for scientists to do quality research. Now, more than fifty years since ICTP was founded, computing power and datasets are important additions to the list of key resources. That’s where ICTP scientist Lina Sitz saw a way for ICTP to provide a new important tool: open access curated data, a resource for scientists in places without large computational capacities. With a new data sharing initiative, ICTP is taking steps towards just that.

A lot of computational power is needed for modeling, analysis, and calibration of data in many fields, especially in physics. Powerful computational infrastructures like research computing clusters or supercomputers are less accessible in developing countries, meaning much more time or money is required to process data. In addition to computational power, some types of software used to clean or calibrate data are proprietary and expensive. Sharing existing data could help remove some of these barriers to research, and some scientists try to make their data and model outputs available already. But with no curation or publicizing, it can be difficult to find or use the data.


 Initiative head Lina Sitz with Robert Quick, project consultant

In 2016 Sitz collaborated with Robert Quick of Indiana University Bloomington, USA on a small project to share the output data from RegCM-ES, the coupled regional climate model developed at ICTP. “With all the insights I got from that experience, and coming from Argentina myself, a developing country, I started thinking about how to connect the needs of scientists who have lower computational resources with the powerful tools that can result from sharing resources,” says Sitz. The long-term goal of the data sharing initiative is to store and curate data from all sections of ICTP as open source and easily accesible.

A pilot project of the ICTP data sharing initiative is driven by a team at that includes Sandro Radicella, head of ICTP's Telecommunications/ICT for Development Laboratory (T/ICT4D), his colleagues in the lab Marco Zennaro, Luigi Ciraolo, Yenca Migoya-Orue', and Katy Alazo-Cuartas, and Clement Onime from ICTP's Information and Communication Technology Section (ICTS). Radicella's network includes a group of scientists from several developing countries who frequently use total electron content (TEC) data to investigate the state of Earth's ionosphere, the ionized part of the upper atmosphere. TEC data helps scientists study, among other questions, how ionization from solar radiation affects radio wave travel through the atmosphere, which communication and positioning systems rely on.

These researchers were asked if the availability of calibrated TEC was important to them. After their very positive answers, the T/ICT4D will provide calibrated TEC data of the past twenty years from around the globe, through the data sharing initiative, plus the continous flow of almost real time data. The whole dataset is calibrated using the same technique, making it easy to compare between locations and regions.

Some of the scientists who will be beta testing the project with TEC data and are quoted in this article

“It’s quite exciting, because it’s a grand solution,” says Babatunde Rabiu, a professor at the Centre for Atmospheric Research at Kogi State University in Anyigba, Nigeria, one of the beta version testers for the pilot project. “It’s a service that is revolutionizing the way we do our research. Bandwidth can be a problem, so gathering and downloading data and software, and then processing data, can sometimes take many days. Now with a click, what once took six months, you can now do in six minutes.”

That time is key for the research process. Sharon Aol, a PhD student at Mbarara University of Science and Technology in Mbarara, Uganda, knows what she’ll be doing with the time saved thanks to ICTP’s calibrated TEC data sharing. “If you are doing some long-term study, you need a lot of data, and it can take a few months to process that data. But if you can just click and get that data fully processed, you can spend more time reading papers and focusing on the analysis; to understand everything needs time. Research is all about finding out new things, not taking time to process data.”

“This new project is intended as a framework to help scientists, both local ICTP faculty and outside associated researchers, adopt recognized best practices and standards in data publishing,” says Sitz. The goal is for the data to all be FAIR: Findable, Accessible, Interoperable, and Reusable. “Scientists typically want to share their data but don’t want to take the time to make it FAIR, which is like making a library without any indexing. Having the data searchable, publicized, usable on different platforms, all of these are important.”

These are important ways to make the data as usable as possible. Sripathi Samireddipalle, an ICTP Associate and professor at the Indian Institute of Geomagnetism in Mumbai, India, says he and his students will be able to tackle more research questions thanks to the ease of using the calibrated TEC data available from ICTP. “This data access means I’ll be able to investigate open questions related to many space weather problems, like solar flares and coronal mass effects.”

For a data repository to be usable, however, indexing is key. “This project is about curating the data,” says Sitz. “That includes offering tools for ease of use, making user guides, maintaining the datasets, and taking care of regular updates.” Climate science, Sitz‘s field, deals with big data on a regular basis: running global coupled climate models produces a huge amount of data that can be used for many different types of analysis. “The idea is that scientists can download only the data they really need for their research,” says Sitz. “If you’re studying malaria spread by looking at the health of a disease-carrying mosquito population, you may only need certain parameters like regional temperature and precipitation, not all the data that comes with a global climate model.”

The initiative was recently awarded a Research Data Alliance grant, and with further support from ICTP, the feedback from the ionospheric scientists will shape future practices. With luck, the data sharing initiative will turn into a key resource ICTP can offer scientists in developing countries. “It will stimulate the interest of scientists and bring in students to this field,” says Jacob Adeniyi, a professor of physics at Landmark University in Omu-Aran, Nigeria, who will be testing and giving feedback on the pilot project. “It will make doing research a lot easier.”


---- Kelsey Calhoun

Publishing Date