Goal

The project outcome will be leveraging all federated data with Machine Learning (ML) and other mathematical methods to drive evidences. Those evidence fulfil the government of Rwanda priorities and need in predicting and monitoring the burden of COVID-19 in the Rwandan community, on hospital admissions and overall infection rates and monitor the impact of various public health measures on the pandemic evolution in the country
The SARS-COV-2/COVID-19 data has the potential to transform our disease understanding and advance science but also to understand outcomes which enable efficient preventive or treatment measure. However, in Rwanda like in other countries this data is currently fragmented, incomplete and scattered across multiple institutions including hospitals, clinics and testing sites that have captured vast amounts of data on the disease. Analyzing those fragmented COVID-19 datasets brings poor evidence. Pooling all those datasets together in one single dataset is challenging as they have different data structures and data owners may fear a break in data privacy. Therefore, we need an innovative approach to analyse all data together.The current project is proposing to leverage Artificial Intelligence (AI) and other Data Science (DS) techniques to create a scalable framework for inventorying, harmonizing and federating the accumulated data from COVID-19 patients and converting it to a standardized data format so that it can be used as part of wider studies on the disease.

Each dataset will be mapped to a common data model, already in use for other observational studies thanks to the Observational Health Data Sciences and Informatics (OHDSI) community especially the OHDSI Common Data Model (CDM) through the Observational Medical Outcomes Partnership (OMOP) initiative. The data will remain under complete control of the original data owner, thereby ensuring ethical and local data privacy rules are respected. The harmonized data will include not only COVID-19 diagnosed/serotyped but also non-infected individuals as they will come from normal hospitals electronic health records (EHRs) or testing databases with positive and negative results. In the second stage, this project will collect new data in a longitudinal way. Those new longitudinal data will be enriched with patient reported outcome (PROs) and will be in the same standardized model by design. The surveys will be conducted through mobile application questionnaires completed by direct phone calls and face-to-face surveys. anonymous geofencing data will be collected as well.

T he project outcome will be leveraging all federated data with Machine Learning (ML) and other mathematical methods to drive evidences. Those evidence fulfil the government of Rwanda priorities and need in predicting and monitoring the burden of COVID-19 in the Rwandan community, on hospital admissions and overall infection rates and monitor the impact of various public health measures on the pandemic evolution in the country. Finally, the proposed approach is scalable by extending the list of new datasets or updating the existing one and all data will remain available for future usage. The same approach is also applicable for other diseases and pandemics like Ebola virus, Influenza, and others.