Maelstrom Research develops a suite of integrated software applications to support data cataloguing, harmonization, integration and co-analysis. These applications are built by a team of software developers, epidemiologists and statisticians. Our software applications are modular, customizable and designed to ensure participant privacy and data security. Our software are open source under the GPL3 licence and are freely available at Obiba.
For more information on Opal and Mica, please see: Doiron, Dany, et al. "Software Application Profile: Opal and Mica: open-source software solutions for epidemiological data management, harmonization and dissemination", International Journal of Epidemiology
For more information on DataSHIELD, please see: Gaye, Amadou, et al. "DataSHIELD: taking the analysis to the data, not the data to the analysis", International Journal of Epidemiology (2014) 43 (6): 1929–1944.
Opal is a software application used to manage study data and includes a feature enabling data harmonization and data integration across studies. As such, Opal supports the development and implementation of processing algorithms required to transform study-specific data into a common harmonized format. Moreover, when connected to a Mica web interface, Opal allows users to seamlessly and securely search distributed datasets across several Opal instances. Learn more about Opal here.
Mica is a software application developed to create web portals for individual epidemiological studies or for study consortia. Features supported by Mica include a standardized study catalogue, data dictionary browsers, online data access request forms, and communication tools (e.g. forums, events, news). This Maelstrom web site is itself powered by Mica. When used in conjunction with the Opal software, Mica also allows authenticated users to perform distributed queries on the content of study databases hosted on remote servers and retrieve summary statistics. Learn more about Mica here.
DataSHIELD acts as an interface module between the Opal software application and the R statistical environment. Under DataSHIELD, a central analysis computer (i.e. the computer from which analysis is carried out) coordinates an iterative sequence of parallelized analysis of the individual-level data on multiple data computers (i.e. the secure servers where the study-specific harmonized individual-level data are stored). With this approach individual participant data from contributing studies are held securely on geographically-dispersed, study-based computers; analytical commands are sent as blocks of code from a computer within the network which requests each computer to undertake an analysis and return non-identifiable summary statistics (i.e. results, not data). Under DataSHIELD, all individual-level data stays at source, within the governance structure and control of the originating study. This is a flexible and efficient way to obtain all the relevant information needed for a multi-centre analysis, based on results from clean and harmonized datasets, while no identifiable or sensitive data are physically moved or even rendered temporarily visible outside the original study in which the data were collected. Learn more about DataSHIELD here.