Cover image This is a book about OHDSI, and is currently very much under development.

The book is written in RMarkdown with bookdown. It is automatically rebuilt from source by travis.

Goals of this book

This book aims to be a central knowledge repository for OHDSI, and focuses on describing the OHDSI community, data standards, and tools. It is intended both for those new to OHDSI and veterans alike, and aims to be practical, providing the necessary theory and subsequent instructions on how to do things. After reading this book you will understand what OHDSI is, and how you can join the journey. You will learn what the common data model and standard vocabularies are, and how they can be used to standardize an observational healthcare database. You will learn there are three main use cases for these data: characterization, population-level estimation, and patient-level prediction, and that all three activities are supported by OHDSI’s open source tools, and how to use them. You will learn how to establish the quality of the generated evidence through data quality, clinical validity, software validity, and method validity. Lastly, you will learn how these tools can be used to execute these studies in a distributed research network.

Structure of the book

This book is organized in five major sections:

  1. The OHDSI Community
  2. Uniform data representation
  3. Data Analytics
  4. Evidence Quality
  5. OHDSI Studies

Each section has multiple chapters, and each chapter aims to follow the following main outline: Introduction, Theory, Practice, Summary, and Excercises.


Each chapter lists one or more chapter leads. These are the people who lead the writing of the chapters. However, there are many others that have contributed to the book, whom we would like to acknowledge here:

Software versions

A large part of this book is about the open source software of OHDSI, and this software will evolve over time. Although the developers do their best to offer a consistent and stable experience to the users, it is inevitable that over time improvements to the software will make some of the instructions in this book out of date. The online version of the book will be updated to reflect those changes, and new editions of the hard copy will be released over time. For reference, these are the version numbers of the software used in this version of the book:

  • ACHILLES: version 1.6.6
  • ATLAS: version 2.7.2
  • EUNOMIA: version 1.0.0
  • Methods Library packages: see Table 0.1
Table 0.1: Versions of packages in the Methods Library used in this book.
Package Version
CaseControl 1.6.0
CaseCrossover 1.1.0
CohortMethod 3.1.0
Cyclops 2.0.2
DatabaseConnector 2.4.1
EmpiricalCalibration 2.0.0
EvidenceSynthesis 0.0.4
FeatureExtraction 2.2.4
MethodEvaluation 1.1.0
ParallelLogger 1.1.0
PatientLevelPrediction 3.0.6
SelfControlledCaseSeries 1.4.0
SelfControlledCohort 1.5.0
SqlRender 1.6.2