Crowdsourcing genetic data. A look from the outside and inside.
By Yaniv Erlich, MyHeritage, Israel
Biography: Dr. Yaniv Erlich is the Chief Science Officer of MyHeritage.com and an Associate Professor of Computer Science and Computational Biology at Columbia University (leave of absence). Prior to these positions, he was a Fellow at the Whitehead Institute, MIT. Dr. Erlich received his bachelor’s degree from Tel-Aviv University, Israel (2006) and a PhD from the Watson School of Biological Sciences at Cold Spring Harbor Laboratory (2010). Dr. Erlich’s research interests are computational human genetics. Dr. Erlich is a TEDMED speaker (2018), the recipient of DARPA’s Young Faculty Award (2017), the Burroughs Wellcome Career Award (2013), Harold M. Weintraub award (2010), the IEEE/ACM-CS HPC award (2008), and he was selected as one of 2010 Tomorrow’s PIs team of Genome Technology. He is currently working on statistical genetics at scale using direct to consumer genomics.
Precision medicine is a data-hungry endeavor. However, traditional cohort ascertainment strategies poorly scale and necessitate substantial investments to obtain genomics data, conduct physical exams and lab tests, and assess familial history. But are these really required in today’s world? In the last decade, the human population has produced zettabytes (1021) of digital data. Here, I will present our successes in repurposing participants’ data for ultra-large scale genetic studies. First, I will describe our long-term project to build a 13-million family tree by a mining genealogy-driven social. Second, I will present DNA.Land, our website to crowd-source genetic data of Direct-To-Consumer participants. Third, I will describe MyHeritage Health, a novel product to empower participants to learn about their genetic predispositions and the challenges in building this product. Last, I will talk about genetic privacy implications of this brave new world.