Skip to main content

Post an event

Sign in

Restrictions are in place to help slow the spread of coronavirus (COVID-19) and save lives. For more information visit the Department of Health and Human Services (DHHS) website.

Frontiers of Big Data, AI, and Analytics

Online Seminar Series

 

Discussion Theme: From COVID-19 Testing to Election Prediction: How Small Are Our Big Data? 

 

Abstract

The term “Big Data” emphasizes data quantity, not quality. What will be the effective sample size when we take into account the deterioration of data quality because of, for example, the selection bias in COVID-19 testing or the non-response bias in 2016US Election polling results? This talk provides an answer to such questions, based on the concept of data defect index (ddi) developed in Meng (2018) Statistical paradises and paradoxes in big data (I): Law of large populations, bigdata paradox, and the 2016 US presidential election. Annals of Applied Statistics, 685-726. It will also discuss briefly the application of ddi for 2020 US Election, as reported in Isakov and Kuriwaki (2020) Towards Principled Unskewing: Viewing 2020 Election Polls Through a Corrective Lens from 2016. Harvard Data Science Review.

Short biography

Xiao-Li has served in various university leadership positions at Harvard university, including Dean of Graduate School of Arts and Sciences, Chair of the Department of Statistics, the founding Editor-in-Chief of the Harvard Data Science Review. Before joining Harvard university, Xiao-Li was a faculty member at the University of Chicago. His Ph.D. in Statistics is from Harvard University. Xiao-Li’s research interests cover a wide range of topics, including Statistical theory and the principles for data science, Philosophical and foundational issues in statistics, Statistical computing and computational statistics, signal extraction and uncertainty assessment.

About this event series: This event series aims to unleash ideas and insights for harnessing the successful future of business & society. The first part (speaker's talk) focuses on cutting edge ideas and the second part (discussion and Q&A from audience) explores its practical usages/implications in business and society, bridging a gap between new ideas and business & society.

Main audience

Main audience include business professionals (including C-suites, directors, managers) from small to large organizations, government and regulatory and not-for-profit organizations. It is not necessarily to have a fluency of data.

While the idea transfer to business and society is the focus, interested academics and students are also welcome to join this event series.