Lilach
Goldshtein

AI Modeling in Healthcare: Addressing Biases in Cohort Selection in Observational Data
K-Health
Lilach Goldshtein

Lilach
Goldshtein

AI Modeling in Healthcare: Addressing Biases in Cohort Selection in Observational Data

K-Health
Lilach Goldshtein

Bio

Lilach Goldshtein is a Senior Data Scientist at K-Health. For the past 3 years, she combined her expertise in data science and AI algorithms, with her passion for the medical field, to create and improve personalized primary care data-driven applications.

 

Before joining K-Health, Lilach worked at Teva Pharmaceuticals for 5 years, where she worked closely with biostatisticians to improve and develop methods for acquiring, processing, analyzing, and presenting clinical trial data.
Lilach holds a M.Sc. in Computer Science from IDC Herzliya, a B.Sc in Biomedical Engineering from Tel-Aviv University, and has ten years of industry experience in Medical Data Research.

Bio

Lilach Goldshtein is a Senior Data Scientist at K-Health. For the past 3 years, she combined her expertise in data science and AI algorithms, with her passion for the medical field, to create and improve personalized primary care data-driven applications.

 

Before joining K-Health, Lilach worked at Teva Pharmaceuticals for 5 years, where she worked closely with biostatisticians to improve and develop methods for acquiring, processing, analyzing, and presenting clinical trial data.
Lilach holds a M.Sc. in Computer Science from IDC Herzliya, a B.Sc in Biomedical Engineering from Tel-Aviv University, and has ten years of industry experience in Medical Data Research.

Abstract

In the world of big data, selecting the “best” subset of data to learn from can be tricky and challenging, especially when data collection processes may pose subsequent biases in part of the observations. It is certainly the case in medicine, where the data represents a patient’s medical history. Choosing your data correctly is key to improve the performance of even a baseline vanilla model, and define the initial target audience of your data science product.

 

We will review methodologies of choosing the inclusion/exclusion criteria (a standard craft in the design of clinical studies), assessing their impact and implications, and how to deal with the selection bias that can be induced by the selection process.

Abstract

In the world of big data, selecting the “best” subset of data to learn from can be tricky and challenging, especially when data collection processes may pose subsequent biases in part of the observations. It is certainly the case in medicine, where the data represents a patient’s medical history. Choosing your data correctly is key to improve the performance of even a baseline vanilla model, and define the initial target audience of your data science product.

 

We will review methodologies of choosing the inclusion/exclusion criteria (a standard craft in the design of clinical studies), assessing their impact and implications, and how to deal with the selection bias that can be induced by the selection process.