Aviv
Ben Arie

Leveraging Domain Knowledge While Shifting Towards ML Centric Solutions

PayPal

Aviv Ben Arie

Aviv
Ben Arie

Leveraging Domain Knowledge While Shifting Towards ML Centric Solutions

PayPal

Aviv Ben Arie

Bio

Aviv is a Lead Data Scientist at PayPal, and a certified pastry chef. Her data science team develops machine learning algorithms for fraud detection.

 

Aviv graduated from Tel Aviv University with a double bachelor’s degree in Computer Science and Life Science with specialization in Bioinformatics. Before PayPal, Aviv worked at the Prime Minister’s Office in the Cyber Security field, with a focus on protocol analysis.

Bio

Aviv is a Lead Data Scientist at PayPal, and a certified pastry chef. Her data science team develops machine learning algorithms for fraud detection.

 

Aviv graduated from Tel Aviv University with a double bachelor’s degree in Computer Science and Life Science with specialization in Bioinformatics. Before PayPal, Aviv worked at the Prime Minister’s Office in the Cyber Security field, with a focus on protocol analysis.

Abstract

In the past, data driven solutions were heavily based on domain knowledge and business understanding (e.g. rule based decision engines). However, today it is easy to train complex models and achieve satisfying results without being an expert in the business problem at question, shifting the DS focus almost entirely to technical model optimization. Our experience shows that injecting domain knowledge into the different aspects of the ML pipeline is critical in order to gain the maximal benefit from our models.

 

While feature engineering is the first option that comes to mind, architecture selection, loss function definition and evaluation methodologies are just as important. In this round table I will share the insights we gained at PayPal throughout the years while balancing between ML and domain expertise. We will talk about ways to incorporate prior knowledge into the different modelling life cycle stages, and introduce Snorkel – a Weak Supervision framework, which supports this approach by design. As this challenge is industry-wide, we will hear the participants’ different approaches to this issue and how other companies implement this balance in practice.

Abstract

In the past, data driven solutions were heavily based on domain knowledge and business understanding (e.g. rule based decision engines). However, today it is easy to train complex models and achieve satisfying results without being an expert in the business problem at question, shifting the DS focus almost entirely to technical model optimization. Our experience shows that injecting domain knowledge into the different aspects of the ML pipeline is critical in order to gain the maximal benefit from our models.

 

While feature engineering is the first option that comes to mind, architecture selection, loss function definition and evaluation methodologies are just as important. In this round table I will share the insights we gained at PayPal throughout the years while balancing between ML and domain expertise. We will talk about ways to incorporate prior knowledge into the different modelling life cycle stages, and introduce Snorkel – a Weak Supervision framework, which supports this approach by design. As this challenge is industry-wide, we will hear the participants’ different approaches to this issue and how other companies implement this balance in practice.