Halyna
Oliinyk

Tips and Tricks for Hardware Acceleration of Tensorflow-based ML Service

1touch.io
Halyna Oliinyk

Halyna
Oliinyk

Tips and Tricks for Hardware Acceleration of Tensorflow-based ML Service

1touch.io

Halyna Oliinyk

Bio

Halyna is head of the NLP department at 1touch.io, which is a platform for advanced data lifecycle management. She has extensive experience in continuous delivery of end-to-end NLP solutions mainly focused on performing multi-lingual analysis for high-load systems helping to enhance, summarize and highlight specific properties of text data.

Bio

Halyna is head of the NLP department at 1touch.io, which is a platform for advanced data lifecycle management. She has extensive experience in continuous delivery of end-to-end NLP solutions mainly focused on performing multi-lingual analysis for high-load systems helping to enhance, summarize and highlight specific properties of text data.

Abstract

The talk will be dedicated to the research of performance best practices for tensorflow-based AI system running either on GPU or CPU. It will cover a brief review of best and worst recommendations given by tensorflow developers, the most common performance mistakes and issues while implementing complex deep learning algorithms and ways to avoid them, different approaches to tensorflow installation for best possible hardware acceleration, description of tensorflow lite use cases for computations speed up.

 

There also will be given a gentle overview of how tensorflow works under the hood and what is the best way to deliver a well-performing system to the client when using this framework as the main machine learning instrument. Special attention will be paid to all of the previously described topics from the natural language processing point of view, specifically, language modeling using BERT, ULMFiT, ELMo, and other SOTA approaches.

Abstract

The talk will be dedicated to the research of performance best practices for tensorflow-based AI system running either on GPU or CPU. It will cover a brief review of best and worst recommendations given by tensorflow developers, the most common performance mistakes and issues while implementing complex deep learning algorithms and ways to avoid them, different approaches to tensorflow installation for best possible hardware acceleration, description of tensorflow lite use cases for computations speed up.

 

There also will be given a gentle overview of how tensorflow works under the hood and what is the best way to deliver a well-performing system to the client when using this framework as the main machine learning instrument. Special attention will be paid to all of the previously described topics from the natural language processing point of view, specifically, language modeling using BERT, ULMFiT, ELMo, and other SOTA approaches.