Sharon
Fogel

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

Amazon

Sharon Fogel

Sharon
Fogel

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

Amazon

Sharon Fogel

Bio

Sharon is a PhD student in the Department of Electrical Engineering at Tel-Aviv University, under the supervision of Daniel Cohen-Or and Shai Avidan.

 

She is currently completing an internship at Amazon AWS Computer Vision team. Prior to starting her PhD, Sharon served six years in the Air Force as an Electronic Warfare researcher and team leader, after graduating with her first degree in Physics from the Talpiot Program.

Bio

Sharon is a PhD student in the Department of Electrical Engineering at Tel-Aviv University, under the supervision of Daniel Cohen-Or and Shai Avidan.

 

She is currently completing an internship at Amazon AWS Computer Vision team. Prior to starting her PhD, Sharon served six years in the Air Force as an Electronic Warfare researcher and team leader, after graduating with her first degree in Physics from the Talpiot Program.

Abstract

Optical character recognition (OCR) systems performance improved significantly in the deep learning era. This is especially true for handwritten text recognition (HTR), where each author has a unique style, unlike printed text, where the variation is smaller by design. That said, deep learning based HTR is limited, as in every other task, by the number of training examples. Gathering data is a challenging and costly task, and even more so, the labeling task that follows, on which we focus here. One possible approach to reduce the burden of data annotation is semi-supervised learning. Semi supervised methods use, in addition to labeled data, some unlabeled samples to improve performance, compared to fully supervised ones. Consequently, such methods may adapt to unseen images during test time.

 

We present ScrabbleGAN, a semi-supervised approach to synthesize handwritten text images that are versatile both in style and lexicon. ScrabbleGAN relies on a novel generative model which can generate images of words at arbitrary length. We show how to operate our approach in a semi-supervised manner, enjoying the aforementioned benefits such as performance boost over state of the art supervised HTR. Furthermore, our generator can manipulate the resulting text style. This allows us to change, for instance, whether the text is cursive, or how thin is the pen stroke.

Abstract

Optical character recognition (OCR) systems performance improved significantly in the deep learning era. This is especially true for handwritten text recognition (HTR), where each author has a unique style, unlike printed text, where the variation is smaller by design. That said, deep learning based HTR is limited, as in every other task, by the number of training examples. Gathering data is a challenging and costly task, and even more so, the labeling task that follows, on which we focus here. One possible approach to reduce the burden of data annotation is semi-supervised learning. Semi supervised methods use, in addition to labeled data, some unlabeled samples to improve performance, compared to fully supervised ones. Consequently, such methods may adapt to unseen images during test time.


We present ScrabbleGAN, a semi-supervised approach to synthesize handwritten text images that are versatile both in style and lexicon. ScrabbleGAN relies on a novel generative model which can generate images of words at arbitrary length. We show how to operate our approach in a semi-supervised manner, enjoying the aforementioned benefits such as performance boost over state of the art supervised HTR. Furthermore, our generator can manipulate the resulting text style. This allows us to change, for instance, whether the text is cursive, or how thin is the pen stroke.