Bookmark and Share

"Getting More from Your Datasets: Data Augmentation, Annotation and Generative Techniques," a Presentation from Xperi

Register or sign in to access the Embedded Vision Academy's free technical training content.

The training materials provided by the Embedded Vision Academy are offered free of charge to everyone. All we ask in return is that you register, and tell us a little about yourself so that we can understand a bit about our audience. As detailed in our Privacy Policy, we will not share your registration information, nor contact you, except with your consent.

Registration is free and takes less than one minute. Click here to register, and get full access to the Embedded Vision Academy's unique technical training content.

If you've already registered, click here to sign in.

See a sample of this page's content below:


Peter Corcoran, co-founder of FotoNation (now a core business unit of Xperi) and lead principle investigator and director of C3Imaging (a research partnership between Xperi and the National University of Ireland, Galway), presents the "Getting More from Your Datasets: Data Augmentation, Annotation and Generative Techniques" tutorial at the May 2018 Embedded Vision Summit.

Deep learning for embedded vision requires large datasets. Indeed, the more varied the training data is, the more accurate the resultant trained network tends to also be. But, acquiring and accurately annotating datasets costs time and money. This talk shows how to get more out of existing datasets.

First, state-of-art data augmentation techniques are reviewed, and a new approach, smart augmentation, is explained. Next, GANs (generative adversarial networks) that learn the structure of an existing dataset are explained; several example use cases (such as creating a very large dataset of facial training data) show how GANs can generate new data corresponding to the original dataset.

But building a dataset does not by itself represent the entirety of the challenge; data must also be annotated in a way that is meaningful for the training process. The presentation then gives an example of training a GAN from a dataset that incorporates annotations. This technique enables the generation of pre-annotated data" providing an exciting way to create large datasets at significantly reduced costs.