Skip to main content

MIT researchers claim augmentation technique can train GANs with less data

DiffAugment
These images were generated with GANs trained using just hundreds of images -- 100 portraits of Obama and 100 cats. .

Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.


Researchers at MIT, Adobe Research, and Tsinghua University say they’ve developed a method — differentiable augmentation (DiffAugment) — that improves the efficiency of generative adversarial networks (GANs) by augmenting both real and fake data samples. In a preprint paper, they claim it effectively stabilizes the networks during training, enabling them to generate high-fidelity images using only 100 images without pretraining and to achieve state-of-the-art performance on popular benchmarks.

GANs — two-part AI models consisting of a generator that creates samples and a discriminator that attempts to differentiate between the generated samples and real-world samples — have demonstrated impressive feats of media synthesis. Top-performing GANs can create realistic portraits of people who don’t exist, for instance, or snapshots of fictional apartment buildings. But their success so far has come at the cost of considerable computation and data; GANs rely heavily on quantities (in the tens of thousands) of diverse and high-quality training samples, and in some cases, collecting such large-scale data sets requires months or years along with annotation costs — if it’s even possible.

As alluded to earlier, the researchers’ technique applies augmentations to both real images from training data and fake images produced by the generator. (If the method were to augment only the real images, the target GAN might learn a different data distribution.) DiffAugment randomly enlarges or shrinks the images and masks them with a random square half the image size, and it simultaneously adjusts the images’ brightness, color, and contrast values.

DiffAugment

Above: With DiffAugment, GANs can generate high-fidelity images using only 100 Obama portraits, grumpy cats, or pandas from a data set. The cats and dogs were generated using 160 and 389 images, respectively.

In experiments conducted on the open source ImageNet and CIFAR-100 corpora, the researchers applied DiffAugment to two leading-class GANs: DeepMind’s BigGAN and Nvidia’s StyleGAN2. With pretraining, they report that against CIFAR-100, their method improved all the baselines by a “considerable margin” independently of the architectures on the Fréchet Inception Distance (FID) metric, which takes photos from the target distribution and the models being evaluated and uses an AI object recognition system to capture important features and suss out similarities. More impressively, without pretraining and using only 100 images, the GANs achieved results on par with existing transfer learning algorithms in several image categories (namely “panda” and “grumpy cat”).

VB Event

The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.
Request an invite

“StyleGAN2’s performance drastically degrades given less training data. With DiffAugment, we are able to roughly match its FID and outperform its Inception Score (IS) using only 20% training data,” the coauthors wrote. “Extensive experiments consistently demonstrate its benefits with different network architectures, supervision settings, and objective functions. Our method is especially effective when limited data is available.”

The code and models are freely available on GitHub.

VB Daily - get the latest in your inbox

Thanks for subscribing. Check out more VB newsletters here.

An error occured.