Skip to main content

Google researchers investigate how transfer learning works

Google AI logo
Image Credit: VentureBeat

Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.


Transfer learning’s ability to store knowledge gained while solving a problem and apply it to a related problem has attracted considerable attention. But despite recent breakthroughs, no one fully understands what enables a successful transfer and which parts of algorithms are responsible for it.

That’s why Google researchers sought to develop analysis techniques tailored to explainability challenges in transfer learning. In a new paper, they say their contributions help clear up a few of the mysteries around why machine learning models transfer successfully — or fail to.

During the first of several experiments in the study, the researchers sourced images from a medical imaging data set of chest X-rays (CheXpert) and sketches, clip art, and paintings from the open source DomainNet corpus. The team partitioned each image into equal-sized blocks and shuffled the blocks randomly, disrupting the images’ visual features, after which they compared agreements and disagreements between models trained from pretraining versus from scratch.

The researchers found the reuse of features — the individual measurable properties of a phenomenon being observed — is an important factor in successful transfers, but not the only one. Low-level statistics of the data that weren’t disturbed by things like shuffling the pixels also play a role. Moreover, any two instances of models trained from pretrained weights made similar mistakes, suggesting these models capture features in common.

VB Event

The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.
Request an invite

Working from this knowledge, the researchers attempted to pinpoint where feature reuse occurs within models. They observed that features become more specialized the denser the model (in terms of layers) and that feature-reuse is more prevalent in layers closer to the input. (Deep learning models contain mathematical functions arranged in layers that transmit signals from input data.) The researchers also found it’s possible to fine-tune pretrained models on a target task earlier than originally assumed — without sacrificing accuracy.

“Our observation of low-level data statistics improving training speed could lead to better network initialization methods,” the researchers wrote. “Using these findings to improve transfer learning is of interest for future work.”

A better understanding of transfer learning could yield substantial algorithmic performance gains. Google is using transfer learning in Google Translate so insights gleaned through training on high-resource languages — including French, German, and Spanish (which have billions of parallel examples) — can be applied to the translation of low-resource languages like Yoruba, Sindhi, and Hawaiian (which have only tens of thousands of examples). Another Google team has applied transfer learning techniques to enable robot control algorithms to learn how to manipulate objects faster with less data.

VB Daily - get the latest in your inbox

Thanks for subscribing. Check out more VB newsletters here.

An error occured.