AWS Sets the Stage for Generative AI Models & Clean Rooms

AWS’s Clean Rooms updates usher in a new era for generative AI models, fostering collaboration through automated governance layers.

December 8, 2023

AWS Sets the Stage for Generative AI Models & Clean Rooms

Amazon Web Services (AWS) announced new capabilities enabling customers to share machine learning models or access a cloud-hosted industry model via AWS Clean Rooms at its re: Invent conference in Las Vegas on Nov. 30.

AWS announced Clean Rooms one year ago and released it as generally available in March as a data collaboration initiative for industries, allowing organizations to share data through a governance layer. It’s adding machine learning models to the mix with AWS Clean Rooms ML. Organizations can train the model with data that would normally be too sensitive to share with another organization, sanitizing it to prevent leaking of sensitive information while finding mutual benefit from predictive insights. Clean Rooms ML is a preview release available in the US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Seol, Singapore, Sydney, Tokyo) and Europe (Frankfurt, Ireland, London) regions.  

Organizations are already keen to use generative AI models specific to their industry. In our Future of IT survey, we asked IT leaders about their interest in the different varieties of generative AI. Industry-specific models showed the most interest, along with text-based models such as ChatGPT (which has added multi-modal support since we conducted the survey.) These types were code-based, interfaces, speech and audio, and then visual media. We received 518 responses from IT leaders worldwide, with our survey conducted from June to August 2023.

Adam Solomon, head of business development, AWS Data Collaboration Apps, teed up the general use case in a chalk talk at re:Invent. 

“I have data, you have data, we want to collaborate across our datasets and perform some type of analysis, but I don’t want to reveal the granular contents around your data to me,” he explains. “AWS Clean Rooms can help with such analysis.”

Going further, AWS announced its plans to release a healthcare model – the first of many models supported in 2024. It’s a strategy that follows AWS’ cloud services approach, with industry-specific cloud services made available and often supported by a flagship customer in the category. Current customers of Clean Rooms ML credit data firm Experian, The Weather Company, marketing platform Bridge, and consumer purchase insights provider Affinity Solutions.

See More: Why Simplicity Is Key to Data Clean Room Adoption

IT Leaders Expect Industry-specific Models to Create More Business Value

According to the results of Info-Tech Research Group’s Future of IT survey, conducted online between June and August of 2023, IT leaders rated industry-specific generative AI models as the most interesting tied with text-based models such as ChatGPT (the survey was run before ChatGPT’s updates to include multi-modal support.) The third-most interesting type of generative AI was code, followed by interfaces.

image1-6 image

Image source: Info-Tech Research Group.

The concept of shared data stores extends back to pre-digital eras when companies would keep physical records on-site and allow their review by regulators and partners for reasons including compliance and merger and acquisition considerations. Professionals had to sign non-disclosure agreements about using the data found therein. In the digital age, one primary use of shared data stores is to get better customer insights and create advertising markets. For example, since 2022, Google has offered its Publisher Advertiser Identity Reconciliation (PAIR) solution to reconcile first-party customer data between publishers and advertisers.

Solutions more akin to AWS Clean Rooms can be found in SoftwareReviews’ Analytical Data Store quadrantOpens a new window . For example, Snowflake is a dedicated cloud data platform used by customers pursuing a multi-cloud strategy and offers a Global Data Clean Room for its customers to collaborate. Cloudera’s open data lakehouse offers shared data between customers, using AWS as its backend systemOpens a new window

AWS Architecture to Facilitate Cross-organizational Data and Model Collaboration on Clean Rooms

image2-2-1024x604 image
Image source: AWS

AWS says Clean Rooms offers multiple layers of privacy protection for Clean Rooms. It starts with isolating data a company wants to share, selecting who you want to collaborate with, determining what data is allowed to be analyzed by those partners, and finally, what protections are applied to the outputs. AWS provides encryption for all data stored in the Clean Rooms but ultimately says it is up to customers not to include data from customers who’ve not consented to share their data with a third-party provider. 

Once the architecture layers are in place, the shared data is available through a persistent, subscription-based offering or via a consumption-based API. 

New Choices Await CIOs Charting Their AI Roadmap

Data is the fuel for machine learning models, including new generative AI models. Over the next year, organizations will need to create a data strategy to harvest their own proprietary data to customize and pre-train foundation models and seek other external data sources to complement that effort. Working with other organizations on creating a shared model for a specific purpose is one path to creating value. Consider what has been done: the airline-owned data broker creates custom offers personalized to each customer, boosting sales for airlines that share their shopping data. 

Where to build these data alliances is the question. AWS offers a native way to connect all of its customers and is announcing its intent to take the lead in creating industry collaborations. That could provide AWS a first-move advantage if it pools the best data sets and trains the best industry models.

A downside for adopters will be vendor lock-in. Maybe you can take your data out of AWS, but you won’t be able to move the data or models inside of a Clean Room data store. Perhaps it can be integrated through a middleware layer, but this adds complexity. This will have CIOs who are pursuing multi-cloud strategies thinking carefully about where they want to source this capability and whether it should be from a multi-cloud gateway. 

Sharing data directly should only be considered by organizations with mature data governance in place.

How can your organization benefit from this innovative approach? Let us know on FacebookOpens a new window , XOpens a new window , and LinkedInOpens a new window . We’d love to hear from you!

About Expert Contributors: The Expert Contributor program is designed to help kickstart meaningful conversations around the priorities and challenges most critical to C-level executives. The insights and perspectives will help CIOs tackle what’s most important to them. We are always looking for industry thinkers who can help set the narrative for our enterprise audience. To know more about this program, and submit your ideas, reach out to the Spiceworks News & Insights Editorial team at editorial-toolbox@ziffdavis.comOpens a new window .

Image Source: Shutterstock

MORE ON AWS RE- INVENT

Brian Jackson
Brian Jackson

Research Director, Info-Tech Research group

As a Research Director in the CIO practice, Brian focuses on emerging trends, executive leadership strategy, and digital strategy. After more than a decade as a technology and science journalist, Brian has his fingers on the pulse of leading-edge trends and organizational best practices towards innovation. Prior to joining Info-Tech Research Group, Brian was the Editorial Director at IT World Canada, responsible for the B2B media publisher’s editorial strategy and execution across all of its publications. A leading digital thinker at the firm, Brian led IT World Canada to become the most award-winning publisher in the B2B category at the Canadian Online Publishing Awards. In addition to delivering insightful reporting across three industry-leading websites, Brian also developed, launched, and grew the firm’s YouTube channel and podcasting capabilities. Brian started his career with Discovery Channel Interactive, where he helped pioneer Canada’s first broadband video player for the web. He developed a unique web-based Live Events series, offering video coverage of landmark science experiences including a Space Shuttle launch, a dinosaur bones dig in Alberta’s badlands, a concrete canoe race competition hosted by Survivorman, and FIRST’s educational robot battles. Brian holds a Bachelor of Journalism from Carleton University. He is regularly featured as a technology expert by broadcast media including CTV, CBC, and Global affiliates.
Take me to Community
Do you still have questions? Head over to the Spiceworks Community to find answers.