Skip to main content

Opendoor on using data science to close real estate deals

Image Credit: Alexander Raths / Shutterstock

Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.


The real estate industry isn’t the first industry that usually comes to mind when discussing ways to apply machine learning algorithms. The seller wants to sell the property and the buyer wants to buy it — it is just a matter of closing the deal. The stumbling block is agreeing on the price for that deal. Accurately assessing property value is a complicated process, and one that requires a lot of different data sources and scalable pricing models. The buyer can’t just reference an itemized list of all the possible factors and their associated price values and sum up all the property’s features to calculate the total value.

The automated valuable model is a machine learning model that estimates the value of a property, usually by comparing that property in question to similar properties nearby that have recently sold (“comps”). Real estate company Opendoor relies on its version of AVM — Opendoor Valuation Model — for valuation and to look up information about the comps (in order to understand the difference between the comp’s value and the property in question, for example). The company has invested heavily in data science from almost the beginning of the company’s history to incorporate different data sources and to refine algorithms to improve the model’s accuracy.

In a conversation with VentureBeat, Opendoor’s Sam Stone described why the company built the Opendoor Valuation Model and how data science fits in to the real estate industry. With the company’s plans to expand from 30 markets to 42 markets by the end of the year, and to add new types and price points of homes, data science is expected to remain a core part of the company’s strategy, according to Stone.

This interview has been edited for clarity.

VB Event

The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.
Request an invite

VentureBeat: What was the problem Opendoor was having, and why did it decide that investing in data science in-house was the answer? What benefits did the company expect to gain with scalable pricing models and investment in data science?

Sam Stone: Since our founding, we’ve always done our data science in-house and leverage both our own and third-party data for our models. We recognized that modernizing the outdated, manual process of pricing homes could benefit consumers in terms of price certainty and the ability to more quickly take advantage of the equity in their home.

For most people, their home is their largest financial asset, and they are highly attuned to its value. It’s critical our algorithms incorporate all of the important features on a home. Since every home is unique and market conditions are constantly changing, pricing homes accurately requires constantly evolving solutions. That means we have to invest heavily in both our algorithms and our team of in-house pricing experts to make sure that the algorithms and experts work seamlessly together.

VentureBeat: What did Opendoor already have that made it feasible to build out the Opendoor Valuation Model rather than hiring the work out to another company?

Stone: Accurate and performant pricing systems are core to our business model. Our initial automated valuation model stems from lines of code our co-founder and CTO, Ian Wong, wrote back in 2014.

Since then we’ve made enormous investments on the technology and data science side. We’ve developed different machine learning model types, which includes ingesting and testing new datasets. We’ve built out processes to hire, grow and retain top-notch machine learning engineers and data scientists. And, at the same time, we’ve invested heavily in expanding our expert insights by arming our pricing experts with customized tools to track local nuances across our markets.

It’s fair to say that pricing systems are core to our DNA as a company.

We’re always eager to learn from new datasets, new products and new vendors. But we’ve yet to see any third-party that comes close to matching the overall accuracy, coverage, or functionality of our in-house suite of pricing systems.

VentureBeat: Tell me a bit about Opendoor Valuation Model. What kind of data science analysis and investment went into building this model?

Stone: Opendoor Valuation Model, or “OVM,” is a core piece of pricing infrastructure that feeds into many downstream pricing applications. This includes our home offers, how we value our portfolio and assess risk, and what decisions we’ll make when we resell a home.

One element of OVM is based on a set of structural insights about how buyers and sellers evaluate prices and decide on home purchase bids. They look at the prices of comparable homes in the neighborhood that sold recently—often referred to as “comps”— and adjust their home price up or down depending on how they think their home equates. But how do you decide what makes one home “better or worse” than another? It’s not a black and white equation and is much more complex. Homes have unique features, ranging from the square footage and backyard space to the number of bathrooms and bedrooms, layout, natural light and much more.

OVM is fed by a multitude of other data sources, ranging from property tax information, market trends, and many home and neighborhood specific signals.

VentureBeat: What does OVM look like under the hood? What did you have to build in order to get this up and running?

Stone: When we started building OVM, we kept it straightforward, relying mainly on linear statistical models. Starting with relatively simple models forced us to focus on developing a deep understanding of buyers and sellers’ thought processes. We could verify and grow our data quality, rather than getting caught up in fancy math.

As we’ve come to understand the behavior of buyers and sellers better over the years, we’ve been able to move to more sophisticated models. OVM is now based on a neural network, specifically an architecture called a Siamese Network. We use this to embed buyers and sellers behaviors, including selecting comps, adjusting them and weighting them.

We’ve seen repeatedly that a “state of the art” machine learning model isn’t enough. The model needs to understand how buyers and sellers actually behave in its architecture.

We have multiple teams, composed of both engineers and data scientists, who are constantly working on our OVM. These teams collaborate deeply with operators, who have deep local expertise, often incorporating them into product sprints. The process of developing, QA’ing, and releasing our first neural-network-based version of OVM was a cross-team effort that took many months.

VentureBeat: What is the purpose of the human+machine learning feedback loop?

Stone: Our in-house pricing experts play a key role across our pricing decisions, working in conjunction with our algorithms. We rely on pricing experts at various stages:

  • Adding or verifying input data. For example, assessing the quality of appliances or finish levels, which are inputs that are important but hard to quantify algorithmically. Humans are much better at this.
  • Making intermediate decisions. For example, what features of the home might make it particularly hard to value?
  • Making user-facing decisions. For example, given a set of buyer offers on a home in our portfolio, which, if any, should we accept?

While we may do more or less automation on a particular area or task at a point in time, we have always believed that in the long-term, the best strategy is to marry pricing experts and algorithms. Algorithms help us understand expert insight strengths and weaknesses better, and vice versa.

VentureBeat: What would you do differently if you were building out OVM now, with the lessons learned from last time?

Stone: Ensuring high quality input data, under all circumstances and for all fields, is always top priority.

The model that is most accurate in a time of macroeconomic stability is not necessarily the model that is most accurate in a time of economic crisis — for example, the financial crisis of 2007-2008 and the COVID-19 global pandemic. Sometimes it makes sense to invest in forecasting features that don’t help accuracy during “normal” times, but can help a lot in rare, but highly uncertain times.

This past year has taught us that we can price homes using interior photos and videos shared by sellers. Prior to COVID-19, we would inspect home interiors in person. However when the pandemic began, we stopped in-person interactions for safety reasons. As a result, we turned the interior assessment into a virtual one and learned that it’s actually much easier for sellers.

VB Daily - get the latest in your inbox

Thanks for subscribing. Check out more VB newsletters here.

An error occured.