Skip to main content

Pinterest open-sources big data analytics tool Querybook

Pinterest headquarters in San Francisco.
Image Credit: Jordan Novet / VentureBeat

Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.


Pinterest today open-sourced Querybook, a data management solution for enterprise-scale remote engineering collaboration. The company says the tool, which it uses internally, can help engineers compose queries, create analyses, and collaborate with one another via a notebook interface.

Querybook started in 2017 as an intern project at Pinterest. The development team early on decided on a document-like interface where users could write queries and analyses in one place, with collocated metadata and the simplicity of a note-taking app. Released internally in March 2018, Querybook became the go-to solution for big data analytics at Pinterest. It now averages 500 daily active users and 7,000 daily query runs.

“With Querybook, Pinterest engineers have brought together the power of metadata with the simplicity of a note-taking app for a better querying interface, where teams can compose queries and write analyses all in one place,” a spokesperson told VentureBeat. “Querybook can be set up and deployed in minutes.”

Every query executed on Querybook gets analyzed to extract metadata like referenced tables and query runners. Querybook uses this information to automatically update its data schema and search ranking, as well as to show a table’s frequent users and query examples. The more queries in Querybook, the better documented the tables become.

VB Event

The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.
Request an invite

Querybook also features an admin interface that lets companies configure query engines, table metadata ingestion, and access permissions. From this interface, admins can make live Querybook changes without going through code or config files. And they can create visualizations, including lines, bars, stacked areas, pies, donuts, scatter charts, and table charts.

“The common starting point for any analysis at Pinterest is an ad-hoc query that gets executed on the internal Hadoop or Presto cluster. To continuously make these improvements, especially in an increasingly remote environment, it’s more important than ever for teams to be able to compose queries, create analyses, and collaborate with one another,” Pinterest wrote in a blog post. “We built Querybook to provide a responsive and simple web user interface for such analysis so data scientists, product managers, and engineers can discover the right data, compose their queries, and share their findings.”

Pinterest previously open-sourced Teletraan, a tool that can deploy code onto virtual machines, such as those available from public cloud Amazon Web Services. Prior to this, the company released Terrapin, software designed to more efficiently push data out of the Hadoop open source big data software and make it available for other systems to use.

VB Daily - get the latest in your inbox

Thanks for subscribing. Check out more VB newsletters here.

An error occured.