Q&A: Looker's Nick Caldwell Discusses Data-Driven Workforces - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Cloud
Commentary
4/8/2019
09:30 AM
Connect Directly
Twitter
RSS
50%
50%

Q&A: Looker’s Nick Caldwell Discusses Data-Driven Workforces

New cloud infrastructure and connected data are changing how collaboration and data management are handled.

Connected data in the cloud is driving a bit of a sea change, says Nick Caldwell, chief product officer for Looker, a provider of business intelligence and big data analytics. He spoke with InformationWeek about data engineering and data-driven workforces. He previously worked as vice president of engineering with Reddit and prior to that was general manager for Microsoft Power BI business intelligence and analytics service.

He says his exposure at Reddit opened him to new ways of approaching data challenges using Amazon AWS and Google BigQuery for a site with users that generate massive amounts of data. Now at Looker, he sees activity in the market around how people use data and ways infrastructure is evolving with an expectation for data to be integrated into the tools they use. This goes beyond data professionals and includes factory workers, school teachers, and students. He sees this trend leading to growth in SaaS applications that might sit atop large datasets.

How do trends in connected data and business intelligence affect cloud infrastructure?

“Modern cloud infrastructure, in massively parallel data houses, allows you to dump enormous amounts of data at low cost without losing any sort of performance. Increasingly the data stores are baking analytics directly into the data store. Google BigQuery has a language called BQML, BigQuery machine learning, where you can dump your data in and run TensorFlow machine learning jobs directly within the database. The databases are cheaper, faster, and very, very powerful. That trend is something Looker has latched onto.

Image: Egor - AdobeStock
Image: Egor - AdobeStock

“It means that no matter how many of these new SaaS apps or data sources are going to pop up, you can push them into one of these massively parallel data lakes. Then rather than do the older generation approaches of ETL (extract, transform, load) jobs, creating data marts, and doing aggregate tables—just push the data into one massively parallel warehouse and do a technique called schema on read.

What are the advantages of such consolidation of data?

“After you’ve pushed all the data into one spot, you can use a semantic layer that describes what you think the data should look like. Given that I have Marketo, Zendesk, and other software and tools in one spot, what are the tables I actually care about? What are the business metrics I care about? Tell the semantic layer how to compute and calculate those things. What Looker does is take that semantic description and convert it into SQL queries that are optimized for whatever the underlying data store is.

“If you’ve got all your data in BigQuery, Looker’s going to know how to take your semantic understanding of the data and convert it into the actual SQL queries to run against BigQuery in the most efficient way. This has a lot of advantages from an infrastructure perspective in terms of time to value. If I am a data engineering team who previously spent all my time maintaining complex ETLs from these different data stores, updating data marts, responding to my end users who want to ask new business questions—that’s a very costly cycle to iterate. I have to change how the data is getting transformed at multiple steps.

Nick Caldwell, Looker Image: Joao-Pierre Ruth
Nick Caldwell, Looker
Image: Joao-Pierre Ruth

How can you simplify that process?

“With Looker, you just change the semantic layer. There is one place that multiple developers can edit at the same time using tools that allow for large scale collaboration. We have customers such as Square, who have developers of the semantic model all working with it at the same time and then allowing other departments within Square to make use of that semantic model to deliver dashboards, build custom applications, and other sorts of experiences.

“Looker is corralling this ever-exploding mountain of data and SaaS applications and putting in a governed, well-understood API for that data like a central layer that tells you ‘this is the truth.’ On top of that you can build different experiences from exploratory dashboards to something like Deliveroo, which has an app that delivery drivers use. I saw a demo where the API was being used to optimize bid campaigns for marketing spend. You can use it all sorts of different ways, but the fundamental thing is they are now all trusted and there is one place where you can rapidly iterate on what the definition of data truth is.

How do you deal with friction that arises if customers are used to handling data a certain way?

“There’s friction because we’re a fundamentally different architecture. It is very different from how the majority of companies architect a data warehouse. Typically, when we go into an account, you have legacy systems on premise, and they are trying to figure out how to join the modern cloud revolution. In those cases, they quickly realize all of the trends that have been underway. They discover for themselves, ‘Given all of these new capabilities, maybe I don’t need to do things the old way.’ Then they look for a solution that works using the modern approach and hit upon Looker.

“It’s a different architecture and a different approach that was built from the ground with a cloud-first governance layer in mind. The previous generation was workbook chaos—just give everyone in the organization a tableau or book and they can edit however they want. You were giving up things to accommodate for speed or convenience. With Looker, you still get to see all of the data; you still get the performance because you’re using a modern data store but you don’t get the chaos.”

Joao-Pierre S. Ruth has spent his career immersed in business and technology journalism first covering local industries in New Jersey, later as the New York editor for Xconomy delving into the city's tech startup community, and then as a freelancer for such outlets as ... View Full Bio
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Slideshows
What Digital Transformation Is (And Isn't)
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/4/2019
Commentary
Watch Out for New Barriers to Faster Software Development
Lisa Morgan, Freelance Writer,  12/3/2019
Commentary
If DevOps Is So Awesome, Why Is Your Initiative Failing?
Guest Commentary, Guest Commentary,  12/2/2019
White Papers
Register for InformationWeek Newsletters
State of the Cloud
State of the Cloud
Cloud has drastically changed how IT organizations consume and deploy services in the digital age. This research report will delve into public, private and hybrid cloud adoption trends, with a special focus on infrastructure as a service and its role in the enterprise. Find out the challenges organizations are experiencing, and the technologies and strategies they are using to manage and mitigate those challenges today.
Video
Current Issue
Getting Started With Emerging Technologies
Looking to help your enterprise IT team ease the stress of putting new/emerging technologies such as AI, machine learning and IoT to work for their organizations? There are a few ways to get off on the right foot. In this report we share some expert advice on how to approach some of these seemingly daunting tech challenges.
Slideshows
Flash Poll