Reusable Datasets: building blocks for business users

OVERVIEW

For 10 years, analysis in Mode always started with a SQL query. With mounting pressure to convert our less technical users into content creators by enabling true self-service, we found ourselves in a unique position: designing for business users, through the eyes of analysts.

DEFINING A DATASET

To help analysts quickly organize and deliver these reusable result results to their stakeholders, we introduced the Data View Component, comprised of three sections:

Data: the tabular result set
Fields: metadata on the columns themselves
Source: the underlying SQL code that was used to produce the results

While many competitive BI tools worked to obfuscate code at every opportunity, we felt it was critically important to allow users to peak behind the curtain and see how the sausage was made, while still making Datasets feel elevated and distinct from ad-hoc queries within Reports.

ENTRY POINTS

As foreshadowed in the intro, customers had been conditioned for roughly a decade to expect that clicking the global plus button in Mode would always land you in a blank query editor. This starting point could feel horrifyingly intimidating to a non-technical user, but delightfully efficient for, well, query writers.

With an opportunity to now decide whether a Report should start with a SQL query or leverage an existing Dataset, we had to strike a balance between making the new entry point feel approachable to our business user audience without adding too much friction for our power users. This admittedly took us a couple of iterations., but ultimately we found it best to focus on what the user was trying to accomplish, rather than organizing the entry points by the content they’d be creating as a result.

BEFORE

AFTER

DEPENDENCY MANAGEMENT

I believe we got two things right here, and one thing very wrong. In addition to unlocking code-free content creation, Datasets also promised a more performant and cost effective experience. Run your Dataset once, use its cached results across dozens o’ Reports. We introduced a usage modal to help Dataset creators understand how and where a Dataset was being used…good. We also built in automatic fallback behavior so that if for whatever reason the query powering a Dataset failed, any Reports using the data would fallback to the last successful run…good! But the final decision in this camp, though well intentioned, proved to hurt more than help.

I had great admiration for Figma’s approach to components, and the fact that they gave individual designers the control to consciously pull component library updated into individual projects. You were proactively notified of changes, but never forced to accept them. This made it extremely easy to avoid disruptions until you were ready.

In theory, it made sense to offer the same control to content creators in Mode. What if one of the fields you were leveraging to power a KPI on your very important dashboard was no longer in the latest Dataset version? It could be frustrating to have that change forced upon you. So instead, we required users to refresh their Reports in order to see the latest changes. In hindsight, I believe we over-indexed on updates that might include breaking changes, when in reality, the latest update typically just included more up-to-date rows, and it was frustrating to users to have to intentionally load them in every time.

Happy to report that later this year, we’ll be adding behavior to allow users to automatically refresh their Reports if any of the Datasets in use have updated data available!