How to Data Analytics (in a Start-up)

3 Lessons I Learned as the “Chief Data Analyst” of a Silicon Valley–funded Start-up

From 2015 till 2017 I helped grow HoloBuilder Inc., a start-up providing virtual reality solutions for the construction industry, as their VP of Customer & Data Analytics & Optimization, which roughly translates to “Chief Data Analyst”. The company is headquartered in San Francisco while I was a part of their R&D lab in Aachen, Germany. I was responsible for the whole data analytics* pipeline — from collecting data on the web platform using Google Analytics and own trackers to processing the data in Google BigQuery and visualizing it using tools like Power BI and Klipfolio. During my time in Aachen I learned lots of valuable lessons. Here, I want to share with you the three most important ones that are directly concerned with data analytics (please scroll down for a TL;DR).

How to Data Analytics

1. Data Analytics ≈ UX Design

Data analytics is a lot like UX design. You have specific target audiences that expect to experience what you provide them with in the most optimal way — concerning content, presentation, and possibly interactivity. For instance, providing data for C-level management and for potential investors are two completely different stories. While management requires low-level insights concerning the software itself, among other things, for VCs we usually prepared more high-level business metrics including projections and forecasts, due to the different requirements. Moreover, internal data would usually be provided through dynamic dashboards that could be adjusted and customized while data for investors would rather be delivered in the form of PowerPoint slides that matched the layout of the pitch deck. Therefore, it is crucial to have the definition of a target audience (potentially even personas) and a requirements elicitation from that audience at the beginning of every data science process. At HoloBuilder Inc., this lesson became especially clear because of the split between San Francisco and Germany and the fact that most of the (potential) VCs were residing in the Silicon Valley.

I am convinced that a data analyst without some proper UX skills — and, of course, adequate requirements and input — cannot be successful.

2. Ask the Right Questions — and Do So early

This one goes hand in hand with requirements elicitation. Don’t provide analyses just for the sake of it!

This whole “let’s just analyze everything we can get” thing doesn’t work! It’s extremely important to define the questions you intend to answer beforehand. Tracking is cheap, so you can (and should!) track more than you need (at the moment). But the processing and visualization of data that nobody ever looks at eats up a whole lot of resources that would be required for the meaningful analysis and presentation of the few nuggets that are buried in your giant pile of big data. Also, having concrete questions in mind greatly helps with tailoring data structures more precisely to your specific needs. Of course, this doesn’t mean that your infrastructure doesn’t have to be flexible enough to quickly react to changing and new questions that need to be answered. In an optimal world, the data for answering new questions is already there and you “just” have to do the processing and visualization. In general: Expect surprise on-demand questions anytime! Therefore, anticipate and be prepared!

(While the questions that need to be answered can be seen as part of the requirements elicitation, I treat them separately here, because I give requirements a more technical connotation — e.g., “possibility to toggle between line/bar charts” or “include difference to previous period in %” — compared to key questions such as “Why do we lose users?”.)

3. Data is Meaningless …

… unless you give it meaning by interpreting it. For this, it’s inherently important to not think in silos. A data analytics team has to closely cooperate with the UX team and (almost) all other teams in the company in order to find meaningful interpretations or reasons for the collected data. Yet, this is still not the norm in industry. For instance, there is still the widely believed misconception that A/B testing = usability testing.

To ensure meaningful data analytics, at HoloBuilder Inc., marketing manager Harry Handorf and I developed a boilerplate for a weekly KPI report that posed three crucial questions:

  1. Which data did we collect?
  2. What are the reasons the data looks like that?
  3. What actions must/should be taken based on the above?

That is, the first part delivered the hard facts; the second part explained these numbers (e.g., less sign-ups due to change in UI); and the third part presented concrete calls to action (e.g., undo UI change). The report looked at those questions from the platform as well as the marketing perspective. Therefore, we had to extensively collaborate with software engineers, designers, UX people, marketing and sales to find meaningful answers. According to the second learning above, the basis of the report always were higher-level questions defined beforehand, such as: “Does the new tutorial work?”, “How can we gain more customers?”, and “Have we reached our target growth?”. In general, the interpretation of data is based on the processed data and the questions to be answered, rather than on technical requirements (see infographic above).

Again, because this is really important: Your data is worth nothing without proper interpretation and input from outside the data analytics department.

Ultimately, to conclude this article, I don’t want to withhold from you Harry’s take on the topic:

You might have heard of the metaphor for life feeling like a tornado. It perfectly applies to working with data of a young business — it spins you around with all of its metrics, data points and things you COULD measure. It’s noisy and wild. A good data scientist figures out how to step out of it. But that does not mean getting out of the tornado completely, letting it do its thing and becoming a passive spectator. It means getting inward, to the eye. Where silence and clarity allow for a better picture of what’s going on around you, defining appropriate KPIs and asking the right, well thought-out questions.”
—Harry Handorf (tornado tamer)

TL;DR

  1. Data analytics is a lot like UX design! As a data analyst, you have to define target audiences and elicit requirements. Tailor content & presentation of your analyses to those.
  2. Define the questions to be answered beforehand, then process and interpret the data necessary to answer those questions. Don’t analyze everything you can just for the sake of it.
  3. Data is meaningless without interpretation. Extensively collaborate with other departments — especially UX — to ensure meaningful data analytics.

(This article has also been published in Startups.co on Medium.)

Footnotes

* What we did at HoloBuilder Inc. was clearly a mix of data analytics and data science. But since it was closer to the analytics part, I refer to it as data analytics in this article. In case you are interested in the specific differences between the two (and how difficult it is to tell them apart), I recommend reading the Wikipedia articles about data science and data analytics, as well as “Data Analytics vs Data Science: Two Separate, but Interconnected Disciplines” by Jerry A. Smith.

Acknowledgments

Special thanks go to Harry for proofreading the article & his valuable input.

Advertisements

What ’bout some fancy dashboards for ya? Power BI vs. Geckoboard

In my capacity as the chief data analyst of bitstars, it’s one of my key responsibilities to regularly compile all relevant figures concerning our web platform HoloBuilder. These figures are mostly intended for people who don’t have the time to dive deeply into some fancy but complicated statistics. Hence, from the user experience perspective it’s crucial to provide them in an easy-to-understand and pleasant-to-look-at form. A well-established way of doing so are data visualizations of different forms which are provided in terms of dashboards for optimal accessibility. Since we are currently redesigning our internal process for providing figures and statistics, I’ve done some research on two potential software solutions that could be used for this.

Requirements

Since we are talking about a solution for our internal process at bitstars, there is a set of company-specific requirements a contemplable software has to fulfill. In particular, these requirements are:

Moreover, there are two nice-to-haves:

In the following, I investigate two possible solutions—Power BI and Geckoboard—, which are evaluated against the above requirements.

Power BI

Power BI WebsitePower BI is a cloud-based business analytics service provided by Microsoft. It comes as a part of the Office 365 suite, but can also be used standalone. There is an online as well as a desktop version, whereas the latter has a significantly larger range of functions. Power BI distinguishes between dashboards (“[…] something you create or something a colleague creates and shares with you. It is a single canvas that contains one or more tiles.”) and reports [“one or more pages of visualizations (charts and graphs)”]. Reports can be saved in the Power BI Desktop file format (.pbix); dashboards can be shared.

Power BI comes with a rather limited range of integrable services, among which are Google Analytics (☑) and MailChimp (☑). AdWords (☑) statistics can be integrated via Google Analytics if your respective accounts are connected. However, integration for Facebook Ads (✖), Pipedrive (✖) and AWS (✖) is still missing. FB Ads integration has been requested, but is yet to be realized. There is moreover functionality to integrate data from Excel and CSV files (from your computer or OneDrive) or Azure SQL databases, among others, which also enables you to import your own custom data.

The basic version of Power BI can be used for free while Power BI Pro comes for $9.99 per user & month.

How to create a cumulative chart in Power BI?

Cumulative charts are not a built-in functionality of Power BI, but can be easily realized using Data Analysis Expressions (DAX, ☑). That is, you have to create a new measure in your dataset. Assume, for instance, you want a cumulative chart of your sales (to be accumulated, Y axis) over time, which are only present in your dataset as the number of sales per date (X axis). The DAX formula for your new measure would be as follows:

Measure = calculate(
  SUM('Your Dataset'[Sales]);
  FILTER(
    all('Your Dataset'[Date]);
    'Your Dataset'[Date] <= max('Your Dataset'[Date])
  )
)

(found at http://www.daxpatterns.com/cumulative-total/). You can then simply add a chart visualizing your new measure (Y axis) per date (X axis) to your Power BI report to obtain your desired cumulative chart.

Geckoboard

Geckoboard WebsiteGeckoboard is a web platform for creating individual dashboards that show your business’s KPIs (key performance indicators), e.g., unique visits to your website, Facebook likes or sales per day. The platform has built-in support for integration of a wide range of external data sources, including Google Analytics (☑), AdWords (☑), Facebook Ads (☑), MailChimp (☑), Pipedrive (☑) and AWS (☑) and many more (in fact, way more compared to Power BI). Moreover, Geckoboard supports CSV and Google Sheets integration for your own custom data.

Like in Power BI, there is no built-in support for cumulative charts. However, since it is easily possible to create those in Google Sheets (see, e.g., this link), they can simply be imported and visualized in Geckoboard as well (☑). Of course, this means an additional intermediate step is required.

Geckoboard offers no free plan. Paid plans start from $49 per month for one user and two dashboards.

Conclusion

Power BI Geckoboard
Cumulative charts (☑)1 (☑)1
Google Analytics integration
AdWords integration
Facebook Ads integration
MailChimp integration
(Pipedrive integration)
(AWS integration)
overall rating ⭐⭐⭐ ⭐⭐⭐⭐

Both tools miss built-in functionality for cumulative charts, but provide means for importing own custom data. When it comes to the integration of third-party services, Geckoboard supports a significantly lager range of available data sources. Because of this, I give Power BI an overall rating of 3 out of 5 (⭐⭐⭐). Since the pricing is more expensive and cumulative charts require an additional intermediate step, but the overall package makes a better impression regarding what we need at bitstars, Geckoboard receives a rating of 4 out of 5 (⭐⭐⭐⭐).

To summarize, if you’re fine with Google Analytics stats and some custom data imported via Excel files or an Azure DB, go for Power BI. Yet, if you rely on the seamless integration of a wider range of external services, you’re clearly better off with Geckoboard—unless you wanna implement the integration of the different services’ APIs yourself in a DIY solution.

1 These are given in parentheses because an additional intermediate step is required.