DataHem: open source, serverless, real-time and end-2-end ML pipeline on Google Cloud Platform

I’m excited to say that the project I’ve been working on the last year is now released as OpenSource (MIT license). DataHem is a serverless real-time end-2-end ML pipeline built entirely on GoogleCloud Platform services - AppEngine, PubSub, Dataflow, BigQuery, Cloud ML Engine, Deployment Manager, Cloud Build and Cloud Composer. When building ML/Data products, your most valuable asset is your data. Hence, the purpose of DataHem is to give you:...

June 1, 2018 · 2 min · Robert Sahlin

Clarifying Analytics Requests = 5 x Why + So What

Most analysts want to avoid a situation where they spend a lot of time working on ad-hoc, vague and misguided analytics/data requests instead of focusing on hypothesis testing. But hey, you will not be able to avoid those requests entirely, so what can you do in order to clarify the requests you can’t hold at bay? I use two simple methods to clarify analytics requests: 5 x why? 5 Whys is an iterative interrogative technique used to explore the cause-and-effect relationships underlying a particular problem and I’ve found that it also works really well to clarify analytics requests....

January 23, 2018 · 3 min · Robert Sahlin

Bigquery Training Resources for Digital Analysts

In this post I’ve tried to collect different training resources that I’ve found useful for myself, some for free and some for a fee. The focus is using BigQuery for digital analytics. If you are one of the lucky digital analysts who work for an organisation with the 360 version of Google Analytics or Firebase Blaze, but not started using BigQuery? Then, don’t wait for it, enable the BigQuery Export (read this post if you are acting in EU) and learn how to use BigQuery....

December 15, 2017 · 2 min · Robert Sahlin

Flatten Firebase Properties and Parameters in Bigquery

At Google I/O May 2017, Firebase announced Google Analytics for Firebase, a fantastic tool that automatically captures data on how people are using your iOS and Android app and lets you define your own custom app events. Like Google Analytics 360, it offers the ability to export raw data to Google BigQuery for custom analysis. There are a few posts on Google Cloud Platform Blog and Firebase Blog on how to query the Firebase dataset, but none of them giving much advise on how to analyze multiple properties and parameters at the same time....

December 8, 2017 · 2 min · Robert Sahlin

Google Analytics Custom Dimension Alias in Bigquery

Second to being able to export your Google Analytics data to Google BigQuery, the feature I value the most with the premium version of GA is that you are not limited to 20 custom dimensions but have 200 to play with! However, if you have many custom dimensions, it quickly becomes hard to remember what dimension each index represents, the value isn’t always selfdescribing. Hence being able to give the custom dimension a more descriptive identifier than an index could be useful....

December 7, 2017 · 2 min · Robert Sahlin