DataHem: open source, serverless, real-time and end-2-end ML pipeline on Google Cloud Platform

I’m excited to say that the project I’ve been working on the last year is now released as OpenSource (MIT license). DataHem is a serverless real-time end-2-end ML pipeline built entirely on GoogleCloud Platform services - AppEngine, PubSub, Dataflow, BigQuery, Cloud ML Engine, Deployment Manager, Cloud Build and Cloud Composer. When building ML/Data products, your most valuable asset is your data. Hence, the purpose of DataHem is to give you: - Full control and ownership of your data and data platform - Unsampled data - Data in real time - The ability to replay/reprocess your data unlimited times - Data synergies, i. »

Bigquery Training Resources for Digital Analysts

In this post I’ve tried to collect different training resources that I’ve found useful for myself, some for free and some for a fee. The focus is using BigQuery for digital analytics. If you are one of the lucky digital analysts who work for an organisation with the 360 version of Google Analytics or Firebase Blaze, but not started using BigQuery? Then, don’t wait for it, enable the BigQuery Export (read this post if you are acting in EU) and learn how to use BigQuery. »

Google Analytics Custom Dimension Alias in Bigquery

Second to being able to export your Google Analytics data to Google BigQuery, the feature I value the most with the premium version of GA is that you are not limited to 20 custom dimensions but have 200 to play with! However, if you have many custom dimensions, it quickly becomes hard to remember what dimension each index represents, the value isn’t always selfdescribing. Hence being able to give the custom dimension a more descriptive identifier than an index could be useful. »

Flatten Google Analytics Custom Dimensions with a BigQuery UDF

Updated 2018-04-23 with a fourth alternative - Unnest Are you one of the lucky digital analysts that have a google analytics premium account? (If not, checkout the open source solution DataHem to get premium features such as your GA data in BigQuery) Then you know you can export your data to Google BigQuery and analyze it in an adhoc and explorative manner using SQL. One frequent use case for BigQuery is to analyze many custom dimensions at the same time. »