Schema evolution in streaming Dataflow jobs and BigQuery tables, part 2

In the previous post, I covered the protobuf (schema definition) part of the solution. This post will focus on how we create or patch BigQuery tables without interrupting the real-time ingestion. 2 BigQuery BigQuery is Google’s serverless data warehouse, and it is awesome (and I’ve experience from Hive, Presto, SparkSQL, Redshift, Microsoft PDW, …). It is a scalable data solution that helps companies store and query their data or apply machine learning models. »

Schema evolution in streaming Dataflow jobs and BigQuery tables, part 1

In the previous post, I gave an overview of MatHem’s streaming analytics platform DataHem. This post will focus on how we manage schema evolution without sacrificing real-time data or having downtime in our data ingestion. The streaming analytics platform is built entirely on Google Cloud Platform and use services such as Dataflow, BigQuery and PubSub extensively. Another important component are protobuf schemas. 1 Protocol buffers There are many different frameworks for serialization/deserialization of data. »

Get all unique Firebase Analytics events in BigQuery

As I mentioned in my earlier post about the drawbacks with the entity-attribute-value data model used in Firebase Analytics and Google Analytics app plus web, it is hard to know what events and associated attributes and data types are logged without proper documentation. Another way to get an overview is to actually query the table. Below you find an example of how to do it. SELECT event_name, ARRAY_AGG(struct(name, value)) as attribute FROM( SELECT event_name, param. »

Why Google Analytics App + Web BigQuery Export Rocks and Sucks

Google recently released Google Analytics App + Web which essentially is something like Firebase Analytics for web (or Google Analytics version 2 if you want to). This is exciting for many reasons, two of them are: Google is finally moving away from a user-session-pageview based model to one built on users and events It supports BigQuery export also for standard users That is awesome. These two are actually two of the primary reasons why I built datahem. »

Configure Firebase Analytics and Google Analytics app + web Bigquery export to EU region

January 2019 when I tried to set up BigQuery export on our Firebase Analytics projects I found out that I couldn’t chose region for the export and that it defaults to US. Since my employer is an European comapny I prefer to store the data in EU. This is the exact same issue that I had previously with GA360 BigQuery Export and I thought I would try to solve it in a similar manner (that has become part of the the GA360 BigQuery Export documentation for how to geolocate your data in EU). »