Components Description
Explore Stages
Select a tab to see a detailed explanation here.
In this project I tried to use differet tools to simulate an AppStore full data pipeline. The journay the data takes, and how different technologies help at different steps. My main goal is to showcase a variety of concepts and tools I have experience with.
Hover over the elements in the overview to see a short desription, and click the tabs in the components section to explore each stage in detail.
Python script to synthesize user interactions
GCP Pub/Sub (alt. Kafka)
GCP Dataflow, Apache Beam (alt. Spark/Flink)
BigQuery, Google Cloud Storage+Parquet
BigQuery, Google Cloud Storage+Parquet
Metabase (alt. Power BI, Tableau)
BigQuery, Google Cloud Storage+Parquet
BigQueryML and Vertex AI
Select a tab to see a detailed explanation here.