Discovery for a Marketing Data Mart

A Melbourne based social SaaS business required a solution to connect their marketing activity and financial outcomes. Their objective was to move from fragmented platform-siloed reporting to a centralised single source of truth.

The Problem: "Data Blind Spots"

The organisation was capturing high-quality data across GA4, HubSpot, Google Ads, and Chargebee, but lacked the infrastructure to connect them.

  • The Goal: Design a system capable of calculating LTV/CAC and Pipeline Velocity across multi-touch customer journeys.
  • The Complexity: Mapping the transition from anonymous web traffic to paid subscriptions; with data gaps as a result of cookie laws. 

Phase 1: Conceptual & Logical Data Modeling

The core of the design was a Relational Entity Model that defined how "Leads" in a CRM relate to "Invoices" in a billing system.

Key Design Element: A schema that mapped the "Golden Path" of a customer: WebEvent (Traffic) → Lead (MQL/SQL) → Opportunity (Sales) → Activation (Revenue)

LayerComponentsValue
Ingestion (dlt)APIs / CDCAutomated, reliable data movement.
Storage (BigQuery)Raw Data / Cleaned DataHigh-performance, scalable query engine.
Logic (dbt)Models, Staging, MartsClean, version-controlled metrics.
PresentationLooker StudioActionable insights for stakeholders.

This allowed for the future design and implementation of multi-touch attribution, where a "first click" from a Google Ad could be programmatically linked to a plan activation six months later.

Phase 2: The Data Stack

I evaluated two primary architectural paths to provide the organisation with a "Build vs. Buy" decision matrix:

Option A: The Cloud-Native Build

  • Philosophy: High control, lower long-term subscription costs with cost guradrails and governance.
  • Stack: BigQuery (Warehouse), dlt/dbt (open-source transformation), Cloud Run (Serverless execution).
  • Strategic Value: Ideal for organisations with internal engineering capacity looking to avoid "SaaS sprawl."

Option B: Managed SaaS Integration

  • Philosophy: Faster time-to-value, lower maintenance overhead.
  • Stack: Airbyte Cloud (Ingestion) and dbt Cloud (Transformation).
  • Strategic Value: Ideal for rapid deployment with minimal DevOps requirements.

Phase 3: CI/CD & Governance Framework

Beyond just "moving data," the design included a potential operational framework:

  • Infrastructure-as-Code: Utilising GitHub Actions and Cloud Build for automated deployment.
  • Data Observability: A logging and monitoring strategy using Google Cloud Monitoring to ensure data "freshness" and reliability.
  • Security: A centralised Secrets Management strategy to handle API keys for third-party platforms like HubSpot.

Conclusion: The Value Delivered

The project did not move to the implementation phase, but the design provided the organisation with:

  • A Technical Roadmap: A clear direction for future data engineering hires.
  • A Data Dictionary: Standardised definitions for common terms "Activation" and "MQL" across the business.
  • Cost Transparency: A full breakdown of projected cloud consumption and licensing costs.