For the complete documentation index, see llms.txt. This page is also available as Markdown.

Snapshots

Snapshots in dex allow you to track how data changes over time. They are especially useful for capturing Slowly Changing Dimensions (SCD), auditing historical records, and preserving point-in-time data for comparison or analysis.

While models define the current state of data, snapshots preserve historical versions by comparing incoming records with previously captured ones. When a change is detected, dex records a new version, allowing you to build a complete change history.

When to Use Snapshots

Use snapshots when you need to:

  • Track changes to customer, product, or order records over time

  • Build Slowly Changing Dimensions (Type 2)

  • Monitor key field changes in a source table

  • Create point-in-time views for analytics or regulatory compliance

  • Detect and store updates without overwriting previous values

How Snapshots Work

Each time a snapshot runs, dex:

  1. Queries the source data

  2. Compares each record against the previous version (based on unique keys and fields to track)

  3. Inserts a new row only if a tracked field has changed

  4. Adds metadata columns like dbt_valid_from, dbt_valid_to, and dbt_scd_id to enable temporal queries

This allows you to write queries like “what did this record look like last week?” or “how many customers changed plans in Q1?”

Defining a Snapshot

Snapshots are stored in the snapshots/ folder of your dex project and written using SQL.

Example

Explanation:

  • unique_key: Identifies each record (e.g., customer_id)

  • strategy: Defines how changes are detected

    • check: Compares specific columns for changes

    • timestamp: Compares a column value like updated_at

  • check_cols: Columns to monitor for changes

Materialization and Execution

Snapshots are materialized as tables in your warehouse and are updated incrementally over time. Each row represents a valid version of a record, along with timestamps that define its validity range.

Snapshots can be:

  • Run manually from the Develop tab

  • Triggered in Flows

Referencing Snapshots

You can query snapshots just like models:

This gives you the most recent version of each record. To analyze historical changes, you can use dbt_valid_from and dbt_valid_to for time-based filtering.

Snapshot Folder Structure

All snapshots should be stored in the /snapshots/ directory in your dex project repo.

Each file defines one snapshot and can be accompanied by documentation and tests.

Best Practices

  • Use snapshots only for datasets that are mutable (i.e., values can change over time)

  • Monitor the size of snapshot tables—over time, they can grow large

  • Partition or archive old snapshot data if needed

  • Track only the columns that matter to reduce noise and improve performance

  • Combine with tests to detect unexpected or invalid changes

Last updated

Was this helpful?