Snapshots
Snapshots in dex allow you to track how data changes over time. They are especially useful for capturing Slowly Changing Dimensions (SCD), auditing historical records, and preserving point-in-time data for comparison or analysis.
While models define the current state of data, snapshots preserve historical versions by comparing incoming records with previously captured ones. When a change is detected, dex records a new version, allowing you to build a complete change history.
When to Use Snapshots
Use snapshots when you need to:
Track changes to customer, product, or order records over time
Build Slowly Changing Dimensions (Type 2)
Monitor key field changes in a source table
Create point-in-time views for analytics or regulatory compliance
Detect and store updates without overwriting previous values
How Snapshots Work
Each time a snapshot runs, dex:
Queries the source data
Compares each record against the previous version (based on unique keys and fields to track)
Inserts a new row only if a tracked field has changed
Adds metadata columns like
dbt_valid_from
,dbt_valid_to
, anddbt_scd_id
to enable temporal queries
This allows you to write queries like “what did this record look like last week?” or “how many customers changed plans in Q1?”
Defining a Snapshot
Snapshots are stored in the snapshots/
folder of your dex project and written using SQL.
Example
{% snapshot customer_snapshot %}
{{
config(
target_schema='snapshots',
unique_key='customer_id',
strategy='check',
check_cols=['email', 'plan']
)
}}
select
customer_id,
email,
plan,
created_at
from {{ source('crm', 'customers') }}
{% endsnapshot %}
Explanation:
unique_key
: Identifies each record (e.g.,customer_id
)strategy
: Defines how changes are detectedcheck
: Compares specific columns for changestimestamp
: Compares a column value likeupdated_at
check_cols
: Columns to monitor for changes
Materialization and Execution
Snapshots are materialized as tables in your warehouse and are updated incrementally over time. Each row represents a valid version of a record, along with timestamps that define its validity range.
Snapshots can be:
Run manually from the Develop tab
Triggered in Flows
Referencing Snapshots
You can query snapshots just like models:
select *
from {{ ref('customer_snapshot') }}
where dbt_valid_to is null
This gives you the most recent version of each record. To analyze historical changes, you can use dbt_valid_from
and dbt_valid_to
for time-based filtering.
Snapshot Folder Structure
All snapshots should be stored in the /snapshots/
directory in your dex project repo.
snapshots/
└── customer_snapshot.sql
└── subscription_status.sql
Each file defines one snapshot and can be accompanied by documentation and tests.
Best Practices
Use snapshots only for datasets that are mutable (i.e., values can change over time)
Monitor the size of snapshot tables—over time, they can grow large
Partition or archive old snapshot data if needed
Track only the columns that matter to reduce noise and improve performance
Combine with tests to detect unexpected or invalid changes
Last updated
Was this helpful?