# Snapshots

Snapshots in dex allow you to track how data changes over time. They are especially useful for capturing Slowly Changing Dimensions (SCD), auditing historical records, and preserving point-in-time data for comparison or analysis.

While models define the current state of data, snapshots preserve historical versions by comparing incoming records with previously captured ones. When a change is detected, dex records a new version, allowing you to build a complete change history.

### When to Use Snapshots

Use snapshots when you need to:

* Track changes to customer, product, or order records over time
* Build Slowly Changing Dimensions (Type 2)
* Monitor key field changes in a source table
* Create point-in-time views for analytics or regulatory compliance
* Detect and store updates without overwriting previous values

### How Snapshots Work

Each time a snapshot runs, dex:

1. Queries the source data
2. Compares each record against the previous version (based on unique keys and fields to track)
3. Inserts a new row **only if** a tracked field has changed
4. Adds metadata columns like `dbt_valid_from`, `dbt_valid_to`, and `dbt_scd_id` to enable temporal queries

This allows you to write queries like “what did this record look like last week?” or “how many customers changed plans in Q1?”

### Defining a Snapshot

Snapshots are stored in the `snapshots/` folder of your dex project and written using SQL.

#### Example

```sql
{% snapshot customer_snapshot %}

{{
  config(
    target_schema='snapshots',
    unique_key='customer_id',
    strategy='check',
    check_cols=['email', 'plan']
  )
}}

select
  customer_id,
  email,
  plan,
  created_at
from {{ source('crm', 'customers') }}

{% endsnapshot %}
```

#### Explanation:

* `unique_key`: Identifies each record (e.g., `customer_id`)
* `strategy`: Defines how changes are detected
  * `check`: Compares specific columns for changes
  * `timestamp`: Compares a column value like `updated_at`
* `check_cols`: Columns to monitor for changes

### Materialization and Execution

Snapshots are materialized as tables in your warehouse and are updated incrementally over time. Each row represents a valid version of a record, along with timestamps that define its validity range.

Snapshots can be:

* Run manually from the Develop tab
* Triggered in Flows

### Referencing Snapshots

You can query snapshots just like models:

```sql
select *
from {{ ref('customer_snapshot') }}
where dbt_valid_to is null
```

This gives you the most recent version of each record. To analyze historical changes, you can use `dbt_valid_from` and `dbt_valid_to` for time-based filtering.

### Snapshot Folder Structure

All snapshots should be stored in the `/snapshots/` directory in your dex project repo.

```
snapshots/
  └── customer_snapshot.sql
  └── subscription_status.sql
```

Each file defines one snapshot and can be accompanied by documentation and tests.

### Best Practices

* Use snapshots only for datasets that are mutable (i.e., values can change over time)
* Monitor the size of snapshot tables—over time, they can grow large
* Partition or archive old snapshot data if needed
* Track only the columns that matter to reduce noise and improve performance
* Combine with tests to detect unexpected or invalid changes


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dexlabs.io/lakehouse-platform/develop-with-dex/snapshots.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
