Develop with dex
In dex, developing data pipelines is built around structured, modular, and version-controlled workflows. Whether you're transforming raw data into clean, analytics-ready datasets or orchestrating complex dependencies across projects, dex offers a powerful developer experience designed for scalability and collaboration.
At the core of dex development is the concept of writing models
that define transformations, enriched with a set of tools and patterns to support data quality, reusability, governance, and automation. These tools include tests, documentation, seeds, snapshots, macros, variables, and more—all working together in a unified development framework.
Development in dex happens through Git-based workflows. Each environment is tied to a Git branch, and all changes go through commit-based versioning. This ensures every deployment is traceable, reproducible, and aligned with your organization’s software development lifecycle.
Below is a summary of the key building blocks you’ll use when developing with dex:
Core Development Concepts
Sync Sources
Configure access to external datasets already present in your cloud storage. Automatically detect schema and register sources for development.
Accessing Data Sources
Explore table schema, data previews, logs, and lineage directly within dex. Easily reference external sources in your models using the source()
function.
Core Execution Actions
Preview, Run, Test, and Build—core actions that allow you to inspect, validate, and materialize models during development and execution.
Models
SQL or Python files that define how raw data should be transformed into clean, structured outputs. Each model is typically a SELECT
statement saved as a .sql
file.
Layers
Structured stages of transformation (e.g., raw, cleaned, trusted) that help organize data pipelines for clarity, scalability, and governance.
Tests
Assertions that validate data quality, such as uniqueness, non-nullness, referential integrity, or custom conditions. Tests help catch issues early in the pipeline.
Documentation
Human-readable descriptions added to models, columns, and resources. Enables automatic generation of data catalog and lineage tools.
Seeds
Static data tables loaded from CSV or Parquet files. Useful for reference data like country codes, mappings, or product lists.
Snapshots
Track changes over time in slowly changing dimensions (SCD). Snapshots allow capturing and diffing changes in source records.
Jinja and Macros
Jinja is a templating language that allows dynamic SQL generation. Macros are reusable functions written in Jinja to standardize logic across models.
Project and Environment Variables
User-defined variables that allow parameterizing model logic and configuration. Can vary by project or environment and support secure storage.
Packages
Collections of models, macros, and other assets that can be reused across projects. Enables modularity and sharing of business logic.
Explore each concept in the sections that follow to learn how dex empowers scalable, collaborative, and production-grade analytics engineering.
Last updated
Was this helpful?