Develop with dex

In dex, developing data pipelines is built around structured, modular, and version-controlled workflows. Whether you're transforming raw data into clean, analytics-ready datasets or orchestrating complex dependencies across projects, dex offers a powerful developer experience designed for scalability and collaboration.

At the core of dex development is the concept of writing models that define transformations, enriched with a set of tools and patterns to support data quality, reusability, governance, and automation. These tools include tests, documentation, seeds, snapshots, macros, variables, and more—all working together in a unified development framework.

Development in dex happens through Git-based workflows. Each environment is tied to a Git branch, and all changes go through commit-based versioning. This ensures every deployment is traceable, reproducible, and aligned with your organization’s software development lifecycle.

Below is a summary of the key building blocks you’ll use when developing with dex:

Core Development Concepts

Sync Sources

Configure access to external datasets already present in your cloud storage. Automatically detect schema and register sources for development.

Accessing Data Sources

Explore table schema, data previews, logs, and lineage directly within dex. Easily reference external sources in your models using the source() function.

Core Execution Actions

Preview, Run, Test, and Build—core actions that allow you to inspect, validate, and materialize models during development and execution.

Models

SQL or Python files that define how raw data should be transformed into clean, structured outputs. Each model is typically a SELECT statement saved as a .sql file.

Layers

Structured stages of transformation (e.g., raw, cleaned, trusted) that help organize data pipelines for clarity, scalability, and governance.

Tests

Assertions that validate data quality, such as uniqueness, non-nullness, referential integrity, or custom conditions. Tests help catch issues early in the pipeline.

Documentation

Human-readable descriptions added to models, columns, and resources. Enables automatic generation of data catalog and lineage tools.

Seeds

Static data tables loaded from CSV or Parquet files. Useful for reference data like country codes, mappings, or product lists.

Snapshots

Track changes over time in slowly changing dimensions (SCD). Snapshots allow capturing and diffing changes in source records.

Jinja and Macros

Jinja is a templating language that allows dynamic SQL generation. Macros are reusable functions written in Jinja to standardize logic across models.

Project and Environment Variables

User-defined variables that allow parameterizing model logic and configuration. Can vary by project or environment and support secure storage.

Packages

Collections of models, macros, and other assets that can be reused across projects. Enables modularity and sharing of business logic.

Hooks and Operations

Custom scripts triggered before or after model execution. Common use cases include granting permissions, sending alerts, or refreshing external systems.

Tags

Lightweight metadata used to group or classify models. Useful for running subsets of models (e.g., nightly jobs or finance-related models).

Explore each concept in the sections that follow to learn how dex empowers scalable, collaborative, and production-grade analytics engineering.

PreviousSet up a Connection NextSync Sources

Last updated 1 month ago

Was this helpful?