Core Concepts
dex is a unified data platform that brings together every layer of the modern data stack into a single, cohesive experience. It enables teams to ingest, transform, orchestrate, test, document, and govern their data—securely and at scale. With native Git integration and cloud-first architecture, dex helps organizations move faster while maintaining control and transparency across their entire data workflow.
A Unified Platform for the Entire Data Lifecycle
At its core, dex simplifies and automates every phase of the data journey:
Data Collection & Replication: Connect to SaaS tools, databases, and files using high-performance, proprietary connectors. Many sources support incremental sync and Change Data Capture (CDC).
Data Transformation: Build transformation logic in SQL or Python. All models are version-controlled and organized using dbt-like best practices.
Orchestration: Schedule, trigger, and monitor jobs with built-in orchestration—no external tools like Airflow required.
Data Testing: Add data quality checks and validations directly in your models. Fail fast and build confidence in your pipelines.
Governance & Lineage: View full DAGs, model-level metadata, schema changes, and logs. Easily understand how data flows across your systems.
Everything is backed by Git workflows, allowing safe collaboration, rollback, and reproducibility across environments.
Data Platform as a Self-managed Service
dex is a fully self-managed SaaS platform. You don’t need to install or maintain any infrastructure. All configuration, scaling, and security patching are handled automatically by the platform.
There is no hardware or software to install. Environments are configured through a simple UI, and infrastructure is provisioned and maintained by dex on your behalf. This means your team can stay focused on modeling, monitoring, and optimizing data—not managing systems.
dex operates entirely in the cloud, running on a cloud-native stack designed for elasticity, speed, and isolation. All processing happens in secure, isolated compute environments provisioned for your organization.
Query Processing
dex uses the native processing engines of your cloud data warehouse. SQL transformations are compiled and executed directly in BigQuery or Athena. Python tasks run through serverless containers. This model ensures that workloads are processed close to where your data lives—maximizing speed, minimizing egress, and keeping costs under control.
dex also supports resource configuration like query retries, timeouts, concurrency, and compute profiles per project or environment.
Cloud Services Layer
Beyond compute, dex also includes a robust cloud services layer that manages job execution, metadata, and automation logic. It coordinates activity across environments, ensuring safe deployment and stable orchestration.
Cloud services managed by dex include:
Authentication and permissions
Metadata tracking and lineage
Job dispatching and retry logic
Scheduling and alerting
Git-based version control and conflict resolution
These services are distributed, resilient, and scale automatically based on workload demand.
Supported Cloud Platforms
dex supports deployments across Amazon Web Services (AWS) and Google Cloud Platform (GCP). You can run different projects in different clouds depending on your infrastructure preferences. Cloud resources—including buckets, warehouses, and compute workers—are provisioned securely in your cloud account, or optionally managed by dex.
You can select preferred regions per environment, ensuring that your data remains compliant with geographic or regulatory requirements.
Common Use Cases
Build a unified data lakehouse
dex enables teams to ingest and model raw data from across business systems and store it in a centralized cloud warehouse. Whether you're using BigQuery, Athena, or Databricks, dex supports a lakehouse-style approach to building curated, production-grade datasets in a layered fashion (e.g., raw → staged → modeled → published).
This helps organizations break data silos, enforce governance, and maintain a single source of truth.
ETL and data engineering
dex is built for modern ELT workflows. Write SELECT
statements to define transformations, configure pipelines with minimal setup, and test your data as it moves across stages. With native Git integration, automated orchestration, and model-level testing, data teams can build and ship faster—without sacrificing reliability.
You can schedule daily batch jobs, run transformations on change events, or deploy Python pipelines with configurable Spark backends.
Process automation
dex can trigger data pipelines based on external events or schedules—powering everything from nightly reporting jobs to near real-time syncs with third-party tools. This enables teams to automate repetitive workflows such as syncing ad performance data, refreshing KPIs in dashboards, or delivering updated forecasts to stakeholders without manual intervention.
Department-specific projects
Create isolated projects for different business units—marketing, operations, product, or finance. Each team gets its own set of environments, models, connections, and workspaces, while still operating under the same organizational umbrella.
This structure promotes autonomy and focus while preserving centralized governance.
Governance and observability
dex provides detailed metadata and lineage tracking out of the box. Every model, column, and transformation is versioned and documented. You can trace how data moves through your system, identify stale or broken pipelines, and investigate errors through rich logs and job histories.
dex helps teams meet regulatory requirements, standardize naming and schemas, and build confidence in the accuracy of business metrics.
DevOps and CI/CD for data
Because everything in dex is Git-native, teams can adopt development best practices like pull requests, code reviews, branching strategies, and rollback. Each environment is tied to a Git branch, and all production automations run from committed code. You can safely promote changes from dev to staging to production with full control.
dex integrates with your GitHub or GitLab repository, making it easy to maintain reproducibility and auditability.
Last updated
Was this helpful?