# Layers

Modeling your data in layers is one of the most important best practices in building scalable, maintainable, and trustworthy data pipelines. By organizing models into distinct transformation stages, teams can more easily manage complexity, ensure data quality, and collaborate effectively across domains and departments.

Layering provides structure and clarity. It helps data teams isolate responsibilities (e.g. ingestion vs. business logic), trace data lineage, and deploy pipelines incrementally with confidence.

### Why Use Layers?

* **Modularity**: Each layer has a focused responsibility—making your models easier to understand and maintain.
* **Observability**: You can track issues and errors at the right stage in the pipeline (e.g. ingestion vs. logic).
* **Reusability**: Upstream layers can be shared across multiple use cases (e.g. cleaned tables reused in different marts).
* **Performance**: You can apply caching, optimization, or materialization strategies differently per layer.
* **Governance**: Different stakeholders can own different layers (e.g. Data Engineering owns bronze, Analytics owns gold).
* **Incremental delivery**: Promote data progressively through each layer, validating logic and tests as it moves.

### Common Layering Patterns

dex doesn’t enforce a single naming convention, but here are two of the most common and recommended layer structures:

#### Option 1: Bronze / Silver / Gold

| Layers   | Purpose                                                                       |
| -------- | ----------------------------------------------------------------------------- |
| `bronze` | Raw ingested data from external sources (minimal changes)                     |
| `silver` | Cleaned, typed, and conformed data (joins, filters, types)                    |
| `gold`   | Business-ready data marts used for reporting, dashboards, or machine learning |

**Folder structure:**

```plaintext
models/
  └── 1.bronze/
  └── 2.silver/
  └── 3.gold/
```

#### Option 2: Raw / Cleaned / Trusted

| Layers    | Purpose                                                      |
| --------- | ------------------------------------------------------------ |
| `raw`     | Exact copy of source tables (e.g. staging from API/DB dumps) |
| `cleaned` | Standardized tables with logic applied (naming, filtering)   |
| `trusted` | Curated datasets with validated business logic               |

**Folder structure:**

```plaintext
models/
  └── 1.raw/
  └── 2.cleaned/
  └── 3.trusted/
```

### Naming Conventions

To help with sorting, readability, and execution order, it’s a good practice to prefix folders with numeric indicators:

```plaintext
models/
  └── 1.raw/
  └── 2.cleaned/
  └── 3.trusted/
```

This helps:

* Sort layers logically in code editors and interfaces
* Visually distinguish model responsibilities
* Avoid errors in large or fast-moving projects

### Layering in Practice

```sql
-- 1.raw/orders.sql
select * from {{ source('shopify', 'orders') }}

-- 2.cleaned/orders.sql
select order_id, customer_id, order_date
from {{ ref('raw__orders') }}
where order_status != 'cancelled'

-- 3.gold/orders_by_region.sql
select region, count(*) as order_count
from {{ ref('cleaned__orders') }}
group by region

```

### Best Practices

* Treat each layer as a separate responsibility
* Use `ref()` to clearly define upstream dependencies
* Apply tests between layers (e.g. row counts, integrity checks)
* Document the purpose of each layer and model
* Materialize layers differently based on usage (e.g. views in raw, tables in gold)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dexlabs.io/lakehouse-platform/develop-with-dex/layers.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
