# Quickstart dex with GCP

<div align="center"><figure><img src="/files/kwQXmVTi8i7TIixHg4CB" alt="" width="188"><figcaption></figcaption></figure></div>

In this guide, you'll learn how to set up dex with Google Cloud and BigQuery. We'll walk you through:

* Creating a Google Cloud Project
* Assigning required roles and credentials
* Connecting dex to BigQuery
* Setting up GitHub or GitLab for version control
* Access sample data in a public dataset.
* Writing your first SQL model
* Adding tests and documentation
* Automating your data pipelines

By the end, you’ll have a fully operational dex project running in your own cloud.

<details>

<summary>Step 1: Prerequisites</summary>

Before you begin, make sure you have:

* A **Google Cloud** account
* A **dex** account ([sign up here](https://dexlabs.io))
* A **GitHub or GitLab** account for code management

</details>

<details>

<summary>Step 2: Set Up your GCP Account</summary>

To complete the **"Connect to Your Cloud Account"** step in the deX platform, you'll need to configure a few resources in your GCP console. Once this is complete, you’ll receive a Service Account JSON key to use within dex.

### Create a GCP Project for dex

1. Open the Google Cloud Console.
2. Click **New Project**.
3. Fill in the required fields:
   * Name your project something like `your-org-lakehouse`
   * Choose a **Location** with your SRE or infra team for optimal cost/performance
4. Click **Create**

{% hint style="info" %}
Tip: In production, you may want separate projects for `dev` and `prod`. For this quickstart, one is enough.
{% endhint %}

{% embed url="<https://youtu.be/HSeTvUWjOoA>" %}
Creating a New Project on Google Cloud Console
{% endembed %}

### Create Cloud Credentials

dex needs specific GCP permissions to operate. We’ve made this easy with a setup script.

For reference, these are the permissions dex needs (generated by the setup script). You can read the full list of permissions and what they do [here](https://cloud.google.com/iam/docs/allow-policies).

{% code overflow="wrap" %}

```
storage.managedFolders.delete
storage.managedFolders.get
storage.managedFolders.list
storage.multipartUploads.abort
storage.multipartUploads.create
storage.multipartUploads.list
storage.multipartUploads.listParts
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
storage.objects.restore
storage.objects.update
bigquery.datasets.create
bigquery.datasets.get
bigquery.datasets.getIamPolicy
bigquery.jobs.create
bigquery.models.getMetadata
bigquery.models.list
bigquery.routines.get
bigquery.routines.list
bigquery.tables.create
bigquery.tables.get
bigquery.tables.getData
bigquery.tables.getIamPolicy
bigquery.tables.list
bigquery.tables.update
bigquery.tables.updateData
dataplex.projects.search
resourcemanager.projects.get
```

{% endcode %}

### Set Up GCP Credentials with Cloud Shell

To configure access between dex and BigQuery, follow these steps using Google Cloud Shell:

1\. Open the [GCP Console](https://console.cloud.google.com/)

2\. In the top bar, click on the project selector and choose the project you created or reserved for dex.

3\. Launch Cloud Shell Editor - Use the search bar at the top of the page to search for `Cloud Shell Editor` and click to open it.

4\. Inside the Cloud Shell Editor, click the **"Open Terminal"** button near the top of the screen.

5\. Paste the following command into the terminal to download the setup script:

{% code overflow="wrap" %}

```bash
wget https://raw.githubusercontent.com/dexlabsio/terraform-modules/refs/heads/main/gcp/cloud-shell/bigquery-service-account-setup.py
```

{% endcode %}

6\. Set your GCP project ID in the shell environment

```bash
export GOOGLE_CLOUD_PROJECT=$(gcloud config get-value project)
```

7\. Execute the setup script

```bash
python3 bigquery-service-account-setup.py
```

8\. Download your Service Account Key

After running the script, you’ll receive a **Service Account JSON key** on the screen

{% hint style="info" %}
Download and store this key securely — you’ll need it when connecting dex to BigQuery.
{% endhint %}

</details>

<details>

<summary>Step 3: Set up your Github/Gitlab account</summary>

dex uses Git for version control, CI/CD, and collaboration. You can choose to set up with GitHub or GitLab. We'll guide you through the **GitHub setup** since most of our customers are familiar with it.

### Create a Personal GitHub Account

{% hint style="info" %}
Skip this if you already have a GitHub account.
{% endhint %}

1. Go to [GitHub](https://github.com)
2. Click **Sign Up**
3. Verify your email

### Create an Organization <a href="#create-an-organization" id="create-an-organization"></a>

1. **Go to Your Organizations in the profile dropdown**
2. **Click New Organization**
3. **Choose a plan (Free or Team) and click Create Organization**

### Create an Empty Repository <a href="#create-an-empty-repository" id="create-an-empty-repository"></a>

1. In your new organization, go to the **Repositories** tab
2. Click **New Repository**
3. Fill in:
   1. Repository name (e.g. `dex-analytics`)
   2. Optional description
   3. Set to **Private**
4. Do **not** add a README—repo must be empty

{% hint style="info" %}
**Important:** The repository must be **brand new** and **empty** (no commits or README).
{% endhint %}

### Create a Git Access Token <a href="#create-a-git-access-token" id="create-a-git-access-token"></a>

{% embed url="<https://youtu.be/FbjdGurClBo>" %}
How to Create a Github Access Token
{% endembed %}

1. Go to **Settings** > **Developer Settings** > **Personal Access Tokens**
2. Click **Generate new token (classic)**
3. Configure the token:
   * **Name**: `dex_lakehouse_access_token`
   * **Expiration**: Optional
   * **Scopes**: Select `repo`
4. Click **Generate Token**
5. **Copy and save the token**—input this token in the dex setup

{% hint style="info" %}
If you prefer to limit access to specific repositories, use a [fine-grained personal access tokens](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#fine-grained-personal-access-tokens) token instead of a classic token.
{% endhint %}

</details>

<details>

<summary>Step 4: Connect dex to GCP and Git provider</summary>

1. Log in to [dex](https://app.dexlabs.io/auth/signin?callbackUrl=%2F)
2. Change your temporary password, if asked
3. Select Set Up with GCP&#x20;

</details>

<details>

<summary>Step 5: Set up your first Connection</summary>

In this example, we’ll connect to a sample Postgres database.

1. In the left-hand menu, go to **Connection > New Source**
2. Select **Postgres** from the connector catalog
3. Give your connection a name, like `Demo Postgres Database Connection`
4. Click **Next**, then enter the following credentials:
   1. **Host**: `dex-trial.db.aws.dexlabs.io`
   2. **Port**: `5432`
   3. **User**: `trial_user`
   4. **Database**: `postgres`
   5. **Password**: `%7R^RT4N#h#WdjpU2#or@W5`
   6. **SSH Tunnel**: Select **Don’t Use SSH Tunnel**
5. Click **Test** and then **Save**
6. Wait a few seconds while dex tests the connection
7. On the next screen, toggle all datapoints on
8. Click Run to execute a manual data replication
9. Click on the Runs tab to check your replication status

</details>

<details>

<summary>Step 6: Build your first model</summary>

Now that we have raw data connected, we can start modeling it. In this example, we'll organize data into two layers: `raw` and `cleaned`. Most organizations build additional layers (like `trusted`, `analytics`, or `mart`) on top of these, but this will give us a solid foundation.

#### Create Your Model Folders

1. In the left-hand menu, go to **Develop**
2. Right-click the `Models` folder and create two subfolders:
   * `1.raw`
   * `2.cleaned`

#### Create Your First Model: `customers.sql`

1. Right-click on the `1.raw` folder and create a new file: `customers.sql`
2. Paste the following code into the file (update the `from` clause with your copied source):

```sql
    select
        customer_state as customer_state,
        customer_unique_id as customer_unique_id,
        regexp_extract(customer_id,r'^.{0,3}') as customer_id,
        customer_city as customer_city,
        customer_zip_code_prefix as customer_zip_code_prefix
    from 
        <your_copy_as_source_here>
```

3. If you encounter any errors related to the data source configuration, refer to [this section](/lakehouse-platform/develop-with-dex/accessing-data-sources.md#referencing-sources-in-models) in the documentation
4. Click **Save**
5. Click **Preview** and **Run** to see the results

#### Add a Second Model: `orders.sql`

1. Right-click `1.raw` again and add a new file: `orders.sql`
2. Paste in the following query:

   ```sql

   SELECT
       order_id,
       customer_id,
       order_status,
       order_purchase_timestamp,
       order_approved_at,
       order_delivered_carrier_date,
       order_delivered_customer_date,
       order_estimated_delivery_date
   FROM 
       <your_copy_as_source_here>
   ```
3. If you encounter any errors related to the data source configuration, refer to [this section](/lakehouse-platform/develop-with-dex/accessing-data-sources.md#referencing-sources-in-models) in the documentation
4. Click **Save**
5. Click **Preview** and **Run** to see the results

#### Add a Third Model: `order_payments.sql`

1. Right-click `1.raw` again and create: `order_payments.sql`
2. Paste in the following code:

   ```sql
       select
           payment_type as payment_type,
           payment_value as payment_value,
           payment_installments as payment_installments,
           payment_sequential as payment_sequential,
           order_id as order_id
       from 
           <your_copy_as_source_here>
   ```
3. If you encounter any errors related to the data source configuration, refer to [this section](/lakehouse-platform/develop-with-dex/accessing-data-sources.md#referencing-sources-in-models) in the documentation
4. Click **Save**
5. Click **Preview** and **Run** to validate the model

#### Create a Cleaned Model: `customer_orders.sql`

1. Right-click on `2.cleaned` and create a file called `customer_orders.sql`
2. Add the following code to join data across the three models:

```sql
{{
  config(
    tags=['finance']
  )
}}

with 

orders_info as (
    select
    order_id as order_id,
    customer_id as customer_id,
    order_purchase_timestamp as order_date,
    order_status as order_status
    from 
    {{ ref ('orders') }}
),

payments_info as (
    select
        payment_type as payment_type,
        payment_value as payment_value,
        payment_installments as payment_installments,
        order_id as order_id
    from 
        {{ ref('order_payments') }}
),

customer_info as (
    select
        customer_id as customer_id,
        customer_city as customer_city
    from
        {{ ref('customers') }}
)

select
    o.order_id,
	o.customer_id,
    ci.customer_city,
    py.payment_value,
    py.payment_type,
    py.payment_installments,
    o.order_date,
    o.order_status
from
orders_info o
left join
payments_info py on py.order_id = o.order_id
left join    
customer_info ci on ci.customer_id = o.customer_id
```

3. Click **Save**
4. Click **Preview** and **Run** to see the final output

{% hint style="info" %}
Read more about Models in the [Develop with dex](/lakehouse-platform/develop-with-dex.md) page
{% endhint %}

</details>

<details>

<summary>Step 7: Change the way your model is materializaed</summary>

One of the most powerful features of dex is the ability to control how models are materialized in your warehouse—**without changing SQL**. With a single configuration value, you can switch models from **views** to **tables**, and vice versa.

This gives you the flexibility to optimize performance and cost, while keeping your modeling layer clean and focused on business logic.

By default, all models are materialized as **views**. You can override this at the **directory level**, so every model inside that folder uses a different materialization strategy.

#### Update Your Project Configuration

1. In your file explorer, open the `dbt_project.yml` file at the root of your project
2. Update the project `name` (line 5) to: `dex_lakehouse`
3. Define how your `1.raw` and `2.cleaned` models should be materialized as **tables** by adding this configuration under `models:` (line 28):

```yaml
...

# Configuring models
# Full documentation: https://dex-labs-1.gitbook.io/wip-dex-docs/project-settings-and-defaults

# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
models:
  dex_lakehouse:
    # Config indicated by + and applies to all files under models/example/
    example:
      +materialized: view
    1.raw:
      +materialized: table
      +schema: raw
    2.cleaned:
      +materialized: table
      +schema: cleaned

```

4. Save the file
5. If you want to override materialization for a specific model, you can do it inline at the top of the file using the `config()` block:

```sql
{{
  config(
    materialized='view'
  )
}}
```

{% hint style="info" %}
Read more about materialization in the [Materialization](/lakehouse-platform/develop-with-dex/materialization.md) page
{% endhint %}

</details>

<details>

<summary>Step 8:  Add tests to your models</summary>

* **Right-click** the `2.cleaned` folder in the Explorer panel and select **New File**.
* Name the file: `customer_orders.yml`
* Paste the following content into the file:

```yaml
version: 2

models:
  - name: customer_orders
    description: "Joined dataset combining order, payment, and customer information."
    columns:
      - name: order_id
        description: "Unique identifier for each order"
        tests:
          - not_null
          - unique

```

#### What This Does

* **Model-level metadata**: You are describing the `customer_orders` model.
* **Column-level tests**:
  * `not_null`: Ensures every row has an `order_id`
  * `unique`: Ensures no duplicate `order_id` exists

Once you've saved the `.yml` file:

* Run the model again.
* The tests will automatically execute right after the model builds.
* If a test fails, the flow status will change to **Failed**, and a notification will be sent to the Notification Center for review.

{% hint style="info" %}
Read more about tests in the [Tests](/lakehouse-platform/develop-with-dex/tests.md) page
{% endhint %}

</details>

<details>

<summary>Step 9: Commit your changes</summary>

After building your first set of models in dex—such as `customers.sql`, `orders.sql`, `order_payments.sql`, and `customer_orders.sql`—and configuring metadata in `customer_orders.yaml` and `dbt_project.yml`, it's time to commit those changes to version control.

dex provides an integrated Git workflow so you can track, manage, and push your changes directly from the platform.

#### 1. Open the Git Tab

Navigate to the **Git** tab from the left-hand **Develop** menu. dex will list all the files you've created or modified since your last commit. In this case, you should see the following:

* `models/1.bronze/customers.sql`
* `models/1.bronze/orders.sql`
* `models/1.bronze/order_payments.sql`
* `models/2.silver/customer_orders.sql`
* `models/2.silver/customer_orders.yaml`
* `dbt_project.yml`

#### 2. Stage Your Changes

Review the file list and click the checkbox next to each file you'd like to include in the commit. Click on any filename to open a diff view that highlights the changes made—ideal for validating updates before they are finalized.

Once you're ready, enter a **commit title** (e.g., `Initial modeling: customer orders pipeline`) and optionally include a **description** to give your team more context.

Click **Stage files to commit**, then **Commit** to save your changes locally.

#### 3. Push to Your Git Repository

With the changes committed, you’ll now want to push them to your remote Git repository. Click the **Push** button at the bottom of the Git panel.

dex will push to the branch configured for the current environment. For example, if you're in the `prod` environment, it will push to the `prod` branch in your GitHub or GitLab repository.

Once pushed, your changes become available to the rest of your team and will be picked up by any Flows, automations, or scheduled runs configured on that branch.

{% hint style="info" %}
Read more about Github/Gitlab integration in the [Version Control with Git](/lakehouse-platform/version-control-with-git.md) page
{% endhint %}

</details>

<details>

<summary>Step 10: Automate your pipeline with Flow</summary>

Now that you've built and tested your data models, it's time to automate the entire workflow. In dex, this is done using **Flows**—orchestration pipelines that can run on a schedule or be triggered manually.

### Creating a Flow to Automate Your Workflow

#### 1. Navigate to the Flows menu

Open the **Flows** section from the left-hand navigation menu.

#### 2. Create a new flow

Click the **New Flow** button on the top right. Fill in the required fields:

* **Name**: Give your flow a meaningful name (e.g., `daily ecommerce pipeline`)
* **Description** (optional): Briefly describe what the flow will do

Then, configure the schedule:

* Set the run time to **Every day at 04:00 AM**
* Click **Continue** to proceed

#### 3. Add a Connection node

The first step in your pipeline is data ingestion. Create a Connection node and select the connector you configured in Step 5 of your setup. This ensures your source data is always up-to-date before the transformations run.

#### 4. Add a new Transformation node

Create a **Transformation** node by selecting:

* **Project**: your current working project
* **Environment**: the correct environment (e.g., `production` or `dev`)

In the **Include** field, type: `+customer_orders.sql`

Then press `Enter`.

{% hint style="info" %}
This argument tells dex to execute the `customer_orders.sql` model and all of its upstream dependencies. The `+` prefix automatically includes every model required for this one to work—no need to manually list them.
{% endhint %}

#### 5. Save and Run

Click **Save** to store your Flow configuration.

Now, run it manually by using the **Run Now** option. This helps confirm your flow works as expected before it runs on schedule.

#### 6. Done!

Your ingestion and transformation flow is now live and will run automatically **every day at 04:00 AM**.

#### 7. Monitor Runs

You can view run history and inspect execution results by clicking the **Runs** tab inside your flow. This includes:

* Run status (e.g., succeeded or failed)
* Start and end time
* Task execution logs
* Troubleshooting details for each model

{% hint style="info" %}
Read more about Flows in the [Flows and Automation](/lakehouse-platform/flows-and-automation.md) page
{% endhint %}

</details>

### Congratulations! You’ve Completed Your First Data Journey

You’ve just built a fully operational data pipeline using dex—from ingestion to transformation to orchestration.

All data you’ve generated and automated is now stored in your own cloud environment—fully queryable and ready to be consumed by any BI tool, notebook, or data science workflow.

This is a solid foundation that mirrors real-world data engineering best practices. But this is just the beginning.

dex is built to grow with your complexity—whether that’s more data sources, advanced transformations, or multiple teams collaborating on analytics. The rest of our documentation will help you expand your capabilities, customize workflows, and unlock new use cases.

Happy building!&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dexlabs.io/lakehouse-platform/get-started-with-dex/quickstart-dex-with-gcp.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
