# Sync Sources

## Sync Sources

Sync Sources enable users to access external datasets available in their cloud storage—even if those datasets were not ingested using native dex Connections.

This functionality ensures dex can operate seamlessly in environments where data may already exist in buckets, databases, or warehouses managed outside of dex’s ingestion engine.

### Why Use Sync Sources?

* Leverage datasets that were uploaded, replicated, or generated externally.
* Reference tables from external pipelines, data lakes, or manually loaded storage locations.
* Automatically configure syncs for datasets ingested using dex native connectors.

### How to Configure a Sync Source

<figure><img src="/files/RJxTT8161QTLjg399jhL" alt=""><figcaption><p>Configuring Sync Sources</p></figcaption></figure>

To configure a new sync source:

1. Navigate to the **Sync Sources** menu in the left sidebar.
2. Click **+ Add** to create a new Sync Source.
3. Provide the following information:
   * **Environment**: The environment (e.g. dev, staging, prod) in which this dataset exists.
   * **Dataset Name**: The exact name of the dataset in your cloud storage. Make sure it matches the naming in your warehouse or lake.
4. Click **Save**.

{% hint style="info" %}
dex will search for this dataset in the default project or bucket defined in your environment's configuration.
{% endhint %}

### Syncing the Source

Once configured, you’ll need to **sync** the source so it becomes available during development:

1. Open the **Develop** menu in the sidebar.
2. Inside the **Explorer** tab, locate the **Sync Sources** button at the top.

<figure><img src="/files/9LEt6IZw2yy8Wf8kmcgg" alt=""><figcaption><p>Sync Sources button</p></figcaption></figure>

1. A panel will display all configured Sync Sources and their sync status:
   * 🟢 Green dot = Source is synced and ready to use.
   * 🟡 Yellow dot = Source is not yet synced.
2. Click any unsynced source to trigger a sync.

dex will automatically generate a `.yml` file describing the dataset schema, including:

* Metadata
* Identifiers
* List of tables

This YAML file is the last required asset to allow referencing the source in your transformation models.

<figure><img src="/files/UzYKJTw5h4gtHz14EDRd" alt=""><figcaption><p>Example .yml file</p></figcaption></figure>

### Native Connectors Sync Automatically

If your dataset was ingested through a **dex native connector**, the Sync Source will be:

* Automatically created
* Automatically synced
* Instantly usable in your transformation layer

### Referencing a Source

Once the sync completes, you can use the `source()` function inside your SQL or Python models like so:

```sql
select * from {{ source('my_dataset', 'orders') }}
```

For more details on how dex connects to various databases, files, and APIs, continue reading [Accessing Data Sources](/lakehouse-platform/develop-with-dex/accessing-data-sources.md).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.dexlabs.io/lakehouse-platform/develop-with-dex/sync-sources.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
