> ## Documentation Index
> Fetch the complete documentation index at: https://lightdash-docs-data-app-visualizations.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Sandboxes

> How Lightdash runs LLM-generated code safely in isolated sandboxes, and how to configure the sandbox provider on a self-hosted instance.

<Note>
  🛠 This page is for engineering teams self-hosting their own Lightdash instance. On Lightdash Cloud, sandboxes are fully managed for you — there's nothing to configure.
</Note>

## What sandboxes are for

Some Lightdash features use an AI agent (Claude Code) that **writes and runs code on
your behalf**. Today that's:

* **AI writeback** — the agent edits your dbt project (e.g. adds a metric or dimension),
  runs `lightdash compile` to validate it, and opens a pull request.
* **Data app generation** — the agent generates and builds a small web app from a prompt.

Running model-generated code directly on the Lightdash server would be unsafe: the code
is untrusted, can run arbitrary commands, and needs its own toolchain (git, dbt, the
Lightdash CLI, Node). Lightdash instead runs each agent inside a **sandbox** — an isolated,
disposable environment with a constrained network. The agent does its work there, Lightdash
collects the result (a PR, a built app), and the sandbox is torn down.

Sandboxes are also what make these features **fast** and **multi-turn**: a sandbox can be
suspended between turns and resumed later, so a conversation with the agent keeps its state
without holding a container open the whole time.

## Sandbox providers

The sandbox backend is pluggable. Lightdash talks to a provider-neutral interface, so the
same feature code runs on whichever backend your deployment is configured for. You select
the provider with the `SANDBOX_PROVIDER` environment variable.

| Provider                           | `SANDBOX_PROVIDER` | Use for                           | Isolation |
| ---------------------------------- | ------------------ | --------------------------------- | --------- |
| **E2B** (default)                  | `e2b`              | Production / managed              | microVM   |
| **AWS Lambda MicroVMs**            | `lambda-microvm`   | Production on AWS (self-hosted)   | microVM   |
| **Azure Container Apps Sandboxes** | `azure-sandboxes`  | Production on Azure (self-hosted) | microVM   |
| **Local Docker**                   | `docker`           | Local development only            | container |

**E2B**, **AWS Lambda MicroVMs**, and **Azure Container Apps Sandboxes** are all supported
production backends. E2B is the managed default; the AWS and Azure providers are for teams
who want sandboxes to run inside their own cloud account. More providers (Kubernetes, ECS)
are planned.

<Warning>
  The **local Docker provider is for development only**. It launches plain Docker containers
  via the Docker socket, which is root-equivalent on the host and provides no real isolation
  between the sandbox and your machine. It **refuses to start when `NODE_ENV=production`**.
  Do not use it for a production deployment.
</Warning>

## E2B (production default)

[E2B](https://e2b.dev) runs each sandbox as a Firecracker microVM in E2B's cloud. It's the
default — if you don't set `SANDBOX_PROVIDER`, Lightdash uses E2B.

To use it you need an E2B account and API key, and the agent needs an Anthropic API key:

```bash theme={null}
SANDBOX_PROVIDER=e2b            # default, can be omitted
E2B_API_KEY=e2b_...            # from your E2B dashboard
ANTHROPIC_API_KEY=sk-ant-...   # the agent (Claude Code) runs inside the sandbox
```

The sandbox images are E2B *templates*. Lightdash uses separate templates for data apps and
for AI writeback so they can be pinned or rolled back independently. These default to the
published Lightdash templates and rarely need to be set:

```bash theme={null}
E2B_TEMPLATE_NAME=lightdash/lightdash-data-app
E2B_TEMPLATE_TAG=<lightdash-version>             # defaults to your Lightdash version
E2B_AI_WRITEBACK_TEMPLATE_NAME=lightdash/lightdash-ai-writeback
E2B_AI_WRITEBACK_TEMPLATE_TAG=<lightdash-version>
```

## AWS Lambda MicroVMs (self-hosted production)

AWS Lambda MicroVMs run each sandbox as a Firecracker microVM **inside your own AWS
account**, so untrusted agent code and your repository contents never leave your
infrastructure. The microVMs have **no public IP** — your backend reaches each one
through an AWS-managed endpoint that requires a short-lived per-microVM token — and you
control their outbound network access (see [Networking and IAM](#networking-and-iam)).

This is the **recommended sandbox provider for customers deploying Lightdash on AWS** —
it keeps the sandbox boundary inside your existing AWS account and avoids sending agent
workloads or repository contents to a third-party service.

### Prerequisites

Provision these with your own IaC, in the **same AWS account and region your Lightdash
backend already runs in**:

* **Two MicroVM images** — one for data app generation and one for AI writeback (they
  bundle different toolchains). Build them from the Dockerfiles in the Lightdash repo
  (`sandboxes/data-apps/`, `sandboxes/ai-writeback/`, and the exec agent in
  `sandboxes/microvm-agent/`), push them to ECR, and register each as a Lambda MicroVM
  image on the AWS-managed `al2023` base — `ARM_64`, 4 GB memory, with the agent's
  `/ready` hook on port 8080. Each registration returns an **image ARN** for the config
  below.
* **Control-plane permissions on your backend's existing IAM role** — add `RunMicrovm`,
  `GetMicrovm`, `SuspendMicrovm`, `ResumeMicrovm`, `TerminateMicrovm`, and
  `CreateMicrovmAuthToken`.

<Note>
  Registering an image uses AWS's `create-microvm-image`, which needs a build role (trusting
  `lambda.amazonaws.com`) and an S3 location to stage the build context. These are
  **build-time only** — not the S3 bucket Lightdash already uses for results and snapshots,
  and the backend never touches them at runtime.
</Note>

### Configure the provider

```bash theme={null}
SANDBOX_PROVIDER=lambda-microvm
ANTHROPIC_API_KEY=sk-ant-...   # the agent (Claude Code) runs inside the microVM

# Region the microVMs run in (defaults to eu-west-1, the EU launch region)
LAMBDA_MICROVM_REGION=eu-west-1

# The image ARNs from registering the two MicroVM images above. Required.
LAMBDA_MICROVM_DATA_APP_IMAGE_ARN=arn:aws:lambda:<region>:<account>:microvm-image/...
LAMBDA_MICROVM_AI_WRITEBACK_IMAGE_ARN=arn:aws:lambda:<region>:<account>:microvm-image/...
```

The backend uses its ambient AWS credentials (instance role / IRSA / standard SDK
credential chain) to call the Lambda MicroVMs control plane, so no access keys are
configured here.

### Networking and IAM

<Warning>
  **Configure the network connectors before going to production.** The AWS-managed
  defaults give the microVM open inbound and outbound access, which means untrusted
  agent code can reach the public internet from inside your AWS account. We currently
  recommend pointing the egress connector at a VPC connector that has **no outbound
  access by default**, and only opening up the destinations the agent actually needs
  (your dbt repository host, the Anthropic / Bedrock API, your ECR registry).
</Warning>

Override these to tighten the network boundary or to give the microVM an IAM role:

```bash theme={null}
# IAM role the microVM assumes (e.g. to pull from a private ECR or reach AWS APIs)
LAMBDA_MICROVM_EXECUTION_ROLE_ARN=arn:aws:iam::...:role/lightdash-sandbox

# Network connectors. Default to AWS-managed open ingress/egress; point these at
# your own VPC connectors to constrain traffic. We strongly recommend an egress
# connector that denies outbound traffic by default.
LAMBDA_MICROVM_INGRESS_CONNECTOR_ARN=arn:aws:lambda:<region>:aws:network-connector:aws-network-connector:ALL_INGRESS
LAMBDA_MICROVM_EGRESS_CONNECTOR_ARN=arn:aws:lambda:<region>:aws:network-connector:aws-network-connector:INTERNET_EGRESS
```

## Azure Container Apps Sandboxes (self-hosted production)

[Azure Container Apps Sandboxes](https://learn.microsoft.com/azure/container-apps/sandboxes-overview)
run each sandbox as an isolated, microVM-class environment **inside your own Azure
subscription**, so untrusted agent code and your repository contents never leave your
infrastructure. Sandboxes have **native suspend/resume** (a full memory + disk snapshot
with sub-second restore), which is what keeps multi-turn agent conversations fast.

This is the **recommended sandbox provider for customers deploying Lightdash on Azure** —
it keeps the sandbox boundary inside your existing Azure tenant and avoids sending agent
workloads or repository contents to a third-party service.

<Note>
  Azure Container Apps Sandboxes is currently an Azure **preview** feature. It requires a
  Microsoft Entra ID account (personal Microsoft accounts aren't supported), and its API
  surface may change while in preview.
</Note>

### Prerequisites

Provision these in the **same Azure subscription and region your Lightdash backend runs
in**, using the [`aca` CLI](https://aka.ms/aca/sandboxes/dev) or the
[Sandboxes portal](https://aka.ms/aca/sandboxes/portal):

* **A sandbox group per feature** — one for data app generation and one for AI writeback
  (they bundle different toolchains). A sandbox group (`Microsoft.App/SandboxGroups`) is the
  management boundary that holds a feature's sandboxes and disk image. Give each group a
  **Memory-mode auto-suspend** lifecycle policy so idle sandboxes snapshot and scale to zero.
* **A disk image per group** — build the two images from the Dockerfiles in the Lightdash
  repo (`sandboxes/data-apps/`, `sandboxes/ai-writeback/`), push them to a container registry
  (e.g. Azure Container Registry), and register each as a disk image in its sandbox group.
  Registration returns a disk image **ID** for the config below. (Unlike the AWS provider,
  there is no in-VM agent to build — Sandboxes expose a native command/file API.)
* **A workload identity with the data-plane role** — grant your backend's managed identity
  the **Container Apps SandboxGroup Data Owner** role on each sandbox group. Lightdash uses
  `DefaultAzureCredential` (workload identity on AKS, or the standard Azure credential chain)
  to authenticate — **no client secret is configured here**.

### Configure the provider

```bash theme={null}
SANDBOX_PROVIDER=azure-sandboxes
ANTHROPIC_API_KEY=sk-ant-...   # the agent (Claude Code) runs inside the sandbox

# Where your sandbox groups live
AZURE_SANDBOXES_SUBSCRIPTION_ID=<subscription-id>
AZURE_SANDBOXES_RESOURCE_GROUP=<resource-group>
AZURE_SANDBOXES_REGION=eastus2

# Per-feature sandbox group + disk image ID (from the prerequisites above). Required.
AZURE_SANDBOXES_DATA_APP_GROUP=lightdash-data-app
AZURE_SANDBOXES_DATA_APP_DISK_IMAGE=<data-app-disk-image-id>
AZURE_SANDBOXES_AI_WRITEBACK_GROUP=lightdash-writeback
AZURE_SANDBOXES_AI_WRITEBACK_DISK_IMAGE=<writeback-disk-image-id>

# Optional — sandbox size (XS/S/M/L, default M)
AZURE_SANDBOXES_RESOURCE_TIER=M
```

Egress is locked down automatically: each sandbox launches with a **default-deny egress
policy** that only allows the hosts the agent needs (the Anthropic API and your git host),
with **full traffic inspection** so the platform enforces the deny on all traffic and blocks
non-HTTP egress. Untrusted agent code can't reach any other destination — outbound requests
to non-allowlisted hosts are rejected, and all other ports are blocked.

## Local Docker provider (development)

For local development you can run sandboxes as plain Docker containers on your own machine
— no E2B account required. This is the recommended way to work on or try the AI features
locally.

It uses the same images E2B builds, but as plain local Docker images. Two **separate**
images are used (different toolchains), mirroring the two E2B templates:

| Image (default tag)            | Built from                | Used by             |
| ------------------------------ | ------------------------- | ------------------- |
| `lightdash-sandbox:local`      | `sandboxes/data-apps/`    | Data app generation |
| `lightdash-ai-writeback:local` | `sandboxes/ai-writeback/` | AI writeback        |

### Prerequisites

* Docker running locally, with the daemon reachable from the Lightdash backend.
* S3-compatible object storage configured (locally this is MinIO). Suspended-sandbox
  snapshots are tarred to object storage so a conversation survives the container being
  destroyed — see [external object storage](/self-host/customize-deployment/configure-lightdash-to-use-external-object-storage).
* An Anthropic API key (`ANTHROPIC_API_KEY`) for the agent.

### Setup

1. Build the local sandbox images (each builds from `sandboxes/<feature>/`):

   ```bash theme={null}
   ./sandboxes/data-apps/build-local-image.sh        # -> lightdash-sandbox:local
   ./sandboxes/ai-writeback/build-local-image.sh     # -> lightdash-ai-writeback:local
   ```

   These are large (the writeback image bundles dbt, the Lightdash CLI and Claude Code) and
   only need rebuilding when the sandbox toolchain changes.

2. Point Lightdash at the Docker provider:

   ```bash theme={null}
   SANDBOX_PROVIDER=docker
   # optional — these are the defaults:
   SANDBOX_DOCKER_IMAGE=lightdash-sandbox:local
   SANDBOX_AI_WRITEBACK_DOCKER_IMAGE=lightdash-ai-writeback:local
   ```

3. Restart the **backend and the scheduler** so both pick up the new environment. Data app
   generation runs in the scheduler worker, so a stale `SANDBOX_PROVIDER` there will keep it
   on E2B. (With PM2, a plain `restart` reuses the cached env — delete and re-start the
   processes, or restart with `--update-env`, to actually reload the env file.)

## Snapshot lifecycle

Every turn suspends its own sandbox, so in steady state nothing sits idle. Two timers
configure the cloud-side idle policy as a backstop — when to auto-suspend a sandbox left
running and when to auto-terminate a suspended one:

```bash theme={null}
SANDBOX_IDLE_TIMEOUT_MS=1800000        # auto-suspend a running-but-idle sandbox (default 30 min)
SANDBOX_SNAPSHOT_RETENTION_MS=604800000 # auto-terminate a suspended microVM (default 7 days); also how long a thread stays resumable
```

`SANDBOX_IDLE_TIMEOUT_MS` feeds the auto-suspend policy on both the **Lambda MicroVMs** and
**Azure Sandboxes** providers (for Azure it sets each sandbox's Memory-mode auto-suspend
interval). `SANDBOX_SNAPSHOT_RETENTION_MS` is read only by Lambda MicroVMs — on Azure,
suspended-sandbox retention is governed by the sandbox group's own auto-delete policy.
**E2B** manages idle sandboxes itself, and the **Docker** dev provider has no idle handling.

## Environment variable reference

| Variable                                  | Default                                          | Description                                                                                                                                            |
| ----------------------------------------- | ------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `SANDBOX_PROVIDER`                        | `e2b`                                            | Sandbox backend: `e2b`, `lambda-microvm`, `azure-sandboxes`, or `docker`.                                                                              |
| `ANTHROPIC_API_KEY`                       | —                                                | API key for the Claude Code agent running inside the sandbox.                                                                                          |
| `E2B_API_KEY`                             | —                                                | E2B API key (required when `SANDBOX_PROVIDER=e2b`).                                                                                                    |
| `E2B_TEMPLATE_NAME`                       | `lightdash/lightdash-data-app`                   | E2B template for data app sandboxes.                                                                                                                   |
| `E2B_TEMPLATE_TAG`                        | Lightdash version                                | Tag of the data app template to launch.                                                                                                                |
| `E2B_AI_WRITEBACK_TEMPLATE_NAME`          | `lightdash/lightdash-ai-writeback`               | E2B template for writeback sandboxes.                                                                                                                  |
| `E2B_AI_WRITEBACK_TEMPLATE_TAG`           | Lightdash version                                | Tag of the writeback template to launch.                                                                                                               |
| `LAMBDA_MICROVM_REGION`                   | `eu-west-1`                                      | AWS region the microVMs run in (`lambda-microvm`).                                                                                                     |
| `LAMBDA_MICROVM_DATA_APP_IMAGE_ARN`       | —                                                | Image ARN for data app microVMs (required when `SANDBOX_PROVIDER=lambda-microvm`).                                                                     |
| `LAMBDA_MICROVM_AI_WRITEBACK_IMAGE_ARN`   | —                                                | Image ARN for writeback microVMs (required when `SANDBOX_PROVIDER=lambda-microvm`).                                                                    |
| `LAMBDA_MICROVM_EXECUTION_ROLE_ARN`       | —                                                | Optional IAM role the microVM assumes.                                                                                                                 |
| `LAMBDA_MICROVM_INGRESS_CONNECTOR_ARN`    | AWS-managed `ALL_INGRESS`                        | Optional ingress network connector.                                                                                                                    |
| `LAMBDA_MICROVM_EGRESS_CONNECTOR_ARN`     | AWS-managed `INTERNET_EGRESS`                    | Optional egress network connector.                                                                                                                     |
| `AZURE_SANDBOXES_SUBSCRIPTION_ID`         | —                                                | Azure subscription holding the sandbox groups (required when `SANDBOX_PROVIDER=azure-sandboxes`).                                                      |
| `AZURE_SANDBOXES_RESOURCE_GROUP`          | —                                                | Resource group holding the sandbox groups (required when `azure-sandboxes`).                                                                           |
| `AZURE_SANDBOXES_REGION`                  | `eastus2`                                        | Region the sandboxes run in (selects the data-plane endpoint).                                                                                         |
| `AZURE_SANDBOXES_DATA_APP_GROUP`          | —                                                | Sandbox group for data app sandboxes (required when `azure-sandboxes`).                                                                                |
| `AZURE_SANDBOXES_DATA_APP_DISK_IMAGE`     | —                                                | Disk image ID for data app sandboxes (required when `azure-sandboxes`).                                                                                |
| `AZURE_SANDBOXES_AI_WRITEBACK_GROUP`      | —                                                | Sandbox group for writeback sandboxes (required when `azure-sandboxes`).                                                                               |
| `AZURE_SANDBOXES_AI_WRITEBACK_DISK_IMAGE` | —                                                | Disk image ID for writeback sandboxes (required when `azure-sandboxes`).                                                                               |
| `AZURE_SANDBOXES_RESOURCE_TIER`           | `M`                                              | Sandbox size: `XS`, `S`, `M`, or `L`.                                                                                                                  |
| `AZURE_SANDBOXES_API_VERSION`             | `2026-02-01-preview`                             | Azure Sandboxes data-plane API version.                                                                                                                |
| `AZURE_SANDBOXES_TOKEN_SCOPE`             | `https://management.azuredevcompute.io/.default` | Entra token scope for the data plane.                                                                                                                  |
| `SANDBOX_DOCKER_IMAGE`                    | `lightdash-sandbox:local`                        | Local image for data app sandboxes (`docker` provider).                                                                                                |
| `SANDBOX_AI_WRITEBACK_DOCKER_IMAGE`       | `lightdash-ai-writeback:local`                   | Local image for writeback sandboxes (`docker` provider).                                                                                               |
| `SANDBOX_IDLE_TIMEOUT_MS`                 | `1800000` (30 min)                               | Auto-suspend a running-but-idle sandbox (`lambda-microvm` and `azure-sandboxes`). Ignored by `e2b`/`docker`.                                           |
| `SANDBOX_SNAPSHOT_RETENTION_MS`           | `604800000` (7 days)                             | Lambda MicroVMs idle policy: auto-terminate a suspended microVM (also how long a thread stays resumable). Ignored by `e2b`/`azure-sandboxes`/`docker`. |
