Serverless deployment in Dagster Cloud#

This guide is applicable to Dagster Cloud.

Dagster Cloud Serverless is a fully managed version of Dagster Cloud, and is the easiest way to get started with Dagster. With Serverless, you can run your Dagster jobs without spinning up any infrastructure.


When to choose Serverless#

Serverless works best with workloads that primarily orchestrate other services or perform light computation. Most workloads fit into this category, especially those that orchestrate third-party SaaS products like cloud data warehouses and ETL tools.

If any of the following are applicable, you should select Hybrid deployment:

  • You require substantial computational resources. For example, training a large machine learning (ML) model in-process.
  • Your dataset is too large to fit in memory. For example, training a large machine learning (ML) model in-process on a terabyte of data.
  • You need to distribute computation across many nodes for a single run. Dagster Cloud runs currently execute on a single node with 4 CPUs.
  • You don't want to add Elementl as a data processor.

Limitations#

Serverless is currently in early access and is subject to the following limitations:

  • Maximum of 100 GB of bandwidth per day
  • Maximum of 4500 step-minutes per day
  • Runs receive 4 vCPU cores, 16 GB of RAM and 128 GB of ephemeral disk
  • Sensors receive 0.25 vCPU cores and 512 MB of RAM
  • The Serverless base image is debian:bullseye-slim. Custom Dockerfiles are not currently supported.
  • All Serverless jobs run in the United States

Enterprise customers may request a quota increase by contacting Sales.


Getting started with Serverless#

With GitHub#

If you are a GitHub user, our GitHub integration is the fastest way to get started. It uses a GitHub app and GitHub Actions to set up a repo containing skeleton code and configuration consistent with Dagster Cloud's best practices with a single click.

When you create a new Dagster Cloud organization, you'll be prompted to choose Serverless or Hybrid deployment. Once activated, our GitHub integration will scaffold a new git repo for you with Serverless and Branch Deployments already configured. Pushing to the main branch will deploy to your prod Serverless deployment. Pull requests will spin up ephemeral branch deployments using the Serverless agent.

Without GitHub (GitLab, BitBucket, or local development)#

If you don't want to use our GitHub integration, we offer a powerful CLI that you can use in another CI environment or on your local laptop.

First, create a new project with the Dagster open-source CLI.

pip install dagster
dagster project from-example \
  --name my-dagster-project \
  --example assets_modern_data_stack

Once scaffolded, add dagster-cloud as a dependency in your setup.py file.

Next, install the dagster-cloud CLI and log in to your org. Note: The CLI requires a recent version of Python 3 and Docker.

pip install dagster-cloud
dagster-cloud configure

You can also configure the dagster-cloud tool noninteractively; see the CLI docs for more information.

Finally, deploy your project with Dagster Cloud Serverless:

dagster-cloud serverless deploy \
  --location-name example \
  --package-name assets_modern_data_stack

Adding secrets#

Often you'll need to securely access secrets from your jobs. With Dagster Cloud Serverless, secrets are added as environment variables that can be read using APIs like StringSource.

  • If you're using our GitHub integration, you can pass in secrets as a JSON string input to the serverless deploy step in your repository's Github action script.
  • If you're using the CLI directly, you can pass KEY=VALUE pairs using the --env flag. Run dagster-cloud serverless deploy --help to learn more.

Transitioning to Hybrid#

If your organization begins to hit the limitations of Serverless, you should transition to a Hybrid deployment. Hybrid deployments allow you to run an agent in your own infrastructure and give you substantially more flexibility and control over the Dagster environment.

To switch to Hybrid, navigate to Status > Agents in your Dagster Cloud account. On this page, you can disable the Serverless agent on and view instructions for enabling Hybrid.


Security and data protection#

Unlike Hybrid, Serverless Deployments on Dagster Cloud require direct access to your data, secrets and source code.

  • Dagster Cloud Serverless does not provide persistent storage. Ephemeral storage is deleted when a run concludes.
  • Secrets and source code are built into the image directly. Images are stored in a per-customer container registry with restricted access.
  • User code is securely sandboxed using modern container sandboxing techniques.
  • All production access is governed by industry-standard best practices which are regularly audited.