dstack configuration (SchemaStore) JSON Schema

Type	DevEnvironmentConfigurationRequest \| TaskConfigurationRequest \| ServiceConfigurationRequest \| FleetConfigurationRequest \| GatewayConfigurationRequest \| VolumeConfigurationRequest
File match	`.dstack.yml` `.dstack.yaml`
Schema URL	https://catalog.lintel.tools/schemas/schemastore/dstack-configuration/latest.json
Source	https://dstack-runner-downloads.s3.eu-west-1.amazonaws.com/latest/schemas/configuration.json

PortMappingRequest object

container_port integer required

max=65536exclusiveMin=0

local_port integer

max=65536exclusiveMin=0

RegistryAuthRequest object

Credentials for pulling a private Docker image.

Attributes: username (str): The username password (str): The password or access token

username string required

The username

password string required

The password or access token

PythonVersion string

An enumeration.

EnvSentinelRequest object

key string required

Env string[] | object

Env represents a mapping of process environment variables, as in environ(7). Environment values may be omitted, in that case the :class:EnvSentinel object is used as a placeholder.

To create an instance from a dict[str, str] or a list[str] use pydantic's :meth:BaseModel.parse_obj(dict | list) method.

NB: this is NOT a CoreModel, pydantic-duality, which is used as a base for the CoreModel, doesn't play well with custom root models.

CPUArchitecture string

An enumeration.

Range_int_ object

min integer

max integer

CPUSpecRequest object

arch

The CPU architecture, one of: x86, arm

All of: CPUArchitecture string

count Range_int_ | integer | string

The number of CPU cores

Default:

{
  "min": 2,
  "max": null
}

Range_Memory_ object

min number

max number

AcceleratorVendor string

An enumeration.

GPUSpecRequest object

vendor

The vendor of the GPU/accelerator, one of: nvidia, amd, google (alias: tpu), intel

All of: AcceleratorVendor string

name array | string

The name of the GPU (e.g., A100 or H100)

count Range_int_ | integer | string

The number of GPUs

Default:

{
  "min": 1,
  "max": null
}

memory Range_Memory_ | integer | string

The RAM size (e.g., 16GB). Can be set to a range (e.g. 16GB.., or 16GB..80GB)

total_memory Range_Memory_ | integer | string

The total RAM size (e.g., 32GB). Can be set to a range (e.g. 16GB.., or 16GB..80GB)

compute_capability array

The minimum compute capability of the GPU (e.g., 7.5)

DiskSpecRequest object

size Range_Memory_ | integer | string required

Disk size

ResourcesSpecRequest object

cpu CPUSpecRequest | Range_int_ | integer | string

The CPU requirements

Default:

{
  "arch": null,
  "count": {
    "min": 2,
    "max": null
  }
}

memory Range_Memory_ | integer | string

The RAM size (e.g., 8GB)

Default:

{
  "min": 8.0,
  "max": null
}

shm_size number | integer | string

The size of shared memory (e.g., 8GB). If you are using parallel communicating processes (e.g., dataloaders in PyTorch), you may need to configure this

gpu GPUSpecRequest | integer | string

The GPU requirements

Default:

{
  "vendor": null,
  "name": null,
  "count": {
    "min": 0,
    "max": null
  },
  "memory": null,
  "total_memory": null,
  "compute_capability": null
}

disk DiskSpecRequest | integer | string

The disk resources

Default:

{
  "size": {
    "min": 100.0,
    "max": null
  }
}

VolumeMountPointRequest object

name string | string[] required

The network volume name or the list of network volume names to mount. If a list is specified, one of the volumes in the list will be mounted. Specify volumes from different backends/regions to increase availability

path string required

The absolute container path to mount the volume at

InstanceMountPointRequest object

instance_path string required

The absolute path on the instance (host)

path string required

The absolute path in the container

optional boolean

Allow running without this volume in backends that do not support instance volumes

Default: false

RepoExistsAction string

An enumeration.

RepoSpecRequest object

local_path string

The path to the Git repo on the user's machine. Relative paths are resolved relative to the parent directory of the the configuration file. Mutually exclusive with url

url string

The Git repo URL. Mutually exclusive with local_path

branch string

The repo branch. Defaults to the active branch for local paths and the default branch for URLs

hash string

The commit hash

path string

The repo path inside the run container. Relative paths are resolved relative to the working directory

Default: "."

if_exists

The action to be taken if path exists and is not empty. One of: error, skip

Default: "error"

All of: RepoExistsAction string

FilePathMappingRequest object

local_path string required

The path on the user's machine. Relative paths are resolved relative to the parent directory of the the configuration file

path string required

The path in the container. Relative paths are resolved relative to the working directory

BackendType string

Attributes: AMDDEVCLOUD (BackendType): AMD Developer Cloud AWS (BackendType): Amazon Web Services AZURE (BackendType): Microsoft Azure CLOUDRIFT (BackendType): CloudRift CRUSOE (BackendType): Crusoe CUDO (BackendType): Cudo DATACRUNCH (BackendType): DataCrunch (for backward compatibility) DIGITALOCEAN (BackendType): DigitalOcean DSTACK (BackendType): dstack Sky GCP (BackendType): Google Cloud Platform HOTAISLE (BackendType): Hot Aisle KUBERNETES (BackendType): Kubernetes LAMBDA (BackendType): Lambda Cloud NEBIUS (BackendType): Nebius AI Cloud OCI (BackendType): Oracle Cloud Infrastructure RUNPOD (BackendType): Runpod Cloud TENSORDOCK (BackendType): TensorDock Marketplace VASTAI (BackendType): Vast.ai Marketplace VERDA (BackendType): Verda Cloud VULTR (BackendType): Vultr

SpotPolicy string

An enumeration.

RetryEvent string

An enumeration.

ProfileRetryRequest object

on_events RetryEvent[]

The list of events that should be handled with retry. Supported events are no-capacity, interruption, error. Omit to retry on all events

duration integer | string

The maximum period of retrying the run, e.g., 4h or 1d. The period is calculated as a run age for no-capacity event and as a time passed since the last interruption and error for interruption and error events.

CreationPolicy string

An enumeration.

UtilizationPolicyRequest object

min_gpu_utilization integer required

Minimum required GPU utilization, percent. If any GPU has utilization below specified value during the whole time window, the run is terminated

min=0max=100

time_window integer | string required

The time window of metric samples taking into account to measure utilization (e.g., 30m, 1h). Minimum is 5m

StartupOrder string

An enumeration.

StopCriteria string

An enumeration.

ScheduleRequest object

cron string[] | string required

A cron expression or a list of cron expressions specifying the UTC time when the run needs to be started

EntityReferenceRequest object

Cross-project entity reference.

name string required

The entity name

project string

The project name. If unspecified, refers to the current project

DevEnvironmentConfigurationRequest object

ide string | string | string

The IDE to pre-install. Supported values include vscode, cursor, and windsurf. Defaults to no IDE (SSH only)

version string

The version of the IDE. For windsurf, the version is in the format version@commit

init string[]

The shell commands to run on startup

Default:

[]

inactivity_duration string | integer | boolean | string

The maximum amount of time the dev environment can be inactive (e.g., 2h, 1d, etc). After it elapses, the dev environment is automatically stopped. Inactivity is defined as the absence of SSH connections to the dev environment, including VS Code connections, ssh <run name> shells, and attached dstack apply or dstack attach commands. Use off for unlimited duration. Can be updated in-place. Defaults to off

ports integer | string | PortMappingRequest[]

Port numbers/mapping to expose

Default:

[]

type string

Default: "dev-environment"

Values: "dev-environment"

name string

The run name. If not specified, a random name is generated

image string

The name of the Docker image to run

user string

The user inside the container, user_name_or_id[:group_name_or_id] (e.g., ubuntu, 1000:1000). Defaults to the default user from the image

privileged boolean

Run the container in privileged mode

Default: false

entrypoint string

The Docker entrypoint

working_dir string

The absolute path to the working directory inside the container. Defaults to the image's default working directory

home_dir string

Default: "/root"

registry_auth

Credentials for pulling a private Docker image

All of: RegistryAuthRequest object

python

The major version of Python. Mutually exclusive with image and docker

All of: PythonVersion string

nvcc boolean

Use image with NVIDIA CUDA Compiler (NVCC) included. Mutually exclusive with image and docker

single_branch boolean

Whether to clone and track only the current branch or all remote branches. Relevant only when using remote Git repos. Defaults to false for dev environments and to true for tasks and services

env

The mapping or the list of environment variables

Default:

{
  "__root__": {}
}

All of: Env string[] | object

shell string

The shell used to run commands. Allowed values are sh, bash, or an absolute path, e.g., /usr/bin/zsh. Defaults to /bin/sh if the image is specified, /bin/bash otherwise

resources

The resources requirements to run the configuration

Default:

{
  "cpu": {
    "min": 2,
    "max": null
  },
  "memory": {
    "min": 8.0,
    "max": null
  },
  "shm_size": null,
  "gpu": {
    "vendor": null,
    "name": null,
    "count": {
      "min": 0,
      "max": null
    },
    "memory": null,
    "total_memory": null,
    "compute_capability": null
  },
  "disk": {
    "size": {
      "min": 100.0,
      "max": null
    }
  }
}

All of: ResourcesSpecRequest object

priority integer

The priority of the run, an integer between 0 and 100. dstack tries to provision runs with higher priority first. Defaults to 0

min=0max=100

volumes VolumeMountPointRequest | InstanceMountPointRequest | string[]

The volumes mount points

Default:

[]

docker boolean

Use Docker inside the container. Mutually exclusive with image, python, and nvcc. Overrides privileged

repos RepoSpecRequest[]

The list of Git repos

Default:

[]

files FilePathMappingRequest | string[]

The local to container file path mappings

Default:

[]

setup string[]

Default:

[]

backends BackendType[]

The backends to consider for provisioning (e.g., [aws, gcp])

regions string[]

The regions to consider for provisioning (e.g., [eu-west-1, us-west4, westeurope])

availability_zones string[]

The availability zones to consider for provisioning (e.g., [eu-west-1a, us-west4-a])

instance_types string[]

The cloud-specific instance types to consider for provisioning (e.g., [p3.8xlarge, n1-standard-4])

reservation string

The existing reservation to use for instance provisioning. Supports AWS Capacity Reservations, AWS Capacity Blocks, and GCP reservations

spot_policy

The policy for provisioning spot or on-demand instances: spot, on-demand, auto. Defaults to on-demand

All of: SpotPolicy string

retry ProfileRetryRequest | boolean

The policy for resubmitting the run. Defaults to false

max_duration string | integer | boolean | string

The maximum duration of a run (e.g., 2h, 1d, etc) in a running state, excluding provisioning and pulling. After it elapses, the run is automatically stopped. Use off for unlimited duration. Defaults to off

stop_duration string | integer | boolean | string

The maximum duration of a run graceful stopping. After it elapses, the run is automatically forced stopped. This includes force detaching volumes used by the run. Use off for unlimited duration. Defaults to 5m

max_price number

The maximum instance price per hour, in dollars

exclusiveMin=0.0

creation_policy

The policy for using instances from fleets: reuse, reuse-or-create. Defaults to reuse-or-create

All of: CreationPolicy string

idle_duration integer | string

Time to wait before terminating idle instances. When the run reuses an existing fleet instance, the fleet's idle_duration applies. When the run provisions a new instance, the shorter of the fleet's and run's values is used. Defaults to 5m for runs and 3d for fleets. Use off for unlimited duration. Only applied for VM-based backends

utilization_policy

Run termination policy based on utilization

All of: UtilizationPolicyRequest object

startup_order

The order in which master and workers jobs are started: any, master-first, workers-first. Defaults to any

All of: StartupOrder string

stop_criteria

The criteria determining when a multi-node run should be considered finished: all-done, master-done. Defaults to all-done

All of: StopCriteria string

schedule

The schedule for starting the run at specified time

All of: ScheduleRequest object

fleets EntityReferenceRequest | string[]

The fleets considered for reuse. For fleets owned by the current project, specify fleet names. For imported fleets, specify <project name>/<fleet name>

tags Record<string, string>

The custom tags to associate with the resource. The tags are also propagated to the underlying backend resources. If there is a conflict with backend-level tags, does not override them

TaskConfigurationRequest object

nodes integer

Number of nodes

Default: 1

min=1

ports integer | string | PortMappingRequest[]

Port numbers/mapping to expose

Default:

[]

commands string[]

The shell commands to run

Default:

[]

type string

Default: "task"

Values: "task"

name string

The run name. If not specified, a random name is generated

image string

The name of the Docker image to run

user string

The user inside the container, user_name_or_id[:group_name_or_id] (e.g., ubuntu, 1000:1000). Defaults to the default user from the image

privileged boolean

Run the container in privileged mode

Default: false

entrypoint string

The Docker entrypoint

working_dir string

The absolute path to the working directory inside the container. Defaults to the image's default working directory

home_dir string

Default: "/root"

registry_auth

Credentials for pulling a private Docker image

All of: RegistryAuthRequest object

python

The major version of Python. Mutually exclusive with image and docker

All of: PythonVersion string

nvcc boolean

Use image with NVIDIA CUDA Compiler (NVCC) included. Mutually exclusive with image and docker

single_branch boolean

Whether to clone and track only the current branch or all remote branches. Relevant only when using remote Git repos. Defaults to false for dev environments and to true for tasks and services

env

The mapping or the list of environment variables

Default:

{
  "__root__": {}
}

All of: Env string[] | object

shell string

The shell used to run commands. Allowed values are sh, bash, or an absolute path, e.g., /usr/bin/zsh. Defaults to /bin/sh if the image is specified, /bin/bash otherwise

resources

The resources requirements to run the configuration

Default:

{
  "cpu": {
    "min": 2,
    "max": null
  },
  "memory": {
    "min": 8.0,
    "max": null
  },
  "shm_size": null,
  "gpu": {
    "vendor": null,
    "name": null,
    "count": {
      "min": 0,
      "max": null
    },
    "memory": null,
    "total_memory": null,
    "compute_capability": null
  },
  "disk": {
    "size": {
      "min": 100.0,
      "max": null
    }
  }
}

All of: ResourcesSpecRequest object

priority integer

The priority of the run, an integer between 0 and 100. dstack tries to provision runs with higher priority first. Defaults to 0

min=0max=100

volumes VolumeMountPointRequest | InstanceMountPointRequest | string[]

The volumes mount points

Default:

[]

docker boolean

Use Docker inside the container. Mutually exclusive with image, python, and nvcc. Overrides privileged

repos RepoSpecRequest[]

The list of Git repos

Default:

[]

files FilePathMappingRequest | string[]

The local to container file path mappings

Default:

[]

setup string[]

Default:

[]

backends BackendType[]

The backends to consider for provisioning (e.g., [aws, gcp])

regions string[]

The regions to consider for provisioning (e.g., [eu-west-1, us-west4, westeurope])

availability_zones string[]

The availability zones to consider for provisioning (e.g., [eu-west-1a, us-west4-a])

instance_types string[]

The cloud-specific instance types to consider for provisioning (e.g., [p3.8xlarge, n1-standard-4])

reservation string

The existing reservation to use for instance provisioning. Supports AWS Capacity Reservations, AWS Capacity Blocks, and GCP reservations

spot_policy

The policy for provisioning spot or on-demand instances: spot, on-demand, auto. Defaults to on-demand

All of: SpotPolicy string

retry ProfileRetryRequest | boolean

The policy for resubmitting the run. Defaults to false

max_duration string | integer | boolean | string

The maximum duration of a run (e.g., 2h, 1d, etc) in a running state, excluding provisioning and pulling. After it elapses, the run is automatically stopped. Use off for unlimited duration. Defaults to off

stop_duration string | integer | boolean | string

The maximum duration of a run graceful stopping. After it elapses, the run is automatically forced stopped. This includes force detaching volumes used by the run. Use off for unlimited duration. Defaults to 5m

max_price number

The maximum instance price per hour, in dollars

exclusiveMin=0.0

creation_policy

The policy for using instances from fleets: reuse, reuse-or-create. Defaults to reuse-or-create

All of: CreationPolicy string

idle_duration integer | string

Time to wait before terminating idle instances. When the run reuses an existing fleet instance, the fleet's idle_duration applies. When the run provisions a new instance, the shorter of the fleet's and run's values is used. Defaults to 5m for runs and 3d for fleets. Use off for unlimited duration. Only applied for VM-based backends

utilization_policy

Run termination policy based on utilization

All of: UtilizationPolicyRequest object

startup_order

The order in which master and workers jobs are started: any, master-first, workers-first. Defaults to any

All of: StartupOrder string

stop_criteria

The criteria determining when a multi-node run should be considered finished: all-done, master-done. Defaults to all-done

All of: StopCriteria string

schedule

The schedule for starting the run at specified time

All of: ScheduleRequest object

fleets EntityReferenceRequest | string[]

The fleets considered for reuse. For fleets owned by the current project, specify fleet names. For imported fleets, specify <project name>/<fleet name>

tags Record<string, string>

The custom tags to associate with the resource. The tags are also propagated to the underlying backend resources. If there is a conflict with backend-level tags, does not override them

TGIChatModelRequest object

Mapping of the model for the OpenAI-compatible endpoint.

Attributes: type (str): The type of the model, e.g. "chat" name (str): The name of the model. This name will be used both to load model configuration from the HuggingFace Hub and in the OpenAI-compatible endpoint. format (str): The format of the model, e.g. "tgi" if the model is served with HuggingFace's Text Generation Inference. chat_template (Optional[str]): The custom prompt template for the model. If not specified, the default prompt template from the HuggingFace Hub configuration will be used. eos_token (Optional[str]): The custom end of sentence token. If not specified, the default end of sentence token from the HuggingFace Hub configuration will be used.

name string required

The name of the model

format string required

The serving format. Must be set to tgi

Values: "tgi"

type string

The type of the model

Default: "chat"

Values: "chat"

chat_template string

The custom prompt template for the model. If not specified, the default prompt template from the HuggingFace Hub configuration will be used

eos_token string

The custom end of sentence token. If not specified, the default end of sentence token from the HuggingFace Hub configuration will be used

OpenAIChatModelRequest object

Mapping of the model for the OpenAI-compatible endpoint.

Attributes: type (str): The type of the model, e.g. "chat" name (str): The name of the model. This name will be used both to load model configuration from the HuggingFace Hub and in the OpenAI-compatible endpoint. format (str): The format of the model, i.e. "openai". prefix (str): The base_url prefix: <http://hostname/{prefix}/chat/completions>. Defaults to /v1.

name string required

The name of the model

format string required

The serving format. Must be set to openai

Values: "openai"

type string

The type of the model

Default: "chat"

Values: "chat"

prefix string

The base_url prefix (after hostname)

Default: "/v1"

ScalingSpecRequest object

metric string required

The target metric to track. Currently, the only supported value is rps (meaning requests per second)

Values: "rps"

target number required

The target value of the metric. The number of replicas is calculated based on this number and automatically adjusts (scales up or down) as this metric changes

exclusiveMin=0

scale_up_delay integer

The delay in seconds before scaling up

Default: 300

scale_down_delay integer

The delay in seconds before scaling down

Default: 600

IPAddressPartitioningKeyRequest object

type string

Partitioning type

Default: "ip_address"

Values: "ip_address"

HeaderPartitioningKeyRequest object

header string required

Name of the header to use for partitioning

maxLength=500pattern=^[a-zA-Z0-9-_]+$

type string

Partitioning type

Default: "header"

Values: "header"

RateLimitRequest object

rps number required

Max allowed number of requests per second. Requests are tracked at millisecond granularity. For example, rps: 10 means at most 1 request per 100ms

min=0.016666666666666666max=153722867280912930

prefix string

URL path prefix to which this limit is applied. If an incoming request matches several prefixes, the longest prefix is applied

Default: "/"

maxLength=4094pattern=^/[^\s\\{}]*$

key IPAddressPartitioningKeyRequest | HeaderPartitioningKeyRequest

The partitioning key. Each incoming request belongs to a partition and rate limits are applied per partition. Defaults to partitioning by client IP address

Default:

{
  "type": "ip_address"
}

burst integer

Max number of requests that can be passed to the service ahead of the rate limit

Default: 0

min=0max=9223372036854775807

HTTPHeaderSpecRequest object

name string required

The name of the HTTP header

minLength=1maxLength=256

value string required

The value of the HTTP header

minLength=1maxLength=2048

ProbeConfigRequest object

type string required

The probe type. Must be http

Values: "http"

url string

The URL to request. Defaults to /

method string

The HTTP method to use for the probe (e.g., get, post, etc.). Defaults to get

Values: "get" "post" "put" "delete" "patch" "head"

headers HTTPHeaderSpecRequest[]

A list of HTTP headers to include in the request

Default:

[]

maxItems=16

body string

The HTTP request body to send with the probe

minLength=1maxLength=2048

timeout integer | string

Maximum amount of time the HTTP request is allowed to take. Defaults to 10s

interval integer | string

Minimum amount of time between the end of one probe execution and the start of the next. Defaults to 15s

ready_after integer

The number of consecutive successful probe executions required for the replica to be considered ready. Used during rolling deployments. Defaults to 1

min=1

until_ready boolean

If true, the probe will stop being executed as soon as it reaches the ready_after threshold of successful executions. Defaults to false

ReplicaGroupRequest object

count required

The number of replicas. Can be a number (e.g. 2) or a range (0..4 or 1..8). If it's a range, the scaling property is required

All of: Range[int] object

name string

The name of the replica group. If not provided, defaults to '0', '1', etc. based on position.

scaling

The auto-scaling rules. Required if count is set to a range

All of: ScalingSpecRequest object

resources

The resources requirements for replicas in this group

Default:

{
  "cpu": {
    "min": 2,
    "max": null
  },
  "memory": {
    "min": 8.0,
    "max": null
  },
  "shm_size": null,
  "gpu": {
    "vendor": null,
    "name": null,
    "count": {
      "min": 0,
      "max": null
    },
    "memory": null,
    "total_memory": null,
    "compute_capability": null
  },
  "disk": {
    "size": {
      "min": 100.0,
      "max": null
    }
  }
}

All of: ResourcesSpecRequest object

commands string[]

The shell commands to run for replicas in this group

Default:

[]

SGLangServiceRouterConfigRequest object

type string

The router type

Default: "sglang"

Values: "sglang"

policy string

The routing policy. Options: random, round_robin, cache_aware, power_of_two

Default: "cache_aware"

Values: "random" "round_robin" "cache_aware" "power_of_two"

pd_disaggregation boolean

Enable PD disaggregation mode for the SGLang router

Default: false

ServiceConfigurationRequest object

port integer | string | PortMappingRequest required

The port the application listens on

gateway boolean | string

The name of the gateway. Specify boolean false to run without a gateway. Specify boolean true to run with the default gateway. Omit to run with the default gateway if there is one, or without a gateway otherwise

strip_prefix boolean

Strip the /proxy/services/<project name>/<run name>/ path prefix when forwarding requests to the service. Only takes effect when running the service without a gateway

Default: true

model TGIChatModelRequest | OpenAIChatModelRequest | string

Mapping of the model for the OpenAI-compatible endpoint provided by dstack. Can be a full model format definition or just a model name. If it's a name, the service is expected to expose an OpenAI-compatible API at the /v1 path

https boolean | string

Enable HTTPS if running with a gateway. Set to auto to determine automatically based on gateway configuration. Defaults to true

auth boolean

Enable the authorization

Default: true

scaling

The auto-scaling rules. Required if replicas is set to a range

All of: ScalingSpecRequest object

rate_limits RateLimitRequest[]

Rate limiting rules

Default:

[]

probes ProbeConfigRequest[]

The list of probes to determine service health. If model is set, defaults to a /v1/chat/completions probe. Set explicitly to override

replicas ReplicaGroupRequest[] | Range_int_ | integer | string

The number of replicas or a list of replica groups. Can be an integer (e.g., 2), a range (e.g., 0..4), or a list of replica groups. Each replica group defines replicas with shared configuration (commands, resources, scaling). When replicas is a list of replica groups, top-level scaling, commands, and resources are not allowed and must be specified in each replica group instead.

router

Router configuration for the service. Requires a gateway with matching router enabled.

All of: SGLangServiceRouterConfigRequest object

commands string[]

The shell commands to run

Default:

[]

type string

Default: "service"

Values: "service"

name string

The run name. If not specified, a random name is generated

image string

The name of the Docker image to run

user string

The user inside the container, user_name_or_id[:group_name_or_id] (e.g., ubuntu, 1000:1000). Defaults to the default user from the image

privileged boolean

Run the container in privileged mode

Default: false

entrypoint string

The Docker entrypoint

working_dir string

The absolute path to the working directory inside the container. Defaults to the image's default working directory

home_dir string

Default: "/root"

registry_auth

Credentials for pulling a private Docker image

All of: RegistryAuthRequest object

python

The major version of Python. Mutually exclusive with image and docker

All of: PythonVersion string

nvcc boolean

Use image with NVIDIA CUDA Compiler (NVCC) included. Mutually exclusive with image and docker

single_branch boolean

Whether to clone and track only the current branch or all remote branches. Relevant only when using remote Git repos. Defaults to false for dev environments and to true for tasks and services

env

The mapping or the list of environment variables

Default:

{
  "__root__": {}
}

All of: Env string[] | object

shell string

The shell used to run commands. Allowed values are sh, bash, or an absolute path, e.g., /usr/bin/zsh. Defaults to /bin/sh if the image is specified, /bin/bash otherwise

resources

The resources requirements to run the configuration

Default:

{
  "cpu": {
    "min": 2,
    "max": null
  },
  "memory": {
    "min": 8.0,
    "max": null
  },
  "shm_size": null,
  "gpu": {
    "vendor": null,
    "name": null,
    "count": {
      "min": 0,
      "max": null
    },
    "memory": null,
    "total_memory": null,
    "compute_capability": null
  },
  "disk": {
    "size": {
      "min": 100.0,
      "max": null
    }
  }
}

All of: ResourcesSpecRequest object

priority integer

The priority of the run, an integer between 0 and 100. dstack tries to provision runs with higher priority first. Defaults to 0

min=0max=100

volumes VolumeMountPointRequest | InstanceMountPointRequest | string[]

The volumes mount points

Default:

[]

docker boolean

Use Docker inside the container. Mutually exclusive with image, python, and nvcc. Overrides privileged

repos RepoSpecRequest[]

The list of Git repos

Default:

[]

files FilePathMappingRequest | string[]

The local to container file path mappings

Default:

[]

setup string[]

Default:

[]

backends BackendType[]

The backends to consider for provisioning (e.g., [aws, gcp])

regions string[]

The regions to consider for provisioning (e.g., [eu-west-1, us-west4, westeurope])

availability_zones string[]

The availability zones to consider for provisioning (e.g., [eu-west-1a, us-west4-a])

instance_types string[]

The cloud-specific instance types to consider for provisioning (e.g., [p3.8xlarge, n1-standard-4])

reservation string

The existing reservation to use for instance provisioning. Supports AWS Capacity Reservations, AWS Capacity Blocks, and GCP reservations

spot_policy

The policy for provisioning spot or on-demand instances: spot, on-demand, auto. Defaults to on-demand

All of: SpotPolicy string

retry ProfileRetryRequest | boolean

The policy for resubmitting the run. Defaults to false

max_duration string | integer | boolean | string

The maximum duration of a run (e.g., 2h, 1d, etc) in a running state, excluding provisioning and pulling. After it elapses, the run is automatically stopped. Use off for unlimited duration. Defaults to off

stop_duration string | integer | boolean | string

The maximum duration of a run graceful stopping. After it elapses, the run is automatically forced stopped. This includes force detaching volumes used by the run. Use off for unlimited duration. Defaults to 5m

max_price number

The maximum instance price per hour, in dollars

exclusiveMin=0.0

creation_policy

The policy for using instances from fleets: reuse, reuse-or-create. Defaults to reuse-or-create

All of: CreationPolicy string

idle_duration integer | string

Time to wait before terminating idle instances. When the run reuses an existing fleet instance, the fleet's idle_duration applies. When the run provisions a new instance, the shorter of the fleet's and run's values is used. Defaults to 5m for runs and 3d for fleets. Use off for unlimited duration. Only applied for VM-based backends

utilization_policy

Run termination policy based on utilization

All of: UtilizationPolicyRequest object

startup_order

The order in which master and workers jobs are started: any, master-first, workers-first. Defaults to any

All of: StartupOrder string

stop_criteria

The criteria determining when a multi-node run should be considered finished: all-done, master-done. Defaults to all-done

All of: StopCriteria string

schedule

The schedule for starting the run at specified time

All of: ScheduleRequest object

fleets EntityReferenceRequest | string[]

The fleets considered for reuse. For fleets owned by the current project, specify fleet names. For imported fleets, specify <project name>/<fleet name>

tags Record<string, string>

The custom tags to associate with the resource. The tags are also propagated to the underlying backend resources. If there is a conflict with backend-level tags, does not override them

SSHKeyRequest object

public string required

private string

SSHProxyParamsRequest object

hostname string required

The IP address or domain of proxy host

user string required

The user to log in with for proxy host

identity_file string required

The private key to use for proxy host

port integer

The SSH port of proxy host

ssh_key object

2 nested properties

public string required

private string

SSHHostParamsRequest object

hostname string required

The IP address or domain to connect to

port integer

The SSH port to connect to for this host

user string

The user to log in with for this host

identity_file string

The private key to use for this host

proxy_jump

The SSH proxy configuration for this host

All of: SSHProxyParamsRequest object

internal_ip string

The internal IP of the host used for communication inside the cluster. If not specified, dstack will use the IP address from network or from the first found internal network.

ssh_key object

2 nested properties

public string required

private string

blocks string | integer

The amount of blocks to split the instance into, a number or auto. auto means as many as possible. The number of GPUs and CPUs must be divisible by the number of blocks. Defaults to the top-level blocks value.

SSHParamsRequest object

hosts SSHHostParamsRequest | string[] required

The per host connection parameters: a hostname or an object that overrides default ssh parameters

user string

The user to log in with on all hosts

port integer

The SSH port to connect to

identity_file string

The private key to use for all hosts

ssh_key object

2 nested properties

public string required

private string

proxy_jump

The SSH proxy configuration for all hosts

All of: SSHProxyParamsRequest object

network string

The network address for cluster setup in the format <ip>/<netmask>. dstack will use IP addresses from this network for communication between hosts. If not specified, dstack will use IPs from the first found internal network.

FleetNodesSpecRequest object

min integer required

The minimum number of instances to maintain in the fleet

target integer required

The number of instances to provision on fleet apply. min <= target <= max Defaults to min

max integer

The maximum number of instances allowed in the fleet. Unlimited if not specified

InstanceGroupPlacement string

An enumeration.

FleetConfigurationRequest object

type string

Default: "fleet"

Values: "fleet"

name string

The fleet name

env

The mapping or the list of environment variables

Default:

{
  "__root__": {}
}

All of: Env string[] | object

ssh_config

The parameters for adding instances via SSH

All of: SSHParamsRequest object

nodes FleetNodesSpecRequest | integer | string

The number of instances in cloud fleet

placement

The placement of instances: any or cluster

All of: InstanceGroupPlacement string

reservation string

The existing reservation to use for instance provisioning. Supports AWS Capacity Reservations, AWS Capacity Blocks, and GCP reservations

resources

The resources requirements

Default:

{
  "cpu": {
    "min": 2,
    "max": null
  },
  "memory": {
    "min": 8.0,
    "max": null
  },
  "shm_size": null,
  "gpu": {
    "vendor": null,
    "name": null,
    "count": {
      "min": 0,
      "max": null
    },
    "memory": null,
    "total_memory": null,
    "compute_capability": null
  },
  "disk": {
    "size": {
      "min": 100.0,
      "max": null
    }
  }
}

All of: ResourcesSpecRequest object

blocks string | integer

The amount of blocks to split the instance into, a number or auto. auto means as many as possible. The number of GPUs and CPUs must be divisible by the number of blocks. Defaults to 1, i.e. do not split

Default: 1

backends BackendType[]

The backends to consider for provisioning (e.g., [aws, gcp])

regions string[]

The regions to consider for provisioning (e.g., [eu-west-1, us-west4, westeurope])

availability_zones string[]

The availability zones to consider for provisioning (e.g., [eu-west-1a, us-west4-a])

instance_types string[]

The cloud-specific instance types to consider for provisioning (e.g., [p3.8xlarge, n1-standard-4])

spot_policy

The policy for provisioning spot or on-demand instances: spot, on-demand, auto. Defaults to on-demand

All of: SpotPolicy string

retry ProfileRetryRequest | boolean

The policy for provisioning retry. Defaults to false

max_price number

The maximum instance price per hour, in dollars

exclusiveMin=0.0

idle_duration integer | string

Time to wait before terminating idle instances. Instances are not terminated if the fleet is already at nodes.min. Defaults to 5m for runs and 3d for fleets. Use off for unlimited duration

tags Record<string, string>

The custom tags to associate with the resource. The tags are also propagated to the underlying backend resources. If there is a conflict with backend-level tags, does not override them

SGLangGatewayRouterConfigRequest object

Gateway-level router configuration. type and policy only. pd_disaggregation is service-level.

type string

The router type enabled on this gateway.

Default: "sglang"

Values: "sglang"

policy string

The routing policy. Deprecated: prefer setting policy in the service's router config. Options: random, round_robin, cache_aware, power_of_two

Default: "cache_aware"

Values: "random" "round_robin" "cache_aware" "power_of_two"

LetsEncryptGatewayCertificateRequest object

type string

Automatic certificates by Let's Encrypt

Default: "lets-encrypt"

Values: "lets-encrypt"

ACMGatewayCertificateRequest object

arn string required

The ARN of the wildcard ACM certificate for the domain

type string

Certificates by AWS Certificate Manager (ACM)

Default: "acm"

Values: "acm"

GatewayConfigurationRequest object

backend required

The gateway backend

All of: BackendType string

region string required

The gateway region

type string

Default: "gateway"

Values: "gateway"

name string

The gateway name

default boolean

Make the gateway default

Default: false

instance_type string

Backend-specific instance type to use for the gateway instance. Omit to use the backend's default, which is typically a small non-GPU instance

minLength=1

router

The router configuration for this gateway. E.g. { type: sglang, policy: round_robin }.

All of: SGLangGatewayRouterConfigRequest object

domain string

The gateway domain, e.g. example.com

public_ip boolean

Allocate public IP for the gateway

Default: true

certificate LetsEncryptGatewayCertificateRequest | ACMGatewayCertificateRequest

The SSL certificate configuration. Set to null to disable. Defaults to type: lets-encrypt

Default:

{
  "type": "lets-encrypt"
}

tags Record<string, string>

The custom tags to associate with the gateway. The tags are also propagated to the underlying backend resources. If there is a conflict with backend-level tags, does not override them

VolumeConfigurationRequest object

backend required

The volume backend

All of: BackendType string

region string required

The volume region

type string

Default: "volume"

Values: "volume"

name string

The volume name

availability_zone string

The volume availability zone

size number

The volume size. Must be specified when creating new volumes

volume_id string

The volume ID. Must be specified when registering external volumes

auto_cleanup_duration integer | string

Time to wait after volume is no longer used by any job before deleting it. Defaults to keep the volume indefinitely. Use the value 'off' or -1 to disable auto-cleanup.

tags Record<string, string>

The custom tags to associate with the volume. The tags are also propagated to the underlying backend resources. If there is a conflict with backend-level tags, does not override them

dstack configuration

Validate with Lintel

One of

Definitions