Restate
Restate Server configuration file
| Type | object |
|---|---|
| File match |
**/restate.toml
**/restate-server.toml
|
| Schema URL | https://catalog.lintel.tools/schemas/schemastore/restate/latest.json |
| Source | https://docs.restate.dev/schemas/restate-server-configuration-schema.json |
Validate with Lintel
npx @lintel/lintel check
Configuration for Restate server.
Properties
Distributed tracing exporter filter.
Check the RUST_LOG documentation for more details how to configure it.
Headers that should be applied to all outgoing requests (HTTP and Lambda).
Defaults to x-restate-cluster-name: <cluster name>.
14 nested properties
Address that other nodes will use to connect to this service.
The full prefix that will be used to advertise this service publicly.
For example, if this is set to <https://my-host> then others will use this
as base URL to connect to this service.
If unset, the advertised address will be inferred from public address of this node
or it'll use the value supplied in advertised-host if set.
Optional advertised Admin API endpoint.
[Deprecated] Use advertised-address instead.
Hostname to advertise for this service
The combination of bind-ip and bind-port that will be used to bind
This has precedence over bind-ip and bind-port
Local interface IP address to listen on
Network port to listen on
Concurrency limit for the Admin APIs. Default is unlimited.
List of header names considered routing headers.
These will be used during deployment creation to distinguish between an already existing deployment and a new deployment.
Disable serving the Restate Web UI on the admin port. Default is false.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Listen on unix-sockets, TCP sockets, or both.
The default is to listen on both.
3 nested properties
The degree of parallelism to use for query execution (Defaults to the number of available cores).
The path to spill to
Use random ports instead of the default port
Address that other nodes will use to connect to this service.
The full prefix that will be used to advertise this service publicly.
For example, if this is set to <https://my-host> then others will use this
as base URL to connect to this service.
If unset, the advertised address will be inferred from public address of this node
or it'll use the value supplied in advertised-host if set.
Hostname to advertise for this service
If true, then this node is allowed to automatically provision as a new cluster. This node must have an admin role and a new nodes configuration will be created that includes this node.
auto-provision is allowed by default in development mode and is disabled if restate-server runs with --production flag
to prevent cluster nodes from forming their own clusters, rather than forming a single cluster.
Use restatectl to provision the cluster/node if automatic provisioning is disabled.
This can also be explicitly disabled by setting this value to false.
Default: true
An external ID to apply to any AssumeRole operations taken by this client.
https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html
Can be overridden by the AWS_EXTERNAL_ID environment variable.
Name of the AWS profile to select. Defaults to 'AWS_PROFILE' env var, or otherwise
the default profile.
The working directory which this Restate node should use for relative paths. The default is
restate-data under the current working directory.
10 nested properties
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
An enum with the list of supported loglet providers.
When enabled, automatic improvement periodically checks with the loglet provider if the loglet configuration can be improved by performing a reconfiguration.
This allows the log to pick up replication property changes, apply better placement of replicas, or for other reasons.
Configuration of local loglet provider
{
"rocksdb-disable-wal": false,
"rocksdb-disable-wal-fsync": false,
"rocksdb-log-keep-file-num": null,
"rocksdb-log-level": null,
"rocksdb-log-max-file-size": null,
"rocksdb-memory-ratio": 0.5,
"writer-batch-commit-count": 5000,
"writer-batch-commit-duration": "0s"
}
Definition of a retry policy
8 nested properties
Definition of a retry policy
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Maximum number of inflight records sequencer can accept
Once this maximum is hit, sequencer will induce back pressure on clients. This controls the total number of records regardless of how many batches.
Note that this will be increased to fit the biggest batch of records being enqueued.
Maximum number of records to prefetch from log servers
The number of records bifrost will attempt to prefetch from replicated loglet's log-servers for every loglet reader (e.g. partition processor). Note that this mainly impacts readers that are not co-located with the loglet sequencer (i.e. partition processor followers).
Trigger to prefetch more records
When read-ahead is used (readahead-records), this value (percentage in float) will determine when readers should trigger a prefetch for another batch to fill up the buffer. For instance, if this value is 0.3, then bifrost will trigger a prefetch when 30% or more of the read-ahead slots become available (e.g. partition processor consumed records and freed up enough slots).
The higher the value is, the longer bifrost will wait before it triggers the next fetch, potentially fetching more records as a result.
To illustrate, if readahead-records is set to 100 and readahead-trigger-ratio is 1.0. Then bifrost will prefetch up to 100 records from log-servers and will not trigger the next prefetch unless the consumer consumes 100% of this buffer. This means that bifrost will read in batches but will not do while the consumer is still reading the previous batch.
Value must be between 0 and 1. It will be clamped at 1.0.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Definition of a retry policy
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
The combination of bind-ip and bind-port that will be used to bind
This has precedence over bind-ip and bind-port
Local interface IP address to listen on
Network port to listen on
A unique identifier for the cluster. All nodes in the same cluster should have the same.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Number of partitions that will be provisioned during initial cluster provisioning. partitions are the logical shards used to process messages.
Cannot be higher than 65535 (You should almost never need as many partitions anyway)
NOTE 1: This config entry only impacts the initial number of partitions, the value of this entry is ignored for provisioned nodes/clusters.
NOTE 2: This will be renamed to default-num-partitions by default as of v1.3+
Default: 24
Configures the global default replication factor to be used by the the system.
Note that this value only impacts the cluster initial provisioning and will not be respected after the cluster has been provisioned.
To update existing clusters use the restatectl utility.
5 nested properties
The factor to use to compute the next retry attempt. Default: 2.0.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Size of the default thread pool used to perform internal tasks. If not set, it defaults to the number of CPU cores.
Disable prometheus metric recording and reporting. Default is false.
Restate uses Scarf to collect anonymous usage data to help us understand how the software is being used. You can set this flag to true to disable this collection. It can also be set with the environment variable DO_NOT_TRACK=1.
Use the new experimental kafka ingestion path which leverages batching for a faster kafka ingestion.
Set to true to enable the experimental ingestion mechanism.
The legacy path will be removed in v1.7.
Defaults to false in v1.6.
Use the new experimental batch ingestion path.
Set to true to enable the experimental ingestion mechanism.
The legacy path will be removed in v1.7.
Defaults to false in v1.6.
If set, the node insists on acquiring this node ID.
In addition to basic health/liveness information, the gossip protocol is used to exchange
extra information about the roles hosted by this node. For instance, which partitions are
currently running, their configuration versions, and the durable LSN of the corresponding
partition databases. This information is sent every Nth gossip message. This setting
controls the frequency of this exchange. For instance, 10 means that every 10th gossip
message will contain the extra information about.
Specifies how many gossip intervals of inactivity need to pass before considering a node as dead.
How many intervals need to pass without receiving any gossip messages before considering this node as potentially isolated/dead. This threshold is used in the case where the node can still send gossip messages but did not receive any. This can rarely happen in asymmetric network partitions.
In this case, the node will advertise itself as dead in the gossip messages it sends out.
Note: this threshold does not apply to a cluster that's configured with a single node.
On every gossip interval, how many peers each node attempts to gossip with. The default is optimized for small clusters (less than 5 nodes). On larger clusters, if gossip overhead is noticeable, consider reducing this value to 1.
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Configuration for the HTTP/2 keep-alive mechanism, using PING frames.
Please note: most gateways don't propagate the HTTP/2 keep-alive between downstream and upstream hosts. In those environments, you need to make sure the gateway can detect a broken connection to the upstream deployment(s).
2 nested properties
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
A URI, such as <http://127.0.0.1:10001>, of a server to which all invocations should be sent, with the Host header set to the deployment URI.
HTTPS proxy URIs are supported, but only HTTP endpoint traffic will be proxied currently.
Can be overridden by the HTTP_PROXY environment variable.
11 nested properties
Address that other nodes will use to connect to this service.
The full prefix that will be used to advertise this service publicly.
For example, if this is set to <https://my-host> then others will use this
as base URL to connect to this service.
If unset, the advertised address will be inferred from public address of this node
or it'll use the value supplied in advertised-host if set.
Hostname to advertise for this service
[Deprecated] Use advertised-address instead.
Ingress endpoint that the Web UI should use to interact with.
The combination of bind-ip and bind-port that will be used to bind
This has precedence over bind-ip and bind-port
Local interface IP address to listen on
Network port to listen on
Local concurrency limit to use to limit the amount of concurrent requests. If exceeded, the ingress will reply immediately with an appropriate status code. Default is unlimited.
Options for ingestion client
3 nested properties
Definition of a retry policy
Non-zero human-readable bytes
Non-zero human-readable bytes
[]
Listen on unix-sockets, TCP sockets, or both.
The default is to listen on both.
Use random ports instead of the default port
Sets the initial maximum of locally initiated (send) streams.
This value will be overwritten by the value included in the initial SETTINGS frame received from the peer as part of a [connection preface].
Default: None
NOTE: Setting this value to None (default) users the default recommended value from HTTP2 specs
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Listen on unix-sockets, TCP sockets, or both.
The default is to listen on both.
Setting the location allows Restate to form a tree-like cluster topology. The value is written in the format of "region[.zone]" to assign this node to a specific region, or to a zone within a region.
The value of region and zone is arbitrary but whitespace and . are disallowed.
NOTE: It's strongly recommended to not change the node's location string after
its initial registration. Changing the location may result in data loss or data
inconsistency if log-server is enabled on this node.
When this value is not set, the node is considered to be in the default location. The default location means that the node is not assigned to any specific region or zone.
Examples
us-west-- the node is in theus-westregion.us-west.a1-- the node is in theus-westregion and in thea1zone.- `` -- [default] the node is in the default location
Disable ANSI terminal codes for logs. This is useful when the log collector doesn't support processing ANSI terminal codes.
Log filter configuration. Can be overridden by the RUST_LOG environment variable.
Check the RUST_LOG documentation for more details how to configure it.
Configuration is only used on nodes running with log-server role.
18 nested properties
The number of messages that can queue up on input network stream while request processor is busy.
Uncompressed block size
Default: 64KiB
If non-zero, we perform bigger reads when doing compaction. If you're running RocksDB on spinning disks, you should set this to at least 2MB. That way RocksDB's compaction is doing sequential instead of random reads.
Use O_DIRECT for writes in background flush and compactions.
Files will be opened in "direct I/O" mode which means that data r/w from the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters.
Disable rocksdb statistics collection
Default: False (statistics enabled)
The default depends on the different rocksdb use-cases at Restate.
Supports hot-reloading (Partial / Bifrost only)
Disable fsync of WAL on every batch
Number of info LOG files to keep
Default: 1
Verbosity of the LOG.
Default: "error"
Max size of info LOG file
Default: 64MB
Default: the number of CPU cores on this node.
The maximum number of subcompactions to run in parallel.
Setting this to 1 means no sub-compactions are allowed (i.e. only 1 thread will do the compaction).
Default is 0 which maps to floor(number of CPU cores / 2)
The memory budget for rocksdb memtables in bytes
If this value is set, it overrides the ratio defined in rocksdb-memory-ratio.
The memory budget for rocksdb memtables as ratio
This defines the total memory for rocksdb as a ratio of all memory available to the log-server.
(See rocksdb-total-memtables-ratio in common).
StatsLevel can be used to reduce statistics overhead by skipping certain types of stats in the stats collection process.
Default: "except-detailed-timers"
Trigger a commit when the batch size exceeds this threshold.
Set to 0 or 1 to commit the write batch on every command.
Maximum journal retention duration that can be configured. When discovering a service deployment, or when modifying the journal retention using the Admin API, the given value will be clamped.
Unset means no limit.
Maximum max attempts configurable in an invocation retry policy. When discovering a service deployment with configured retry policies, or when modifying the invocation retry policy using the Admin API, the given value will be clamped.
None means no limit, that is infinite retries is enabled.
The metadata client type to store metadata
5 nested properties
Definition of a retry policy
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Maximum size of network messages that metadata client can receive from a metadata server.
If unset, defaults to networking.message-size-limit. If set, it will be clamped at
the value of networking.message-size-limit since larger messages cannot be transmitted
over the cluster internal network.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
20 nested properties
Auto join the metadata cluster when being started
Defines whether this node should auto join the metadata store cluster when being started for the first time.
The threshold for trimming the raft log. The log will be trimmed if the number of apply entries
exceeds this threshold. The default value is 1000.
The number of ticks before triggering an election
The number of ticks before triggering an election. The value must be larger than
raft_heartbeat_tick. It's recommended to set raft_election_tick = 10 * raft_heartbeat_tick.
Decrease this value if you want to react faster to failed leaders. Note, decreasing this
value too much can lead to cluster instabilities due to falsely detecting dead leaders.
The number of ticks before sending a heartbeat
A leader sends heartbeat messages to maintain its leadership every heartbeat ticks. Decrease this value to send heartbeats more often.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Limit number of in-flight requests
Number of in-flight metadata store requests.
Uncompressed block size
Default: 64KiB
If non-zero, we perform bigger reads when doing compaction. If you're running RocksDB on spinning disks, you should set this to at least 2MB. That way RocksDB's compaction is doing sequential instead of random reads.
Use O_DIRECT for writes in background flush and compactions.
Files will be opened in "direct I/O" mode which means that data r/w from the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters.
Disable rocksdb statistics collection
Default: False (statistics enabled)
The default depends on the different rocksdb use-cases at Restate.
Supports hot-reloading (Partial / Bifrost only)
Number of info LOG files to keep
Default: 1
Verbosity of the LOG.
Default: "error"
Max size of info LOG file
Default: 64MB
Default: the number of CPU cores on this node.
The memory budget for rocksdb memtables in bytes
If this value is set, it overrides the ratio defined in rocksdb-memory-ratio.
The memory budget for rocksdb memtables as ratio
This defines the total memory for rocksdb as a ratio of all memory available to memtables
(See rocksdb-total-memtables-ratio in common).
StatsLevel can be used to reduce statistics overhead by skipping certain types of stats in the stats collection process.
Default: "except-detailed-timers"
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Definition of a retry policy
Common network configuration options for communicating with Restate cluster nodes. Note that similar keys are present in other config sections, such as in Service Client options.
9 nested properties
Definition of a retry policy
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero human-readable bytes
Disables Zstd compression for internal gRPC network connections
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
IP subnets, addresses, and domain names eg localhost,restate.dev,127.0.0.1,::1,192.168.1.0/24 that should not be proxied by the http_proxy.
IP addresses must not have ports, and IPv6 addresses must not be wrapped in '[]'.
Subdomains are also matched. An entry “*” matches all hostnames.
Can be overridden by the NO_PROXY environment variable, which supports comma separated values.
Unique name for this node in the cluster. The node must not change unless it's started with empty local store. It defaults to the node's hostname.
Request minimum size to enable compression. The request size includes the total of the journal replay and its framing using Restate service protocol, without accounting for the json envelope and the base 64 encoding.
Default: 4MB (The default AWS Lambda Limit is 6MB, 4MB roughly accounts for +33% of Base64 and the json envelope).
A path to a file, such as "/var/secrets/key.pem", which contains exactly one ed25519 private
key in PEM format. Such a file can be generated with openssl genpkey -algorithm ed25519.
If provided, this key will be used to attach JWTs to requests from this client which
SDKs may optionally verify, proving that the caller is a particular Restate instance.
This file is currently only read on client creation, but this may change in future. Parsed public keys will be logged at INFO level in the same format that SDKs expect.
The number of threads to reserve to Rocksdb background tasks. Defaults to the number of cores on the machine.
Uncompressed block size
Default: 64KiB
If non-zero, we perform bigger reads when doing compaction. If you're running RocksDB on spinning disks, you should set this to at least 2MB. That way RocksDB's compaction is doing sequential instead of random reads.
Use O_DIRECT for writes in background flush and compactions.
Files will be opened in "direct I/O" mode which means that data r/w from the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters.
Disable rocksdb statistics collection
Default: False (statistics enabled)
The default depends on the different rocksdb use-cases at Restate.
Supports hot-reloading (Partial / Bifrost only)
The number of threads to reserve to high priority Rocksdb background tasks.
Number of info LOG files to keep
Default: 1
Verbosity of the LOG.
Default: "error"
Max size of info LOG file
Default: 64MB
Default: the number of CPU cores on this node.
StatsLevel can be used to reduce statistics overhead by skipping certain types of stats in the stats collection process.
Default: "except-detailed-timers"
Non-zero human-readable bytes
The memory size used across all memtables (ratio between 0 to 1.0). This limits how much memory memtables can eat up from the value in rocksdb-total-memory-limit. When set to 0, memtables can take all available memory up to the value specified in rocksdb-total-memory-limit. This value will be sanitized to 1.0 if outside the valid bounds.
Defines the roles which this Restate node should run, by default the node starts with all roles.
[
"http-ingress",
"admin",
"worker",
"log-server",
"metadata-server"
]
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Storage high priority thread pool
This configures the restate-managed storage thread pool for performing high-priority or latency-sensitive storage tasks when the IO operation cannot be performed on in-memory caches.
Storage low priority thread pool
This configures the restate-managed storage thread pool for performing low-priority or latency-insensitive storage tasks.
Address to bind for the tokio-console tracing subscriber. If unset and restate-server is
built with tokio-console support, it'll listen on 0.0.0.0:6669.
This is a shortcut to set both [Self::tracing_runtime_endpoint], and [Self::tracing_services_endpoint].
Specify the tracing endpoint to send runtime traces to. Traces will be exported using OTLP gRPC through opentelemetry_otlp.
To configure the sampling, please refer to the opentelemetry autoconfigure docs.
Proxy type to implement HashMap<HeaderName, HeaderValue> ser/de
Use it directly or with #[serde(with = "serde_with::As::<serde_with::FromInto<restate_serde_util::SerdeableHeaderMap>>")].
If set, an exporter will be configured to write traces to files using the Jaeger JSON format.
Each trace file will start with the trace prefix.
If unset, no traces will be written to file.
It can be used to export traces in a structured format without configuring a Jaeger agent.
To inspect the traces, open the Jaeger UI and use the Upload JSON feature to load and inspect them.
Overrides [Self::tracing_endpoint] for runtime traces
Specify the tracing endpoint to send runtime traces to. Traces will be exported using OTLP gRPC through opentelemetry_otlp.
To configure the sampling, please refer to the opentelemetry autoconfigure docs.
Overrides [Self::tracing_endpoint] for services traces
Specify the tracing endpoint to send services traces to. Traces will be exported using OTLP gRPC through opentelemetry_otlp.
To configure the sampling, please refer to the opentelemetry autoconfigure docs.
Use random ports instead of the default port
10 nested properties
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Every partition store is backed up by a durable log that is used to recover the state of the partition on restart or failover. The durability mode defines the criteria used to determine whether a partition is considered fully durable or not at a given point in the log history. Once a partition is fully durable, its backing log is allowed to be trimmed to the durability point.
This helps keeping the log's disk usage under control but it forces nodes that need to restore the state of the partition to fetch a snapshot of that partition that covers the changes up to and including the "durability point".
Since v1.4.2 (not compatible with earlier versions)
9 nested properties
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Configures rate limiting for service actions at the node level. This throttling mechanism uses a token bucket algorithm to control the rate at which actions can be processed, helping to prevent resource exhaustion and maintain system stability under high load.
The throttling limit is shared across all partitions running on this node,
providing a global rate limit for the entire node rather than per-partition limits.
When unset, no throttling is applied and actions are processed
without throttling.
Number of concurrent invocations that can be processed by the invoker.
Defines the threshold after which queues invocations will spill to disk at
the path defined in tmp-dir. In other words, this is the number of invocations
that can be kept in memory before spilling to disk. This is a per-partition limit.
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Configures throttling for service invocations at the node level. This throttling mechanism uses a token bucket algorithm to control the rate at which invocations can be processed, helping to prevent resource exhaustion and maintain system stability under high load.
The throttling limit is shared across all partitions running on this node,
providing a global rate limit for the entire node rather than per-partition limits.
When unset, no throttling is applied and invocations are processed
without throttling.
Maximum size of journal messages that can be received from a service. If a service sends a message larger than this limit, the invocation will fail.
If unset, defaults to networking.message-size-limit. If set, it will be clamped at
the value of networking.message-size-limit since larger messages cannot be transmitted
over the cluster internal network.
Temporary directory to use for the invoker temporary files. If empty, the system temporary directory will be used instead.
The maximum number of commands a partition processor will apply in a batch. The larger this value is, the higher the throughput and latency are.
The number of timers in memory limit is used to bound the amount of timers loaded in memory. If this limit is set, when exceeding it, the timers farther in the future will be spilled to disk.
Options for ingestion client
3 nested properties
Definition of a retry policy
Non-zero human-readable bytes
Non-zero human-readable bytes
Partition store object-store snapshotting settings. At a minimum, set destination to enable
manual snapshotting via restatectl. Additionally, snapshot-interval and
snapshot-interval-num-records can be used to configure automated periodic snapshots. For a
complete example, see Snapshots.
12 nested properties
Username for Minio, or consult the service documentation for other S3-compatible stores.
Allow plain HTTP to be used with the object store endpoint. Required when the endpoint URL that isn't using HTTPS.
When you use Amazon S3, this is typically inferred from the region and there is no need to
set it. With other object stores, you will have to provide an appropriate HTTP(S) endpoint.
If not using HTTPS, also set aws-allow-http to true.
The AWS configuration profile to use for S3 object store destinations. If you use named profiles in your AWS configuration, you can replace all the other settings with a single profile reference. See the [AWS documentation on profiles] (https://docs.aws.amazon.com/sdkref/latest/guide/file-format.html) for more.
AWS region to use with S3 object store destinations. This may be inferred from the
environment, for example the current region when running in EC2. Because of the
request signing algorithm this must have a value. For Minio, you can generally
set this to any string, such as us-east-1.
Password for Minio, or consult the service documentation for other S3-compatible stores.
This is only needed with short-term STS session credentials.
Base URL for cluster snapshots. Currently only supports the s3:// protocol scheme.
S3-compatible object stores must support ETag-based conditional writes.
Default: None
Definition of a retry policy
A time interval at which partition snapshots will be created. If
snapshot-interval-num-records is also set, it will be treated as an additional requirement
before a snapshot is taken. Use both time-based and record-based intervals to reduce the
number of snapshots created during times of low activity.
Snapshot intervals are calculated based on the wall clock timestamps reported by cluster nodes, assuming a basic level of clock synchronization within the cluster.
This setting does not influence explicitly requested snapshots triggered using restatectl.
Default: None - automatic snapshots are disabled
Number of log records that trigger a snapshot to be created.
As snapshots are created asynchronously, the actual number of new records that will trigger a snapshot will vary. The counter for the subsequent snapshot begins from the LSN at which the previous snapshot export was initiated.
This setting does not influence explicitly requested snapshots triggered using restatectl.
Default: None - automatic snapshots are disabled
14 nested properties
Uncompressed block size
Default: 64KiB
If non-zero, we perform bigger reads when doing compaction. If you're running RocksDB on spinning disks, you should set this to at least 2MB. That way RocksDB's compaction is doing sequential instead of random reads.
When set to true, disables RocksDB's CompactOnDeletionCollector for partition stores.
The collector automatically triggers compaction when SST files accumulate a high density
of tombstones (deletion markers), helping reclaim disk space after bulk deletions.
This helps control space amplification when invocation journal retention expires and the cleaner purges completed invocations.
Consider disabling this if you observe frequent unnecessary compactions triggered by the collector causing performance issues.
Use O_DIRECT for writes in background flush and compactions.
Files will be opened in "direct I/O" mode which means that data r/w from the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters.
Disable rocksdb statistics collection
Default: False (statistics enabled)
The default depends on the different rocksdb use-cases at Restate.
Supports hot-reloading (Partial / Bifrost only)
Number of info LOG files to keep
Default: 1
Verbosity of the LOG.
Default: "error"
Max size of info LOG file
Default: 64MB
Default: the number of CPU cores on this node.
The memory budget for rocksdb memtables in bytes
The total is divided evenly across partitions. The server will rebalance the memory budget periodically depending on the number of running partitions on this node.
If this value is set, it overrides the ratio defined in rocksdb-memory-ratio.
The memory budget for rocksdb memtables as ratio
This defines the total memory for rocksdb as a ratio of all memory available to memtables
(See rocksdb-total-memtables-ratio in common). The budget is then divided evenly across
partitions.
StatsLevel can be used to reduce statistics overhead by skipping certain types of stats in the stats collection process.
Default: "except-detailed-timers"
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Definitions
Address that other nodes will use to connect to this service.
The full prefix that will be used to advertise this service publicly.
For example, if this is set to <https://my-host> then others will use this
as base URL to connect to this service.
If unset, the advertised address will be inferred from public address of this node
or it'll use the value supplied in advertised-host if set.
Optional advertised Admin API endpoint.
[Deprecated] Use advertised-address instead.
Hostname to advertise for this service
The combination of bind-ip and bind-port that will be used to bind
This has precedence over bind-ip and bind-port
Local interface IP address to listen on
Network port to listen on
Concurrency limit for the Admin APIs. Default is unlimited.
List of header names considered routing headers.
These will be used during deployment creation to distinguish between an already existing deployment and a new deployment.
Disable serving the Restate Web UI on the admin port. Default is false.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Listen on unix-sockets, TCP sockets, or both.
The default is to listen on both.
3 nested properties
The degree of parallelism to use for query execution (Defaults to the number of available cores).
The path to spill to
Use random ports instead of the default port
An externally accessible URI address for admin-api-server. This can be set to unix:restate-data/admin.sock to advertise the automatically created unix-socket instead of using tcp if needed
"http//127.0.0.1:9070/""https://my-host/""unix:/data/restate-data/admin.sock"
An externally accessible URI address for http-ingress-server. This can be set to unix:restate-data/ingress.sock to advertise the automatically created unix-socket instead of using tcp if needed
"http//127.0.0.1:8080/""https://my-host/""unix:/data/restate-data/ingress.sock"
An externally accessible URI address for message-fabric-server. This can be set to unix:restate-data/fabric.sock to advertise the automatically created unix-socket instead of using tcp if needed
"http//127.0.0.1:5122/""https://my-host/""unix:/data/restate-data/fabric.sock"
An externally accessible URI address for tokio-console-server. This can be set to unix:restate-data/tokio.sock to advertise the automatically created unix-socket instead of using tcp if needed
"http//127.0.0.1:6669/""https://my-host/""unix:/data/restate-data/tokio.sock"
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
An enum with the list of supported loglet providers.
When enabled, automatic improvement periodically checks with the loglet provider if the loglet configuration can be improved by performing a reconfiguration.
This allows the log to pick up replication property changes, apply better placement of replicas, or for other reasons.
Configuration of local loglet provider
{
"rocksdb-disable-wal": false,
"rocksdb-disable-wal-fsync": false,
"rocksdb-log-keep-file-num": null,
"rocksdb-log-level": null,
"rocksdb-log-max-file-size": null,
"rocksdb-memory-ratio": 0.5,
"writer-batch-commit-count": 5000,
"writer-batch-commit-duration": "0s"
}
Definition of a retry policy
8 nested properties
Definition of a retry policy
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Maximum number of inflight records sequencer can accept
Once this maximum is hit, sequencer will induce back pressure on clients. This controls the total number of records regardless of how many batches.
Note that this will be increased to fit the biggest batch of records being enqueued.
Maximum number of records to prefetch from log servers
The number of records bifrost will attempt to prefetch from replicated loglet's log-servers for every loglet reader (e.g. partition processor). Note that this mainly impacts readers that are not co-located with the loglet sequencer (i.e. partition processor followers).
Trigger to prefetch more records
When read-ahead is used (readahead-records), this value (percentage in float) will determine when readers should trigger a prefetch for another batch to fill up the buffer. For instance, if this value is 0.3, then bifrost will trigger a prefetch when 30% or more of the read-ahead slots become available (e.g. partition processor consumed records and freed up enough slots).
The higher the value is, the longer bifrost will wait before it triggers the next fetch, potentially fetching more records as a result.
To illustrate, if readahead-records is set to 100 and readahead-trigger-ratio is 1.0. Then bifrost will prefetch up to 100 records from log-servers and will not trigger the next prefetch unless the consumer consumes 100% of this buffer. This means that bifrost will read in batches but will not do while the consumer is still reading the previous batch.
Value must be between 0 and 1. It will be clamped at 1.0.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Definition of a retry policy
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
The local network address to bind on for admin-api-server. This service uses default port 9070 and will create a unix-socket file at the data directory under the name admin.sock
"0.0.0.0:9070""127.0.0.1:9070"
The local network address to bind on for http-ingress-server. This service uses default port 8080 and will create a unix-socket file at the data directory under the name ingress.sock
"0.0.0.0:8080""127.0.0.1:8080"
The local network address to bind on for message-fabric-server. This service uses default port 5122 and will create a unix-socket file at the data directory under the name fabric.sock
"0.0.0.0:5122""127.0.0.1:5122"
The local network address to bind on for tokio-console-server. This service uses default port 6669 and will create a unix-socket file at the data directory under the name tokio.sock
"0.0.0.0:6669""127.0.0.1:6669"
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
"10 hours""5 days""5d""1h 4m""P40D""0"
Configuration for the HTTP/2 keep-alive mechanism, using PING frames.
Please note: most gateways don't propagate the HTTP/2 keep-alive between downstream and upstream hosts. In those environments, you need to make sure the gateway can detect a broken connection to the upstream deployment(s).
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Human-readable bytes
Options for ingestion client
Definition of a retry policy
Non-zero human-readable bytes
Non-zero human-readable bytes
Address that other nodes will use to connect to this service.
The full prefix that will be used to advertise this service publicly.
For example, if this is set to <https://my-host> then others will use this
as base URL to connect to this service.
If unset, the advertised address will be inferred from public address of this node
or it'll use the value supplied in advertised-host if set.
Hostname to advertise for this service
[Deprecated] Use advertised-address instead.
Ingress endpoint that the Web UI should use to interact with.
The combination of bind-ip and bind-port that will be used to bind
This has precedence over bind-ip and bind-port
Local interface IP address to listen on
Network port to listen on
Local concurrency limit to use to limit the amount of concurrent requests. If exceeded, the ingress will reply immediately with an appropriate status code. Default is unlimited.
Options for ingestion client
3 nested properties
Definition of a retry policy
Non-zero human-readable bytes
Non-zero human-readable bytes
[]
Listen on unix-sockets, TCP sockets, or both.
The default is to listen on both.
Use random ports instead of the default port
The factor to use to compute the next retry attempt. Default: 2.0.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Configures rate limiting for service actions at the node level. This throttling mechanism uses a token bucket algorithm to control the rate at which actions can be processed, helping to prevent resource exhaustion and maintain system stability under high load.
The throttling limit is shared across all partitions running on this node,
providing a global rate limit for the entire node rather than per-partition limits.
When unset, no throttling is applied and actions are processed
without throttling.
Number of concurrent invocations that can be processed by the invoker.
Defines the threshold after which queues invocations will spill to disk at
the path defined in tmp-dir. In other words, this is the number of invocations
that can be kept in memory before spilling to disk. This is a per-partition limit.
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Configures throttling for service invocations at the node level. This throttling mechanism uses a token bucket algorithm to control the rate at which invocations can be processed, helping to prevent resource exhaustion and maintain system stability under high load.
The throttling limit is shared across all partitions running on this node,
providing a global rate limit for the entire node rather than per-partition limits.
When unset, no throttling is applied and invocations are processed
without throttling.
Maximum size of journal messages that can be received from a service. If a service sends a message larger than this limit, the invocation will fail.
If unset, defaults to networking.message-size-limit. If set, it will be clamped at
the value of networking.message-size-limit since larger messages cannot be transmitted
over the cluster internal network.
Temporary directory to use for the invoker temporary files. If empty, the system temporary directory will be used instead.
Configuration options to connect to a Kafka cluster.
Initial list of brokers (host or host:port).
Cluster name (Used to identify subscriptions).
Configuration is only used on nodes running with log-server role.
The number of messages that can queue up on input network stream while request processor is busy.
Uncompressed block size
Default: 64KiB
If non-zero, we perform bigger reads when doing compaction. If you're running RocksDB on spinning disks, you should set this to at least 2MB. That way RocksDB's compaction is doing sequential instead of random reads.
Use O_DIRECT for writes in background flush and compactions.
Files will be opened in "direct I/O" mode which means that data r/w from the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters.
Disable rocksdb statistics collection
Default: False (statistics enabled)
The default depends on the different rocksdb use-cases at Restate.
Supports hot-reloading (Partial / Bifrost only)
Disable fsync of WAL on every batch
Number of info LOG files to keep
Default: 1
Verbosity of the LOG.
Default: "error"
Max size of info LOG file
Default: 64MB
Default: the number of CPU cores on this node.
The maximum number of subcompactions to run in parallel.
Setting this to 1 means no sub-compactions are allowed (i.e. only 1 thread will do the compaction).
Default is 0 which maps to floor(number of CPU cores / 2)
The memory budget for rocksdb memtables in bytes
If this value is set, it overrides the ratio defined in rocksdb-memory-ratio.
The memory budget for rocksdb memtables as ratio
This defines the total memory for rocksdb as a ratio of all memory available to the log-server.
(See rocksdb-total-memtables-ratio in common).
StatsLevel can be used to reduce statistics overhead by skipping certain types of stats in the stats collection process.
Default: "except-detailed-timers"
Trigger a commit when the batch size exceeds this threshold.
Set to 0 or 1 to commit the write batch on every command.
The metadata client type to store metadata
Definition of a retry policy
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Maximum size of network messages that metadata client can receive from a metadata server.
If unset, defaults to networking.message-size-limit. If set, it will be clamped at
the value of networking.message-size-limit since larger messages cannot be transmitted
over the cluster internal network.
Auto join the metadata cluster when being started
Defines whether this node should auto join the metadata store cluster when being started for the first time.
The threshold for trimming the raft log. The log will be trimmed if the number of apply entries
exceeds this threshold. The default value is 1000.
The number of ticks before triggering an election
The number of ticks before triggering an election. The value must be larger than
raft_heartbeat_tick. It's recommended to set raft_election_tick = 10 * raft_heartbeat_tick.
Decrease this value if you want to react faster to failed leaders. Note, decreasing this
value too much can lead to cluster instabilities due to falsely detecting dead leaders.
The number of ticks before sending a heartbeat
A leader sends heartbeat messages to maintain its leadership every heartbeat ticks. Decrease this value to send heartbeats more often.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Limit number of in-flight requests
Number of in-flight metadata store requests.
Uncompressed block size
Default: 64KiB
If non-zero, we perform bigger reads when doing compaction. If you're running RocksDB on spinning disks, you should set this to at least 2MB. That way RocksDB's compaction is doing sequential instead of random reads.
Use O_DIRECT for writes in background flush and compactions.
Files will be opened in "direct I/O" mode which means that data r/w from the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters.
Disable rocksdb statistics collection
Default: False (statistics enabled)
The default depends on the different rocksdb use-cases at Restate.
Supports hot-reloading (Partial / Bifrost only)
Number of info LOG files to keep
Default: 1
Verbosity of the LOG.
Default: "error"
Max size of info LOG file
Default: 64MB
Default: the number of CPU cores on this node.
The memory budget for rocksdb memtables in bytes
If this value is set, it overrides the ratio defined in rocksdb-memory-ratio.
The memory budget for rocksdb memtables as ratio
This defines the total memory for rocksdb as a ratio of all memory available to memtables
(See rocksdb-total-memtables-ratio in common).
StatsLevel can be used to reduce statistics overhead by skipping certain types of stats in the stats collection process.
Default: "except-detailed-timers"
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Common network configuration options for communicating with Restate cluster nodes. Note that similar keys are present in other config sections, such as in Service Client options.
Definition of a retry policy
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero human-readable bytes
Disables Zstd compression for internal gRPC network connections
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
"10 hours""5 days""5d""1h 4m""P40D"
Non-zero human-readable bytes
An enum with the list of supported loglet providers.
The degree of parallelism to use for query execution (Defaults to the number of available cores).
The path to spill to
Definition of a retry policy
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Maximum number of inflight records sequencer can accept
Once this maximum is hit, sequencer will induce back pressure on clients. This controls the total number of records regardless of how many batches.
Note that this will be increased to fit the biggest batch of records being enqueued.
Maximum number of records to prefetch from log servers
The number of records bifrost will attempt to prefetch from replicated loglet's log-servers for every loglet reader (e.g. partition processor). Note that this mainly impacts readers that are not co-located with the loglet sequencer (i.e. partition processor followers).
Trigger to prefetch more records
When read-ahead is used (readahead-records), this value (percentage in float) will determine when readers should trigger a prefetch for another batch to fill up the buffer. For instance, if this value is 0.3, then bifrost will trigger a prefetch when 30% or more of the read-ahead slots become available (e.g. partition processor consumed records and freed up enough slots).
The higher the value is, the longer bifrost will wait before it triggers the next fetch, potentially fetching more records as a result.
To illustrate, if readahead-records is set to 100 and readahead-trigger-ratio is 1.0. Then bifrost will prefetch up to 100 records from log-servers and will not trigger the next prefetch unless the consumer consumes 100% of this buffer. This means that bifrost will read in batches but will not do while the consumer is still reading the previous batch.
Value must be between 0 and 1. It will be clamped at 1.0.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Definition of a retry policy
Definition of a retry policy
Verbosity of the LOG.
Proxy type to implement HashMap<HeaderName, HeaderValue> ser/de
Use it directly or with #[serde(with = "serde_with::As::<serde_with::FromInto<restate_serde_util::SerdeableHeaderMap>>")].
Partition store object-store snapshotting settings. At a minimum, set destination to enable
manual snapshotting via restatectl. Additionally, snapshot-interval and
snapshot-interval-num-records can be used to configure automated periodic snapshots. For a
complete example, see Snapshots.
Username for Minio, or consult the service documentation for other S3-compatible stores.
Allow plain HTTP to be used with the object store endpoint. Required when the endpoint URL that isn't using HTTPS.
When you use Amazon S3, this is typically inferred from the region and there is no need to
set it. With other object stores, you will have to provide an appropriate HTTP(S) endpoint.
If not using HTTPS, also set aws-allow-http to true.
The AWS configuration profile to use for S3 object store destinations. If you use named profiles in your AWS configuration, you can replace all the other settings with a single profile reference. See the [AWS documentation on profiles] (https://docs.aws.amazon.com/sdkref/latest/guide/file-format.html) for more.
AWS region to use with S3 object store destinations. This may be inferred from the
environment, for example the current region when running in EC2. Because of the
request signing algorithm this must have a value. For Minio, you can generally
set this to any string, such as us-east-1.
Password for Minio, or consult the service documentation for other S3-compatible stores.
This is only needed with short-term STS session credentials.
Base URL for cluster snapshots. Currently only supports the s3:// protocol scheme.
S3-compatible object stores must support ETag-based conditional writes.
Default: None
Definition of a retry policy
A time interval at which partition snapshots will be created. If
snapshot-interval-num-records is also set, it will be treated as an additional requirement
before a snapshot is taken. Use both time-based and record-based intervals to reduce the
number of snapshots created during times of low activity.
Snapshot intervals are calculated based on the wall clock timestamps reported by cluster nodes, assuming a basic level of clock synchronization within the cluster.
This setting does not influence explicitly requested snapshots triggered using restatectl.
Default: None - automatic snapshots are disabled
Number of log records that trigger a snapshot to be created.
As snapshots are created asynchronously, the actual number of new records that will trigger a snapshot will vary. The counter for the subsequent snapshot begins from the LSN at which the previous snapshot export was initiated.
This setting does not influence explicitly requested snapshots triggered using restatectl.
Default: None - automatic snapshots are disabled
Uncompressed block size
Default: 64KiB
If non-zero, we perform bigger reads when doing compaction. If you're running RocksDB on spinning disks, you should set this to at least 2MB. That way RocksDB's compaction is doing sequential instead of random reads.
When set to true, disables RocksDB's CompactOnDeletionCollector for partition stores.
The collector automatically triggers compaction when SST files accumulate a high density
of tombstones (deletion markers), helping reclaim disk space after bulk deletions.
This helps control space amplification when invocation journal retention expires and the cleaner purges completed invocations.
Consider disabling this if you observe frequent unnecessary compactions triggered by the collector causing performance issues.
Use O_DIRECT for writes in background flush and compactions.
Files will be opened in "direct I/O" mode which means that data r/w from the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters.
Disable rocksdb statistics collection
Default: False (statistics enabled)
The default depends on the different rocksdb use-cases at Restate.
Supports hot-reloading (Partial / Bifrost only)
Number of info LOG files to keep
Default: 1
Verbosity of the LOG.
Default: "error"
Max size of info LOG file
Default: 64MB
Default: the number of CPU cores on this node.
The memory budget for rocksdb memtables in bytes
The total is divided evenly across partitions. The server will rebalance the memory budget periodically depending on the number of running partitions on this node.
If this value is set, it overrides the ratio defined in rocksdb-memory-ratio.
The memory budget for rocksdb memtables as ratio
This defines the total memory for rocksdb as a ratio of all memory available to memtables
(See rocksdb-total-memtables-ratio in common). The budget is then divided evenly across
partitions.
StatsLevel can be used to reduce statistics overhead by skipping certain types of stats in the stats collection process.
Default: "except-detailed-timers"
Throttling options per invoker.
The rate at which the tokens are replenished.
Syntax: <rate>/<unit> where <unit> is s|sec|second, m|min|minute, or h|hr|hour.
unit defaults to per second if not specified.
The maximum number of tokens the bucket can hold. Default to the rate value if not specified.
Non-zero duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Every partition store is backed up by a durable log that is used to recover the state of the partition on restart or failover. The durability mode defines the criteria used to determine whether a partition is considered fully durable or not at a given point in the log history. Once a partition is fully durable, its backing log is allowed to be trimmed to the durability point.
This helps keeping the log's disk usage under control but it forces nodes that need to restore the state of the partition to fetch a snapshot of that partition that covers the changes up to and including the "durability point".
Since v1.4.2 (not compatible with earlier versions)
9 nested properties
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Configures rate limiting for service actions at the node level. This throttling mechanism uses a token bucket algorithm to control the rate at which actions can be processed, helping to prevent resource exhaustion and maintain system stability under high load.
The throttling limit is shared across all partitions running on this node,
providing a global rate limit for the entire node rather than per-partition limits.
When unset, no throttling is applied and actions are processed
without throttling.
Number of concurrent invocations that can be processed by the invoker.
Defines the threshold after which queues invocations will spill to disk at
the path defined in tmp-dir. In other words, this is the number of invocations
that can be kept in memory before spilling to disk. This is a per-partition limit.
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.
Configures throttling for service invocations at the node level. This throttling mechanism uses a token bucket algorithm to control the rate at which invocations can be processed, helping to prevent resource exhaustion and maintain system stability under high load.
The throttling limit is shared across all partitions running on this node,
providing a global rate limit for the entire node rather than per-partition limits.
When unset, no throttling is applied and invocations are processed
without throttling.
Maximum size of journal messages that can be received from a service. If a service sends a message larger than this limit, the invocation will fail.
If unset, defaults to networking.message-size-limit. If set, it will be clamped at
the value of networking.message-size-limit since larger messages cannot be transmitted
over the cluster internal network.
Temporary directory to use for the invoker temporary files. If empty, the system temporary directory will be used instead.
The maximum number of commands a partition processor will apply in a batch. The larger this value is, the higher the throughput and latency are.
The number of timers in memory limit is used to bound the amount of timers loaded in memory. If this limit is set, when exceeding it, the timers farther in the future will be spilled to disk.
Options for ingestion client
3 nested properties
Definition of a retry policy
Non-zero human-readable bytes
Non-zero human-readable bytes
Partition store object-store snapshotting settings. At a minimum, set destination to enable
manual snapshotting via restatectl. Additionally, snapshot-interval and
snapshot-interval-num-records can be used to configure automated periodic snapshots. For a
complete example, see Snapshots.
12 nested properties
Username for Minio, or consult the service documentation for other S3-compatible stores.
Allow plain HTTP to be used with the object store endpoint. Required when the endpoint URL that isn't using HTTPS.
When you use Amazon S3, this is typically inferred from the region and there is no need to
set it. With other object stores, you will have to provide an appropriate HTTP(S) endpoint.
If not using HTTPS, also set aws-allow-http to true.
The AWS configuration profile to use for S3 object store destinations. If you use named profiles in your AWS configuration, you can replace all the other settings with a single profile reference. See the [AWS documentation on profiles] (https://docs.aws.amazon.com/sdkref/latest/guide/file-format.html) for more.
AWS region to use with S3 object store destinations. This may be inferred from the
environment, for example the current region when running in EC2. Because of the
request signing algorithm this must have a value. For Minio, you can generally
set this to any string, such as us-east-1.
Password for Minio, or consult the service documentation for other S3-compatible stores.
This is only needed with short-term STS session credentials.
Base URL for cluster snapshots. Currently only supports the s3:// protocol scheme.
S3-compatible object stores must support ETag-based conditional writes.
Default: None
Definition of a retry policy
A time interval at which partition snapshots will be created. If
snapshot-interval-num-records is also set, it will be treated as an additional requirement
before a snapshot is taken. Use both time-based and record-based intervals to reduce the
number of snapshots created during times of low activity.
Snapshot intervals are calculated based on the wall clock timestamps reported by cluster nodes, assuming a basic level of clock synchronization within the cluster.
This setting does not influence explicitly requested snapshots triggered using restatectl.
Default: None - automatic snapshots are disabled
Number of log records that trigger a snapshot to be created.
As snapshots are created asynchronously, the actual number of new records that will trigger a snapshot will vary. The counter for the subsequent snapshot begins from the LSN at which the previous snapshot export was initiated.
This setting does not influence explicitly requested snapshots triggered using restatectl.
Default: None - automatic snapshots are disabled
14 nested properties
Uncompressed block size
Default: 64KiB
If non-zero, we perform bigger reads when doing compaction. If you're running RocksDB on spinning disks, you should set this to at least 2MB. That way RocksDB's compaction is doing sequential instead of random reads.
When set to true, disables RocksDB's CompactOnDeletionCollector for partition stores.
The collector automatically triggers compaction when SST files accumulate a high density
of tombstones (deletion markers), helping reclaim disk space after bulk deletions.
This helps control space amplification when invocation journal retention expires and the cleaner purges completed invocations.
Consider disabling this if you observe frequent unnecessary compactions triggered by the collector causing performance issues.
Use O_DIRECT for writes in background flush and compactions.
Files will be opened in "direct I/O" mode which means that data r/w from the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters.
Disable rocksdb statistics collection
Default: False (statistics enabled)
The default depends on the different rocksdb use-cases at Restate.
Supports hot-reloading (Partial / Bifrost only)
Number of info LOG files to keep
Default: 1
Verbosity of the LOG.
Default: "error"
Max size of info LOG file
Default: 64MB
Default: the number of CPU cores on this node.
The memory budget for rocksdb memtables in bytes
The total is divided evenly across partitions. The server will rebalance the memory budget periodically depending on the number of running partitions on this node.
If this value is set, it overrides the ratio defined in rocksdb-memory-ratio.
The memory budget for rocksdb memtables as ratio
This defines the total memory for rocksdb as a ratio of all memory available to memtables
(See rocksdb-total-memtables-ratio in common). The budget is then divided evenly across
partitions.
StatsLevel can be used to reduce statistics overhead by skipping certain types of stats in the stats collection process.
Default: "except-detailed-timers"
Duration string in either jiff human friendly or ISO8601 format. Check https://docs.rs/jiff/latest/jiff/struct.Span.html#parsing-and-printing for more details.