Kestra flow file
Kestra Flow definition file, see: kestra.io/docs/workflow-components/flow#flow-sample
| Type | io.kestra.core.models.flows.Flow |
|---|---|
| File match |
**/flows/*.yml
|
| Schema URL | https://catalog.lintel.tools/schemas/schemastore/kestra-flow-file/latest.json |
| Source | https://www.schemastore.org/kestra-0.19.0.json |
Versions
Validate with Lintel
npx @lintel/lintel check
Definitions
Default value is : QUEUE
2 nested properties
Default value is : QUEUE
Default value is : false
Default value is : false
Output values make information about the execution of your Flow available and expose for other Kestra flows to use. Output values are similar to return values in programming languages.
Default value is : false
Cannot be of type ARRAY.
Default value is : true
Default value is : true
Default value is : true
Default value is : true
Default value is : true
DEPRECATED; use 'SELECT' instead.
Default value is : true
Default value is : .upl
Default value is : true
Default value is : true
Default value is : true
Default value is : true
Default value is : false
Cannot be of type ARRAY nor 'MULTISELECT'.
Default value is : STRING
Default value is : true
Default value is : true
Default value is : false
Default value is : true
Default value is : true
Default value is : true
Default value is : true
Default value is : true
Default value is : true
Default value is : RETRY_FAILED_TASK
Default value is : constant
Default value is : false
Default value is : RETRY_FAILED_TASK
Default value is : exponential
Default value is : false
Default value is : RETRY_FAILED_TASK
Default value is : random
Default value is : false
This task is deprecated, please use the io.kestra.plugin.scripts.shell.Script or io.kestra.plugin.scripts.shell.Commands task instead.##### Examples
Single bash command.
id: bash_single_command
namespace: company.team
tasks:
- id: bash
type: io.kestra.core.tasks.scripts.Bash
commands:
- 'echo "The current execution is : {{ execution.id }}"'
Bash command that generate file in storage accessible through outputs.
id: bash_generate_files
namespace: company.team
tasks:
- id: bash
type: io.kestra.core.tasks.scripts.Bash
outputFiles:
- first
- second
commands:
- echo "1" >> {{ outputFiles.first }}
- echo "2" >> {{ outputFiles.second }}
Bash with some inputs files.
id: bash_input_files
namespace: company.team
tasks:
- id: bash
type: io.kestra.core.tasks.scripts.Bash
inputFiles:
script.sh: |
echo {{ workingDir }}
commands:
- /bin/bash script.sh
Bash with an input file from Kestra's local storage created by a previous task.
id: bash_use_input_files
namespace: company.team
tasks:
- id: bash
type: io.kestra.core.tasks.scripts.Bash
inputFiles:
data.csv: {{ outputs.previousTaskId.uri }}
commands:
- cat data.csv
Run a command on a Docker image.
id: bash_run_php_code
namespace: company.team
tasks:
- id: bash
type: io.kestra.core.tasks.scripts.Bash
runner: DOCKER
dockerOptions:
image: php
commands:
- php -r 'print(phpversion() . "
");'
Execute cmd on Windows.
id: bash_run_cmd_on_windows
namespace: company.team
tasks:
- id: bash
type: io.kestra.core.tasks.scripts.Bash
commands:
- 'echo "The current execution is : {{ execution.id }}"'
exitOnFailed: false
interpreter: cmd
interpreterArgs:
- /c
Set outputs from bash standard output.
id: bash_set_outputs
namespace: company.team
tasks:
- id: bash
type: io.kestra.core.tasks.scripts.Bash
commands:
- echo '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::'
Send a counter metric from bash standard output.
id: bash_set_metrics
namespace: company.team
tasks:
- id: bash
type: io.kestra.core.tasks.scripts.Bash
commands:
- echo '::{"metrics":[{"name":"count","type":"counter","value":1,"tags":{"tag1":"i","tag2":"win"}}]}::'
Default command will be launched with /bin/sh -c "commands".
Default value is : false
Default value is : false
This tells bash that it should exit the script if any statement returns a non-true return value.
Setting this to true helps catch cases where a command fails and the script continues to run anyway.
Default value is : true
Use outputFiles instead.
Define the files as a map of a file name being the key, and the value being the file's content. Alternatively, configure the files as a JSON string with the same key/value structure as the map. In both cases, you can either specify the file's content inline, or reference a file from Kestra's internal storage by its URI, e.g. a file from an input, output of a previous task, or a Namespace File.
Default value is : /bin/sh
Default value is : - -c
Default value is : - -c
[
"-c"
]
Default value is : false
List of keys that will generate temporary directories.
This property can be used with a special variable named outputDirs.key.
If you add a file with ["myDir"], you can use the special var echo 1 >> {[ outputDirs.myDir }}/file1.txt and echo 2 >> {[ outputDirs.myDir }}/file2.txt, and both the files will be uploaded to Kestra's internal storage. You can reference them in other tasks using {{ outputs.taskId.outputFiles['myDir/file1.txt'] }}.
List of keys that will generate temporary files.
This property can be used with a special variable named outputFiles.key.
If you add a file with ["first"], you can use the special var echo 1 >> {[ outputFiles.first }}, and on other tasks, you can reference it using {{ outputs.taskId.outputFiles.first }}.
Use outputFiles instead.
Default value is : PROCESS
Default value is : true
1 nested properties
This task is deprecated, please use the io.kestra.plugin.scripts.node.Script or io.kestra.plugin.scripts.node.Commands task instead.
With the Node task, you can execute a full JavaScript script.
The task will create a temporary folder for each task, and allows you to install some npm packages defined in an optional package.json file.
By convention, you need to define at least a main.js file in inputFiles that will be the script used.
You can also add as many JavaScript files as you need in inputFiles.
The outputs & metrics from your Node.js script can be used by others tasks. In order to make things easy, we inject a node package directly on the working directory.Here is an example usage:
const Kestra = require("./kestra");
Kestra.outputs({test: 'value', int: 2, bool: true, float: 3.65});
Kestra.counter('count', 1, {tag1: 'i', tag2: 'win'});
Kestra.timer('timer1', (callback) => { setTimeout(callback, 1000) }, {tag1: 'i', tag2: 'lost'});
Kestra.timer('timer2', 2.12, {tag1: 'i', tag2: 'destroy'});
```##### Examples
> Execute a Node.js script.
```yaml
inputFiles:
main.js: |
const Kestra = require("./kestra");
const fs = require('fs')
const result = fs.readFileSync(process.argv[2], "utf-8")
console.log(JSON.parse(result).status)
const axios = require('axios')
axios.get('http://google.fr').then(d => { console.log(d.status); Kestra.outputs({'status': d.status, 'text': d.data})})
console.log(require('./mymodule').value)
data.json: |
{"status": "OK"}
mymodule.js: |
module.exports.value = 'hello world'
package.json: |
{
"name": "tmp",
"version": "1.0.0",
"description": "",
"main": "index.js",
"dependencies": {
"axios": "^0.20.0"
},
"devDependencies": {},
"scripts": {
"test": "echo `Error: no test specified` && exit 1"
},
"author": "",
"license": "ISC"
}
args:
- data.json
warningOnStdErr: false
Execute a Node.js script with an input file from Kestra's internal storage created by a previous task.
inputFiles:
data.csv: {{ outputs.previousTaskId.uri }}
main.js: |
const fs = require('fs')
const result = fs.readFileSync('data.csv', 'utf-8')
console.log(result)
Default value is : false
Arguments list to pass to main JavaScript script.
Default value is : false
This tells bash that it should exit the script if any statement returns a non-true return value.
Setting this to true helps catch cases where a command fails and the script continues to run anyway.
Default value is : true
Use outputFiles instead.
Define the files as a map of a file name being the key, and the value being the file's content. Alternatively, configure the files as a JSON string with the same key/value structure as the map. In both cases, you can either specify the file's content inline, or reference a file from Kestra's internal storage by its URI, e.g. a file from an input, output of a previous task, or a Namespace File.
Default value is : /bin/sh
Default value is : - -c
Default value is : - -c
[
"-c"
]
Default value is : false
Set the node interpreter path to use.
Default value is : node
Set the npm binary path for node dependencies setup.
Default value is : npm
List of keys that will generate temporary directories.
This property can be used with a special variable named outputDirs.key.
If you add a file with ["myDir"], you can use the special var echo 1 >> {[ outputDirs.myDir }}/file1.txt and echo 2 >> {[ outputDirs.myDir }}/file2.txt, and both the files will be uploaded to Kestra's internal storage. You can reference them in other tasks using {{ outputs.taskId.outputFiles['myDir/file1.txt'] }}.
List of keys that will generate temporary files.
This property can be used with a special variable named outputFiles.key.
If you add a file with ["first"], you can use the special var echo 1 >> {[ outputFiles.first }}, and on other tasks, you can reference it using {{ outputs.taskId.outputFiles.first }}.
Use outputFiles instead.
Default value is : PROCESS
Default value is : true
1 nested properties
This task is deprecated, please use the io.kestra.plugin.scripts.python.Script or io.kestra.plugin.scripts.python.Commands task instead.
With the Python task, you can execute a full Python script.
The task will create a fresh virtualenv for every tasks and allows to install some Python package define in requirements property.
By convention, you need to define at least a main.py files in inputFiles that will be the script used.
But you are also able to add as many script as you need in inputFiles.
You can also add a pip.conf in inputFiles to customize the pip download of dependencies (like a private registry).
You can send outputs & metrics from your python script that can be used by others tasks. In order to help, we inject a python package directly on the working dir.Here is an example usage:
from kestra import Kestra
Kestra.outputs({'test': 'value', 'int': 2, 'bool': True, 'float': 3.65})
Kestra.counter('count', 1, {'tag1': 'i', 'tag2': 'win'})
Kestra.timer('timer1', lambda: time.sleep(1), {'tag1': 'i', 'tag2': 'lost'})
Kestra.timer('timer2', 2.12, {'tag1': 'i', 'tag2': 'destroy'})
```##### Examples
> Execute a python script.
```yaml
id: python_flow
namespace: company.team
tasks:
- id: python
type: io.kestra.core.tasks.scripts.Python
inputFiles:
data.json: |
{"status": "OK"}
main.py: |
from kestra import Kestra
import json
import requests
import sys
result = json.loads(open(sys.argv[1]).read())
print(f"python script {result['status']}")
response = requests.get('http://google.com')
print(response.status_code)
Kestra.outputs({'status': response.status_code, 'text': response.text})
pip.conf: |
# some specific pip repository configuration
args:
- data.json
requirements:
- requests
Execute a python script with an input file from Kestra's local storage created by a previous task.
id: python_flow
namespace: company.team
tasks:
- id: python
type: io.kestra.core.tasks.scripts.Python
inputFiles:
data.csv: {{outputs.previousTaskId.uri}}
main.py: |
with open('data.csv', 'r') as f:
print(f.read())
Default value is : false
Arguments list to pass to main python script
Default command will be launched with ./bin/python main.py {{args}}
Default value is : - ./bin/python main.py
Default value is : - ./bin/python main.py
[
"./bin/python main.py"
]
Default value is : false
This tells bash that it should exit the script if any statement returns a non-true return value.
Setting this to true helps catch cases where a command fails and the script continues to run anyway.
Default value is : true
Use outputFiles instead.
Define the files as a map of a file name being the key, and the value being the file's content. Alternatively, configure the files as a JSON string with the same key/value structure as the map. In both cases, you can either specify the file's content inline, or reference a file from Kestra's internal storage by its URI, e.g. a file from an input, output of a previous task, or a Namespace File.
Default value is : /bin/sh
Default value is : - -c
Default value is : - -c
[
"-c"
]
Default value is : false
List of keys that will generate temporary directories.
This property can be used with a special variable named outputDirs.key.
If you add a file with ["myDir"], you can use the special var echo 1 >> {[ outputDirs.myDir }}/file1.txt and echo 2 >> {[ outputDirs.myDir }}/file2.txt, and both the files will be uploaded to Kestra's internal storage. You can reference them in other tasks using {{ outputs.taskId.outputFiles['myDir/file1.txt'] }}.
List of keys that will generate temporary files.
This property can be used with a special variable named outputFiles.key.
If you add a file with ["first"], you can use the special var echo 1 >> {[ outputFiles.first }}, and on other tasks, you can reference it using {{ outputs.taskId.outputFiles.first }}.
Use outputFiles instead.
Set the python interpreter path to use
Default value is : python
Python dependencies list to setup in the virtualenv, in the same format than requirements.txt
Default value is : PROCESS
When a virtual env is created, we will install the requirements needed. Disabled it if all the requirements is already on the file system.
If you disabled the virtual env creation, the requirements will be ignored.
Default value is : true
Default value is : true
1 nested properties
Examples
id: airbyte_reset
namespace: company.team
tasks:
- id: reset
type: io.kestra.plugin.airbyte.cloud.jobs.Reset
token: <token>
connectionId: e3b1ce92-547c-436f-b1e8-23b6936c12cd
Default value is : false
Default value is : false
Default value is : false
Default value is : 3600.000000000
Default value is : 1.000000000
Allowing capture of job status & logs.
Default value is : true
1 nested properties
Examples
id: airbyte_sync
namespace: company.team
tasks:
- id: sync
type: io.kestra.plugin.airbyte.cloud.jobs.Sync
token: <token>
connectionId: e3b1ce92-547c-436f-b1e8-23b6936c12cd
Default value is : false
Default value is : false
Default value is : false
Default value is : 3600.000000000
Default value is : 1.000000000
Allowing capture of job status & logs.
Default value is : true
1 nested properties
Examples
id: airbyte_check_status
namespace: company.team
tasks:
- id: "check_status"
type: "io.kestra.plugin.airbyte.connections.CheckStatus"
url: http://localhost:8080
jobId: 970
Default value is : false
Default value is : false
Default value is : 10.000000000
Default value is : false
Default value is : 3600.000000000
Default value is : 1.000000000
1 nested properties
Examples
id: airbyte_sync
namespace: company.team
tasks:
- id: sync
type: io.kestra.plugin.airbyte.connections.Sync
url: http://localhost:8080
connectionId: e3b1ce92-547c-436f-b1e8-23b6936c12cd
Default value is : false
Default value is : false
Default value is : true
Default value is : 10.000000000
Default value is : false
Default value is : 3600.000000000
Default value is : 1.000000000
Allowing capture of job status & logs.
Default value is : true
1 nested properties
Launch a DAG run, optionally wait for its completion and return the final state of the DAG run.##### Examples
Trigger a DAG run with custom inputs, and authenticate with basic authentication
id: airflow
namespace: company.team
tasks:
- id: run_dag
type: io.kestra.plugin.airflow.dags.TriggerDagRun
baseUrl: http://host.docker.internal:8080
dagId: example_astronauts
wait: true
pollFrequency: PT1S
options:
basicAuthUser: "{{ secret('AIRFLOW_USERNAME') }}"
basicAuthPassword: "{{ secret('AIRFLOW_PASSWORD') }}"
body:
conf:
source: kestra
namespace: "{{ flow.namespace }}"
flow: "{{ flow.id }}"
task: "{{ task.id }}"
execution: "{{ execution.id }}"
Trigger a DAG run with custom inputs, and authenticate with a Bearer token
id: airflow_header_authorization
namespace: company.team
tasks:
- id: run_dag
type: io.kestra.plugin.airflow.dags.TriggerDagRun
baseUrl: http://host.docker.internal:8080
dagId: example_astronauts
wait: true
headers:
authorization: "Bearer {{ secret('AIRFLOW_TOKEN') }}"
Default value is : false
Default value is : false
Default value is : false
Default value is : 3600.000000000
Default value is : 1.000000000
Default value is false
Default value is : false
1 nested properties
Requires maxDuration or maxRecords.##### Examples
id: amqp_consume
namespace: company.team
tasks:
- id: consume
type: io.kestra.plugin.amqp.Consume
url: amqp://guest:guest@localhost:5672/my_vhost
queue: kestramqp.queue
maxRecords: 1000
Default value is : false
Default value is : Kestra
Default value is : false
Default value is : false
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second.
Default value is : STRING
1 nested properties
Create a queue, including specified arguments.##### Examples
id: amqp_create_queue
namespace: company.team
tasks:
- id: create_queue
type: io.kestra.plugin.amqp.CreateQueue
url: amqp://guest:guest@localhost:5672/my_vhost
name: kestramqp.queue
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : false
Default value is : false
1 nested properties
Examples
id: amqp_declare_exchange
namespace: company.team
tasks:
- id: declare_exchange
type: io.kestra.plugin.amqp.DeclareExchange
url: amqp://guest:guest@localhost:5672/my_vhost
name: kestramqp.exchange
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : DIRECT
Default value is : false
Default value is : false
1 nested properties
Examples
id: amqp_publish
namespace: company.team
tasks:
- id: publish
type: io.kestra.plugin.amqp.Publish
url: amqp://guest:guest@localhost:5672/my_vhost
exchange: kestramqp.exchange
from:
- data: value-1
headers:
testHeader: KestraTest
timestamp: '2023-01-09T08:46:33.103130753Z'
- data: value-2
timestamp: '2023-01-09T08:46:33.115456977Z'
appId: unit-kestra
It can be a Kestra's internal storage URI or a list.
Default value is : false
Default value is : false
Default value is : false
Default value is : STRING
1 nested properties
Examples
id: amqp_queue_bind
namespace: company.team
tasks:
- id: queue_bind
type: io.kestra.plugin.amqp.QueueBind
url: amqp://guest:guest@localhost:5672/my_vhost
exchange: kestramqp.exchange
queue: kestramqp.queue
Default value is : false
Default value is : false
Default value is : false
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.amqp.Trigger instead.##### Examples
Consume a message from a AMQP queue in real-time.
id: amqp
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: realtime_trigger
type: io.kestra.plugin.amqp.RealtimeTrigger
url: amqp://guest:guest@localhost:5672/my_vhost
queue: amqpTrigger.queue
Default value is : Kestra
Default value is : false
Default value is : false
Default value is : STRING
1 nested properties
Note that you don't need an extra task to consume the message from the event trigger. The trigger will automatically consume messages and you can retrieve their content in your flow using the {{ trigger.uri }} variable. If you would like to consume each message from a AMQP queue in real-time and create one execution per message, you can use the io.kestra.plugin.amqp.RealtimeTrigger instead.##### Examples
id: amqp_trigger
namespace: company.team
tasks:
- id: trigger
type: io.kestra.plugin.amqp.Trigger
url: amqp://guest:guest@localhost:5672/my_vhost
maxRecords: 2
queue: amqpTrigger.queue
Default value is : Kestra
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second.
Default value is : STRING
1 nested properties
Examples
Execute a list of Ansible CLI commands to orchestrate an Ansible playbook stored in the Editor using Namespace Files.
id: ansible
namespace: company.team
tasks:
- id: ansible_task
type: io.kestra.plugin.ansible.cli.AnsibleCLI
inputFiles:
inventory.ini: "{{ read('inventory.ini') }}"
myplaybook.yml: "{{ read('myplaybook.yml') }}"
docker:
image: cytopia/ansible:latest-tools
commands:
- ansible-playbook -i inventory.ini myplaybook.yml
Execute a list of Ansible CLI commands to orchestrate an Ansible playbook defined inline in the flow definition.
id: ansible
namespace: company.team
tasks:
- id: ansible_task
type: io.kestra.plugin.ansible.cli.AnsibleCLI
inputFiles:
inventory.ini: |
localhost ansible_connection=local
myplaybook.yml: |
---
- hosts: localhost
tasks:
- name: Print Hello World
debug:
msg: "Hello, World!"
docker:
image: cytopia/ansible:latest-tools
commands:
- ansible-playbook -i inventory.ini myplaybook.yml
Default value is : false
Default value is : cytopia/ansible:latest-tools
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
The query will wait for completion, except if fetchMode is set to NONE, and will output converted rows.
Row conversion is based on the types listed here.
Complex data types like array, map and struct will be converted to a string.##### Examples
id: aws_athena_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.aws.athena.Query
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
database: my_database
outputLocation: s3://some-s3-bucket
query: |
select * from cloudfront_logs limit 10
The query results will be stored in this output location. Must be an existing S3 bucket.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
FETCH_ONE outputs the first row, FETCH outputs all the rows, STORE stores all rows in a file, NONE does nothing — in this case, the task submits the query without waiting for its completion.
Default value is : STORE
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : true
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Run a simple AWS CLI command and capture the output.
id: aws_cli
namespace: company.team
tasks:
- id: cli
type: io.kestra.plugin.aws.cli.AwsCLI
accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}"
region: "us-east-1"
commands:
- aws sts get-caller-identity | tr -d '
' | xargs -0 -I {} echo '::{"outputs":{}}::'
Create a simple S3 bucket.
id: aws_cli
namespace: company.team
tasks:
- id: cli
type: io.kestra.plugin.aws.cli.AwsCLI
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
commands:
- aws s3 mb s3://test-bucket
List all S3 buckets as the task's output.
id: aws_cli
namespace: company.team
tasks:
- id: cli
type: io.kestra.plugin.aws.cli.AwsCLI
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
commands:
- aws s3api list-buckets | tr -d '
' | xargs -0 -I {} echo '::{"outputs":{}}::'
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : amazon/aws-cli
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Default value is : JSON
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Delete an item by its key.
id: aws_dynamodb_delete_item
namespace: company.team
tasks:
- id: delete_item
type: io.kestra.plugin.aws.dynamodb.DeleteItem
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
tableName: "persons"
key:
id: "1"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
The DynamoDB item identifier.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Get an item by its key.
id: aws_dynamodb_get_item
namespace: company.team
tasks:
- id: get_item
type: io.kestra.plugin.aws.dynamodb.GetItem
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
tableName: "persons"
key:
id: "1"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
The DynamoDB item identifier.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Put an item in map form into a table.
id: aws_dynamodb_put_item
namespace: company.team
tasks:
- id: put_item
type: io.kestra.plugin.aws.dynamodb.PutItem
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
tableName: "persons"
item:
id: 1
firstname: "John"
lastname: "Doe"
Put an item in JSON string form into a table.
id: aws_dynamodb_put_item
namespace: company.team
tasks:
- id: put_item
type: io.kestra.plugin.aws.dynamodb.PutItem
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
tableName: "persons"
item: "{{ outputs.task_id.data | json }}"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
The item can be in the form of a JSON string, or a map.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Query items from a table.
id: aws_dynamo_db_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.aws.dynamodb.Query
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
tableName: "persons"
keyConditionExpression: id = :id
expressionAttributeValues:
:id: "1"
Query items from a table with a filter expression.
id: aws_dynamo_db_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.aws.dynamodb.Query
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
tableName: "persons"
keyConditionExpression: id = :id
expressionAttributeValues:
:id: "1"
:lastname: "Doe"
It's a map of string -> object.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
FETCH_ONE output the first row, FETCH output all the rows, STORE store all rows in a file, NONE do nothing.
Default value is : STORE
Query filter expression.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Scan all items from a table.
id: aws_dynamo_db_scan
namespace: company.team
tasks:
- id: scan
type: io.kestra.plugin.aws.dynamodb.Scan
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
tableName: "persons"
Scan items from a table with a filter expression.
id: aws_dynamo_db_scan
namespace: company.team
tasks:
- id: scan
type: io.kestra.plugin.aws.dynamodb.Scan
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
tableName: "persons"
filterExpression: "lastname = :lastname"
expressionAttributeValues:
:lastname: "Doe"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
It's a map of string -> object.
FETCH_ONE output the first row, FETCH output all the rows, STORE store all rows in a file, NONE do nothing.
Default value is : STORE
When used, expressionAttributeValues property must also be provided.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Retrieve the AWS ECR authorization token.
id: aws_ecr_get_auth_token
namespace: company.team
tasks:
- id: get_auth_token
type: io.kestra.plugin.aws.ecr.GetAuthToken
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Send multiple custom events as maps to Amazon EventBridge so that they can be matched to rules.
id: aws_event_bridge_put_events
namespace: company.team
tasks:
- id: put_events
type: io.kestra.plugin.aws.eventbridge.PutEvents
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
entries:
- eventBusName: "events"
source: "Kestra"
detailType: "my_object"
detail:
message: "hello from EventBridge and Kestra"
Send multiple custom events as a JSON string to Amazon EventBridge so that they can be matched to rules.
id: aws_event_bridge_put_events
namespace: company.team
tasks:
- id: put_events
type: io.kestra.plugin.aws.eventbridge.PutEvents
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
entries:
- eventBusName: "events"
source: "Kestra"
detailType: "my_object"
detail: "{"message": "hello from EventBridge and Kestra"}"
resources:
- "arn:aws:iam::123456789012:user/johndoe"
A list of at least one EventBridge entry.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
If true, the task will fail when any event fails to be sent.
Default value is : true
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Can be a JSON string, or a map.
AWS resources, identified by Amazon Resource Name (ARN), which the event primarily concerns. Any number, including zero, may be present.
Examples
Send multiple records as maps to Amazon Kinesis Data Streams. Check the following AWS API reference for the structure of the PutRecordsRequestEntry request payload.
id: aws_kinesis_put_records
namespace: company.team
tasks:
- id: put_records
type: io.kestra.plugin.aws.kinesis.PutRecords
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
streamName: "mystream"
records:
- data: "user sign-in event"
explicitHashKey: "optional hash value overriding the partition key"
partitionKey: "user1"
- data: "user sign-out event"
partitionKey: "user1"
Send multiple records from an internal storage ion file to Amazon Kinesis Data Streams.
id: aws_kinesis_put_records
namespace: company.team
tasks:
- id: put_records
type: io.kestra.plugin.aws.kinesis.PutRecords
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
streamName: "mystream"
records: kestra:///myfile.ion
A list of at least one record with a map including data and partitionKey properties (those two are required arguments). Check the PutRecordsRequestEntry API reference for a detailed description of required fields.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
If true, the task will fail when any record fails to be sent.
Default value is : true
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Make sure to set either streamName or streamArn. One of those must be provided.
Make sure to set either streamName or streamArn. One of those must be provided.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Invoke given Lambda function and wait for its completion.
id: aws_lambda_invoke
namespace: company.team
tasks:
- id: invoke
type: io.kestra.plugin.aws.lambda.Invoke
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
functionArn: "arn:aws:lambda:eu-central-1:123456789012:function:my-function"
Invoke given Lambda function with given payload parameters and wait for its completion. Payload is a map of items.
id: aws_lambda_invoke
namespace: company.team
tasks:
- id: invoke
type: io.kestra.plugin.aws.lambda.Invoke
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
functionArn: "arn:aws:lambda:eu-central-1:123456789012:function:my-function"
functionPayload:
id: 1
firstname: "John"
lastname: "Doe"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Request payload. It's a map of string -> object.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: aws_s3_copy
namespace: company.team
tasks:
- id: copy
type: io.kestra.plugin.aws.s3.Copy
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
from:
bucket: "my-bucket"
key: "path/to/file"
to:
bucket: "my-bucket2"
key: "path/to/file2"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
Create a new bucket with some options
id: aws_s3_create_bucket
namespace: company.team
tasks:
- id: create_bucket
type: io.kestra.plugin.aws.s3.CreateBucket
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
The S3 bucket name to create.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Allows grantee the read, write, read ACP, and write ACP permissions on the bucket.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: aws_s3_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.aws.s3.Delete
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
key: "path/to/file"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
Required to permanently delete a versioned object if versioning is configured with MFA delete enabled.
Sets the value of the RequestPayer property for this object.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: aws_s3_delete_list
namespace: company.team
tasks:
- id: delete_list
type: io.kestra.plugin.aws.s3.DeleteList
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
If the bucket is owned by a different account, the request fails with the HTTP status code 403 Forbidden (access denied).
Default value is : BOTH
Default value is : false
Amazon S3 starts listing after this specified key. Marker can be any key in the bucket.
By default, the action returns up to 1,000 key names. The response might contain fewer keys but will never contain more.
Default value is : 1000
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: aws_s3_download
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.aws.s3.Download
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
key: "path/to/file"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: aws_s3_downloads
namespace: company.team
tasks:
- id: downloads
type: io.kestra.plugin.aws.s3.Downloads
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
If the bucket is owned by a different account, the request fails with the HTTP status code 403 Forbidden (access denied).
Default value is : BOTH
Default value is : false
Amazon S3 starts listing after this specified key. Marker can be any key in the bucket.
By default, the action returns up to 1,000 key names. The response might contain fewer keys but will never contain more.
Default value is : 1000
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: aws_s3_list
namespace: company.team
tasks:
- id: list
type: io.kestra.plugin.aws.s3.List
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
If the bucket is owned by a different account, the request fails with the HTTP status code 403 Forbidden (access denied).
Default value is : BOTH
Default value is : false
Amazon S3 starts listing after this specified key. Marker can be any key in the bucket.
By default, the action returns up to 1,000 key names. The response might contain fewer keys but will never contain more.
Default value is : 1000
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
This trigger will poll every interval s3 bucket. You can search for all files in a bucket or directory in from or you can filter the files with a regExp. The detection is atomic, internally we do a list and interact only with files listed.
Once a file is detected, we download the file on internal storage and processed with declared action in order to move or delete the files from the bucket (to avoid double detection on new poll).##### Examples
Wait for a list of files on a s3 bucket and iterate through the files.
id: s3_listen
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
value: "{{ trigger.objects | jq('.[].uri') }}"
triggers:
- id: watch
type: io.kestra.plugin.aws.s3.Trigger
interval: "PT5M"
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
action: MOVE
moveTo:
key: archive
bucket: "new-bucket"
Wait for a list of files on a s3 bucket and iterate through the files. Delete files manually after processing to prevent infinite triggering.
id: s3_listen
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
- id: delete
type: io.kestra.plugin.aws.s3.Delete
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
key: "{{ taskrun.value }}"
value: "{{ trigger.objects | jq('.[].key') }}"
triggers:
- id: watch
type: io.kestra.plugin.aws.s3.Trigger
interval: "PT5M"
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
action: NONE
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
This property allows you to use a different S3 compatible storage backend.
If the bucket is owned by a different account, the request fails with the HTTP status code 403 Forbidden (access denied).
Default value is : BOTH
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Amazon S3 starts listing after this specified key. Marker can be any key in the bucket.
By default, the action returns up to 1,000 key names. The response might contain fewer keys but will never contain more.
Default value is : 1000
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: aws_s3_upload
namespace: company.team
inputs:
- id: myfile
type: FILE
tasks:
- id: upload
type: io.kestra.plugin.aws.s3.Upload
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
from: "{{ inputs.myfile }}"
bucket: "my-bucket"
key: "path/to/file"
Can be a single file, a list of files or json array.
a full key (with filename) or the directory path if from is multiple files.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Setting this header to true causes Amazon S3 to use an S3 Bucket Key for object encryption with SSE-KMS.
Must be used in pair with checksumAlgorithm to defined the expect algorithm of these values
Default value is : false
And thus, what decoding mechanisms must be applied to obtain the media-type referenced by the Content-Type header field.
This parameter is useful when the size of the body cannot be determined automatically.
Default value is : false
This property allows you to use a different S3 compatible storage backend.
If the bucket is owned by a different account, the request fails with the HTTP status code 403 Forbidden (access denied).
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
For example, AES256, aws:kms, aws:kms:dsse
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: aws_sns_publish
namespace: company.team
tasks:
- id: publish
type: io.kestra.plugin.aws.sns.Publish
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
topicArn: "arn:aws:sns:eu-central-1:000000000000:MessageTopic"
from:
- data: Hello World
- data: Hello Kestra
subject: Kestra
Can be an internal storage URI, a list of SNS messages, or a single SNS message.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Requires maxDuration or maxRecords.##### Examples
id: aws_sqs_consume
namespace: company.team
tasks:
- id: consume
type: io.kestra.plugin.aws.sqs.Consume
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
queueUrl: "https://sqs.eu-central-1.amazonaws.com/000000000000/test-queue"
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : STRING
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: aws_sqs_publish
namespace: company.team
tasks:
- id: publish
type: io.kestra.plugin.aws.sqs.Publish
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
queueUrl: "https://sqs.eu-central-1.amazonaws.com/000000000000/test-queue"
from:
- data: Hello World
- data: Hello Kestra
delaySeconds: 5
Can be an internal storage URI, a list of SQS messages, or a single SQS message.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.aws.sqs.Trigger instead.##### Examples
Consume a message from an SQS queue in real-time.
id: sqs
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: realtime_trigger
type: io.kestra.plugin.aws.sqs.RealtimeTrigger
accessKeyId: "access_key"
secretKeyId: "secret_key"
region: "eu-central-1"
queueUrl: https://sqs.eu-central-1.amazonaws.com/000000000000/test-queue
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : 3
Default value is : false
This property allows you to use a different S3 compatible storage backend.
Default value is : false
Increasing this value can reduce the number of requests made to SQS. Amazon SQS never returns more messages than this value (however, fewer messages might be returned). Valid values: 1 to 10.
Default value is : 5
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : STRING
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
Default value is : 20.000000000
1 nested properties
Requires maxDuration or maxRecords.
Note that you don't need an extra task to consume the message from the event trigger. The trigger will automatically consume messages and you can retrieve their content in your flow using the {{ trigger.uri }} variable. If you would like to consume each message from an SQS queue in real-time and create one execution per message, you can use the io.kestra.plugin.aws.sqs.RealtimeTrigger instead.##### Examples
id: sqs
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: trigger
type: io.kestra.plugin.aws.sqs.Trigger
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
queueUrl: "https://sqs.eu-central-1.amazonaws.com/000000000000/test-queue"
maxRecords: 10
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : false
This property allows you to use a different S3 compatible storage backend.
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
Default value is : STRING
If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
The Amazon Resource Name (ARN) of the role to assume. If set the task will use the StsAssumeRoleCredentialsProvider. If no credentials are defined, we will use the default credentials provider chain to fetch credentials.
A unique identifier that might be required when you assume a role in another account. This property is only used when an stsRoleArn is defined.
The duration of the role session (default: 15 minutes, i.e., PT15M). This property is only used when an stsRoleArn is defined.
Default value is : 900.000000000
This property is only used when an stsRoleArn is defined.
1 nested properties
Examples
id: azure_batch_job_create
namespace: company.team
tasks:
- id: create
type: io.kestra.plugin.azure.batch.job.Create
endpoint: https://***.francecentral.batch.azure.com
account: <batch-account>
accessKey: <access-key>
poolId: <pool-id>
job:
id: <job-name>
tasks:
- id: env
commands:
- 'echo t1=$ENV_STRING'
environments:
ENV_STRING: "{{ inputs.first }}"
- id: echo
commands:
- 'echo t2={{ inputs.second }} 1>&2'
- id: for
commands:
- 'for i in $(seq 10); do echo t3=$i; done'
- id: vars
commands:
- echo '::{"outputs":{"extract":"'$(cat files/in/in.txt)'"}::'
resourceFiles:
- httpUrl: https://unittestkt.blob.core.windows.net/tasks/***?sv=***&se=***&sr=***&sp=***&sig=***
filePath: files/in/in.txt
- id: output
commands:
- 'mkdir -p outs/child/sub'
- 'echo 1 > outs/1.txt'
- 'echo 2 > outs/child/2.txt'
- 'echo 3 > outs/child/sub/3.txt'
outputFiles:
- outs/1.txt
outputDirs:
- outs/child
Use a container to start the task, the pool must use a
microsoft-azure-batchpublisher.
id: azure_batch_job_create
namespace: company.team
tasks:
- id: create
type: io.kestra.plugin.azure.batch.job.Create
endpoint: https://***.francecentral.batch.azure.com
account: <batch-account>
accessKey: <access-key>
poolId: <pool-id>
job:
id: <job-name>
tasks:
- id: echo
commands:
- 'python --version'
containerSettings:
imageName: python
Default value is : false
Default value is : 1.000000000
Default value is : true
Default value is : false
Default value is : false
If null, there is no timeout and the task is delegated to Azure Batch.
Default value is : true
1 nested properties
If omitted, the default is "docker.io".
The ID can contain any combination of alphanumeric characters including hyphens and underscores, and cannot contain more than 64 characters. The ID is case-preserving and case-insensitive (that is, you may not have two IDs within an Account that differ only by case).
The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.
The value of maxParallelTasks must be -1 or greater than 0, if specified. If not specified, the default value is -1, which means there's no limit to the number of tasks that can be run at once. You can update a job's maxParallelTasks after it has been created using the update job API.
Priority values can range from -1000 to 1000, with -1000 being the lowest priority and 1000 being the highest priority. The default value is 0.
Both relative and absolute paths are supported. Relative paths are relative to the Task working directory. The following wildcards are supported: * matches 0 or more characters (for example, pattern abc* would match abc or abcdef), ** matches any directory, ? matches any single character, [abc] matches one character in the brackets, and [a-c] matches one character in the range. Brackets can include a negation to match any character not specified (for example, [!abc] matches any character but a, b, or c). If a file name starts with "." it is ignored by default but may be matched by specifying it explicitly (for example *.gif will not match .a.gif, but .*.gif will). A simple example: **\*.txt matches any file that does not start in '.' and ends with .txt in the Task working directory or any subdirectory. If the filename contains a wildcard character it can be escaped using brackets (for example, abc[*] would match a file named abc*). Note that both \ and / are treated as directory separators on Windows, but only / is on Linux.Environment variables (%var% on Windows or $var on Linux) are expanded prior to the pattern being applied.
If not using a managed identity, the URL must include a Shared Access Signature (SAS) granting write permissions to the container.
If filePattern refers to a specific file (i.e. contains no wildcards), then path is the name of the blob to which to upload that file. If filePattern contains one or more wildcards (and therefore may match multiple files), then path is the name of the blob virtual directory (which is prepended to each blob name) to which to upload the file(s). If omitted, file(s) are uploaded to the root of the container with a blob name matching their file name.
Default value is : taskcompletion
The autoStorageContainerName, storageContainerUrl and httpUrl properties are mutually exclusive, and one of them must be specified.
Only the blobs whose names begin with the specified prefix will be downloaded. The property is valid only when autoStorageContainerName or storageContainerUrl is used. This prefix can be a partial file name or a subdirectory. If a prefix is not specified, all the files in the container will be downloaded.
This property applies only to files being downloaded to Linux Compute Nodes. It will be ignored if it is specified for a resourceFile which will be downloaded to a Windows Compute Node. If this property is not specified for a Linux Compute Node, then a default value of 0770 is applied to the file.
If the httpUrl property is specified, the filePath is required and describes the path which the file will be downloaded to, including the file name. Otherwise, if the autoStorageContainerName or storageContainerUrl property is specified, filePath is optional and is the directory to download the files to. In the case where filePath is used as a directory, any directory structure already associated with the input data will be retained in full and appended to the specified filePath directory. The specified relative path cannot break out of the Task's working directory (for example by using ..).
The autoStorageContainerName, storageContainerUrl and httpUrl properties are mutually exclusive, and one of them must be specified. If the URL points to Azure Blob Storage, it must be readable from compute nodes. There are three ways to get such a URL for a blob in Azure storage: include a Shared Access Signature (SAS) granting read permissions on the blob, use a managed identity with read permission, or set the ACL for the blob or its container to allow public access.
The autoStorageContainerName, storageContainerUrl and httpUrl properties are mutually exclusive, and one of them must be specified. This URL must be readable and listable from compute nodes. There are three ways to get such a URL for a container in Azure storage: include a Shared Access Signature (SAS) granting read and list permissions on the container, use a managed identity with read and list permissions, or set the ACL for the container to allow public access.
For multi-instance Tasks, the command line is executed as the primary Task, after the primary Task and all subtasks have finished executing the coordination command line. The command line does not run under a shell, and therefore cannot take advantage of shell features such as environment variable expansion. If you want to take advantage of such features, you should invoke the shell in the command line, for example, using cmd /c MyCommand in Windows or /bin/sh -c MyCommand in Linux. If the command line refers to file paths, it should use a relative path (relative to the Task working directory), or use the Batch provided environment variable.
Command will be passed as /bin/sh -c "command" by default.
The ID can contain any combination of alphanumeric characters including hyphens and underscores, and cannot contain more than 64 characters. The ID is case-preserving and case-insensitive (that is, you may not have two IDs within a Job that differ only by case). If not provided, a random UUID will be generated.
The display name need not be unique and can contain any Unicode characters up to a maximum length of 1024.
Default value is : /bin/sh
Default value is : - -c
Default value is : - -c
[
"-c"
]
List of keys that will generate temporary directories.
In the command, you can use a special variable named outputDirs.key.
If you add a file with ["myDir"], you can use the special variable echo 1 >> {{ outputDirs.myDir }}/file1.txt and echo 2 >> {{ outputDirs.myDir }}/file2.txt, and both files will be uploaded to the internal storage. Then, you can use them on other tasks using {{ outputs.taskId.files['myDir/file1.txt'] }}
List of keys that will generate temporary files.
In the command, you can use a special variable named outputFiles.key.
If you add a file with ["first"], you can use the special variable echo 1 >> {{ outputFiles.first }}on this task, and reference this file on others tasks using {{ outputs.taskId.outputFiles.first }}.
The default is 1. A Task can only be scheduled to run on a compute node if the node has enough free scheduling slots available. For multi-instance Tasks, this must be 1.
For multi-instance Tasks, the resource files will only be downloaded to the Compute Node on which the primary Task is executed. There is a maximum size for the list of resource files. When the max size is exceeded, the request will fail and the response error code will be RequestEntityTooLarge. If this occurs, the collection of ResourceFiles must be reduced in size. This can be achieved using .zip files, Application Packages, or Docker Containers.
For multi-instance Tasks, the files will only be uploaded from the Compute Node on which the primary Task is executed.
The Batch service retries a Task if its exit code is nonzero. Note that this value specifically controls the number of retries for the Task executable due to a nonzero exit code. The Batch service will try the Task once, and may then retry up to this limit. For example, if the maximum retry count is 3, Batch tries the Task up to 4 times (one initial try and 3 retries). If the maximum retry count is 0, the Batch service does not retry the Task after the first attempt. If the maximum retry count is -1, the Batch service retries the Task without limit.
If the Task does not complete within the time limit, the Batch service terminates it. If this is not specified, there is no time limit on how long the Task may run.
After this time, the Batch service may delete the Task directory and all its contents. The default is 7 days, i.e. the Task directory will be retained for 7 days unless the Compute Node is removed or the Job is deleted.
This is the full Image reference, as would be specified to docker pull. If no tag is provided as part of the Image name, the tag :latest is used as a default.
These additional options are supplied as arguments to the docker create command, in addition to those controlled by the Batch Service.
The default is taskWorkingDirectory. Possible values include: taskWorkingDirectory, containerImageDefault.
Examples
id: azure_batch_pool_resize
namespace: company.team
tasks:
- id: resize
type: io.kestra.plugin.azure.batch.pool.Resize
poolId: "<your-pool-id>"
targetDedicatedNodes: "12"
Default value is : false
Default value is : false
Default value is : false
Default value is : 0
Default value is : 0
1 nested properties
Examples
List Azure Active Directory users for the currently authenticated tenant.
id: azure_cli
namespace: company.team
tasks:
- id: az_cli
type: io.kestra.plugin.azure.cli.AzCLI
username: "azure_app_id"
password: "{{ secret('AZURE_SERVICE_PRINCIPAL_PASSWORD') }}"
tenant: "{{ secret('AZURE_TENANT_ID') }}"
commands:
- az ad user list
List all successfully provisioned VMs using a Service Principal authentication.
id: azure_cli
namespace: company.team
tasks:
- id: az_cli
type: io.kestra.plugin.azure.cli.AzCLI
username: "azure_app_id"
password: "{{ secret('AZURE_SERVICE_PRINCIPAL_PASSWORD') }}"
tenant: "{{ secret('AZURE_TENANT_ID') }}"
servicePrincipal: true
commands:
- az vm list --query "[?provisioningState=='Succeeded']"
Command without authentication.
id: azure_cli
namespace: company.team
tasks:
- id: az_cli
type: io.kestra.plugin.azure.cli.AzCLI
commands:
- az --help
List supported regions for the current Azure subscription.
id: azure_cli
namespace: company.team
tasks:
- id: list_locations
type: io.kestra.plugin.azure.cli.AzCLI
tenant: "{{ secret('AZURE_TENANT_ID') }}"
username: "{{ secret('AZURE_SERVICE_PRINCIPAL_CLIENT_ID') }}"
password: "{{ secret('AZURE_SERVICE_PRINCIPAL_PASSWORD') }}"
servicePrincipal: true
commands:
- az account list-locations --query "[].{Region:name}" -o table
Default value is : false
Default value is : mcr.microsoft.com/azure-cli
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Default value is : false
1 nested properties
Launch an Azure DataFactory pipeline from Kestra. Data Factory contains a series of interconnected systems that provide a complete end-to-end platform for data engineers.##### Examples
id: azure_datafactory_create_run
namespace: company.team
tasks:
- id: create_run
type: io.kestra.plugin.azure.datafactory.CreateRun
factoryName: exampleFactoryName
pipelineName: examplePipeline
resourceGroupName: exampleResourceGroup
subscriptionId: 12345678-1234-1234-1234-12345678abc
tenantId: "{{ secret('DATAFACTORY_TENANT_ID') }}"
clientId: "{{ secret('DATAFACTORY_CLIENT_ID') }}"
clientSecret: "{{ secret('DATAFACTORY_CLIENT_SECRET') }}"
Default value is : false
Client ID of the Azure service principal. If you don't have a service principal, refer to create a service principal with Azure CLI.
Default value is : --- ""
Service principal client secret. The tenantId, clientId and clientSecret of the service principal are required for this credential to acquire an access token.
Default value is : --- ""
Default value is : false
Default value is : false
Default value is : "{}"
Your stored PEM certificate.
The tenantId, clientId and clientCertificate of the service principal are required for this credential to acquire an access token.
Default value is : --- ""
1 nested properties
Examples
Consume data events from Azure EventHubs.
id: azure_eventhubs_consume_data_events
namespace: company.team
tasks:
- id: consume_from_eventhub
type: io.kestra.plugin.azure.eventhubs.Consume
eventHubName: my_eventhub
namespace: my_eventhub_namespace
connectionString: "{{ secret('EVENTHUBS_CONNECTION') }}"
bodyDeserializer: JSON
consumerGroup: "$Default"
checkpointStoreProperties:
containerName: kestra
connectionString: "{{ secret('BLOB_CONNECTION') }}"
Default value is : false
Default value is : STRING
Configs in key/value pairs.
Default value is : {}
{}
Azure Event Hubs Checkpoint Store can be used for storing checkpoints while processing events from Azure Event Hubs.
Default value is : {}
{}
Default value is : 5
Default value is : 500
Default value is : $Default
Default value is : false
Configs in key/value pairs.
Default value is : false
Default value is : 50
Default value is : 10.000000000
Default value is : 5.000000000
Default value is : EARLIEST
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
Publish a file as events into Azure EventHubs.
id: azure_eventhubs_send_events
namespace: company.team
inputs:
- id: file
type: FILE
description: a CSV file with columns id, username, tweet, and timestamp
tasks:
- id: read_csv_file
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ inputs.file }}"
- id: transform_row_to_json
type: io.kestra.plugin.scripts.nashorn.FileTransform
from: "{{ outputs.read_csv_file.uri }}"
script: |
var result = {
"body": {
"username": row.username,
"tweet": row.tweet
}
};
row = result
- id: send_to_eventhub
type: io.kestra.plugin.azure.eventhubs.Produce
from: "{{ outputs.transform_row_to_json.uri }}"
eventHubName: my_eventhub
namespace: my_event_hub_namespace
connectionString: "{{ secret('EVENTHUBS_CONNECTION') }}"
maxBatchSizeInBytes: 4096
maxEventsPerBatch: 100
bodySerializer: "JSON"
bodyContentType: application/json
eventProperties:
source: kestra
Can be an internal storage URI, a map (i.e. a list of key-value pairs) or a list of maps. The following keys are supported: from, contentType, properties.
Default value is : false
The MIME type describing the data contained in event body allowing consumers to make informed decisions for inspecting and processing the event.
Default value is : STRING
Configs in key/value pairs.
Default value is : {}
{}
Default value is : 5
Default value is : 500
Default value is : false
The event properties which may be used for passing metadata associated with the event body during Event Hubs operations.
Default value is : {}
{}
Default value is : false
Default value is : 1000
Events with the same partitionKey are hashed and sent to the same partition. The provided partitionKey will be used for all the events sent by the Produce task.
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.azure.eventhubs.Trigger instead.##### Examples
Trigger flow based on events received from Azure Event Hubs in real-time.
id: azure_eventhubs_realtime_trigger
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: Hello there! I received {{ trigger.body }} from Azure EventHubs!
triggers:
- id: read_from_eventhub
type: io.kestra.plugin.azure.eventhubs.RealtimeTrigger
eventHubName: my_eventhub
namespace: my_eventhub_namespace
connectionString: "{{ secret('EVENTHUBS_CONNECTION') }}"
bodyDeserializer: JSON
consumerGroup: "$Default"
checkpointStoreProperties:
containerName: kestra
connectionString: "{{ secret('BLOB_CONNECTION') }}"
Default value is : STRING
Configs in key/value pairs.
Default value is : {}
{}
Azure Event Hubs Checkpoint Store can be used for storing checkpoints while processing events from Azure Event Hubs.
Default value is : {}
{}
Default value is : 5
Default value is : 500
Default value is : $Default
Default value is : false
Configs in key/value pairs.
Default value is : false
Default value is : EARLIEST
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
If you would like to consume each message from Azure Event Hubs in real-time and create one execution per message, you can use the io.kestra.plugin.azure.eventhubs.RealtimeTrigger instead.##### Examples
Trigger flow based on events received from Azure Event Hubs in batch.
id: azure_eventhubs_trigger
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: Hello there! I received {{ trigger.eventsCount }} from Azure EventHubs!
triggers:
- id: read_from_eventhub
type: io.kestra.plugin.azure.eventhubs.Trigger
interval: PT30S
eventHubName: my_eventhub
namespace: my_eventhub_namespace
connectionString: "{{ secret('EVENTHUBS_CONNECTION') }}"
bodyDeserializer: JSON
consumerGroup: "$Default"
checkpointStoreProperties:
containerName: kestra
connectionString: "{{ secret('BLOB_CONNECTION') }}"
Default value is : STRING
Configs in key/value pairs.
Default value is : {}
{}
Azure Event Hubs Checkpoint Store can be used for storing checkpoints while processing events from Azure Event Hubs.
Default value is : {}
{}
Default value is : 5
Default value is : 500
Default value is : $Default
Default value is : false
Configs in key/value pairs.
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : 50
Default value is : 10.000000000
Default value is : 5.000000000
Default value is : EARLIEST
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_blob_copy
namespace: company.team
tasks:
- id: copy
type: io.kestra.plugin.azure.storage.blob.Copy
from:
container: "my-bucket"
key: "path/to/file"
to:
container: "my-bucket2"
key: "path/to/file2"
Default value is : false
Default value is : false
Default value is : false
Default value is : false
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_blob_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.azure.storage.blob.Delete
endpoint: "https://yourblob.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
container: "mydata"
name: "myblob"
Default value is : false
Default value is : false
Default value is : false
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_blob_delete_list
namespace: company.team
tasks:
- id: delete_list
type: io.kestra.plugin.azure.storage.blob.DeleteList
endpoint: "https://yourblob.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
container: "mydata"
prefix: "sub-dir"
delimiter: "/"
Default value is : false
Default value is : false
Default value is : false
Default value is : FILES
Default value is : false
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_blob_download
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.azure.storage.blob.Download
endpoint: "https://yourblob.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
container: "mydata"
name: "myblob"
Default value is : false
Default value is : false
Default value is : false
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_blob_downloads
namespace: company.team
tasks:
- id: downloads
type: io.kestra.plugin.azure.storage.blob.Downloads
endpoint: "https://yourblob.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
container: "mydata"
prefix: "sub-dir"
delimiter: "/"
Default value is : false
Default value is : false
Default value is : FILES
Default value is : false
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_blob_list
namespace: company.team
tasks:
- id: list
type: io.kestra.plugin.azure.storage.blob.List
endpoint: "https://yourblob.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
container: "mydata"
prefix: "sub-dir"
delimiter: "/"
Default value is : false
Default value is : false
Default value is : FILES
Default value is : false
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
This trigger will poll every interval on the Azure Blob Storage. You can search for all files in a container or directory in from or you can filter the files with a regExp.The detection is atomic, internally we do a list and interact only with files listed.
Once a file is detected, we download the file on internal storage and processed with declared action in order to move or delete the files from the container (to avoid double detection on new poll)##### Examples
Wait for a list of files on Azure Blob Storage bucket, and then iterate through the files.
id: storage_listen
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
value: "{{ trigger.blobs | jq('.[].uri') }}"
triggers:
- id: watch
type: io.kestra.plugin.azure.storage.blob.Trigger
interval: PT5M
endpoint: "https://yourblob.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
container: "mydata"
prefix: "trigger/storage-listen"
action: MOVE
moveTo:
container: mydata
name: archive
Wait for a list of file on a Azure Blob Storage bucket and iterate through the files. Delete files manually after processing to prevent infinite triggering.
id: storage_listen
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
- id: delete
type: io.kestra.plugin.azure.storage.blob.Delete
endpoint: "https://yourblob.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
container: "mydata"
name: "{{ taskrun.value }}"
value: "{{ trigger.blobs | jq('.[].name') }}"
triggers:
- id: watch
type: io.kestra.plugin.azure.storage.blob.Trigger
endpoint: "https://yourblob.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
container: "mydata"
prefix: "trigger/storage_listen"
action: MOVE
moveTo:
container: mydata
name: archive
Default value is : false
Default value is : FILES
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_blob_upload
namespace: company.team
inputs:
- id: myfile
type: FILE
tasks:
- id: upload
type: io.kestra.plugin.azure.storage.blob.Upload
endpoint: "https://yourblob.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
container: "mydata"
from: "{{ inputs.myfile }}"
name: "myblob"
The operation is allowed on a page blob in a premium Storage Account or a block blob in a blob Storage Account or GPV2 Account. A premium page blob's tier determines the allowed size, IOPS, and bandwidth of the blob. A block blob's tier determines the Hot/Cool/Archive storage type. This does not update the blob's etag.
Default value is : false
Default value is : false
2 nested properties
NOTE: Blob Versioning must be enabled on your storage account and the blob must be in a container with immutable storage with versioning enabled to call this API.
Default value is : false
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_table_bulk
namespace: company.team
tasks:
- id: bulk
type: io.kestra.plugin.azure.storage.table.Bulk
endpoint: "https://yourstorageaccount.blob.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
table: "table_name"
from:
- partitionKey: "color"
rowKey: "green"
type: "UPSERT_MERGE"
properties:
"code": "00FF00"
Can be an internal storage URI or a list of maps in the format partitionKey, rowKey, type, properties, as shown in the example.
Default value is : false
Default value is : UPSERT_REPLACE
Default value is : false
Default value is : false
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_table_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.azure.storage.table.Delete
endpoint: "https://yourstorageaccount.table.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
table: "table_name"
partitionKey: "color"
rowKey: "green"
Default value is : false
Default value is : false
Default value is : false
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
Examples
id: azure_storage_table_get
namespace: company.team
tasks:
- id: get
type: io.kestra.plugin.azure.storage.table.Get
endpoint: "https://yourstorageaccount.table.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
table: "table_name"
partitionKey: "color"
rowKey: "green"
Default value is : false
Default value is : false
Default value is : false
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
If the filter parameter in the options is set, only entities matching the filter will be returned.
If the select parameter is set, only the properties included in the select parameter will be returned for each entity.
If the top parameter is set, the maximum number of returned entities per page will be limited to that value.##### Examples
id: azure_storage_table_list
namespace: company.team
tasks:
- id: list
type: io.kestra.plugin.azure.storage.table.List
endpoint: "https://yourstorageaccount.table.core.windows.net"
connectionString: "DefaultEndpointsProtocol=...=="
table: "table_name"
Default value is : false
Default value is : false
You can specify the filter using Filter Strings.
Default value is : false
This string should only be the query parameters (with or without a leading '?') and not a full URL.
1 nested properties
It must be the ZIP archive containing the secure bundle encoded in base64. Use it only when you are not using the proxy address.
Default value is : 9042
Examples
Send a CQL query to an Astra DB.
id: cassandra_astradb_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.cassandra.astradb.Query
session:
secureBundle: /path/to/secureBundle.zip
keyspace: astradb_keyspace
clientId: astradb_clientId
clientSecret: astradb_clientSecret
cql: SELECT * FROM CQL_TABLE
fetch: true
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a CQL query to return results, and then iterate through rows.
id: astra_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.core.tasks.flows.EachSequential
tasks:
- id: return
type: io.kestra.core.tasks.debugs.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.cassandra.astradb.Trigger
interval: "PT5M"
session:
secureBundle: /path/to/secureBundle.zip
keyspace: astradb_keyspace
clientId: astradb_clientId
clientSecret: astradb_clientSecret
cql: "SELECT * FROM CQL_KEYSPACE.CQL_TABLE"
fetch: true
Default value is : false
Default value is : false
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
It will be sent in the STARTUP protocol message, under the key APPLICATION_NAME, for each new connection established by the driver. Currently, this information is used by Insights monitoring (if the target cluster does not support Insights, the entry will be ignored by the server).
Default value is : 9042
In the context of Cloud, this is the string representation of the host ID.
Examples
Send a CQL query to a Cassandra database.
id: cassandra_standard_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.cassandra.standard.Query
session:
endpoints:
- hostname: localhost
secureConnection:
truststorePath: path to .crt file
truststorePassword: truststore_password
keystorePath: path to .jks file
keystorePassword: keystore_password
username: cassandra_user
password: cassandra_passwd
cql: SELECT * FROM CQL_KEYSPACE.CQL_TABLE
fetch: true
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a CQL query to return results, and then iterate through rows.
id: cassandra_trigger
namespace: io.kestra.tests
tasks:
- id: each
type: io.kestra.core.tasks.flows.EachSequential
tasks:
- id: return
type: io.kestra.core.tasks.debugs.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.cassandra.standard.Trigger
interval: "PT5M"
session:
endpoints:
- hostname: localhost
username: cassandra_user
password: cassandra_passwd
cql: "SELECT * FROM CQL_KEYSPACE.CQL_TABLE"
fetch: true
Default value is : false
Default value is : false
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Run a CloudQuery sync from CLI. You need an API key to download plugins. You can add the API key as an environment variable called
CLOUDQUERY_API_KEY.
id: cloudquery_sync_cli
namespace: company.team
tasks:
- id: hn_to_duckdb
type: io.kestra.plugin.cloudquery.CloudQueryCLI
env:
CLOUDQUERY_API_KEY: "{{ secret('CLOUDQUERY_API_KEY') }}"
inputFiles:
config.yml: |
kind: source
spec:
name: hackernews
path: cloudquery/hackernews
version: v3.0.13
tables: ["*"]
backend_options:
table_name: cq_cursor
connection: "@@plugins.duckdb.connection"
destinations:
- "duckdb"
spec:
item_concurrency: 100
start_time: "{{ now() | dateAdd(-1, 'DAYS') }}"
---
kind: destination
spec:
name: duckdb
path: cloudquery/duckdb
version: v4.2.10
write_mode: overwrite-delete-stale
spec:
connection_string: hn.db
commands:
- cloudquery sync config.yml --log-console
Default value is : false
Default value is : ghcr.io/cloudquery/cloudquery:latest
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
Examples
Start a CloudQuery sync based on a YAML configuration. You need an API key to download plugins. You can add the API key as an environment variable called
CLOUDQUERY_API_KEY.
id: cloudquery_sync
namespace: company.team
tasks:
- id: hn_to_duckdb
type: io.kestra.plugin.cloudquery.Sync
env:
CLOUDQUERY_API_KEY: "{{ secret('CLOUDQUERY_API_KEY') }}"
incremental: false
configs:
- kind: source
spec:
name: hackernews
path: cloudquery/hackernews
version: v3.0.13
tables: ["*"]
destinations: ["duckdb"]
spec:
item_concurrency: 100
start_time: "{{ now() | dateAdd(-1, 'DAYS') }}"
- kind: destination
spec:
name: duckdb
path: cloudquery/duckdb
version: v4.2.10
write_mode: overwrite-delete-stale
spec:
connection_string: hn.db
Start a CloudQuery sync based on a file(s) input.
id: cloudquery_sync
namespace: company.team
tasks:
- id: hn_to_duckdb
type: io.kestra.plugin.cloudquery.Sync
incremental: false
env:
AWS_ACCESS_KEY_ID: "{{ secret('AWS_ACCESS_KEY_ID') }}"
AWS_SECRET_ACCESS_KEY: "{{ secret('AWS_SECRET_ACCESS_KEY') }}"
AWS_DEFAULT_REGION: "{{ secret('AWS_DEFAULT_REGION') }}"
CLOUDQUERY_API_KEY: "{{ secret('CLOUDQUERY_API_KEY') }}"
PG_CONNECTION_STRING: "postgresql://postgres:{{ secret('DB_PASSWORD') }}@host.docker.internal:5432/demo?sslmode=disable"
configs:
- sources.yml
- destination.yml
A list of CloudQuery configurations or files containing CloudQuery configurations.
Default value is : false
Default value is : ghcr.io/cloudquery/cloudquery:latest
Default value is : false
Kestra can automatically add a backend option to your sources and store the incremental indexes in the KV Store. Use this boolean to activate this option.
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
Examples
id: archive_compress
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: "archive_compress"
type: "io.kestra.plugin.compress.ArchiveCompress"
from:
myfile.txt: "{{ inputs.file }}"
algorithm: ZIP
id: archive_compress
namespace: company.team
tasks:
- id: products_download
type: io.kestra.plugin.core.http.Download
uri: "https://raw.githubusercontent.com/kestra-io/datasets/main/csv/products.csv"
- id: orders_download
type: io.kestra.plugin.core.http.Download
uri: "https://raw.githubusercontent.com/kestra-io/datasets/main/csv/orders.csv"
- id: archive_compress
type: "io.kestra.plugin.compress.ArchiveCompress"
from:
products.csv: "{{ outputs.products_download.uri }}"
orders.csv: "{{ outputs.orders_download.uri }}"
algorithm: TAR
compression: GZIP
The key must be a valid path in the archive and can contain / to represent the directory, the value must be a Kestra internal storage URI.
The value can also be a JSON containing multiple keys/values.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: archive_decompress
namespace: company.team
inputs:
- id: file
description: Compressed file
type: FILE
tasks:
- id: archive_decompress
type: io.kestra.plugin.compress.ArchiveDecompress
from: "{{ inputs.file }}"
algorithm: ZIP
compression: GZIP
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: file_compress
namespace: company.team
inputs:
- id: file
description: File to be compressed
type: FILE
tasks:
- id: compress
type: io.kestra.plugin.compress.FileCompress
from: "{{ inputs.file }}"
compression: Z
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: file_decompress
namespace: company.team
inputs:
- id: file
description: File to be decompressed
type: FILE
tasks:
- id: decompress
type: io.kestra.plugin.compress.FileDecompress
from: "{{ inputs.file }}"
compression: Z
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
# This will evaluate to true when the trigger date falls after the `after` date.
- conditions:
- type: io.kestra.plugin.core.condition.DateTimeBetweenCondition
date: "{{ trigger.date }}"
after: "2024-01-01T08:30:00Z"
# This will evaluate to true when the trigger date falls between the `before` and `after` dates.
- conditions:
- type: io.kestra.plugin.core.condition.DateTimeBetweenCondition
date: "{{ trigger.date }}"
before: "2024-01-01T08:30:00Z"
after: "2024-12-31T23:30:00Z"
Must be a valid ISO 8601 datetime with the zone identifier (use 'Z' for the default zone identifier).
Must be a valid ISO 8601 datetime with the zone identifier (use 'Z' for the default zone identifier).
Can be any variable or any valid ISO 8601 datetime. By default, it will use the trigger date.
Default value is : "{{ trigger.date }}"
Examples
- conditions:
- type: io.kestra.plugin.core.condition.DayWeekCondition
dayOfWeek: "MONDAY"
Can be any variable or any valid ISO 8601 datetime. By default, it will use the trigger date.
Default value is : "{{ trigger.date }}"
Examples
- conditions:
- type: io.kestra.plugin.core.condition.DayWeekInMonthCondition
dayOfWeek: MONDAY
dayInMonth: FIRST
Can be any variable or any valid ISO 8601 datetime. By default, it will use the trigger date.
Default value is : "{{ trigger.date }}"
Examples
- conditions:
- type: io.kestra.plugin.core.condition.ExecutionFlowCondition
namespace: company.team
flowId: my-current-flow
Examples
- conditions:
- type: io.kestra.plugin.core.condition.ExecutionLabelsCondition
labels:
owner: john.doe
List of labels to match in the execution.
Examples
- conditions:
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: company.team
comparison: PREFIX
Only used when comparison is not set
Default value is : false
The condition returns false if the execution has no output. If the result is an empty string, a space, or false, the condition will also be considered as false.##### Examples
A condition that will return true for an output matching a specific value.
- conditions:
- type: io.kestra.plugin.core.condition.ExecutionOutputsCondition
expression: {{ trigger.outputs.status_code == '200' }}
Examples
- conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- SUCCESS
notIn:
- FAILED
If the result is an empty string, a string containing only space or false, the condition will be considered as false.##### Examples
A condition that will return false for a missing variable.
- conditions:
- type: io.kestra.plugin.core.condition.ExpressionCondition
expression: {{ unknown is defined }}
Examples
- conditions:
- type: io.kestra.plugin.core.condition.FlowCondition
namespace: company.team
flowId: my-current-flow
Use io.kestra.plugin.core.condition.ExecutionNamespaceCondition instead.##### Examples
- conditions:
- type: io.kestra.plugin.core.condition.FlowNamespaceCondition
namespace: io.kestra.tests
prefix: true
Default value is : false
Examples
- conditions:
- type: io.kestra.plugin.core.condition.HasRetryAttemptCondition
in:
- KILLED
Trigger when all the flows are successfully executed for the first time during the window duration.##### Examples
A flow that is waiting for 2 flows to run successfully in a day
triggers:
- id: multiple-listen-flow
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- SUCCESS
- id: multiple
type: io.kestra.plugin.core.condition.MultipleCondition
window: P1D
windowAdvance: P0D
conditions:
flow-a:
type: io.kestra.plugin.core.condition.ExecutionFlowCondition
namespace: io.kestra.demo
flowId: multiplecondition-flow-a
flow-b:
type: io.kestra.plugin.core.condition.ExecutionFlowCondition
namespace: io.kestra.demo
flowId: multiplecondition-flow-b
The key must be unique for a trigger since it will be use to store previous result.
See ISO_8601 Durations for more information of available duration value. The start of the window is always based on midnight except if you set windowAdvance parameter. Eg if you have a 10 minutes (PT10M) window, the first window will be 00:00 to 00:10 and a new window will be started each 10 minutes
Allow to specify the start hour of the window Eg: you want a window of 6 hours (window=PT6H). By default the check will be done between: 00:00 and 06:00 - 06:00 and 12:00 - 12:00 and 18:00 - 18:00 and 00:00 If you want to check the window between: 03:00 and 09:00 - 09:00 and 15:00 - 15:00 and 21:00 - 21:00 and 3:00You will have to shift the window of 3 hours by settings windowAdvance: PT3H
Examples
- conditions:
- type: io.kestra.plugin.core.condition.NotCondition
conditions:
- type: io.kestra.plugin.core.condition.DateBetweenCondition
after: "2013-09-08T16:19:12"
If any conditions is true, it will prevent the event's execution.
Examples
- conditions:
- type: io.kestra.plugin.core.condition.OrCondition
conditions:
- type: io.kestra.plugin.core.condition.DayWeekCondition
dayOfWeek: "MONDAY"
- type: io.kestra.plugin.core.condition.DayWeekCondition
dayOfWeek: "SUNDAY"
If any condition is true, it will allow the event's execution.
Examples
Condition to allow events on public holidays.
- conditions:
- type: io.kestra.plugin.core.condition.PublicHolidayCondition
country: FR
Conditions to allow events on work days.
- conditions:
- type: io.kestra.plugin.core.condition.NotCondition
conditions:
- type: io.kestra.plugin.core.condition.PublicHolidayCondition
country: FR
- type: io.kestra.plugin.core.condition.WeekendCondition
It uses the Jollyday library for public holiday calendar that supports more than 70 countries.
Can be any variable or any valid ISO 8601 datetime. By default, it will use the trigger date.
Default value is : "{{ trigger.date }}"
It uses the Jollyday library for public holiday calendar that supports more than 70 countries.
Examples
- conditions:
- type: io.kestra.plugin.core.condition.TimeBetweenCondition
after: "16:19:12+02:00"
Must be a valid ISO 8601 time with offset.
Must be a valid ISO 8601 time with offset.
Can be any variable or any valid ISO 8601 time. By default, it will use the trigger date.
Default value is : "{{ trigger.date }}"
Examples
- conditions:
- type: io.kestra.plugin.core.condition.WeekendCondition
Can be any variable or any valid ISO 8601 datetime. By default, it will use the trigger date.
Default value is : "{{ trigger.date }}"
This task is deprecated, please use the io.kestra.plugin.core.log.Log task instead.##### Examples
id: echo_flow
namespace: company.team
tasks:
- id: echo
type: io.kestra.plugin.core.debug.Echo
level: WARN
format: "{{ task.id }} > {{ taskrun.startDate }}"
Default value is : false
Default value is : false
Default value is : INFO
Default value is : false
1 nested properties
This task is mostly useful for troubleshooting.
It allows you to return some templated functions, inputs or outputs.##### Examples
id:return_flow
namespace: company.team
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
This can be used to send an alert if a condition is met about execution counts.##### Examples
Send a slack notification if there is no execution for a flow for the last 24 hours.
id: executions_count
namespace: company.team
tasks:
- id: counts
type: io.kestra.plugin.core.execution.Counts
expression: "{{ count == 0 }}"
flows:
- namespace: company.team
flowId: logs
startDate: "{{ now() | dateAdd(-1, 'DAYS') }}"
- id: each_parallel
type: io.kestra.plugin.core.flow.EachParallel
tasks:
- id: slack_incoming_webhook
type: io.kestra.plugin.notifications.slack.SlackIncomingWebhook
payload: |
{
"channel": "#run-channel",
"text": ":warning: Flow `{{ jq taskrun.value '.namespace' true }}`.`{{ jq taskrun.value '.flowId' true }}` has no execution for last 24h!"
}
url: "https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX"
value: "{{ jq outputs.counts.results '. | select(. != null) | .[]' }}"
triggers:
- id: schedule
type: io.kestra.plugin.core.trigger.Schedule
backfill: {}
cron: "0 4 * * * "
The expression is such that expression must return true in order to keep the current line.
Some examples:
yaml {{ eq count 0 }}: no execution foundyaml {{ gte count 5 }}: more than 5 executions
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Used to fail the execution, for example, on a switch branch or on some conditions based on the execution context.##### Examples
Fail on a switch branch
id: fail_on_switch
namespace: company.team
inputs:
- id: param
type: STRING
required: true
tasks:
- id: switch
type: io.kestra.plugin.core.flow.Switch
value: "{{inputs.param}}"
cases:
case1:
- id: case1
type: io.kestra.plugin.core.log.Log
message: Case 1
case2:
- id: case2
type: io.kestra.plugin.core.log.Log
message: Case 2
notexist:
- id: fail
type: io.kestra.plugin.core.execution.Fail
default:
- id: default
type: io.kestra.plugin.core.log.Log
message: default
Fail on a condition
id: fail_on_condition
namespace: company.team
inputs:
- name: param
type: STRING
required: true
tasks:
- id: before
type: io.kestra.plugin.core.debug.Echo
format: I'm before the fail on condition
- id: fail
type: io.kestra.plugin.core.execution.Fail
condition: '{{ inputs.param == "fail" }}'
- id: after
type: io.kestra.plugin.core.debug.Echo
format: I'm after the fail on condition
Default value is : false
Boolean coercion allows 0, -0, and '' to coerce to false, all other values to coerce to true.
Default value is : false
Default value is : Task failure
Default value is : false
1 nested properties
Examples
Add labels based on a webhook payload
id: webhook_based_labels
namespace: company.team
tasks:
- id: update_labels_with_map
type: io.kestra.plugin.core.execution.Labels
labels:
customerId: "{{ trigger.body.customerId }}"
- id: by_list
type: io.kestra.plugin.core.execution.Labels
labels:
- key: order_id
value: "{{ trigger.body.orderId }}"
- key: order_type
value: "{{ trigger.body.orderType }}"
triggers:
- id: webhook
key: order_webhook
type: io.kestra.plugin.core.trigger.Webhook
conditions:
- type: io.kestra.plugin.core.condition.ExpressionCondition
expression: "{{ trigger.body.customerId is defined and trigger.body.orderId is defined and trigger.body.orderType is defined }}"
The value should result in a list of labels or a labelKey:labelValue map
Default value is : false
Default value is : false
Default value is : false
1 nested properties
This task can be used to purge flow executions data for all flows, for a specific namespace, or for a specific flow.##### Examples
Purge all flow execution data for flows that ended more than one month ago.
endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}"
states:
- KILLED
- FAILED
- WARNING
- SUCCESS
All data of flows executed before this date will be purged.
Default value is : false
Default value is : false
You need to provide the namespace properties if you want to purge a flow.
Default value is : false
If flowId isn't provided, this is a namespace prefix, else the namespace of the flow.
Default value is : true
This will only purge logs from executions not from triggers, and it will do it execution by execution.
The io.kestra.plugin.core.log.PurgeLogs task is a better fit to purge logs as it will purge logs in bulk, and will also purge logs not tied to an execution like trigger logs.
Default value is : true
Default value is : true
Default value is : true
All data of flows executed after this date will be purged.
If not set, executions for any states will be purged.
1 nested properties
Examples
executionId: "{{ trigger.executionId }}"
Default value is : false
Default value is : false
If you explicitly define an executionId, Kestra will use that specific ID.
If another namespace and flowId properties are set, Kestra will look for a paused execution for that corresponding flow.
If executionId is not set, the task will use the ID of the current execution.
Default value is : false
1 nested properties
If any child task of the AllowFailure task fails, the flow will stop executing this block of tasks (i.e. the next tasks in the AllowFailure block will no longer be executed), but the flow execution of the tasks, following the AllowFailure task, will continue.##### Examples
id: allow_failure
namespace: company.team
tasks:
- id: sequential
type: io.kestra.plugin.core.flow.AllowFailure
tasks:
- id: ko
type: io.kestra.plugin.scripts.shell.Commands
commands:
- 'exit 1'
- id: last
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
List your tasks and their dependencies, and Kestra will figure out the execution sequence. Each task can only depend on other tasks from the DAG task. For technical reasons, low-code interaction via UI forms is disabled for now when using this task.##### Examples
Run a series of tasks for which the execution order is defined by their upstream dependencies.
id: dag_flow
namespace: company.team
tasks:
- id: dag
type: io.kestra.plugin.core.flow.Dag
tasks:
- task:
id: task1
type: io.kestra.plugin.core.log.Log
message: task 1
- task:
id: task2
type: io.kestra.plugin.core.log.Log
message: task 2
dependsOn:
- task1
- task:
id: task3
type: io.kestra.plugin.core.log.Log
message: task 3
dependsOn:
- task1
- task:
id: task4
type: io.kestra.plugin.core.log.Log
message: task 4
dependsOn:
- task2
- task:
id: task5
type: io.kestra.plugin.core.log.Log
message: task 5
dependsOn:
- task4
- task3
Default value is : false
If the value is 0, no concurrency limit exists for the tasks in a DAG and all tasks that can run in parallel will start at the same time.
Default value is : 0
Default value is : false
Default value is : false
1 nested properties
This task is deprecated, please use the io.kestra.plugin.core.flow.ForEach task instead.
The list of tasks will be executed for each item in parallel. The value must be a valid JSON string representing an array, e.g. a list of strings ["value1", "value2"] or a list of dictionaries [{"key": "value1"}, {"key": "value2"}].
You can access the current iteration value using the variable {{ taskrun.value }}.
The task list will be executed in parallel for each item. For example, if you have a list with 3 elements and 2 tasks defined in the list of tasks, all 6 tasks will be computed in parallel without any order guarantee.
If you want to execute a group of sequential tasks for each value in parallel, you can wrap the list of tasks with the Sequential task.
If your list of values is large, you can limit the number of concurrent tasks using the concurrent property.
We highly recommend triggering a subflow for each value (e.g. using the ForEachItem task) instead of specifying many tasks wrapped in a Sequential task. This allows better scalability and modularity. Check the flow best practices documentation for more details.##### Examples
id: each_parallel
namespace: company.team
tasks:
- id: each_parallel
type: io.kestra.plugin.core.flow.EachParallel
value: '["value 1", "value 2", "value 3"]'
tasks:
- id: each_value
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} with current value '{{ taskrun.value }}'"
Create a file for each value in parallel, then process all files in the next task. Note how the
inputFilesproperty uses ajqexpression with amapfunction to extract the paths of all files processed in parallel and pass them into the next task's working directory.
id: parallel_script
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachParallel
value: "{{ range(1, 9) }}"
tasks:
- id: script
type: io.kestra.plugin.scripts.shell.Script
outputFiles:
- "out/*.txt"
script: |
mkdir out
echo "{{ taskrun.value }}" > out/file_{{ taskrun.value }}.txt
- id: process_all_files
type: io.kestra.plugin.scripts.shell.Script
inputFiles: "{{ outputs.script | jq('map(.outputFiles) | add') | first }}"
script: |
ls -h out/
Run a group of tasks for each value in parallel.
id: parallel_task_groups
namespace: company.team
tasks:
- id: for_each
type: io.kestra.plugin.core.flow.EachParallel
value: ["value 1", "value 2", "value 3"]
tasks:
- id: group
type: io.kestra.plugin.core.flow.Sequential
tasks:
- id: task1
type: io.kestra.plugin.scripts.shell.Commands
commands:
- echo "{{task.id}} > {{ parents[0].taskrun.value }}"
- sleep 1
- id: task2
type: io.kestra.plugin.scripts.shell.Commands
commands:
- echo "{{task.id}} > {{ parents[0].taskrun.value }}"
- sleep 1
The value can be passed as a string, a list of strings, or a list of objects.
Default value is : false
If the value is 0, no limit exist and all the tasks will start at the same time.
Default value is : 0
Default value is : false
Default value is : false
1 nested properties
This task is deprecated, please use the io.kestra.plugin.core.flow.ForEach task instead.
The list of tasks will be executed for each item sequentially. The value must be a valid JSON string representing an array, e.g. a list of strings ["value1", "value2"] or a list of dictionaries [{"key": "value1"}, {"key": "value2"}].
You can access the current iteration value using the variable {{ taskrun.value }}. The task list will be executed sequentially for each item.
We highly recommend triggering a subflow for each value. This allows much better scalability and modularity. Check the flow best practices documentation and the following Blueprint for more details.##### Examples
The taskrun.value from the
each_sequentialtask is available only to immediate child tasks such as thebefore_ifand theiftasks. To access the taskrun value in child tasks of theiftask (such as in theafter_iftask), you need to use the syntax{{ parent.taskrun.value }}as this allows you to access the taskrun value of the parent taskeach_sequential.
id: loop_example
namespace: company.team
tasks:
- id: each_sequential
type: io.kestra.plugin.core.flow.EachSequential
value: ["value 1", "value 2", "value 3"]
tasks:
- id: before_if
type: io.kestra.plugin.core.debug.Return
format: 'Before if {{ taskrun.value }}'
- id: if
type: io.kestra.plugin.core.flow.If
condition: '{{ taskrun.value == "value 2" }}'
then:
- id: after_if
type: io.kestra.plugin.core.debug.Return
format: "After if {{ parent.taskrun.value }}"
This task shows that the value can be a bullet-style list. The task iterates over the list of values and executes the
each_valuechild task for each value.
id: each_sequential_flow
namespace: company.team
tasks:
- id: each_sequential
type: io.kestra.plugin.core.flow.EachSequential
value:
- value 1
- value 2
- value 3
tasks:
- id: each_value
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} with value '{{ taskrun.value }}'"
The value car be passed as a string, a list of strings, or a list of objects.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
You can control how many task groups are executed concurrently by setting the concurrencyLimit property.
- If you set the
concurrencyLimitproperty to0, Kestra will execute all task groups concurrently for all values. - If you set the
concurrencyLimitproperty to1, Kestra will execute each task group one after the other starting with the task group for the first value in the list.
Regardless of the concurrencyLimit property, the tasks will run one after the other — to run those in parallel, wrap them in a Parallel task as shown in the last example below (see the flow parallel_tasks_example).
The values should be defined as a JSON string or an array, e.g. a list of string values ["value1", "value2"] or a list of key-value pairs [{"key": "value1"}, {"key": "value2"}].
You can access the current iteration value using the variable {{ taskrun.value }} or {{ parent.taskrun.value }} if you are in a nested child task.
If you need to execute more than 2-5 tasks for each value, we recommend triggering a subflow for each value for better performance and modularity. Check the flow best practices documentation for more details.##### Examples
The
{{ taskrun.value }}from thefor_eachtask is available only to direct child tasks such as thebefore_ifand theiftasks. To access the taskrun value of the parent task in a nested child task such as theafter_iftask, use{{ parent.taskrun.value }}.
id: for_loop_example
namespace: company.team
tasks:
- id: for_each
type: io.kestra.plugin.core.flow.ForEach
values: ["value 1", "value 2", "value 3"]
tasks:
- id: before_if
type: io.kestra.plugin.core.debug.Return
format: "Before if {{ taskrun.value }}"
- id: if
type: io.kestra.plugin.core.flow.If
condition: '{{ taskrun.value == "value 2" }}'
then:
- id: after_if
type: io.kestra.plugin.core.debug.Return
format: "After if {{ parent.taskrun.value }}"
This flow uses YAML-style array for
values. The taskfor_eachiterates over a list of values and executes thereturnchild task for each value. TheconcurrencyLimitproperty is set to 2, so thereturntask will run concurrently for the first two values in the list at first. Thereturntask will run for the next two values only after the task runs for the first two values have completed.
id: for_each_value
namespace: company.team
tasks:
- id: for_each
type: io.kestra.plugin.core.flow.ForEach
values:
- value 1
- value 2
- value 3
- value 4
concurrencyLimit: 2
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} with value {{ taskrun.value }}"
This example shows how to run tasks in parallel for each value in the list. All child tasks of the
paralleltask will run in parallel. However, due to theconcurrencyLimitproperty set to 2, only twoparalleltask groups will run at any given time.
id: parallel_tasks_example
namespace: company.team
tasks:
- id: for_each
type: io.kestra.plugin.core.flow.ForEach
values: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
concurrencyLimit: 2
tasks:
- id: parallel
type: io.kestra.plugin.core.flow.Parallel
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: Processing {{ parent.taskrun.value }}
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
commands:
- sleep {{ parent.taskrun.value }}
The values can be passed as a string, a list of strings, or a list of objects.
Default value is : false
If you set the concurrencyLimit property to 0, Kestra will execute all task groups concurrently for all values (zero limits!).
If you set the concurrencyLimit property to 1, Kestra will execute each task group one after the other starting with the first value in the list (limit concurrency to one task group that can be actively running at any time).
Default value is : 1
Default value is : false
Default value is : false
1 nested properties
The items value must be Kestra's internal storage URI e.g. an output file from a previous task, or a file from inputs of FILE type.
Two special variables are available to pass as inputs to the subflow:
taskrun.itemswhich is the URI of internal storage file containing the batch of items to processtaskrun.iterationwhich is the iteration or batch number##### Examples
Execute a subflow for each batch of items. The subflow
ordersis called from the parent floworders_parallelusing theForEachItemtask in order to start one subflow execution for each batch of items.
id: orders
namespace: company.team
inputs:
- id: order
type: STRING
tasks:
- id: read_file
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- cat "{{ inputs.order }}"
- id: read_file_content
type: io.kestra.plugin.core.log.Log
message: "{{ read(inputs.order) }}"
id: orders_parallel
namespace: company.team
tasks:
- id: extract
type: io.kestra.plugin.jdbc.duckdb.Query
sql: |
INSTALL httpfs;
LOAD httpfs;
SELECT *
FROM read_csv_auto('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv', header=True);
store: true
- id: each
type: io.kestra.plugin.core.flow.ForEachItem
items: "{{ outputs.extract.uri }}"
batch:
rows: 1
namespace: company.team
flowId: orders
wait: true # wait for the subflow execution
transmitFailed: true # fail the task run if the subflow execution fails
inputs:
order: "{{ taskrun.items }}" # special variable that contains the items of the batch
Execute a subflow for each JSON item fetched from a REST API. The subflow
mysubflowis called from the parent flowiterate_over_jsonusing theForEachItemtask; this creates one subflow execution for each JSON object.
Note how we first need to convert the JSON array to JSON-L format using the JsonWriter task. This is because the items attribute of the ForEachItem task expects a file where each line represents a single item. Suitable file types include Amazon ION (commonly produced by Query tasks), newline-separated JSON files, or CSV files formatted with one row per line and without a header. For other formats, you can use the conversion tasks available in the io.kestra.plugin.serdes module.
In this example, the subflow mysubflow expects a JSON object as input. The JsonReader task first reads the JSON array from the REST API and converts it to ION. Then, the JsonWriter task converts that ION file to JSON-L format, suitable for the ForEachItem task.
id: mysubflow
namespace: company.team
inputs:
- id: json
type: JSON
tasks:
- id: debug
type: io.kestra.plugin.core.log.Log
message: "{{ inputs.json }}"
id: iterate_over_json
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.fs.http.Download
uri: "https://api.restful-api.dev/objects"
contentType: application/json
method: GET
failOnEmptyResponse: true
timeout: PT15S
- id: json_to_ion
type: io.kestra.plugin.serdes.json.JsonReader
from: "{{ outputs.download.uri }}"
newLine: false # regular json
- id: ion_to_jsonl
type: io.kestra.plugin.serdes.json.JsonWriter
from: "{{ outputs.json_to_ion.uri }}"
newLine: true # JSON-L
- id: for_each_item
type: io.kestra.plugin.core.flow.ForEachItem
items: "{{ outputs.ion_to_jsonl.uri }}"
batch:
rows: 1
namespace: company.team
flowId: mysubflow
wait: true
transmitFailed: true
inputs:
json: "{{ json(read(taskrun.items)) }}"
This example shows how to use the combination of
EachSequentialandForEachItemtasks to process files from an S3 bucket. TheEachSequentialiterates over files from the S3 trigger, and theForEachItemtask is used to split each file into batches. Theprocess_batchsubflow is then called with thedatainput parameter set to the URI of the batch to process.
id: process_batch
namespace: company.team
inputs:
- id: data
type: FILE
tasks:
- id: debug
type: io.kestra.plugin.core.log.Log
message: "{{ read(inputs.data) }}"
id: process_files
namespace: company.team
tasks:
- id: loop_over_files
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.objects | jq('.[].uri') }}"
tasks:
- id: subflow_per_batch
type: io.kestra.plugin.core.flow.ForEachItem
items: "{{ trigger.uris[parent.taskrun.value] }}"
batch:
rows: 1
flowId: process_batch
namespace: company.team
wait: true
transmitFailed: true
inputs:
data: "{{ taskrun.items }}"
triggers:
- id: s3
type: io.kestra.plugin.aws.s3.Trigger
interval: "PT1S"
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "us-east-1"
bucket: "my_bucket"
prefix: "sub-dir"
action: NONE
Default value is : false
Default value is : false
By default, labels are not passed to the subflow execution. If you set this option to true, the child flow execution will inherit all labels from the parent execution.
Default value is : false
Default value is : false
By default, the last, i.e. the most recent, revision of the subflow is executed.
Note that this option works only if wait is set to true.
Default value is : true
Default value is : true
1 nested properties
Can be provided as a string in the format "10MB" or "200KB", or the number of bytes. This allows you to process large files, slit them into smaller chunks by lines and process them in parallel. For example, MySQL by default limits the size of a query size to 16MB per query. Trying to use a bulk insert query with input data larger than 16MB will fail. Splitting the input data into smaller chunks is a common strategy to circumvent this limitation. By dividing a large data set into chunks smaller than the max_allowed_packet size (e.g., 10MB), you can insert the data in multiple smaller queries. This approach not only helps to avoid hitting the query size limit but can also be more efficient and manageable in terms of memory utilization, especially for very large datasets. In short, by splitting the file by bytes, you can bulk-insert smaller chunks of e.g. 10MB in parallel to avoid this limitation.
Default value is : 1
Default value is : \n
Allow some workflow based on context variables, for example, branch a flow based on a previous task.##### Examples
id: if
namespace: company.team
inputs:
- id: string
type: STRING
required: true
tasks:
- id: if
type: io.kestra.plugin.core.flow.If
condition: "{{ inputs.string == 'Condition' }}"
then:
- id: when_true
type: io.kestra.plugin.core.log.Log
message: "Condition was true"
else:
- id: when_false
type: io.kestra.plugin.core.log.Log
message: "Condition was false"
Default value is : false
Boolean coercion allows 0, -0, null and '' to evaluate to false, all other values will evaluate to true.
Default value is : false
Default value is : false
1 nested properties
This task runs all child tasks in parallel.##### Examples
id: parallel
namespace: company.team
tasks:
- id: parallel
type: io.kestra.plugin.core.flow.Parallel
tasks:
- id: 1st
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
- id: 2nd
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.id }}"
- id: last
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
Default value is : false
If the value is 0, no limit exist and all tasks will start at the same time.
Default value is : 0
Default value is : false
Default value is : false
1 nested properties
Examples
Pause the execution and wait for a manual approval
id: human_in_the_loop
namespace: company.team
tasks:
- id: before_approval
type: io.kestra.plugin.core.debug.Return
format: Output data that needs to be validated by a human
- id: pause
type: io.kestra.plugin.core.flow.Pause
- id: run_post_approval
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- echo "Manual approval received! Continuing the execution..."
- id: post_resume
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} started on {{ taskrun.startDate }} after the Pause"
Vacation approval process pausing the execution for approval and waiting for input from a human to approve or reject the request.
id: vacation_approval_process
namespace: company.team
inputs:
- id: request.name
type: STRING
defaults: Rick Astley
- id: request.start_date
type: DATE
defaults: 2042-07-01
- id: request.end_date
type: DATE
defaults: 2042-07-07
- id: slack_webhook_uri
type: URI
defaults: https://reqres.in/api/slack
tasks:
- id: send_approval_request
type: io.kestra.plugin.notifications.slack.SlackIncomingWebhook
url: "{{ inputs.slack_webhook_uri }}"
payload: |
{
"channel": "#vacation",
"text": "Validate holiday request for {{ inputs.request.name }}. To approve the request, click on the `Resume` button here http://localhost:28080/ui/executions/{{flow.namespace}}/{{flow.id}}/{{execution.id}}"
}
- id: wait_for_approval
type: io.kestra.plugin.core.flow.Pause
onResume:
- id: approved
description: Whether to approve the request
type: BOOLEAN
defaults: true
- id: reason
description: Reason for approval or rejection
type: STRING
defaults: Well-deserved vacation
- id: approve
type: io.kestra.plugin.core.http.Request
uri: https://reqres.in/api/products
method: POST
contentType: application/json
body: "{{ inputs.request }}"
- id: log
type: io.kestra.plugin.core.log.Log
message: Status is {{ outputs.wait_for_approval.onResume.reason }}. Process finished with {{ outputs.approve.body }}
Default value is : false
The delay is a string in the ISO 8601 Duration format, e.g. PT1H for 1 hour, PT30M for 30 minutes, PT10S for 10 seconds, P1D for 1 day, etc. If no delay and no timeout are configured, the execution will never end until it's manually resumed from the UI or API.
Default value is : false
Default value is : false
Before resuming the execution, the user will be prompted to fill in these inputs. The inputs can be used to pass additional data to the execution which is useful for human-in-the-loop scenarios. The onResume inputs work the same way as regular flow inputs — they can be of any type and can have default values. You can access those values in downstream tasks using the onResume output of the Pause task.
If no delay and no timeout are configured, the execution will never end until it's manually resumed from the UI or API.
1 nested properties
Used to visually group tasks.##### Examples
id: sequential
namespace: company.team
tasks:
- id: sequential
type: io.kestra.plugin.core.flow.Sequential
tasks:
- id: first_task
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
- id: second_task
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.id }}"
- id: last
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Run a subflow with custom inputs.
id: running_subflow
namespace: company.team
tasks:
- id: call_subflow
type: io.kestra.plugin.core.flow.Subflow
namespace: company.team
flowId: subflow
inputs:
user: "Rick Astley"
favorite_song: "Never Gonna Give You Up"
wait: true
transmitFailed: true
Default value is : false
Default value is : false
By default, labels are not passed to the subflow execution. If you set this option to true, the child flow execution will inherit all labels from the parent execution.
Default value is : false
Default value is : false
Allows to specify outputs as key-value pairs to extract any outputs from the subflow execution into output of this task execution.This property is deprecated since v0.15.0, please use the outputs property on the Subflow definition for defining the output values available and exposed to this task execution.
By default, the last, i.e. the most recent, revision of the subflow is executed.
Note that this option works only if wait is set to true.
Default value is : true
Default value is : true
1 nested properties
This task runs a set of tasks based on a given value. The value is evaluated at runtime and compared to the list of cases. If the value matches a case, the corresponding tasks are executed. If the value does not match any case, the default tasks are executed.##### Examples
id: switch
namespace: company.team
inputs:
- id: string
type: STRING
required: true
tasks:
- id: switch
type: io.kestra.plugin.core.flows.Switch
value: "{{ inputs.string }}"
cases:
FIRST:
- id: first
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
SECOND:
- id: second
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
THIRD:
- id: third
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
defaults:
- id: default
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: template
namespace: company.team
inputs:
- id: with_string
type: STRING
tasks:
- id: 1_return
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
- id: 2_template
type: io.kestra.plugin.core.flow.Template
namespace: company.team
templateId: template
args:
my_forward: "{{ inputs.with_string }}"
- id: 3_end
type: io.kestra.plugin.core.debug.Return
format: "{{ task.id }} > {{ taskrun.startDate }}"
Default value is : false
You can provide a list of named arguments (like function argument on dev) allowing to rename outputs of current flow for this template. For example, if you declare this use of template like this:
- id: 2-template
type: io.kestra.plugin.core.flow.Template
namespace: io.kestra.tests
templateId: template
args:
forward: "{{ output.task-id.uri }}"
You will be able to get this output on the template with {{ parent.outputs.args.forward }}.
Default value is : false
Default value is : false
1 nested properties
Use this task if your workflow requires blocking calls polling for a job to finish or for some external API to return a specific HTTP response.
You can access the outputs of the nested tasks in the condition property. The condition is evaluated after all nested task runs finish.
Examples
Run a task until it returns a specific value. Note how you don't need to take care of incrementing the iteration count. The task will loop and keep track of the iteration outputs behind the scenes — you only need to specify the exit condition for the loop.
id: wait_for
namespace: company.team
tasks:
- id: loop
type: io.kestra.plugin.core.flow.WaitFor
condition: "{{ outputs.return.value == '4' }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ outputs.loop.iterationCount }}"
Boolean coercion allows 0, -0, null and '' to evaluate to false; all other values will evaluate to true.
Default value is : false
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Default value is : 1.000000000
Default value is : 3600.000000000
Default value is : 100
Tasks are stateless by default. Kestra will launch each task within a temporary working directory on a Worker. The WorkingDirectory task allows reusing the same file system's working directory across multiple tasks so that multiple sequential tasks can use output files from previous tasks without having to use the outputs.taskId.outputName syntax. Note that the WorkingDirectory only works with runnable tasks because those tasks are executed directly on the Worker. This means that using flowable tasks such as the Parallel task within the WorkingDirectory task will not work. ##### Examples
Clone a Git repository into the Working Directory and run a Python script in a Docker container.
id: git_python
namespace: company.team
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/examples
branch: main
- id: python
type: io.kestra.plugin.scripts.python.Commands
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
containerImage: ghcr.io/kestra-io/pydata:latest
commands:
- python scripts/etl_script.py
Add input and output files within a Working Directory to use them in a Python script.
id: api_json_to_mongodb
namespace: company.team
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
outputFiles:
- output.json
inputFiles:
query.sql: |
SELECT sum(total) as total, avg(quantity) as avg_quantity
FROM sales;
tasks:
- id: inline_script
type: io.kestra.plugin.scripts.python.Script
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
containerImage: python:3.11-slim
beforeCommands:
- pip install requests kestra > /dev/null
warningOnStdErr: false
script: |
import requests
import json
from kestra import Kestra
with open('query.sql', 'r') as input_file:
sql = input_file.read()
response = requests.get('https://api.github.com')
data = response.json()
with open('output.json', 'w') as output_file:
json.dump(data, output_file)
Kestra.outputs({'receivedSQL': sql, 'status': response.status_code})
- id: load_to_mongodb
type: io.kestra.plugin.mongodb.Load
connection:
uri: mongodb://host.docker.internal:27017/
database: local
collection: github
from: "{{ outputs.wdir.uris['output.json'] }}"
id: working_directory
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: first
type: io.kestra.plugin.scripts.shell.Commands
commands:
- 'echo "{{ taskrun.id }}" > {{ workingDir }}/stay.txt'
- id: second
type: io.kestra.plugin.scripts.shell.Commands
commands:
- |
echo '::{"outputs": {"stay":"'$(cat {{ workingDir }}/stay.txt)'"}}::''
A working directory with a cache of the node_modules directory.
id: node_with_cache
namespace: company.team
tasks:
- id: working_dir
type: io.kestra.plugin.core.flow.WorkingDirectory
cache:
patterns:
- node_modules/**
ttl: PT1H
tasks:
- id: script
type: io.kestra.plugin.scripts.node.Script
beforeCommands:
- npm install colors
script: |
const colors = require("colors");
console.log(colors.red("Hello"));
Default value is : false
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
For example, 'node_modules/**' will include all files of the node_modules directory including sub-directories.
This task connects to a HTTP server and copy a file to Kestra's internal storage.##### Examples
Download a CSV file.
id: download
namespace: company.team
tasks:
- id: extract
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv
Default value is : false
Default value is : false
Default value is : true
Default value is : false
Default value is : GET
1 nested properties
Default value is : 0.0
Default value is : true
Default value is : 10485760
Default value is : DIRECT
Default value is : 300.000000000
Default value is : 10.000000000
Only applies if no trust store is configured. Note: This makes the SSL connection insecure and should only be used for testing. If you are using a self-signed certificate, set up a trust store instead.
This task makes an API call to a specified URL of an HTTP server and stores the response as output.
By default, the maximum length of the response is limited to 10MB, but it can be increased to at most 2GB by using the options.maxContentLength property.
Note that the response is added as output to the task. If you need to process large API payloads, we recommend using the Download task instead.##### Examples
Execute a Kestra flow via an HTTP POST request authenticated with basic auth. To pass a
userinput to the API call, we use theformDataproperty. When using form data, make sure to set thecontentTypeproperty tomultipart/form-data.
id: api_call
namespace: company.team
tasks:
- id: basic_auth_api
type: io.kestra.plugin.core.http.Request
uri: http://host.docker.internal:8080/api/v1/executions/dev/inputs_demo
options:
basicAuthUser: admin
basicAuthPassword: admin
method: POST
contentType: multipart/form-data
formData:
user: John Doe
Execute a Kestra flow via an HTTP request authenticated with a Bearer auth token.
id: api_auth_call
namespace: company.team
tasks:
- id: auth_token_api
type: io.kestra.plugin.core.http.Request
uri: https://dummyjson.com/user/me
method: GET
headers:
authorization: 'Bearer <TOKEN>'
Make an HTTP GET request with a timeout. The
timeoutproperty specifies the maximum time allowed for the entire task to run, while theoptions.connectTimeout,options.readTimeout,options.connectionPoolIdleTimeout, andoptions.readIdleTimeoutproperties specify the time allowed for establishing a connection, reading data from the server, keeping an idle connection in the client's connection pool, and keeping a read connection idle before closing it, respectively.
id: timeout
namespace: company.team
tasks:
- id: http
type: io.kestra.plugin.core.http.Request
uri: https://reqres.in/api/long-request
timeout: PT10M # no default
method: GET
options:
connectTimeout: PT1M # no default
readTimeout: PT30S # 10 seconds by default
connectionPoolIdleTimeout: PT10S # 0 seconds by default
readIdleTimeout: PT10M # 300 seconds by default
Make a HTTP request and process its output. Given that we send a JSON payload in the request body, we need to use
application/jsonas content type.
id: http_post_request_example
namespace: company.team
inputs:
- id: payload
type: JSON
defaults: |
{"title": "Kestra Pen"}
tasks:
- id: send_data
type: io.kestra.plugin.core.http.Request
uri: https://dummyjson.com/products/add
method: POST
contentType: application/json
body: "{{ inputs.payload }}"
- id: print_status
type: io.kestra.plugin.core.log.Log
message: '{{ outputs.send_data.body }}'
Send an HTTP POST request to a webserver.
id: http_post_request_example
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.http.Request
uri: "https://server.com/login"
headers:
user-agent: "kestra-io"
method: "POST"
formData:
user: "user"
password: "pass"
Send a multipart HTTP POST request to a webserver.
id: http_post_multipart_example
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: send_data
type: io.kestra.plugin.core.http.Request
uri: "https://server.com/upload"
headers:
user-agent: "kestra-io"
method: "POST"
contentType: "multipart/form-data"
formData:
user: "{{ inputs.file }}"
Send a multipart HTTP POST request to a webserver and set a custom file name.
id: http_post_multipart_example
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: send_data
type: io.kestra.plugin.core.http.Request
uri: "https://server.com/upload"
headers:
user-agent: "kestra-io"
method: "POST"
contentType: "multipart/form-data"
formData:
user:
name: "my-file.txt"
content: "{{ inputs.file }}"
Default value is : false
Default value is : false
Default value is : false
If this property is set to true, this task will output the request body using the encryptedBody output property; otherwise, the request body will be stored in the body output property.
Default value is : false
Default value is : false
Default value is : GET
1 nested properties
Examples
Send a Slack alert if the price is below a certain threshold. The flow will be triggered every 30 seconds until the condition is met. Then, the
stopAfterproperty will disable the trigger to avoid unnecessary API calls and alerts.
id: http_price_alert
namespace: company.team
tasks:
- id: send_slack_alert
type: io.kestra.plugin.notifications.slack.SlackIncomingWebhook
url: "{{ secret('SLACK_WEBHOOK') }}"
payload: |
{
"channel": "#price-alerts",
"text": "The price is now: {{ json(trigger.body).price }}"
}
triggers:
- id: http
type: io.kestra.plugin.core.http.Trigger
uri: https://fakestoreapi.com/products/1
responseCondition: "{{ json(response.body).price <= 110 }}"
interval: PT30S
stopAfter:
- SUCCESS
Trigger a flow if an HTTP endpoint returns a status code equals to 200
id: http_trigger
namespace: company.team
tasks:
- id: log_response
type: io.kestra.plugin.core.log.Log
message: '{{ trigger.body }}'
triggers:
- id: http
type: io.kestra.plugin.core.http.Trigger
uri: https://api.chucknorris.io/jokes/random
responseCondition: "{{ response.statusCode == 200 }}"
stopAfter:
- SUCCESS
Default value is : false
When true, the encryptedBody output will be filled, otherwise the body output will be filled
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : GET
The condition will be evaluated after calling the HTTP endpoint, it can use the response itself to determine whether to start a flow or not. The following variables are available when evaluating the condition:
response.statusCode: the response HTTP status coderesponse.body: the response body as a stringresponse.headers: the response headers
Boolean coercion allows 0, -0, null and '' to evaluate to false, all other values will evaluate to true.
The condition will be evaluated before any 'generic trigger conditions' that can be configured via the conditions property.
Default value is : "{{ response.statusCode < 400 }}"
1 nested properties
Examples
Delete a KV pair.
id: kv_store_delete
namespace: company.team
tasks:
- id: kv_delete
type: io.kestra.plugin.core.kv.Delete
key: my_variable
namespace: dev # the current namespace of the flow will be used by default
1 nested properties
Examples
Get value for
my_variablekey indevnamespace and fail if it's not present.
id: kv_store_get
namespace: company.team
tasks:
- id: kv_get
type: io.kestra.plugin.core.kv.Get
key: my_variable
namespace: dev # the current namespace of the flow will be used by default
errorOnMissing: true
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : "{{ flow.namespace }}"
1 nested properties
Examples
Get keys that are prefixed by
my_var.
id: kv_store_getkeys
namespace: company.team
tasks:
- id: kv_getkeys
type: io.kestra.plugin.core.kv.GetKeys
prefix: my_var
namespace: dev # the current namespace of the flow will be used by default
Default value is : false
Default value is : false
Default value is : false
Default value is : "{{ flow.namespace }}"
1 nested properties
Examples
Set the task's
urioutput as a value fororders_filekey.
id: kv_store_set
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv
- id: kv_set
type: io.kestra.plugin.core.kv.Set
key: orders_file
value: "{{ outputs.http_download.uri }}"
kvType: STRING
Default value is : false
Default value is : false
Default value is : false
Default value is : "{{ flow.namespace }}"
Default value is : true
1 nested properties
This task is useful to automate moving logs between various systems and environments.##### Examples
level: INFO
executionId: "{{ trigger.executionId }}"
level: WARN
executionId: "{{ execution.id }}"
tasksId:
- "previous_task_id"
Default value is : false
Default value is : false
If not set, the task will use the ID of the current execution.
If set, it will try to locate the execution on the current flow unless the namespace and flowId properties are set.
Default value is : INFO
Default value is : false
1 nested properties
Examples
level: DEBUG
message: "{{ task.id }} > {{ taskrun.startDate }}"
Log one or more messages to the console.
id: hello_world
namespace: company.team
tasks:
- id: greeting
type: io.kestra.plugin.core.log.Log
message:
- Kestra team wishes you a great day 👋
- If you need some help, reach out via Slack
It can be a string or an array of strings.
Default value is : false
Default value is : false
Default value is : INFO
Default value is : false
1 nested properties
This task can be used to purge flow execution and trigger logs for all flows, for a specific namespace, or for a specific flow.##### Examples
Purge all logs that has been created more than one month ago.
endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}"
Purge all logs that has been created more than one month ago, but keep error logs.
endDate: "{{ now() | dateAdd(-1, 'MONTHS') }}"
logLevels:
- TRACE
- DEBUG
- INFO
- WARN
All logs before this date will be purged.
Default value is : false
Default value is : false
You need to provide the namespace properties if you want to purge a flow logs.
If not set, log for any levels will be purged.
Default value is : false
If flowId isn't provided, this is a namespace prefix, else the namespace of the flow.
All logs after this date will be purged.
1 nested properties
Examples
Delete namespace files that match a specific regex glob pattern.
id: delete_files
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.core.namespace.DeleteFiles
namespace: tutorial
files:
- "**.upl"
Delete all namespace files from a specific namespace.
id: delete_all_files
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.core.namespace.DeleteFiles
namespace: tutorial
files:
- "**"
String or a list of strings; each string can either be a regex glob pattern or a file path URI.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Use a regex glob pattern or a file path to download files from your namespace files. This can be useful to share code between projects and teams, which is located in different namespaces.##### Examples
Download a namespace file.
id: download_file
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.core.namespace.DownloadFiles
namespace: tutorial
files:
- "**input.txt"
Download all namespace files from a specific namespace.
id: download_all_files
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.core.namespace.DownloadFiles
namespace: tutorial
files:
- "**"
String or a list of strings; each string can either be a regex glob pattern or a file path URI.
Default value is : false
Default value is : --- ""
Default value is : false
Default value is : false
1 nested properties
Use a regex glob pattern or a file path to upload files as Namespace Files. When using a map with the desired file name as key and file path as value, you can also rename or relocate files.##### Examples
Upload files generated by a previous task using the
filesMapproperty.
id: upload_files_from_git
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.core.http.Download
uri: https://github.com/kestra-io/scripts/archive/refs/heads/main.zip
- id: unzip
type: io.kestra.plugin.compress.ArchiveDecompress
from: "{{ outputs.download.uri }}"
algorithm: ZIP
- id: upload
type: io.kestra.plugin.core.namespace.UploadFiles
filesMap: "{{ outputs.unzip.files }}"
namespace: "{{ flow.namespace }}"
Upload a folder using a glob pattern. Note that the Regex syntax requires a
globpattern inspired by Apache Ant patterns. Make sure that your pattern starts withglob:, followed by the pattern. For example, useglob:**/dbt/**to upload the entiredbtfolder (with all files and subdirectories) regardless of that folder's location in the directory structure.
id: upload_dbt_project
namespace: company.team
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: git_clone
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-example
branch: master
- id: upload
type: io.kestra.plugin.core.namespace.UploadFiles
files:
- "glob:**/dbt/**"
namespace: "{{ flow.namespace }}"
Upload a specific file and rename it.
id: upload_a_file
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.core.http.Download
uri: https://github.com/kestra-io/scripts/archive/refs/heads/main.zip
- id: unzip
type: io.kestra.plugin.compress.ArchiveDecompress
from: "{{ outputs.download.uri }}"
algorithm: ZIP
- id: upload
type: io.kestra.plugin.core.namespace.UploadFiles
filesMap:
LICENCE: "{{ outputs.unzip.files['scripts-main/LICENSE'] }}"
namespace: "{{ flow.namespace }}"
Default value is : false
Can be one of the following options: OVERWRITE, ERROR or SKIP. Default is OVERWRITE.
Default value is : OVERWRITE
Required when providing a list of files.
Default value is : /
Default value is : false
This should be a list of Regex matching the Apache Ant patterns.It's primarily intended to be used with the WorkingDirectory task
This should be a map of URI, with the key being the filename that will be upload, and the key the URI.This one is intended to be used with output files of other tasks. Many Kestra tasks, incl. all Downloads tasks, output a map of files so that you can directly pass the output property to this task e.g. outputFiles in the S3 Downloads task or the files in the Archive Decompress task.
Default value is : false
1 nested properties
You can use this task to return some outputs and pass them to downstream tasks.
It's helpful for parsing and returning values from a task. You can then access these outputs in your downstream tasks
using the expression {{ outputs.mytask_id.values.my_output_name }} and you can see them in the Outputs tab.
Examples
id: outputs_flow
namespace: company.team
tasks:
- id: output_values
type: io.kestra.plugin.core.output.OutputValues
values:
taskrun_data: "{{ task.id }} > {{ taskrun.startDate }}"
execution_data: "{{ flow.id }} > {{ execution.startDate }}"
- id: log_values
type: io.kestra.plugin.core.log.Log
message: |
Got the following outputs from the previous task:
{{ outputs.output_values.values.taskrun_data }}
{{ outputs.output_values.values.execution_data }}
Default value is : false
Default value is : false
Default value is : false
1 nested properties
To access the task's working directory, use the {{workingDir}} Pebble expression or the WORKING_DIR environment variable. Input files and namespace files will be available in this directory.
To generate output files you can either use the outputFiles task's property and create a file with the same name in the task's working directory, or create any file in the output directory which can be accessed by the {{outputDir}} Pebble expression or the OUTPUT_DIR environment variables.
Note that:
- This task runner is independent of any Operating System. You can use it equally on Linux, Mac or Windows without any additional configuration.
- When the Kestra Worker running this task is shut down, the process will be interrupted and re-created as soon as the worker is restarted.##### Examples
Execute a Shell command.
id: new_shell
namespace: company.team
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
taskRunner:
type: io.kestra.plugin.core.runner.Process
commands:
- echo "Hello World"
Install custom Python packages before executing a Python script. Note how we use the
--break-system-packagesflag to avoid conflicts with the system packages. Make sure to use this flag if you see errors similar toerror: externally-managed-environment.
id: before_commands_example
namespace: company.team
inputs:
- id: url
type: URI
defaults: https://jsonplaceholder.typicode.com/todos/1
tasks:
- id: transform
type: io.kestra.plugin.scripts.python.Script
taskRunner:
type: io.kestra.plugin.core.runner.Process
beforeCommands:
- pip install kestra requests --break-system-packages
script: |
import requests
from kestra import Kestra
url = "{{ inputs.url }}"
response = requests.get(url)
print('Status Code:', response.status_code)
Kestra.outputs(response.json())
Pass input files to the task, execute a Shell command, then retrieve output files.
id: new_shell_with_file
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
inputFiles:
data.txt: "{{inputs.file}}"
outputFiles:
- out.txt
taskRunner:
type: io.kestra.plugin.core.runner.Process
commands:
- cp {{workingDir}}/data.txt {{workingDir}}/out.txt
Examples
Delete the default state for the current flow.
id: delete_state
type: io.kestra.plugin.core.state.Delete
Delete the
myStatestate for the current flow.
id: delete_state
type: io.kestra.plugin.core.state.Delete
name: myState
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : default
By default, the state is isolated by namespace and flow, setting to true will allow to share the state between the same namespace
Default value is : false
By default, the state will be isolated with taskrun.value (during iteration with each). Setting to false will allow using the same state for every run of the iteration.
Default value is : true
1 nested properties
Examples
Get the default state file for the current flow.
id: get_state
type: io.kestra.plugin.core.state.Get
Get the
myStatestate for the current flow.
id: get_state
type: io.kestra.plugin.core.state.Get
name: myState
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : default
By default, the state is isolated by namespace and flow, setting to true will allow to share the state between the same namespace
Default value is : false
By default, the state will be isolated with taskrun.value (during iteration with each). Setting to false will allow using the same state for every run of the iteration.
Default value is : true
1 nested properties
Values will be merged:
- If you provide a new key, the new key will be added.
- If you provide an existing key, the previous key will be overwrite.
::alert{type="warning"} This method is not concurrency safe. If many executions for the same flow are concurrent, there is no guarantee on isolation on the value. The value can be overwritten by other executions. ::
Examples
Set the default state for the current flow.
id: set_state
type: io.kestra.plugin.core.state.Set
data:
'{{ inputs.store }}': '{{ outputs.download.md5 }}'
Set the
myStatestate for the current flow.
id: set_state
type: io.kestra.plugin.core.state.Set
name: myState
data:
'{{ inputs.store }}': '{{ outputs.download.md5 }}'
Default value is : false
Default value is : false
Default value is : false
Default value is : default
By default, the state is isolated by namespace and flow, setting to true will allow to share the state between the same namespace
Default value is : false
By default, the state will be isolated with taskrun.value (during iteration with each). Setting to false will allow using the same state for every run of the iteration.
Default value is : true
1 nested properties
Examples
Concat 2 files with a custom separator.
files:
- "kestra://long/url/file1.txt"
- "kestra://long/url/file2.txt"
separator: "\n"
Concat files generated by an each task.
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: start_api_call
type: io.kestra.plugin.scripts.shell.Commands
commands:
- echo {{ taskrun.value }} > {{ temp.generated }}
files:
- generated
value: '["value1", "value2", "value3"]'
- id: concat
type: io.kestra.plugin.core.storage.Concat
files:
- "{{ outputs.start_api_call.value1.files.generated }}"
- "{{ outputs.start_api_call.value2.files.generated }}"
- "{{ outputs.start_api_call.value3.files.generated }}"
Concat a dynamic number of files.
tasks:
- id: echo
type: io.kestra.plugin.scripts.shell.Commands
commands:
- echo "Hello John" > {{ outputDirs.output }}/1.txt
- echo "Hello Jane" > {{ outputDirs.output }}/2.txt
- echo "Hello Doe" > {{ outputDirs.output }}/3.txt
outputDirs:
- output
- id: concat
type: io.kestra.plugin.core.storage.Concat
files: "{{ outputs.echo.files | jq('.[]') }}"
Must be a kestra:// storage URIs, can be a list of string or json string
Default value is : false
Default value is : false
Default value is : .tmp
Default value is : false
1 nested properties
The Deduplicate task involves reading the input file twice, rather than loading the entire file into memory.
The first iteration is used to build a deduplication map in memory containing the last lines observed for each key.
The second iteration is used to rewrite the file without the duplicates. The task must be used with this in mind.
Examples
tasks:
- id: deduplicate
type: io.kestra.plugin.core.storage.DeduplicateItems
from: "{{ inputs.uri }}"
expr: "{{ key }}"
The 'pebble' expression can be used for constructing a composite key.
Must be a kestra:// internal storage URI.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
uri: "kestra://long/url/file.txt"
Must be a kestra:// storage URI.
Default value is : false
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
tasks:
- id: filter
type: io.kestra.plugin.core.storage.FilterItems
from: "{{ inputs.file }}"
filterCondition: " {{ value == null }}"
filterType: EXCLUDE
errorOrNullBehavior: EXCLUDE
The 'pebble' expression should return a BOOLEAN value (i.e. true or false). Values 0, -0, and "" are interpreted as false. Otherwise, any non empty value will be interpreted as true.
Must be a kestra:// internal storage URI.
Default value is : false
Default value is : false
Use FAIL to throw the exception and fail the task, INCLUDE to pass the item through, or EXCLUDE to drop the item.
Default value is : FAIL
Use INCLUDE to pass the item through, or EXCLUDE to drop the items.
Default value is : INCLUDE
Default value is : false
1 nested properties
This task was intended to be used along with the WorkingDirectory task to create temporary files. This task suffers from multiple limitations e.g. it cannot be skipped, so setting disabled: true will have no effect. Overall, the WorkingDirectory task is more flexible and should be used instead of this task. This task will be removed in a future version of Kestra.##### Examples
Output local files created in a Python task and load them to S3.
id: outputs_from_python_task
namespace: company.team
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/examples
branch: main
- id: git_python_scripts
type: io.kestra.plugin.scripts.python.Commands
warningOnStdErr: false
runner: DOCKER
docker:
image: ghcr.io/kestra-io/pydata:latest
beforeCommands:
- pip install faker > /dev/null
commands:
- python examples/scripts/etl_script.py
- python examples/scripts/generate_orders.py
- id: export_files
type: io.kestra.plugin.core.storage.LocalFiles
outputs:
- orders.csv
- "*.parquet"
- id: load_csv_to_s3
type: io.kestra.plugin.aws.s3.Upload
accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}"
region: eu-central-1
bucket: kestraio
key: stage/orders.csv
from: "{{ outputs.export_files.outputFiles['orders.csv'] }}"
disabled: true
Create a local file that will be accessible to a bash task.
id: "local_files"
namespace: "io.kestra.tests"
tasks:
- id: working_dir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: input_files
type: io.kestra.plugin.core.storage.LocalFiles
inputs:
hello.txt: "Hello World\n"
address.json: "{{ outputs.my_task_id.uri }}"
- id: bash
type: io.kestra.plugin.scripts.shell.Commands
commands:
- cat hello.txt
Send local files to Kestra's internal storage.
id: "local_files"
namespace: "io.kestra.tests"
tasks:
- id: working_dir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: bash
type: io.kestra.plugin.scripts.shell.Commands
commands:
- mkdir -p sub/dir
- echo "Hello from Bash" >> sub/dir/bash1.txt
- echo "Hello from Bash" >> sub/dir/bash2.txt
- id: output_files
type: io.kestra.plugin.core.storage.LocalFiles
outputs:
- sub/**
Default value is : false
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
This will delete all the generated files from a flow for the current execution. This will delete all files from:
- inputs
- outputs
- triggers
If the current execution doesn't have any generated files, the task will not fail.##### Examples
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
from: "kestra://long/url/file1.txt"
Default value is : false
Default value is : UTF-8
Default value is : false
Default value is : false
Default value is : |2+
1 nested properties
Examples
uri: "kestra://long/url/file.txt"
Must be a kestra:// storage URI.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Split a file by size.
from: "kestra://long/url/file1.txt"
bytes: 10MB
Split a file by rows count.
from: "kestra://long/url/file1.txt"
rows: 1000
Split a file in a defined number of partitions.
from: "kestra://long/url/file1.txt"
partitions: 8
Default value is : false
Can be provided as a string in the format "10MB" or "200KB", or the number of bytes. This allows you to process large files, slit them into smaller chunks by lines and process them in parallel. For example, MySQL by default limits the size of a query size to 16MB per query. Trying to use a bulk insert query with input data larger than 16MB will fail. Splitting the input data into smaller chunks is a common strategy to circumvent this limitation. By dividing a large data set into chunks smaller than the max_allowed_packet size (e.g., 10MB), you can insert the data in multiple smaller queries. This approach not only helps to avoid hitting the query size limit but can also be more efficient and manageable in terms of memory utilization, especially for very large datasets. In short, by splitting the file by bytes, you can bulk-insert smaller chunks of e.g. 10MB in parallel to avoid this limitation.
Default value is : false
Default value is : false
Default value is : \n
1 nested properties
Examples
spec: |
type: io.kestra.plugin.fs.http.Download
{{ task.property }}: {{ task.value }}
Default value is : false
Default value is : false
Default value is : false
1 nested properties
You can trigger a flow as soon as another flow ends. This allows you to add implicit dependencies between multiple flows, which can often be managed by different teams. ::alert{type="warning"} If you don't provide any conditions, the flow will be triggered for EVERY execution of EVERY flow on your instance. ::##### Examples
This flow will be triggered after each successful execution of flow
company.team.trigger_flowand forward theuriofmy_tasktaskId outputs.
id: trigger_flow_listener
namespace: company.team
inputs:
- id: from_parent
type: STRING
tasks:
- id: only_no_input
type: io.kestra.plugin.core.debug.Return
format: "v1: {{ trigger.executionId }}"
triggers:
- id: listen_flow
type: io.kestra.plugin.core.trigger.Flow
inputs:
from-parent: '{{ outputs.my_task.uri }}'
conditions:
- type: io.kestra.plugin.core.condition.ExecutionFlowCondition
namespace: company.team
flowId: trigger_flow
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- SUCCESS
Default value is : false
Fill input of this flow based on output of current flow, allowing to pass data or file to the triggered flow ::alert{type="warning"} If you provide invalid input, the flow will not be created! Since there is no task started, you can't log any reason that's visible on the Execution UI. So you will need to go to the Logs tabs on the UI to understand the error. ::
Default value is : false
By default, only executions in a terminal state will be evaluated.
If you use a condition of type ExecutionStatusCondition it will be evaluated after this list.
::alert{type="info"}
The trigger will be evaluated on each execution state change, this means that, for non-terminal state, they can be observed multiple times.
For example, if a flow has two Pause tasks, the execution will transition two times from PAUSED to RUNNING so these states will be observed two times.
::
::alert{type="warning"}
You cannot evaluate on the CREATED state.
::
Default value is : `- SUCCESS
- WARNING
- FAILED
- KILLED
- CANCELLED
- RETRIED`
Default value is : `- SUCCESS
- WARNING
- FAILED
- KILLED
- CANCELLED
- RETRIED`
[
"SUCCESS",
"WARNING",
"FAILED",
"KILLED",
"CANCELLED",
"RETRIED"
]
1 nested properties
You can add multiple schedule(s) to a flow.
The scheduler keeps track of the last scheduled date, allowing you to easily backfill missed executions.
Keep in mind that if you change the trigger ID, the scheduler will consider this as a new schedule, and will start creating new scheduled executions from the current date.
By default, Schedules will use UTC. If you need a different timezone, use the timezone property to update it.##### Examples
Schedule a flow every 15 minutes.
id: scheduled_flow
namespace: company.team
tasks:
- id: sleep_randomly
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- echo "{{ execution.startDate ?? trigger.date }}"
- sleep $((RANDOM % 60 + 1))
triggers:
- id: every_15_minutes
type: io.kestra.plugin.core.trigger.Schedule
cron: '*/15 * * * *'
Schedule a flow every hour using the cron nickname
@hourly.
id: scheduled_flow
namespace: company.team
tasks:
- id: log_hello_world
type: io.kestra.plugin.core.log.Log
message: Hello World! 🚀
triggers:
- id: hourly
type: io.kestra.plugin.core.trigger.Schedule
cron: "@hourly"
Schedule a flow on the first Monday of the month at 11 AM.
id: scheduled_flow
namespace: company.team
tasks:
- id: log_hello_world
type: io.kestra.plugin.core.log.Log
message: Hello World! 🚀
triggers:
- id: schedule
cron: "0 11 * * 1"
conditions:
- type: io.kestra.plugin.core.condition.DayWeekInMonthCondition
date: "{{ trigger.date }}"
dayOfWeek: "MONDAY"
dayInMonth: "FIRST"
Schedule a flow every day at 9:00 AM and pause a schedule trigger after a failed execution using the
stopAfterproperty.
id: business_critical_flow
namespace: company.team
tasks:
- id: important_task
type: io.kestra.plugin.core.log.Log
message: "if this run fails, disable the schedule until the issue is fixed"
triggers:
- id: stop_after_failed
type: io.kestra.plugin.core.trigger.Schedule
cron: "0 9 * * *"
stopAfter:
- FAILED
A standard unix cron expression with 5 fields (minutes precision). Using withSeconds: true you can switch to 6 fields and a seconds precision.
Can also be a cron extension / nickname:
@yearly@annually@monthly@weekly@daily@midnight@hourly
This property is deprecated and will be removed in the future. Instead, you can now go to the Triggers tab and start a highly customizable backfill process directly from the UI. This will allow you to backfill missed scheduled executions by providing a specific date range and custom labels. Read more about it in the Backfill documentation.
Default value is : false
If the scheduled execution didn't start after this delay (e.g. due to infrastructure issues), the execution will be skipped.
Default value is : false
ALL will recover all missed schedules, LAST will only recovered the last missing one, NONE will not recover any missing schedule.
The default is ALL unless a different value is configured using the global plugin configuration.
List of schedule conditions in order to limit the schedule trigger date.
Default value is : Etc/UTC
By default, the cron expression has 5 fields, setting this property to true will allow a 6th fields for seconds precision.
Default value is : false
1 nested properties
Default value is : false
Default value is : false
ALL will recover all missed schedules, LAST will only recovered the last missing one, NONE will not recover any missing schedule.
The default is ALL unless a different value is configured using the global plugin configuration.
Default value is : Etc/UTC
1 nested properties
Examples
Toggle a trigger on flow input.
id: trigger_toggle
namespace: company.team
inputs:
- id: toggle
type: BOOLEAN
defaults: true
tasks:
- id: if
type: io.kestra.plugin.core.flow.If
condition: "{{inputs.toggle}}"
then:
- id: enable
type: io.kestra.plugin.core.trigger.Toggle
trigger: schedule
enabled: true
else:
- id: disable
type: io.kestra.plugin.core.trigger.Toggle
trigger: schedule
enabled: false
- id: log
type: io.kestra.plugin.core.log.Log
message: Hello World
triggers:
- id: schedule
type: io.kestra.plugin.core.trigger.Schedule
cron: "* * * * *"
Default value is : false
Default value is : false
Default value is : false
If not set, the current flow identifier will be used.
Default value is : false
If not set, the current flow namespace will be used.
1 nested properties
Webhook trigger allows you to create a unique URL that you can use to trigger a Kestra flow execution based on events in another application such as GitHub or Amazon EventBridge. In order to use that URL, you have to add a secret key that will secure your webhook URL.
The URL will then follow the following format: https://{your_hostname}/api/v1/executions/webhook/{namespace}/{flowId}/{key}. Replace the templated values according to your workflow setup.
The webhook URL accepts GET, POST and PUT requests.
You can access the request body and headers sent by another application using the following template variables:
{{ trigger.body }}{{ trigger.headers }}.
The webhook response will be one of the following HTTP status codes:
- 404 if the namespace, flow or webhook key is not found.
- 200 if the webhook triggers an execution.
- 204 if the webhook cannot trigger an execution due to a lack of matching event conditions sent by other application.
A webhook trigger can have conditions but it doesn't support conditions of type MultipleCondition.##### Examples
Add a webhook trigger to the current flow with the key
4wjtkzwVGBM9yKnjm3yv8r, the webhook will be available at the URI/api/v1/executions/webhook/{namespace}/{flowId}/4wjtkzwVGBM9yKnjm3yv8r.
id: webhook_flow
namespace: company.team
tasks:
- id: log_hello_world
type: io.kestra.plugin.core.log.Log
message: Hello World! 🚀
triggers:
- id: webhook
type: io.kestra.plugin.core.trigger.Webhook
key: 4wjtkzwVGBM9yKnjm3yv8r
Add a trigger matching specific webhook event condition. The flow will be executed only if the condition is met.`.
id: condition_based_webhook_flow
namespace: company.team
tasks:
- id: log_hello_world
type: io.kestra.plugin.core.log.Log
message: Hello World! 🚀
triggers:
- id: webhook
type: io.kestra.plugin.core.trigger.Webhook
key: 4wjtkzwVGBM9yKnjm3yv8r
conditions:
- type: io.kestra.plugin.core.condition.ExpressionCondition
expression: "{{ trigger.body.hello == 'world' }}"
The key is used for generating the URL of the webhook.
::alert{type="warning"} Make sure to keep the webhook key secure. It's the only security mechanism to protect your endpoint from bad actors, and must be considered as a secret. You can use a random key generator to create the key. ::
Default value is : false
Default value is : false
1 nested properties
Examples
Send a N1QL query to a Couchbase database.
connectionString: couchbase://localhost
username: couchbase_user
password: couchbase_passwd
query: SELECT * FROM `COUCHBASE_BUCKET`(.`COUCHBASE_SCOPE`.`COUCHBASE_COLLECTION`)
fetchType: FETCH
Default value is : false
Default value is : false
FETCH_ONE - output just the first row. FETCH - output all the rows. STORE - store all the rows in a file. NONE - do nothing.
Default value is : STORE
Default value is : false
See Couchbase documentation about Prepared Statements for query syntax. This should be supplied with a parameter map if using named parameters, or an array for positional ones.
1 nested properties
Examples
Wait for a N1QL query to return results, and then iterate through rows.
id: couchbase-trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.couchbase.Trigger
interval: "PT5M"
connectionString: couchbase://localhost
username: couchbase_user
password: couchbase_passwd
query: SELECT * FROM `COUCHBASE_BUCKET`(.`COUCHBASE_SCOPE`.`COUCHBASE_COLLECTION`)
fetchType: FETCH
Default value is : false
FETCH_ONE - output just the first row. FETCH - output all the rows. STORE - store all the rows in a file. NONE - do nothing.
Default value is : STORE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
See Couchbase documentation about Prepared Statements for query syntax. This should be supplied with a parameter map if using named parameters, or an array for positional ones.
1 nested properties
Examples
Decrypt a file
id: crypto_decrypt
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: decrypt
type: io.kestra.plugin.crypto.openpgp.Decrypt
from: "{{ inputs.file }}"
privateKey: |
-----BEGIN PGP PRIVATE KEY BLOCK-----
privateKeyPassphrase: my-passphrase
Decrypt a file and verify signature
id: crypto_decrypt
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: decrypt
type: io.kestra.plugin.crypto.openpgp.Decrypt
from: "{{ inputs.file }}"
privateKey: |
-----BEGIN PGP PRIVATE KEY BLOCK-----
privateKeyPassphrase: my-passphrase
signUsersKey:
- |
-----BEGIN PGP PRIVATE KEY BLOCK-----
requiredSignerUsers:
- [email protected]
Default value is : false
Default value is : false
Default value is : false
Must be an ascii key export with gpg --export-secret-key -a
Must be an ascii key export with gpg --export -a
1 nested properties
Examples
Encrypt a file not signed
id: crypto_encrypt
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: encrypt
type: io.kestra.plugin.crypto.openpgp.Encrypt
from: "{{ inputs.file }}"
key: |
-----BEGIN PGP PUBLIC KEY BLOCK----- ...
recipients:
- [email protected]
Encrypt a file signed
id: crypto_encrypt
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: encrypt
type: io.kestra.plugin.crypto.openpgp.Encrypt
from: "{{ inputs.file }}"
key: |
-----BEGIN PGP PUBLIC KEY BLOCK----- ...
recipients:
- [email protected]
signPublicKey: |
-----BEGIN PGP PUBLIC KEY BLOCK----- ...
signPrivateKey: |
-----BEGIN PGP PRIVATE KEY BLOCK-----
signPassphrase: my-passphrase
signUser: [email protected]
Default value is : false
Default value is : false
Must be an ascii key export with gpg --export -a
Default value is : false
Must be an ascii key export with gpg --export -a
Must be an ascii key export with gpg --export -a
If you want to sign the file, you need to provide a privateKey
1 nested properties
Examples
Create a Databricks cluster with one worker.
id: databricks_create_cluster
namespace: company.team
tasks:
- id: create_cluster
type: io.kestra.plugin.databricks.cluster.CreateCluster
authentication:
token: <your-token>
host: <your-host>
clusterName: kestra-demo
nodeTypeId: n2-highmem-4
numWorkers: 1
sparkVersion: 13.0.x-scala2.12
Default value is : false
Default value is : false
Default value is : false
Use this property along with minWorkers to use autoscaling. Otherwise, set a fixed number of workers using numWorkers.
Use this property along with maxWorkers for autoscaling. Otherwise, set a fixed number of workers using numWorkers.
You must set this property unless you use the minWorkers and maxWorkers properties for autoscaling.
1 nested properties
Examples
Delete a Databricks cluster.
id: databricks_delete_cluster
namespace: company.team
tasks:
- id: delete_cluster
type: io.kestra.plugin.databricks.cluster.DeleteCluster
authentication:
token: <your-token>
host: <your-host>
clusterId: <your-cluster>
Default value is : false
Default value is : false
Default value is : false
1 nested properties
The file can be of any size. The task will download the file in chunks of 1MB.##### Examples
Download a file from the Databricks File System.
id: databricks_dbfs_download
namespace: company.team
tasks:
- id: download_file
type: io.kestra.plugin.databricks.dbfs.Download
authentication:
token: <your-token>
host: <your-host>
from: /Share/myFile.txt
Default value is : false
Default value is : false
Default value is : false
1 nested properties
The file can be of any size. The task will upload the file in chunks of 1MB.##### Examples
Upload a file to the Databricks File System.
id: databricks_dbfs_upload
namespace: company.team
inputs:
- id: file
type: FILE
description: File to be uploaded to DBFS
tasks:
- id: upload_file
type: io.kestra.plugin.databricks.dbfs.Upload
authentication:
token: <your-token>
host: <your-host>
from: "{{ inputs.file }}"
to: /Share/myFile.txt
Must be a file from Kestra internal storage.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Create a Databricks job, run it, and wait for completion for five minutes.
id: databricks_job_create
namespace: company.team
tasks:
- id: create_job
type: io.kestra.plugin.databricks.job.CreateJob
authentication:
token: <your-token>
host: <your-host>
jobTasks:
- existingClusterId: <your-cluster>
taskKey: taskKey
sparkPythonTask:
pythonFile: /Shared/hello.py
sparkPythonTaskSource: WORKSPACE
waitForCompletion: PT5M
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Submit a Databricks run and wait up to 5 minutes for its completion.
id: databricks_job_submit_run
namespace: company.team
tasks:
- id: submit_run
type: io.kestra.plugin.databricks.job.SubmitRun
authentication:
token: <your-token>
host: <your-host>
runTasks:
- existingClusterId: <your-cluster>
taskKey: taskKey
sparkPythonTask:
pythonFile: /Shared/hello.py
sparkPythonTaskSource: WORKSPACE
waitForCompletion: PT5M
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Can be a map of string/string or a variable that binds to a JSON object.
Can be a map of string/string or a variable that binds to a JSON object.
Can be a list of strings or a variable that binds to a JSON array of strings.
Can be a list of strings or a variable that binds to a JSON array of strings.
Can be a list of strings or a variable that binds to a JSON array of strings.
Can be a list of strings or a variable that binds to a JSON array of strings.
Can be a map of string/string or a variable that binds to a JSON object.
See Retrieve the connection details in the Databricks documentation to discover how to retrieve the needed configuration properties. We're using the Databricks JDBC driver to execute a Query, see https://docs.databricks.com/integrations/jdbc-odbc-bi.html#jdbc-driver-capabilities for its capabilities.
Due to current limitation of the JDBC driver with Java 21, Arrow is disabled, performance may be impacted, see here and here from Databricks status on Java 21 support.
Examples
id: databricks_sql_query
namespace: company.team
tasks:
- id: sql_query
type: io.kestra.plugin.databricks.sql.Query
accessToken: <your-accessToken>
host: <your-host>
httpPath: <your-httpPath>
sql: SELECT 1
To retrieve the HTTP Path, go to your Databricks cluster, click on Advanced options then, click on JDBC/ODBC. See Retrieve the connection details for more details.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Compile and run a Dataform project from Git
id: dataform
namespace: company.team
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repo
type: io.kestra.plugin.git.Clone
url: https://github.com/dataform-co/dataform-example-project-bigquery
- id: transform
type: io.kestra.plugin.dataform.cli.DataformCLI
beforeCommands:
- dataform compile
commands:
- dataform run --dry-run
Default value is : false
Default value is : dataformco/dataform:latest
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
Examples
Invoke dbt
buildcommand.
id: dbt_build
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_build
type: io.kestra.plugin.dbt.cli.Build
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
containerImage: ghcr.io/kestra-io/dbt-duckdb
profiles: |
jaffle_shop:
outputs:
dev:
type: duckdb
path: ':memory:'
extensions:
- parquet
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : ./bin/dbt
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exist in the current working directory, it will be overridden.
Default is the current working directory and its parents.
Deprecated, use 'taskRunner' instead.
1 nested properties
Examples
Invoke dbt
compilecommand.
id: dbt_compile
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_compile
type: io.kestra.plugin.dbt.cli.Compile
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
containerImage: ghcr.io/kestra-io/dbt-duckdb
profiles: |
jaffle_shop:
outputs:
dev:
type: duckdb
path: ':memory:'
extensions:
- parquet
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : ./bin/dbt
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exist in the current working directory, it will be overridden.
Default is the current working directory and its parents.
Deprecated, use 'taskRunner' instead.
1 nested properties
Examples
Launch a
dbt buildcommand on a sample dbt project hosted on GitHub.
id: dbt_build
namespace: company.team
tasks:
- id: dbt
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: cloneRepository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-example
branch: main
- id: dbt-build
type: io.kestra.plugin.dbt.cli.DbtCLI
containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
commands:
- dbt build
profiles: |
my_dbt_project:
outputs:
dev:
type: duckdb
path: ":memory:"
target: dev
Install a custom dbt version and run
dbt depsanddbt buildcommands. Note how you can also configure the memory limit for the Docker runner. This is useful when you see Zombie processes.
id: dbt_custom_dependencies
namespace: company.team
inputs:
- id: dbt_version
type: STRING
defaults: "dbt-duckdb==1.6.0"
tasks:
- id: git
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-example
branch: main
- id: dbt
type: io.kestra.plugin.dbt.cli.DbtCLI
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
memory:
memory: 1GB
containerImage: python:3.11-slim
beforeCommands:
- pip install uv
- uv venv --quiet
- . .venv/bin/activate --quiet
- uv pip install --quiet {{ inputs.dbt_version }}
commands:
- dbt deps
- dbt build
profiles: |
my_dbt_project:
outputs:
dev:
type: duckdb
path: ":memory:"
fixed_retries: 1
threads: 16
timeout_seconds: 300
target: dev
Clone a Git repository and build dbt models. Note that, as the dbt project files are in a separate directory, you need to set the
projectDirtask property and use--project-dirin each dbt CLI command.
id: dwh_and_analytics
namespace: company.team
tasks:
- id: dbt
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-example
branch: master
- id: dbt_build
type: io.kestra.plugin.dbt.cli.DbtCLI
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
containerImage: ghcr.io/kestra-io/dbt-duckdb:latest
commands:
- dbt deps --project-dir dbt --target prod
- dbt build --project-dir dbt --target prod
projectDir: dbt
profiles: |
my_dbt_project:
outputs:
dev:
type: duckdb
path: dbt.duckdb
extensions:
- parquet
fixed_retries: 1
threads: 16
timeout_seconds: 300
prod:
type: duckdb
path: dbt2.duckdb
extensions:
- parquet
fixed_retries: 1
threads: 16
timeout_seconds: 300
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exists in the current working directory, it will be overridden.
To use it, also use this directory in the --project-dir flag on the dbt CLI commands.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Invoke dbt
depscommand
id: dbt_deps
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_deps
type: io.kestra.plugin.dbt.cli.Deps
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
containerImage: ghcr.io/kestra-io/dbt-duckdb
profiles: |
jaffle_shop:
outputs:
dev:
type: duckdb
path: ':memory:'
extensions:
- parquet
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : ./bin/dbt
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exist in the current working directory, it will be overridden.
Default is the current working directory and its parents.
Deprecated, use 'taskRunner' instead.
1 nested properties
Examples
Invoke dbt
source freshnesscommand.
id: dbt_freshness
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_freshness
type: io.kestra.plugin.dbt.cli.Freshness
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
containerImage: ghcr.io/kestra-io/dbt-duckdb
profiles: |
jaffle_shop:
outputs:
dev:
type: duckdb
path: ':memory:'
extensions:
- parquet
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : ./bin/dbt
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exist in the current working directory, it will be overridden.
Default is the current working directory and its parents.
Deprecated, use 'taskRunner' instead.
1 nested properties
Examples
Invoke dbt
listcommand.
id: dbt_list
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_list
type: io.kestra.plugin.dbt.cli.List
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
containerImage: ghcr.io/kestra-io/dbt-duckdb
profiles: |
jaffle_shop:
outputs:
dev:
type: duckdb
path: ':memory:'
extensions:
- parquet
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : ./bin/dbt
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exist in the current working directory, it will be overridden.
Default is the current working directory and its parents.
Deprecated, use 'taskRunner' instead.
1 nested properties
Examples
Invoke dbt
runcommand.
id: dbt_run
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_run
type: io.kestra.plugin.dbt.cli.Run
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
containerImage: ghcr.io/kestra-io/dbt-duckdb
profiles: |
jaffle_shop:
outputs:
dev:
type: duckdb
path: ':memory:'
extensions:
- parquet
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : ./bin/dbt
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exist in the current working directory, it will be overridden.
Default is the current working directory and its parents.
Deprecated, use 'taskRunner' instead.
1 nested properties
Examples
Invoke dbt
seedcommand.
id: dbt_seed
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_seed
type: io.kestra.plugin.dbt.cli.Seed
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
containerImage: ghcr.io/kestra-io/dbt-duckdb
profiles: |
jaffle_shop:
outputs:
dev:
type: duckdb
path: ':memory:'
extensions:
- parquet
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : ./bin/dbt
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exist in the current working directory, it will be overridden.
Default is the current working directory and its parents.
Deprecated, use 'taskRunner' instead.
1 nested properties
Use it to install dbt requirements locally in a Python virtualenv if you don't want to use dbt via Docker.
In this case, you need to use a WorkingDirectory task and this Setup task to setup dbt prior to using any of the dbt tasks.##### Examples
Setup dbt by installing pip dependencies in a Python virtualenv and initializing the profile directory.
id: dbt_setup
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_setup
type: io.kestra.plugin.dbt.cli.Setup
requirements:
- dbt-duckdb
profiles:
jaffle_shop:
outputs:
dev:
type: duckdb
path: ':memory:'
extensions:
- parquet
target: dev
- id: dbt_build
type: io.kestra.plugin.dbt.cli.Build
Python dependencies list to setup in the virtualenv, in the same format than requirements.txt. It must at least provides dbt.
Default value is : false
Default value is : python
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
You can define the files as map or a JSON string. Each file can be defined inlined or can reference a file from Kestra's internal storage.
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Set the python interpreter path to use.
Default value is : python
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Invoke dbt
snapshotcommand.
id: dbt_snapshot
namespace: company.team
tasks:
- id: working_directory
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-demo
branch: main
- id: dbt_snapshot
type: io.kestra.plugin.dbt.cli.Snapshot
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
containerImage: ghcr.io/kestra-io/dbt-duckdb
profiles: |
jaffle_shop:
outputs:
dev:
type: duckdb
path: ':memory:'
extensions:
- parquet
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : ./bin/dbt
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exist in the current working directory, it will be overridden.
Default is the current working directory and its parents.
Deprecated, use 'taskRunner' instead.
1 nested properties
Examples
Invoke dbt
testcommand.
id: dbt_test
namespace: company.team
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/dbt-example
branch: main
- id: dbt_test
type: io.kestra.plugin.dbt.cli.Test
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
dbtPath: /usr/local/bin/dbt
containerImage: ghcr.io/kestra-io/dbt-duckdb
profiles: |
my_dbt_project:
outputs:
dev:
type: duckdb
path: ':memory:'
target: dev
Default value is : false
Default value is : ghcr.io/kestra-io/dbt
Default value is : ./bin/dbt
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
If a profile.yml file already exist in the current working directory, it will be overridden.
Default is the current working directory and its parents.
Deprecated, use 'taskRunner' instead.
1 nested properties
Examples
id: dbt_check_status
namespace: company.team
tasks:
- id: check_status
type: io.kestra.plugin.dbt.cloud.CheckStatus
accountId: "dbt_account"
token: "dbt_token"
runId: "run_id"
Default value is : false
Default value is : https://cloud.getdbt.com
Default value is : false
Default value is : false
1 nested properties
Use this task to kick off a run for a job. When this endpoint returns a successful response, a new run will be enqueued for the account. If you activate the wait option, it will wait for the job to be ended and will display all the log and dynamic tasks.##### Examples
id: dbt_trigger_job_run
namespace: company.team
tasks:
- id: trigger_run
type: io.kestra.plugin.dbt.cloud.TriggerRun
accountId: "dbt_account"
token: "dbt_token"
jobId: "job_id"
Default value is : false
Default value is : https://cloud.getdbt.com
Default value is : Triggered by Kestra.
Default value is : false
Default value is : false
1 nested properties
Examples
snapshotMode: INITIAL
hostname: 127.0.0.1
port: "50000"
username: db2inst1
password: my_password
database: my_database
maxRecords: 100
Default value is : false
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Any additional configuration properties that is valid for the current driver.
Possible settings are:
ALWAYS: The connector performs a snapshot every time that it starts.INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.WHEN_NEEDED: After the connector starts, it performs a snapshot only if it detects one of the following circumstances: 1. It cannot detect any topic offsets. 2. A previously recorded offset specifies a log position that is not available on the server.NO_DATA: The connector captures the structure of all relevant tables, performing all the steps described in the INITIAL, except that it does not create READ events to represent the data set at the point of the connector’s start-up.RECOVERY: Set this option to restore a database schema history topic that is lost or corrupted. After a restart, the connector runs a snapshot that rebuilds the topic from the source tables.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.debezium.db2.Trigger instead.##### Examples
Consume a message from a DB2 database via change data capture in real-time.
id: debezium-db2
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: realtime
type: io.kestra.plugin.debezium.db2.RealtimeTrigger
hostname: 127.0.0.1
port: 50000
username: db2inst1
password: my_password
database: my_database
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Possible values are:
- ON_EACH_BATCH: after each batch of records consumed by this trigger, the offsets will be stored in the KV Store. This avoids any duplicated records being consumed but can be costly if many events are produced.
- ON_STOP: when this trigger is stopped or killed, the offsets will be stored in the KV Store. This avoid any un-necessary writes to the KV Store, but if the trigger is not stopped gracefully, the KV Store value may not be updated leading to duplicated records consumption.
Default value is : ON_EACH_BATCH
Any additional configuration properties that is valid for the current driver.
Possible settings are:
ALWAYS: The connector performs a snapshot every time that it starts.INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.WHEN_NEEDED: After the connector starts, it performs a snapshot only if it detects one of the following circumstances: 1. It cannot detect any topic offsets. 2. A previously recorded offset specifies a log position that is not available on the server.NO_DATA: The connector captures the structure of all relevant tables, performing all the steps described in the INITIAL, except that it does not create READ events to represent the data set at the point of the connector’s start-up.RECOVERY: Set this option to restore a database schema history topic that is lost or corrupted. After a restart, the connector runs a snapshot that rebuilds the topic from the source tables.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume each message from change data capture in real-time and create one execution per message, you can use the io.kestra.plugin.debezium.db2.RealtimeTrigger instead.##### Examples
snapshotMode: INITIAL
hostname: 127.0.0.1
port: "50000"
username: db2inst1
password: my_password
database: my_database
maxRecords: 100
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Any additional configuration properties that is valid for the current driver.
Possible settings are:
ALWAYS: The connector performs a snapshot every time that it starts.INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.WHEN_NEEDED: After the connector starts, it performs a snapshot only if it detects one of the following circumstances: 1. It cannot detect any topic offsets. 2. A previously recorded offset specifies a log position that is not available on the server.NO_DATA: The connector captures the structure of all relevant tables, performing all the steps described in the INITIAL, except that it does not create READ events to represent the data set at the point of the connector’s start-up.RECOVERY: Set this option to restore a database schema history topic that is lost or corrupted. After a restart, the connector runs a snapshot that rebuilds the topic from the source tables.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
Examples
Replica set connection
snapshotMode: INITIAL
connectionString: mongodb://mongo_user:[email protected]:27017/?replicaSet=rs0
maxRecords: 100
Sharded connection
snapshotMode: INITIAL
connectionString: mongodb://mongo_user:[email protected]:27017,mongos1.example.com:27017/
maxRecords: 100
Replica set SRV connection
snapshotMode: INITIAL
connectionString: mongodb+srv://mongo_user:[email protected]/?replicaSet=rs0
maxRecords: 100
Sharded SRV connection
snapshotMode: INITIAL
connectionString: mongodb+srv://mongo_user:[email protected]/
maxRecords: 100
Default value is : false
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
A list of regular expressions that match the collection namespaces (for example,
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
A list of regular expressions that match the collection namespaces (for example,
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Any additional configuration properties that is valid for the current driver.
Possible settings are:
INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.NO_DATA: The connector captures the structure of all relevant tables, performing all the steps described in the default snapshot workflow, except that it does not create READ events to represent the data set at the point of the connector’s start-up.WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.debezium.mongodb.Trigger instead.##### Examples
Sharded connection
id: debezium-mongodb
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: realtime
type: io.kestra.plugin.debezium.mongodb.RealtimeTrigger
snapshotMode: INITIAL
connectionString: mongodb://mongo_user:[email protected]:27017,mongos1.example.com:27017/
Replica set connection
id: debezium-mongodb
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: realtime
type: io.kestra.plugin.debezium.mongodb.RealtimeTrigger
snapshotMode: INITIAL
connectionString: mongodb://mongo_user:[email protected]:27017/?replicaSet=rs0
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
A list of regular expressions that match the collection namespaces (for example,
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
A list of regular expressions that match the collection namespaces (for example,
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Possible values are:
- ON_EACH_BATCH: after each batch of records consumed by this trigger, the offsets will be stored in the KV Store. This avoids any duplicated records being consumed but can be costly if many events are produced.
- ON_STOP: when this trigger is stopped or killed, the offsets will be stored in the KV Store. This avoid any un-necessary writes to the KV Store, but if the trigger is not stopped gracefully, the KV Store value may not be updated leading to duplicated records consumption.
Default value is : ON_EACH_BATCH
Any additional configuration properties that is valid for the current driver.
Possible settings are:
INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.NO_DATA: The connector captures the structure of all relevant tables, performing all the steps described in the default snapshot workflow, except that it does not create READ events to represent the data set at the point of the connector’s start-up.WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume each message from change data capture in real-time and create one execution per message, you can use the io.kestra.plugin.debezium.mongodb.RealtimeTrigger instead.##### Examples
Sharded connection
id: debezium-mongodb
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: trigger
type: io.kestra.plugin.debezium.mongodb.Trigger
snapshotMode: INITIAL
connectionString: mongodb://mongo_user:[email protected]:27017,mongos1.example.com:27017/
Replica set connection
id: debezium-mongodb
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: trigger
type: io.kestra.plugin.debezium.mongodb.Trigger
snapshotMode: INITIAL
connectionString: mongodb://mongo_user:[email protected]:27017/?replicaSet=rs0
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
A list of regular expressions that match the collection namespaces (for example,
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
A list of regular expressions that match the collection namespaces (for example,
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Any additional configuration properties that is valid for the current driver.
Possible settings are:
INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.NO_DATA: The connector captures the structure of all relevant tables, performing all the steps described in the default snapshot workflow, except that it does not create READ events to represent the data set at the point of the connector’s start-up.WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
Examples
snapshotMode: NEVER
hostname: 127.0.0.1
port: "3306"
username: mysql_user
password: mysql_passwd
maxRecords: 100
This must be unique across all currently-running database processes in the MySQL cluster. This connector joins the MySQL database cluster as another server (with this unique ID) so it can read the binlog. By default, a random number between 5400 and 6400 is generated, though the recommendation is to explicitly set a value.
Default value is : false
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Any additional configuration properties that is valid for the current driver.
Possible settings are:
INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.NEVER: The connector never uses snapshots. Upon first startup with a logical server name, the connector reads from the beginning of the binlog. Configure this behavior with care. It is valid only when the binlog is guaranteed to contain the entire history of the database.SCHEMA_ONLY: The connector runs a snapshot of the schemas and not the data. This setting is useful when you do not need the topics to contain a consistent snapshot of the data but need them to have only the changes since the connector was started.SCHEMA_ONLY_RECOVERY: This is a recovery setting for a connector that has already been capturing changes. When you restart the connector, this setting enables recovery of a corrupted or lost database history topic. You might set it periodically to "clean up" a database history topic that has been growing unexpectedly. Database history topics require infinite retention.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.debezium.mysql.Trigger instead.##### Examples
Consume a message from a MySQL database via change data capture in real-time.
id: debezium-mysql
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: realtime
type: io.kestra.plugin.debezium.mysql.RealtimeTrigger
serverId: 123456789
hostname: 127.0.0.1
port: 63306
username: mysql_user
password: mysql_passwd
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Possible values are:
- ON_EACH_BATCH: after each batch of records consumed by this trigger, the offsets will be stored in the KV Store. This avoids any duplicated records being consumed but can be costly if many events are produced.
- ON_STOP: when this trigger is stopped or killed, the offsets will be stored in the KV Store. This avoid any un-necessary writes to the KV Store, but if the trigger is not stopped gracefully, the KV Store value may not be updated leading to duplicated records consumption.
Default value is : ON_EACH_BATCH
Any additional configuration properties that is valid for the current driver.
This must be unique across all currently-running database processes in the MySQL cluster. This connector joins the MySQL database cluster as another server (with this unique ID) so it can read the binlog. By default, a random number between 5400 and 6400 is generated, though the recommendation is to explicitly set a value.
Possible settings are:
INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.NEVER: The connector never uses snapshots. Upon first startup with a logical server name, the connector reads from the beginning of the binlog. Configure this behavior with care. It is valid only when the binlog is guaranteed to contain the entire history of the database.SCHEMA_ONLY: The connector runs a snapshot of the schemas and not the data. This setting is useful when you do not need the topics to contain a consistent snapshot of the data but need them to have only the changes since the connector was started.SCHEMA_ONLY_RECOVERY: This is a recovery setting for a connector that has already been capturing changes. When you restart the connector, this setting enables recovery of a corrupted or lost database history topic. You might set it periodically to "clean up" a database history topic that has been growing unexpectedly. Database history topics require infinite retention.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume each message from change data capture in real-time and create one execution per message, you can use the io.kestra.plugin.debezium.mysql.RealtimeTrigger instead.##### Examples
snapshotMode: NEVER
hostname: 127.0.0.1
port: "3306"
username: mysql_user
password: mysql_passwd
maxRecords: 100
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Any additional configuration properties that is valid for the current driver.
This must be unique across all currently-running database processes in the MySQL cluster. This connector joins the MySQL database cluster as another server (with this unique ID) so it can read the binlog. By default, a random number between 5400 and 6400 is generated, though the recommendation is to explicitly set a value.
Possible settings are:
INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.NEVER: The connector never uses snapshots. Upon first startup with a logical server name, the connector reads from the beginning of the binlog. Configure this behavior with care. It is valid only when the binlog is guaranteed to contain the entire history of the database.SCHEMA_ONLY: The connector runs a snapshot of the schemas and not the data. This setting is useful when you do not need the topics to contain a consistent snapshot of the data but need them to have only the changes since the connector was started.SCHEMA_ONLY_RECOVERY: This is a recovery setting for a connector that has already been capturing changes. When you restart the connector, this setting enables recovery of a corrupted or lost database history topic. You might set it periodically to "clean up" a database history topic that has been growing unexpectedly. Database history topics require infinite retention.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
Examples
Non-container database (non-CDB)
snapshotMode: INITIAL
hostname: 127.0.0.1
port: "1521"
username: c##dbzuser
password: dbz
sid: ORCLCDB
maxRecords: 100
Container database (CDB)
snapshotMode: INITIAL
hostname: 127.0.0.1
port: "1521"
username: c##dbzuser
password: dbz
sid: ORCLCDB
pluggableDatabase: ORCLPDB1
maxRecords: 100
Default value is : false
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
For non-container database (non-CDB) installation, do not specify the pluggableDatabase property.
Any additional configuration properties that is valid for the current driver.
Possible settings are:
ALWAYS: The connector runs a snapshot on each connector start.INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.NO_DATA: The connector runs a snapshot of the schemas and not the data. This setting is useful when you do not need the topics to contain a consistent snapshot of the data but need them to have only the changes since the connector was started.RECOVERY: This is a recovery setting for a connector that has already been capturing changes. When you restart the connector, this setting enables recovery of a corrupted or lost database history topic. You might set it periodically to "clean up" a database history topic that has been growing unexpectedly. Database history topics require infinite retention.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.debezium.oracle.Trigger instead.##### Examples
Consume a message from a Oracle database via change data capture in real-time.
id: debezium-oracle
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: realtime
type: io.kestra.plugin.debezium.oracle.RealtimeTrigger
hostname: 127.0.0.1
port: 1521
username: c##dbzuser
password: dbz
sid: ORCLCDB
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Possible values are:
- ON_EACH_BATCH: after each batch of records consumed by this trigger, the offsets will be stored in the KV Store. This avoids any duplicated records being consumed but can be costly if many events are produced.
- ON_STOP: when this trigger is stopped or killed, the offsets will be stored in the KV Store. This avoid any un-necessary writes to the KV Store, but if the trigger is not stopped gracefully, the KV Store value may not be updated leading to duplicated records consumption.
Default value is : ON_EACH_BATCH
For non-container database (non-CDB) installation, do not specify the pluggableDatabase property.
Any additional configuration properties that is valid for the current driver.
Possible settings are:
ALWAYS: The connector runs a snapshot on each connector start.INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.NO_DATA: The connector runs a snapshot of the schemas and not the data. This setting is useful when you do not need the topics to contain a consistent snapshot of the data but need them to have only the changes since the connector was started.RECOVERY: This is a recovery setting for a connector that has already been capturing changes. When you restart the connector, this setting enables recovery of a corrupted or lost database history topic. You might set it periodically to "clean up" a database history topic that has been growing unexpectedly. Database history topics require infinite retention.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume each message from change data capture in real-time and create one execution per message, you can use the io.kestra.plugin.debezium.oracle.RealtimeTrigger instead.##### Examples
snapshotMode: INITIAL_ONLY
hostname: 127.0.0.1
port: "1521"
username: c##dbzuser
password: dbz
sid: ORCLCDB
maxRecords: 100
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
For non-container database (non-CDB) installation, do not specify the pluggableDatabase property.
Any additional configuration properties that is valid for the current driver.
Possible settings are:
ALWAYS: The connector runs a snapshot on each connector start.INITIAL: The connector runs a snapshot only when no offsets have been recorded for the logical server name.INITIAL_ONLY: The connector runs a snapshot only when no offsets have been recorded for the logical server name and then stops; i.e. it will not read change events from the binlog.WHEN_NEEDED: The connector runs a snapshot upon startup whenever it deems it necessary. That is, when no offsets are available, or when a previously recorded offset specifies a binlog location or GTID that is not available in the server.NO_DATA: The connector runs a snapshot of the schemas and not the data. This setting is useful when you do not need the topics to contain a consistent snapshot of the data but need them to have only the changes since the connector was started.RECOVERY: This is a recovery setting for a connector that has already been capturing changes. When you restart the connector, this setting enables recovery of a corrupted or lost database history topic. You might set it periodically to "clean up" a database history topic that has been growing unexpectedly. Database history topics require infinite retention.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
Examples
hostname: 127.0.0.1
port: "5432"
username: psql_user
password: psql_passwd
maxRecords: 100
database: my_database
pluginName: PGOUTPUT
snapshotMode: ALWAYS
Default value is : false
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
If you are using a wal2json plug-in and transactions are very large, the JSON batch event that contains all transaction changes might not fit into the hard-coded memory buffer, which has a size of 1 GB. In such cases, switch to a streaming plug-in, by setting the plugin-name property to wal2json_streaming or wal2json_rds_streaming. With a streaming plug-in, PostgreSQL sends the connector a separate message for each change in a transaction.
Default value is : PGOUTPUT
Any additional configuration properties that is valid for the current driver.
This publication is created at start-up if it does not already exist and it includes all tables. Debezium then applies its own include/exclude list filtering, if configured, to limit the publication to change events for the specific tables of interest. The connector user must have superuser permissions to create this publication, so it is usually preferable to create the publication before starting the connector for the first time.
If the publication already exists, either for all tables or configured with a subset of tables, Debezium uses the publication as it is defined.
Default value is : kestra_publication
The server uses this slot to stream events to the Debezium connector that you are configuring. Slot names must conform to PostgreSQL replication slot naming rules, which state: "Each replication slot has a name, which can contain lower-case letters, numbers, and the underscore character."
Default value is : kestra
Possible settings are:
INITIAL: The connector performs a snapshot only when no offsets have been recorded for the logical server name.ALWAYS: The connector performs a snapshot each time the connector starts.NEVER: The connector never performs snapshots. When a connector is configured this way, its behavior when it starts is as follows. If there is a previously stored LSN, the connector continues streaming changes from that position. If no LSN has been stored, the connector starts streaming changes from the point in time when the PostgreSQL logical replication slot was created on the server. The never snapshot mode is useful only when you know all data of interest is still reflected in the WAL.INITIAL_ONLY: The connector performs an initial snapshot and then stops, without processing any subsequent changes.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Must be a PEM encoded certificate.
Must be a PEM encoded key.
Default value is : DISABLE
Must be a PEM encoded certificate.
Default value is : debezium-state
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.debezium.postgres.Trigger instead.##### Examples
Consume a message from a PostgreSQL database via change data capture in real-time.
id: debezium-postgres
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: realtime
type: io.kestra.plugin.debezium.postgres.RealtimeTrigger
database: postgres
hostname: 127.0.0.1
port: 65432
username: postgres
password: pg_passwd
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Possible values are:
- ON_EACH_BATCH: after each batch of records consumed by this trigger, the offsets will be stored in the KV Store. This avoids any duplicated records being consumed but can be costly if many events are produced.
- ON_STOP: when this trigger is stopped or killed, the offsets will be stored in the KV Store. This avoid any un-necessary writes to the KV Store, but if the trigger is not stopped gracefully, the KV Store value may not be updated leading to duplicated records consumption.
Default value is : ON_EACH_BATCH
If you are using a wal2json plug-in and transactions are very large, the JSON batch event that contains all transaction changes might not fit into the hard-coded memory buffer, which has a size of 1 GB. In such cases, switch to a streaming plug-in, by setting the plugin-name property to wal2json_streaming or wal2json_rds_streaming. With a streaming plug-in, PostgreSQL sends the connector a separate message for each change in a transaction.
Default value is : PGOUTPUT
Any additional configuration properties that is valid for the current driver.
This publication is created at start-up if it does not already exist and it includes all tables. Debezium then applies its own include/exclude list filtering, if configured, to limit the publication to change events for the specific tables of interest. The connector user must have superuser permissions to create this publication, so it is usually preferable to create the publication before starting the connector for the first time.
If the publication already exists, either for all tables or configured with a subset of tables, Debezium uses the publication as it is defined.
Default value is : kestra_publication
The server uses this slot to stream events to the Debezium connector that you are configuring. Slot names must conform to PostgreSQL replication slot naming rules, which state: "Each replication slot has a name, which can contain lower-case letters, numbers, and the underscore character."
Default value is : kestra
Possible settings are:
INITIAL: The connector performs a snapshot only when no offsets have been recorded for the logical server name.ALWAYS: The connector performs a snapshot each time the connector starts.NEVER: The connector never performs snapshots. When a connector is configured this way, its behavior when it starts is as follows. If there is a previously stored LSN, the connector continues streaming changes from that position. If no LSN has been stored, the connector starts streaming changes from the point in time when the PostgreSQL logical replication slot was created on the server. The never snapshot mode is useful only when you know all data of interest is still reflected in the WAL.INITIAL_ONLY: The connector performs an initial snapshot and then stops, without processing any subsequent changes.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Must be a PEM encoded certificate.
Must be a PEM encoded key.
Default value is : DISABLE
Must be a PEM encoded certificate.
Default value is : debezium-state
1 nested properties
If you would like to consume each message from change data capture in real-time and create one execution per message, you can use the io.kestra.plugin.debezium.postgres.RealtimeTrigger instead.##### Examples
hostname: 127.0.0.1
port: "5432"
username: posgres
password: psql_passwd
maxRecords: 100
database: my_database
pluginName: PGOUTPUT
snapshotMode: ALWAYS
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
If you are using a wal2json plug-in and transactions are very large, the JSON batch event that contains all transaction changes might not fit into the hard-coded memory buffer, which has a size of 1 GB. In such cases, switch to a streaming plug-in, by setting the plugin-name property to wal2json_streaming or wal2json_rds_streaming. With a streaming plug-in, PostgreSQL sends the connector a separate message for each change in a transaction.
Default value is : PGOUTPUT
Any additional configuration properties that is valid for the current driver.
This publication is created at start-up if it does not already exist and it includes all tables. Debezium then applies its own include/exclude list filtering, if configured, to limit the publication to change events for the specific tables of interest. The connector user must have superuser permissions to create this publication, so it is usually preferable to create the publication before starting the connector for the first time.
If the publication already exists, either for all tables or configured with a subset of tables, Debezium uses the publication as it is defined.
Default value is : kestra_publication
The server uses this slot to stream events to the Debezium connector that you are configuring. Slot names must conform to PostgreSQL replication slot naming rules, which state: "Each replication slot has a name, which can contain lower-case letters, numbers, and the underscore character."
Default value is : kestra
Possible settings are:
INITIAL: The connector performs a snapshot only when no offsets have been recorded for the logical server name.ALWAYS: The connector performs a snapshot each time the connector starts.NEVER: The connector never performs snapshots. When a connector is configured this way, its behavior when it starts is as follows. If there is a previously stored LSN, the connector continues streaming changes from that position. If no LSN has been stored, the connector starts streaming changes from the point in time when the PostgreSQL logical replication slot was created on the server. The never snapshot mode is useful only when you know all data of interest is still reflected in the WAL.INITIAL_ONLY: The connector performs an initial snapshot and then stops, without processing any subsequent changes.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Must be a PEM encoded certificate.
Must be a PEM encoded key.
Default value is : DISABLE
Must be a PEM encoded certificate.
Default value is : debezium-state
1 nested properties
Examples
snapshotMode: INITIAL
hostname: 127.0.0.1
port: "1433"
username: sqlserver_user
password: sqlserver_passwd
maxRecords: 100
Default value is : false
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Any additional configuration properties that is valid for the current driver.
Possible settings are:
INITIAL: Takes a snapshot of structure and data of captured tables; useful if topics should be populated with a complete representation of the data from the captured tables.INITIAL_ONLY: Takes a snapshot of structure and data like initial but instead does not transition into streaming changes once the snapshot has completed.SCHEMA_ONLY: Takes a snapshot of the structure of captured tables only; useful if only changes happening from now onwards should be propagated to topics.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.debezium.sqlserver.Trigger instead.##### Examples
Consume a message from a SQL Server database via change data capture in real-time.
id: debezium-sqlserver
namespace: company.team
tasks:
- id: send_data
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: realtime
type: io.kestra.plugin.debezium.sqlserver.RealtimeTrigger
hostname: 127.0.0.1
port: 61433
username: sa
password: password
database: deb
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Possible values are:
- ON_EACH_BATCH: after each batch of records consumed by this trigger, the offsets will be stored in the KV Store. This avoids any duplicated records being consumed but can be costly if many events are produced.
- ON_STOP: when this trigger is stopped or killed, the offsets will be stored in the KV Store. This avoid any un-necessary writes to the KV Store, but if the trigger is not stopped gracefully, the KV Store value may not be updated leading to duplicated records consumption.
Default value is : ON_EACH_BATCH
Any additional configuration properties that is valid for the current driver.
Possible settings are:
INITIAL: Takes a snapshot of structure and data of captured tables; useful if topics should be populated with a complete representation of the data from the captured tables.INITIAL_ONLY: Takes a snapshot of structure and data like initial but instead does not transition into streaming changes once the snapshot has completed.SCHEMA_ONLY: Takes a snapshot of the structure of captured tables only; useful if only changes happening from now onwards should be propagated to topics.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
If you would like to consume each message from change data capture in real-time and create one execution per message, you can use the io.kestra.plugin.debezium.sqlserver.RealtimeTrigger instead.##### Examples
snapshotMode: INITIAL
hostname: 127.0.0.1
port: "1433"
username: sqlserver_user
password: sqlserver_passwd
database: deb
maxRecords: 100
Possible settings are:
ADD_FIELD: Add a deleted field as boolean.NULL: Send a row with all values as null.DROP: Don't send deleted row.
Default value is : ADD_FIELD
Default value is : deleted
Default value is : false
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the includedColumns connector configuration property."
The connector captures changes in any database whose name is not in the excludedDatabases. Do not also set the includedDatabases connector configuration property.
The connector captures changes in any table not included in excludedTables. Each identifier is of the form databaseName.tableName. Do not also specify the includedTables connector configuration property.
Possible settings are:
RAW: Send raw data from debezium.INLINE: Send a row like in the source with only data (remove after & before), all the columns will be present for each row.WRAP: Send a row like INLINE but wrapped in arecordfield.
Default value is : INLINE
Ignore CREATE, ALTER, DROP and TRUNCATE operations.
Default value is : true
Fully-qualified names for columns are of the form databaseName.tableName.columnName. Do not also specify the excludedColumns connector configuration property.
The connector does not capture changes in any database whose name is not in includedDatabases. By default, the connector captures changes in all databases. Do not also set the excludedDatabases connector configuration property.
The connector does not capture changes in any table not included in includedTables. Each identifier is of the form databaseName.tableName. By default, the connector captures changes in every non-system table in each database whose changes are being captured. Do not also specify the excludedTables connector configuration property.
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Possible settings are:
ADD_FIELD: Add key(s) merged with columns.DROP: Drop keys.
Default value is : ADD_FIELD
Default value is : false
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second. The properties 'maxRecord', 'maxDuration' and 'maxWait' are evaluated only after the snapshot is done.
Default value is : 3600.000000000
It's not an hard limit and is evaluated every second. It is taken into account after the snapshot if any.
Default value is : 10.000000000
Possible settings are:
ADD_FIELD: Add metadata in a column namedmetadata.DROP: Drop metadata.
Default value is : ADD_FIELD
Default value is : metadata
Any additional configuration properties that is valid for the current driver.
Possible settings are:
INITIAL: Takes a snapshot of structure and data of captured tables; useful if topics should be populated with a complete representation of the data from the captured tables.INITIAL_ONLY: Takes a snapshot of structure and data like initial but instead does not transition into streaming changes once the snapshot has completed.SCHEMA_ONLY: Takes a snapshot of the structure of captured tables only; useful if only changes happening from now onwards should be propagated to topics.
Default value is : INITIAL
Possible settings are:
TABLE: This will split all rows by tables on output with namedatabase.tableDATABASE: This will split all rows by databases on output with namedatabase.OFF: This will NOT split all rows resulting in a singledataoutput.
Default value is : TABLE
Default value is : debezium-state
1 nested properties
Examples
Build and push a Docker image to a registry
id: docker_build
namespace: company.team
tasks:
- id: build
type: io.kestra.plugin.docker.Build
dockerfile: |
FROM ubuntu
ARG APT_PACKAGES=""
RUN apt-get update && apt-get install -y --no-install-recommends ${APT_PACKAGES};
platforms:
- linux/amd64
tags:
- private-registry.io/unit-test:latest
buildArgs:
APT_PACKAGES: curl
labels:
unit-test: "true"
credentials:
registry: <registry.url.com>
username: <your-user>
password: <your-password>
Default value is : false
Default value is : false
Default value is : false
Default value is : HTTPS
Default value is : true
Default value is : false
1 nested properties
Examples
Run the docker/whalesay container with the command 'cowsay hello'
id: docker_run
namespace: company.team
tasks:
- id: run
type: io.kestra.plugin.docker.Run
containerImage: docker/whalesay
commands:
- cowsay
- hello
Run the docker/whalesay container with no command
id: docker_run
namespace: company.team
tasks:
- id: run
type: io.kestra.plugin.docker.Run
containerImage: docker/whalesay
Default value is : false
Default value is : []
Default value is : []
[]
Docker configuration file that can set access credentials to private container registries. Usually located in ~/.docker/config.json.
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
The size must be greater than 0. If omitted, the system uses 64MB.
Must be a valid mount expression as string, example : /home/user:/app.
Volumes mount are disabled by default for security reasons; you must enable them on server configuration by setting kestra.tasks.scripts.docker.volume-enabled to true.
Default value is : true
1 nested properties
Examples
id: elasticsearch_bulk_load
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: bulk_load
type: io.kestra.plugin.elasticsearch.Bulk
connection:
hosts:
- "http://localhost:9200"
from: "{{ inputs.file }}"
Default value is : false
Default value is : 1000
Default value is : false
Default value is : false
Using this value to hash the shard and not the id.
1 nested properties
Must be an URI like https://elasticsearch.com:9200 with scheme and port.
Must be a string with key value separated with :, ex: Authorization: Token XYZ.
For example, if this is set to /my/path, then any client request will become /my/path/ + endpoint.
In essence, every request's endpoint is prefixed by this pathPrefix.
The path prefix is useful for when ElasticSearch is behind a proxy that provides a base path or a proxy that requires all paths to start with '/'; it is not intended for other purposes and it should not be supplied in other scenarios.
Use this if the server is using a self signed SSL certificate.
Examples
id: elasticsearch_get
namespace: company.team
tasks:
- id: get
type: io.kestra.plugin.elasticsearch.Get
connection:
hosts:
- "http://localhost:9200"
index: "my_index"
key: "my_id"
which will cause the get operation to only be performed if a matching version exists and no changes happened on the doc since then.
Default value is : false
Default value is : false
Default value is : false
Using this value to hash the shard and not the id.
1 nested properties
Examples
id: elasticsearch_load
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: load
type: io.kestra.plugin.elasticsearch.Load
connection:
hosts:
- "http://localhost:9200"
from: "{{ inputs.file }}"
index: "my_index"
Default value is : false
Default value is : 1000
Default value is : false
Default value is : false
Default value is : true
Using this value to hash the shard and not the id.
1 nested properties
Examples
Put a document with a Map.
id: elasticsearch_put
namespace: company.team
tasks:
- id: put
type: io.kestra.plugin.elasticsearch.Put
connection:
hosts:
- "http://localhost:9200"
index: "my_index"
key: "my_id"
value:
name: "John Doe"
city: "Paris"
Put a document from a JSON string.
id: elasticsearch_put
namespace: company.team
inputs:
- id: value
type: JSON
defaults: {"name": "John Doe", "city": "Paris"}
tasks:
- id: put
type: io.kestra.plugin.elasticsearch.Put
connection:
hosts:
- "http://localhost:9200"
index: "my_index"
key: "my_id"
value: "{{ inputs.value }}"
Default value is : false
Default value is : JSON
Default value is : false
Default value is : false
an immediate refresh IMMEDIATE, wait for a refresh WAIT_UNTIL, or proceed ignore refreshes entirely NONE.
Default value is : NONE
Using this value to hash the shard and not the id.
Can be a string. In this case, the contentType will be used or a raw Map.
1 nested properties
Examples
Inserting a document in an index using POST request.
id: elasticsearch_request
namespace: company.team
tasks:
- id: request_post
type: io.kestra.plugin.elasticsearch.Request
connection:
hosts:
- "http://localhost:9200"
method: "POST"
endpoint: "my_index/_doc/john"
body:
name: "john"
Searching for documents using GET request.
id: elasticsearch_request
namespace: company.team
tasks:
- id: request_get
type: io.kestra.plugin.elasticsearch.Request
connection:
hosts:
- "http://localhost:9200"
method: "GET"
endpoint: "my_index/_search"
parameters:
q: "name:"John Doe""
Deleting document using DELETE request.
id: elasticsearch_request
namespace: company.team
tasks:
- id: request_delete
type: io.kestra.plugin.elasticsearch.Request
connection:
hosts:
- "http://localhost:9200"
method: "DELETE"
endpoint: "my_index/_doc/<_id>"
Default value is : false
Can be a JSON string or raw Map that will be converted to json.
Default value is : false
Default value is : false
Default value is : GET
Using this value to hash the shard and not the id.
1 nested properties
Get all documents from a search request and store it as Kestra Internal Storage file.##### Examples
id: elasticsearch_scroll
namespace: company.team
tasks:
- id: scroll
type: io.kestra.plugin.elasticsearch.Scroll
connection:
hosts:
- "http://localhost:9200"
indexes:
- "my_index"
request:
query:
term:
name:
value: 'john'
Default value is : false
Default value is : JSON
Default value is : false
Default to all indices.
Default value is : false
Can be a JSON string. In this case, the contentType will be used or a raw Map.
Using this value to hash the shard and not the id.
1 nested properties
Get all documents from a search request and store it as outputs.##### Examples
id: elasticsearch_search
namespace: company.team
tasks:
- id: search
type: io.kestra.plugin.elasticsearch.Search
connection:
hosts:
- "http://localhost:9200"
indexes:
- "my_index"
request:
query:
term:
name:
value: 'john'
Default value is : false
Default value is : JSON
Default value is : false
FETCH_ONE output the first row, FETCH output all the rows, STORE store all rows in a file, NONE do nothing.
Default value is : FETCH
Default to all indices.
Default value is : false
Can be a JSON string. In this case, the contentType will be used or a raw Map.
Using this value to hash the shard and not the id.
1 nested properties
Examples
id: fivetran_sync
namespace: company.team
tasks:
- id: sync
type: io.kestra.plugin.fivetran.connectors.Sync
apiKey: "api_key"
apiSecret: "api_secret"
connectorId: "connector_id"
Default value is : false
Default value is : false
If force is true and the connector is currently syncing, it will stop the sync and re-run it. If force is false, the connector will sync only if it isn't currently syncing.
Default value is : false
Default value is : false
Default value is : 3600.000000000
Allowing to capture job status & logs.
Default value is : true
1 nested properties
Examples
id: fs_ftp_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.fs.ftp.Delete
host: localhost
port: 21
username: foo
password: pass
uri: "/upload/dir1/file.txt"
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : 21
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_ftp_download
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.fs.ftp.Download
host: localhost
port: 21
username: foo
password: pass
from: "/in/file.txt"
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : 21
Default value is : true
Default value is : true
1 nested properties
Examples
Download a list of files and move it to an archive folders
id: fs_ftp_downloads
namespace: company.team
tasks:
- id: downloads
type: io.kestra.plugin.fs.ftp.Downloads
host: localhost
port: 21
username: foo
password: pass
from: "/in/"
interval: PT10S
action: MOVE
moveDirectory: "/archive/"
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : 21
Default value is : false
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_ftp_list
namespace: company.team
tasks:
- id: list
type: io.kestra.plugin.fs.ftp.List
host: localhost
port: 21
username: foo
password: pass
from: "/upload/dir1/"
regExp: ".*\/dir1\/.*.(yaml|yml)"
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : 21
Default value is : false
Default value is : true
Default value is : true
1 nested properties
If the destination directory doesn't exist, it will be created##### Examples
id: fs_ftp_move
namespace: company.team
tasks:
- id: move
type: io.kestra.plugin.fs.ftp.Move
host: localhost
port: 21
username: foo
password: pass
from: "/upload/dir1/file.txt"
to: "/upload/dir2/file.txt"
The full destination path (with filename optionally)
If end with a /, the destination is considered as a directory and filename will be happen
If the destFile exists, it is deleted first.
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : 21
Default value is : true
Default value is : true
1 nested properties
Examples
Wait for one or more files in a given FTP server's directory and process each of these files sequentially.
id: ftp_trigger_flow
namespace: company.team
tasks:
- id: for_each_file
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.path') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.ftp.Trigger
host: localhost
port: 21
username: foo
password: bar
from: "/in/"
interval: PT10S
action: MOVE
moveDirectory: "/archive/"
Wait for one or more files in a given FTP server's directory and process each of these files sequentially. Delete files manually after processing to prevent infinite triggering.
id: ftp_trigger_flow
namespace: company.team
tasks:
- id: for_each_file
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.name') }}"
- id: delete
type: io.kestra.plugin.fs.ftp.Delete
host: localhost
port: 21
username: foo
password: bar
uri: "/in/{{ taskrun.value | jq('.name') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.ftp.Trigger
host: localhost
port: 21
username: foo
password: bar
from: "/in/"
interval: PT10S
action: NONE
Wait for one or more files in a given FTP server's directory and process each of these files sequentially. In this example, we restrict the trigger to only wait for CSV files in the
mydirdirectory.
id: ftp_wait_for_csv_in_mydir
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.path') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.ftp.Trigger
host: localhost
port: "21"
username: foo
password: bar
from: "mydir/"
regExp: ".*.csv"
action: MOVE
moveDirectory: "archive/"
interval: PTS
Default value is : false
Default value is : 60.000000000
Default value is : false
Default value is : true
Default value is : 21
Default value is : false
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_ftp_upload
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: upload
type: io.kestra.plugin.fs.ftp.Upload
host: localhost
port: 21
username: foo
password: pass
from: "{{ inputs.file }}"
to: "/upload/dir2/file.txt"
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : 21
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_ftp_uploads
namespace: company.team
inputs:
- id: file1
type: FILE
- id: file2
type: FILE
tasks:
- id: uploads
type: io.kestra.plugin.fs.ftp.Uploads
host: localhost
port: 21
username: foo
password: pass
from:
- "{{ inputs.file1 }}"
- "{{ inputs.file2 }}"
to: "/upload/dir2"
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : 21
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_ftps_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.fs.ftps.Delete
host: localhost
port: 990
username: foo
password: pass
uri: "/upload/dir1/file.txt"
Default value is : false
Default value is : P
Default value is : false
Default value is : false
Note: This makes the SSL connection insecure, and should only be used for testing.
Default value is : false
Default value is : EXPLICIT
Default value is : true
Default value is : 990
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_ftps_download
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.fs.ftps.Download
host: localhost
port: 990
username: foo
password: pass
from: "/in/file.txt"
Default value is : false
Default value is : P
Default value is : false
Note: This makes the SSL connection insecure, and should only be used for testing.
Default value is : false
Default value is : EXPLICIT
Default value is : true
Default value is : 990
Default value is : true
Default value is : true
1 nested properties
Examples
Download a list of files and move it to an archive folders
id: fs_ftps_downloads
namespace: company.team
tasks:
- id: downloads
type: io.kestra.plugin.fs.ftps.Downloads
host: localhost
port: 990
username: foo
password: pass
from: "/in/"
interval: PT10S
action: MOVE
moveDirectory: "/archive/"
Default value is : false
Default value is : P
Default value is : false
Note: This makes the SSL connection insecure, and should only be used for testing.
Default value is : false
Default value is : EXPLICIT
Default value is : true
Default value is : 990
Default value is : false
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_ftps_list
namespace: company.team
tasks:
- id: list
type: io.kestra.plugin.fs.ftps.List
host: localhost
port: 990
username: foo
password: pass
from: "/upload/dir1/"
regExp: ".*\/dir1\/.*.(yaml|yml)"
Default value is : false
Default value is : P
Default value is : false
Note: This makes the SSL connection insecure, and should only be used for testing.
Default value is : false
Default value is : EXPLICIT
Default value is : true
Default value is : 990
Default value is : false
Default value is : true
Default value is : true
1 nested properties
If the destination directory doesn't exist, it will be created##### Examples
id: fs_ftps_move
namespace: company.team
tasks:
- id: move
type: io.kestra.plugin.fs.ftps.Move
host: localhost
port: 990
username: foo
password: pass
from: "/upload/dir1/file.txt"
to: "/upload/dir2/file.txt"
The full destination path (with filename optionally)
If end with a /, the destination is considered as a directory and filename will be happen
If the destFile exists, it is deleted first.
Default value is : false
Default value is : P
Default value is : false
Note: This makes the SSL connection insecure, and should only be used for testing.
Default value is : false
Default value is : EXPLICIT
Default value is : true
Default value is : 990
Default value is : true
Default value is : true
1 nested properties
Examples
Wait for one or more files in a given FTPS server's directory and process each of these files sequentially.
id: ftps_trigger_flow
namespace: company.team
tasks:
- id: for_each_file
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.path') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.ftps.Trigger
host: localhost
port: 990
username: foo
password: bar
from: "/in/"
interval: PT10S
action: MOVE
moveDirectory: "/archive/"
Wait for one or more files in a given FTPS server's directory and process each of these files sequentially. In this example, we restrict the trigger to only wait for CSV files in the
mydirdirectory.
id: ftp_wait_for_csv_in_mydir
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.path') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.ftps.Trigger
host: localhost
port: "21"
username: foo
password: bar
from: "mydir/"
regExp: ".*.csv"
action: MOVE
moveDirectory: "archive/"
interval: PTS
Default value is : P
Default value is : false
Note: This makes the SSL connection insecure, and should only be used for testing.
Default value is : 60.000000000
Default value is : false
Default value is : EXPLICIT
Default value is : true
Default value is : 990
Default value is : false
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_ftps_upload
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: upload
type: io.kestra.plugin.fs.ftps.Upload
host: localhost
port: 990
username: foo
password: pass
from: "{{ inputs.file }}"
to: "/upload/dir2/file.txt"
Default value is : false
Default value is : P
Default value is : false
Note: This makes the SSL connection insecure, and should only be used for testing.
Default value is : false
Default value is : EXPLICIT
Default value is : true
Default value is : 990
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_ftps_uploads
namespace: company.team
inputs:
- id: file1
type: FILE
- id: file2
type: FILE
tasks:
- id: uploads
type: io.kestra.plugin.fs.ftps.Uploads
host: localhost
port: 990
username: foo
password: pass
from:
- "{{ inputs.file1 }}"
- "{{ inputs.file2 }}"
to: "/upload/dir2"
Default value is : false
Default value is : P
Default value is : false
Note: This makes the SSL connection insecure, and should only be used for testing.
Default value is : false
Default value is : EXPLICIT
Default value is : true
Default value is : 990
Default value is : true
Default value is : true
1 nested properties
Examples
id: fs_sftp_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.fs.sftp.Delete
host: localhost
port: "22"
username: foo
password: pass
uri: "/upload/dir1/file.txt"
Default value is : false
Default value is : false
Default value is : false
To generate a PEM format key from OpenSSH, use the following command: ssh-keygen -m PEM
Default value is : false
Default value is : 22
Default value is : true
1 nested properties
Examples
id: fs_sftp_download
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.fs.sftp.Download
host: localhost
port: "22"
username: foo
password: pass
from: "/in/file.txt"
Default value is : false
Default value is : false
To generate a PEM format key from OpenSSH, use the following command: ssh-keygen -m PEM
Default value is : false
Default value is : 22
Default value is : true
1 nested properties
Examples
Download a list of files and move it to an archive folders
id: fs_sftp_downloads
namespace: company.team
tasks:
- id: downloads
type: io.kestra.plugin.fs.sftp.Downloads
host: localhost
port: "22"
username: foo
password: pass
from: "/in/"
interval: PT10S
action: MOVE
moveDirectory: "/archive/"
Default value is : false
Default value is : false
To generate a PEM format key from OpenSSH, use the following command: ssh-keygen -m PEM
Default value is : false
Default value is : 22
Default value is : false
Default value is : true
1 nested properties
Examples
id: fs_sftp_list
namespace: company.team
tasks:
- id: list
type: io.kestra.plugin.fs.sftp.List
host: localhost
port: "22"
username: foo
password: pass
from: "/upload/dir1/"
regExp: ".*\/dir1\/.*.(yaml|yml)"
Default value is : false
Default value is : false
To generate a PEM format key from OpenSSH, use the following command: ssh-keygen -m PEM
Default value is : false
Default value is : 22
Default value is : false
Default value is : true
1 nested properties
If the destination directory doesn't exist, it will be created##### Examples
id: fs_sftp_move
namespace: company.team
tasks:
- id: move
type: io.kestra.plugin.fs.sftp.Move
host: localhost
port: "22"
username: foo
password: pass
from: "/upload/dir1/file.txt"
to: "/upload/dir2/file.txt"
The full destination path (with filename optionally)
If end with a /, the destination is considered as a directory and filename will be happen
If the destFile exists, it is deleted first.
Default value is : false
Default value is : false
To generate a PEM format key from OpenSSH, use the following command: ssh-keygen -m PEM
Default value is : false
Default value is : 22
Default value is : true
1 nested properties
Examples
Wait for one or more files in a given SFTP server's directory and process each of these files sequentially.
id: sftp_trigger_flow
namespace: company.team
tasks:
- id: for_each_file
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.path') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.sftp.Trigger
host: localhost
port: 6622
username: foo
password: bar
from: "/in/"
interval: PT10S
action: MOVE
moveDirectory: "/archive/"
Wait for one or more files in a given SFTP server's directory and process each of these files sequentially. Delete files manually after processing to prevent infinite triggering.
id: sftp_trigger_flow
namespace: company.team
tasks:
- id: for_each_file
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files | jq('.path') }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
- id: delete
type: io.kestra.plugin.fs.sftp.Delete
host: localhost
port: 6622
username: foo
password: bar
uri: "/in/{{ taskrun.value }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.sftp.Trigger
host: localhost
port: 6622
username: foo
password: bar
from: "/in/"
interval: PT10S
action: NONE
Wait for one or more files in a given SFTP server's directory and process each of these files sequentially. In this example, we restrict the trigger to only wait for CSV files in the
mydirdirectory.
id: ftp_wait_for_csv_in_mydir
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.path') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.sftp.Trigger
host: localhost
port: "6622"
username: foo
password: bar
from: "mydir/"
regExp: ".*.csv"
action: MOVE
moveDirectory: "archive/"
interval: PTS
Default value is : false
Default value is : 60.000000000
To generate a PEM format key from OpenSSH, use the following command: ssh-keygen -m PEM
Default value is : false
Default value is : 22
Default value is : false
Default value is : true
1 nested properties
Examples
id: fs_sftp_upload
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: upload
type: io.kestra.plugin.fs.sftp.Upload
host: localhost
port: "22"
username: foo
password: pass
from: "{{ inputs.file }}"
to: "/upload/dir2/file.txt"
Default value is : false
Default value is : false
To generate a PEM format key from OpenSSH, use the following command: ssh-keygen -m PEM
Default value is : false
Default value is : 22
Default value is : true
1 nested properties
Examples
id: fs_sftp_uploads
namespace: company.team
inputs:
- id: file1
type: FILE
- id: file2
type: FILE
tasks:
- id: uploads
type: io.kestra.plugin.fs.sftp.Uploads
host: localhost
port: "22"
username: foo
password: pass
from:
- "{{ inputs.file1 }}"
- "{{ inputs.file2 }}"
to: "/upload/dir2"
Default value is : false
Default value is : false
To generate a PEM format key from OpenSSH, use the following command: ssh-keygen -m PEM
Default value is : false
Default value is : 22
Default value is : true
1 nested properties
Examples
id: fs_smb_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.fs.smb.Delete
host: localhost
port: 445
username: foo
password: pass
uri: "/my_share/dir1/file.txt"
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : 445
1 nested properties
Examples
id: fs_smb_download
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.fs.smb.Download
host: localhost
port: 445
username: foo
password: pass
from: "/my_share/file.txt"
Default value is : false
Default value is : false
Default value is : false
Default value is : 445
1 nested properties
Examples
Download files from
my_shareand move them to anarchive_share
id: fs_smb_downloads
namespace: company.team
tasks:
- id: downloads
type: io.kestra.plugin.fs.smb.Downloads
host: localhost
port: 445
username: foo
password: pass
from: "/my_share/"
interval: PT10S
action: MOVE
moveDirectory: "/archive_share/"
Default value is : false
Default value is : false
Default value is : false
Default value is : 445
Default value is : false
1 nested properties
Examples
id: fs_smb_list
namespace: company.team
tasks:
- id: list
type: io.kestra.plugin.fs.smb.List
host: localhost
port: 445
username: foo
password: pass
from: "/my_share/dir1/"
regExp: ".*\/dir1\/.*.(yaml|yml)"
Default value is : false
Default value is : false
Default value is : false
Default value is : 445
Default value is : false
1 nested properties
If the destination directory doesn't exist, it will be created##### Examples
id: fs_smb_move
namespace: company.team
tasks:
- id: move
type: io.kestra.plugin.fs.smb.Move
host: localhost
port: 445
username: foo
password: pass
from: "/my_share/dir1/file.txt"
to: "/my_share/dir2/file.txt"
The full destination path (with filename optionally)
If end with a /, the destination is considered as a directory and filename will be happen
If the destFile exists, it is deleted first.
Default value is : false
Default value is : false
Default value is : false
Default value is : 445
1 nested properties
Examples
Wait for one or more files in a given SMB server's directory and process each of these files sequentially. Then move them to another share which is used as an archive.
id: smb_trigger_flow
namespace: company.team
tasks:
- id: for_each_file
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.path') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.smb.Trigger
host: localhost
port: 445
username: foo
password: bar
from: "/my_share/in/"
interval: PT10S
action: MOVE
moveDirectory: "/archive_share/"
Wait for one or more files in a given SMB server's directory and process each of these files sequentially. Then move them to another share which is used as an archive.
id: smb_trigger_flow
namespace: company.team
tasks:
- id: for_each_file
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.path') }}"
- id: delete
type: io.kestra.plugin.fs.smb.Delete
host: localhost
port: 445
username: foo
password: bar
uri: "/my_share/in/{{ taskrun.value | jq('.path') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.smb.Trigger
host: localhost
port: 445
username: foo
password: bar
from: "/my_share/in/"
interval: PT10S
action: NONE
Wait for one or more files in a given SMB server's directory (composed of share name followed by dir path) and process each of these files sequentially. In this example, we restrict the trigger to only wait for CSV files in the
mydirdirectory.
id: smb_wait_for_csv_in_my_share_my_dir
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
value: "{{ trigger.files }}"
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value | jq('.path') }}"
triggers:
- id: watch
type: io.kestra.plugin.fs.smb.Trigger
host: localhost
port: "445"
username: foo
password: bar
from: "my_share/mydir/"
regExp: ".*.csv"
action: MOVE
moveDirectory: "my_share/archivedir"
interval: PTS
Default value is : false
Default value is : 60.000000000
Default value is : false
Default value is : 445
Default value is : false
1 nested properties
Examples
id: fs_smb_upload
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: upload
type: io.kestra.plugin.fs.smb.Upload
host: localhost
port: 445
username: foo
password: pass
from: "{{ inputs.file }}"
to: "/my_share/dir2/file.txt"
Default value is : false
Default value is : false
Default value is : false
Default value is : 445
1 nested properties
Examples
id: fs_smb_uploads
namespace: company.team
inputs:
- id: file1
type: FILE
- id: file2
type: FILE
tasks:
- id: uploads
type: io.kestra.plugin.fs.smb.Uploads
host: localhost
port: 445
username: foo
password: pass
from:
- "{{ inputs.file1 }}"
- "{{ inputs.file2 }}"
to: "/my_share/dir2"
Default value is : false
Default value is : false
Default value is : false
Default value is : 445
1 nested properties
Examples
Run SSH command using password authentication
id: fs_ssh_command
namespace: company.team
tasks:
- id: command
type: io.kestra.plugin.fs.ssh.Command
host: localhost
port: "22"
authMethod: PASSWORD
username: foo
password: pass
commands: ['ls']
Run SSH command using public key authentication (must be an OpenSSH private key)
id: fs_ssh_command
namespace: company.team
tasks:
- id: command
type: io.kestra.plugin.fs.ssh.Command
host: localhost
port: "22"
authMethod: PUBLIC_KEY
username: root
privateKey: "{{ secret('SSH_RSA_PRIVATE_KEY') }}"
commands: ['touch kestra_was_here']
Run SSH command using the local OpenSSH configuration
id: ssh
namespace: company.team
tasks:
- id: ssh
type: io.kestra.plugin.fs.ssh.Command
authMethod: OPEN_SSH
useOpenSSHConfig: true
host: localhost
password: pass.
commands:
- echo "Hello World"
Default value is : false
Default value is : PASSWORD
Default value is : false
Default value is : false
Default value is : false
Default value is : "~/.ssh/config"
Default value is : 22
Default value is : "no"
Default value is : true
1 nested properties
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
The value may be null.
If true, BigQuery treats missing trailing columns as null values. If {@code false}, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. By default, rows with missing trailing columns are considered bad records.
By default quoted newline are not allowed.
The supported values are UTF-8 or ISO-8859-1. The default value is UTF-8. BigQuery decodes the data after the raw, binary data has been split using the values set in {@link #setQuote(String)} and {@link #setFieldDelimiter(String)}.
BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. BigQuery also supports the escape sequence "\t" to specify a tab separator. The default value is a comma (',').
BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. The default value is a double-quote ('"'). If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set {@link #setAllowQuotedNewLines(boolean)} property to {@code true}.
The default value is 0. This property is useful if you have header rows in the file that should be skipped.
Examples
id: gcp_bq_copy
namespace: company.team
tasks:
- id: copy
type: io.kestra.plugin.gcp.bigquery.Copy
operationType: COPY
sourceTables:
- "my_project.my_dataset.my_table$20130908"
destinationTable: "my_project.my_dataset.my_table"
If not provided a new table is created.
COPY: The source and destination table have the same table type.SNAPSHOT: The source table type is TABLE and the destination table type is SNAPSHOT.RESTORE: The source table type is SNAPSHOT and the destination table type is TABLE.CLONE: The source and destination table have the same table type, but only bill for unique data.
Can be table or partitions.
Default value is : false
Default value is : false
A valid query will mostly return an empty response with some processing statistics, while an invalid query will return the same error as it would if it were an actual run.
Default value is : false
If this time limit is exceeded, BigQuery may attempt to terminate the job.
You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_bq_copy_partitions
namespace: company.team
tasks:
- id: copy_partitions
type: io.kestra.plugin.gcp.bigquery.CopyPartitions
projectId: my-project
dataset: my-dataset
table: my-table
destinationTable: my-dest-table
partitionType: DAY
from: "{{ now() | dateAdd(-30, 'DAYS') }}"
to: "{{ now() | dateAdd(-7, 'DAYS') }}"
If the partition :
- is a numeric range, must be a valid integer
- is a date, must a valid datetime like
{{ now() }}
If the partition :
- is a numeric range, must be a valid integer
- is a date, must a valid datetime like
{{ now() }}
Default value is : false
If not provided, a new table is created.
Default value is : false
A valid query will mostly return an empty response with some processing statistics, while an invalid query will return the same error as it would if it were an actual run.
Default value is : false
If this time limit is exceeded, BigQuery may attempt to terminate the job.
You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Create a dataset if not exits
id: gcp_bq_create_dataset
namespace: company.team
tasks:
- id: create_dataset
type: io.kestra.plugin.gcp.bigquery.CreateDataset
name: "my_dataset"
location: "EU"
ifExists: "SKIP"
Default value is : false
Once this property is set, all newly-created partitioned tables in the dataset will has an expirationMs property in the timePartitioning settings set to this value. Changing the value only affect new tables, not existing ones. The storage in a partition will have an expiration time of its partition time plus this value. Setting this property overrides the use of defaultTableExpirationMs for partitioned tables: only one of defaultTableExpirationMs and defaultPartitionExpirationMs will be used for any new partitioned table. If you provide an explicit timePartitioning.expirationMs when creating or updating a partitioned table, that value takes precedence over the default partition expiration time indicated by this property. The value may be null.
The minimum value is 3600000 milliseconds (one hour). Once this property is set, all newly-created tables in the dataset will have an expirationTime property set to the creation time plus the value in this property, and changing the value will only affect new tables, not existing ones. When the expirationTime for a given table is reached, that table will be deleted automatically. If a table's expirationTime is modified or removed before the table expires, or if you provide an explicit expirationTime when creating a table, that value takes precedence over the default expiration time indicated by this property. This property is experimental and might be subject to change or removed.
A user-friendly description for the dataset.
Default value is : false
Default value is : ERROR
This property is experimental and might be subject to change or removed. See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_bq_create_table
namespace: company.team
tasks:
- id: create_table
type: io.kestra.plugin.gcp.bigquery.CreateTable
projectId: my-project
dataset: my-dataset
table: my-table
tableDefinition:
type: TABLE
schema:
fields:
- name: id
type: INT64
- name: name
type: STRING
standardTableDefinition:
clustering:
- id
- name
friendlyName: new_table
Default value is : false
Default value is : false
If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Delete a dataset.
id: gcp_bq_delete_dataset
namespace: company.team
tasks:
- id: delete_dataset
type: io.kestra.plugin.gcp.bigquery.DeleteDataset
name: "my-dataset"
deleteContents: true
Default value is : false
If not provided, attempting to delete a non-empty dataset will result in a exception being thrown.
Default value is : false
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_bq_delete_partitions
namespace: company.team
tasks:
- id: delete_partitions
type: io.kestra.plugin.gcp.bigquery.DeletePartitions
projectId: my-project
dataset: my-dataset
table: my-table
partitionType: DAY
from: "{{ now() | dateAdd(-30, 'DAYS') }}"
to: "{{ now() | dateAdd(-7, 'DAYS') }}"
If the partition :
- is a numeric range, must be a valid integer
- is a date, must a valid datetime like
{{ now() }}
If the partition :
- is a numeric range, must be a valid integer
- is a date, must a valid datetime like
{{ now() }}
Default value is : false
Default value is : false
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Delete a partition
id: gcp_bq_delete_table
namespace: company.team
tasks:
- id: delete_table
type: io.kestra.plugin.gcp.bigquery.DeleteTable
projectId: my-project
dataset: my-dataset
table: my-table$20130908
Default value is : false
Default value is : false
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Extract a BigQuery table to a GCS bucket.
id: gcp_bq_extract_to_gcs
namespace: company.team
tasks:
- id: extract_to_gcs
type: io.kestra.plugin.gcp.bigquery.ExtractToGcs
destinationUris:
- "gs://bucket_name/filename.csv"
sourceTable: "my_project.my_dataset.my_table"
format: CSV
fieldDelimiter: ';'
printHeader: true
Default value is : false
Default value is : false
The labels associated with this job. You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key. Parameters: labels - labels or null for none
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
[Optional] If destinationFormat is set to "AVRO", this flag indicates whether to enable extracting applicable column types (such as TIMESTAMP) to their corresponding AVRO logical types (timestamp-micros), instead of only using their raw types (avro-long).
1 nested properties
Examples
Load an csv file from an input file
id: gcp_bq_load
namespace: company.team
tasks:
- id: load
type: io.kestra.plugin.gcp.bigquery.Load
from: "{{ inputs.file }}"
destinationTable: "my_project.my_dataset.my_table"
format: CSV
csvOptions:
fieldDelimiter: ";"
Default value is : false
If not provided, a new table is created.
Default value is : false
Default value is : true
If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. By default unknown values are not allowed.
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
If the number of bad records exceeds this value, an invalid error is returned in the job result. By default, no bad record is ignored.
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
The schema can be omitted if the destination table already exists, or if you're loading data from a Google Cloud Datastore backup (i.e. DATASTORE_BACKUP format option).
schema:
fields:
- name: colA
type: STRING
- name: colB
type: NUMERIC
See type from StandardSQLTypeName
Schema update options are supported in two cases: when writeDisposition is WRITE_APPEND; when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema.
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
Default value is : DAY
1 nested properties
Examples
Load an avro file from a gcs bucket
id: gcp_bq_load_from_gcs
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv
- id: csv_to_ion
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ outputs.http_download.uri }}"
header: true
- id: ion_to_avro
type: io.kestra.plugin.serdes.avro.IonToAvro
from: "{{ outputs.csv_to_ion.uri }}"
schema: |
{
"type": "record",
"name": "Order",
"namespace": "com.example.order",
"fields": [
{"name": "order_id", "type": "int"},
{"name": "customer_name", "type": "string"},
{"name": "customer_email", "type": "string"},
{"name": "product_id", "type": "int"},
{"name": "price", "type": "double"},
{"name": "quantity", "type": "int"},
{"name": "total", "type": "double"}
]
}
- id: load_from_gcs
type: io.kestra.plugin.gcp.bigquery.LoadFromGcs
from:
- "{{ outputs.ion_to_avro.uri }}"
destinationTable: "my_project.my_dataset.my_table"
format: AVRO
avroOptions:
useAvroLogicalTypes: true
Load a csv file with a defined schema
id: gcp_bq_load_files_test
namespace: company.team
tasks:
- id: load_files_test
type: io.kestra.plugin.gcp.bigquery.LoadFromGcs
destinationTable: "myDataset.myTable"
ignoreUnknownValues: true
schema:
fields:
- name: colA
type: STRING
- name: colB
type: NUMERIC
- name: colC
type: STRING
format: CSV
csvOptions:
allowJaggedRows: true
encoding: UTF-8
fieldDelimiter: ","
from:
- gs://myBucket/myFile.csv
Default value is : false
If not provided, a new table is created.
Default value is : false
The fully-qualified URIs that point to source data in Google Cloud Storage (e.g. gs://bucket/path). Each URI can contain one '*' wildcard character and it must come after the 'bucket' name.
If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. By default unknown values are not allowed.
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
If the number of bad records exceeds this value, an invalid error is returned in the job result. By default, no bad record is ignored.
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
The schema can be omitted if the destination table already exists, or if you're loading data from a Google Cloud Datastore backup (i.e. DATASTORE_BACKUP format option).
schema:
fields:
- name: colA
type: STRING
- name: colB
type: NUMERIC
See type from StandardSQLTypeName
Schema update options are supported in two cases: when writeDisposition is WRITE_APPEND; when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema.
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
Default value is : DAY
1 nested properties
Examples
Create a table with a custom query.
id: gcp_bq_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.gcp.bigquery.Query
destinationTable: "my_project.my_dataset.my_table"
writeDisposition: WRITE_APPEND
sql: |
SELECT
"hello" as string,
NULL AS `nullable`,
1 as int,
1.25 AS float,
DATE("2008-12-25") AS date,
DATETIME "2008-12-25 15:30:00.123456" AS datetime,
TIME(DATETIME "2008-12-25 15:30:00.123456") AS time,
TIMESTAMP("2008-12-25 15:30:00.123456") AS timestamp,
ST_GEOGPOINT(50.6833, 2.9) AS geopoint,
ARRAY(SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3) AS `array`,
STRUCT(4 AS x, 0 AS y, ARRAY(SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3) AS z) AS `struct`
Execute a query and fetch results sets on another task.
id: gcp_bq_query
namespace: company.team
tasks:
- id: fetch
type: io.kestra.plugin.gcp.bigquery.Query
fetch: true
sql: |
SELECT 1 as id, "John" as name
UNION ALL
SELECT 2 as id, "Doe" as name
- id: use_fetched_data
type: io.kestra.plugin.core.debug.Return
format: |
{% for row in outputs.fetch.rows %}
id : {{ row.id }}, name: {{ row.name }}
{% endfor %}
Default value is : false
If true the query is allowed to create large results at a slight cost in performance. destinationTable must be provided.
This dataset is used for all unqualified table names used in the query.
If not provided, a new table is created.
Default value is : false
A valid query will mostly return an empty response with some processing statistics, while an invalid query will return the same error as it would if it were an actual run.
Default value is : false
Default value is : false
Default value is : false
If set to false, allowLargeResults must be true.
Default value is : true
If this time limit is exceeded, BigQuery may attempt to terminate the job.
You can use these to organize and group your jobs. Label keys and values can be no longer than 63 characters, can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. Label values are optional. Label keys must start with a letter and each label in the list must have a different key.
By default this property is set to false.
Default value is : false
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
The maximum number of rows of data to return per page of results. Setting this flag to a small value such as 1000 and then paging through results might improve reliability when the query result set is large. In addition to this limit, responses are also limited to 10 MB. By default, there is no maximum row count, and only the byte limit applies.
Queries that have resource usage beyond this tier will fail (without incurring a charge). If unspecified, this will be set to your project default.
Queries that will have bytes billed beyond this limit will fail (without incurring a charge). If unspecified, this will be set to your project default.
Default value is : INTERACTIVE
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Schema update options are supported in two cases: * when writeDisposition is WRITE_APPEND;
- when writeDisposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema.
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
Default value is : false
Default value is : DAY
A valid query will return a mostly empty response with some processing statistics, while an invalid query will return the same error it would if it wasn't a dry run.
Default value is : false
The query cache is a best-effort cache that will be flushed whenever tables in the query are modified. Moreover, the query cache is only available when destinationTable is not set
1 nested properties
Examples
id: gcp_bq_storage_write
namespace: company.team
tasks:
- id: read_data
type: io.kestra.plugin.core.http.Download
uri: https://dummyjson.com/products/1
- id: storage_write
type: io.kestra.plugin.gcp.bigquery.StorageWrite
from: "{{ outputs.read_data.uri }}"
destinationTable: "my_project.my_dataset.my_table"
writeStreamType: DEFAULT
The table must be created before.
Default value is : false
Default value is : 1000
Default value is : false
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Default value is : DEFAULT
Examples
id: gcp_bq_table_metadata
namespace: company.team
tasks:
- id: table_metadata
type: io.kestra.plugin.gcp.bigquery.TableMetadata
projectId: my-project
dataset: my-dataset
table: my-table
Default value is : false
Default value is : false
If the policy is SKIP, the output will contain only null value, otherwise an error is raised.
Default value is : ERROR
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Wait for a sql query to return results and iterate through rows.
id: bigquery-listen
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.gcp.bigquery.Trigger
interval: "PT5M"
sql: "SELECT * FROM `myproject.mydataset.mytable`"
fetch: true
Default value is : false
Default value is : false
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
By default this property is set to false.
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
Default value is : false
1 nested properties
Examples
id: gcp_bq_update_dataset
namespace: company.team
tasks:
- id: update_dataset
type: io.kestra.plugin.gcp.bigquery.UpdateDataset
name: "my_dataset"
location: "EU"
friendlyName: "new Friendly Name"
Default value is : false
Once this property is set, all newly-created partitioned tables in the dataset will has an expirationMs property in the timePartitioning settings set to this value. Changing the value only affect new tables, not existing ones. The storage in a partition will have an expiration time of its partition time plus this value. Setting this property overrides the use of defaultTableExpirationMs for partitioned tables: only one of defaultTableExpirationMs and defaultPartitionExpirationMs will be used for any new partitioned table. If you provide an explicit timePartitioning.expirationMs when creating or updating a partitioned table, that value takes precedence over the default partition expiration time indicated by this property. The value may be null.
The minimum value is 3600000 milliseconds (one hour). Once this property is set, all newly-created tables in the dataset will have an expirationTime property set to the creation time plus the value in this property, and changing the value will only affect new tables, not existing ones. When the expirationTime for a given table is reached, that table will be deleted automatically. If a table's expirationTime is modified or removed before the table expires, or if you provide an explicit expirationTime when creating a table, that value takes precedence over the default expiration time indicated by this property. This property is experimental and might be subject to change or removed.
A user-friendly description for the dataset.
Default value is : false
This property is experimental and might be subject to change or removed. See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_bq_update_table
namespace: company.team
tasks:
- id: update_table
type: io.kestra.plugin.gcp.bigquery.UpdateTable
projectId: my-project
dataset: my-dataset
table: my-table
expirationDuration: PT48H
Default value is : false
Default value is : false
If not present, the table will persist indefinitely. Expired tables will be deleted and their storage reclaimed.
This property is experimental and might be subject to change or removed.
See Dataset Location
Default value is : false
Message is tested as a substring of the full message, and is case insensitive.
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
Default value is : `- due to concurrent update
- Retrying the job may solve the problem`
[
"due to concurrent update",
"Retrying the job may solve the problem"
]
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
Default value is : `- rateLimitExceeded
- jobBackendError
- internalError
- jobInternalError`
[
"rateLimitExceeded",
"jobBackendError",
"internalError",
"jobInternalError"
]
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
For example, user email if the type is USER.
If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result.
If the number of bad records exceeds this value, an invalid error is returned in the job result.
Each URI can
- contain one '*' wildcard character that must come after the bucket's name. Size limits related
- to load jobs apply to external data sources, plus an additional limit of 10 GB maximum size
- across all URIs.
By default, Field.Mode.NULLABLE is used.
3 nested properties
If type is UserDefinedFunction.Type.INLINE, this method returns a code blob.
If type is UserDefinedFunction.Type.FROM_URI, the method returns a Google Cloud Storage URI (e.g. gs://bucket/path).
Examples
Create a cluster then list them using a service account.
id: gcp_g_cloud_cli
namespace: company.team
tasks:
- id: g_cloud_cli
type: io.kestra.plugin.gcp.cli.GCloudCLI
projectId: my-gcp-project
serviceAccount: "{{ secret('gcp-sa') }}"
commands:
- gcloud container clusters create simple-cluster --region=europe-west3
- gcloud container clusters list
Create a GCS bucket.
id: gcp_g_cloud_cli
namespace: company.team
tasks:
- id: g_cloud_cli
type: io.kestra.plugin.gcp.cli.GCloudCLI
projectId: my-gcp-project
serviceAccount: "{{ secret('gcp-sa') }}"
commands:
- gcloud storage buckets create gs://my-bucket
Output the result of a command.
id: gcp_g_cloud_cli
namespace: company.team
tasks:
- id: g_cloud_cli
type: io.kestra.plugin.gcp.cli.GCloudCLI
projectId: my-gcp-project
serviceAccount: "{{ secret('gcp-sa') }}"
commands:
# Outputs as a flow output for UI display
- gcloud pubsub topics list --format=json | tr -d '
' | xargs -0 -I {} echo '::{"outputs":{"gcloud":{}}}::'
# Outputs as a file, preferred way for large payloads
- gcloud storage ls --json > storage.json
Default value is : false
Default value is : google/cloud-sdk
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
Example: projects/[project_id]/locations/[region]/services/[service_id]
If not specified, a default container image will be used.
Example: projects/[project_id]/regions/[region]/clusters/[cluster_name]
Examples
id: gcp_dataproc_py_spark_submit
namespace: company.team
tasks:
- id: py_spark_submit
type: io.kestra.plugin.gcp.dataproc.batches.PySparkSubmit
mainPythonFileUri: 'gs://spark-jobs-kestra/pi.py'
name: test-pyspark
region: europe-west3
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Default value is : false
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
Default value is : false
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_dataproc_r_spark_submit
namespace: company.team
tasks:
- id: r_spark_submit
type: io.kestra.plugin.gcp.dataproc.batches.RSparkSubmit
mainRFileUri: 'gs://spark-jobs-kestra/dataframe.r'
name: test-rspark
region: europe-west3
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Default value is : false
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
Default value is : false
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_dataproc_spark_sql_submit
namespace: company.team
tasks:
- id: spark_sql_submit
type: io.kestra.plugin.gcp.dataproc.batches.SparkSqlSubmit
queryFileUri: 'gs://spark-jobs-kestra/foobar.py'
name: test-sparksql
region: europe-west3
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Default value is : false
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
Default value is : false
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_dataproc_spark_submit
namespace: company.team
tasks:
- id: spark_submit
type: io.kestra.plugin.gcp.dataproc.batches.SparkSubmit
jarFileUris:
- 'gs://spark-jobs-kestra/spark-examples.jar'
mainClass: org.apache.spark.examples.SparkPi
args:
- 1000
name: test-spark
region: europe-west3
The jar file that contains the class must be in the classpath or specified in jarFileUris
Default value is : false
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Do not include arguments that can be set as batch properties, such as --conf, since a collision can occur that causes an incorrect batch submission.
Default value is : false
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Hadoop Compatible File System (HCFS) URIs should be accessible from the cluster. Can be a GCS file with the gs:// prefix, an HDFS file on the cluster with the hdfs:// prefix, or a local file on the cluster with the file:// prefix
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Creates a cluster in Google Cloud Dataproc.
id: gcp_dataproc_cluster_create
namespace: company.team
tasks:
- id: cluster_create
type: io.kestra.plugin.gcp.dataproc.clusters.Create
clusterName: YOUR_CLUSTER_NAME
region: YOUR_REGION
zone: YOUR_ZONE
masterMachineType: n1-standard-2
workerMachineType: n1-standard-2
workers: 2
bucket: YOUR_BUCKET_NAME
Creates a cluster in Google Cloud Dataproc with specific disk size.
id: gcp_dataproc_cluster_create
namespace: company.team
tasks:
- id: create_cluster_with_certain_disk_size
type: io.kestra.plugin.gcp.dataproc.clusters.Create
clusterName: YOUR_CLUSTER_NAME
region: YOUR_REGION
zone: YOUR_ZONE
masterMachineType: n1-standard-2
masterDiskSizeGB: 500
workerMachineType: n1-standard-2
workerDiskSizeGB: 200
workers: 2
bucket: YOUR_BUCKET_NAM
Default value is : false
Default value is : false
The Compute Engine image resource used for cluster instances.
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Deletes a cluster from Google Cloud Dataproc.
id: gcp_dataproc_cluster_delete
namespace: company.team
tasks:
- id: create_cluster_with_certain_disk_size
type: io.kestra.plugin.gcp.dataproc.clusters.Create
clusterName: YOUR_CLUSTER_NAME
region: YOUR_REGION
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_firestore_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.gcp.firestore.Delete
collection: "persons"
childPath: "1"
The Firestore document child path.
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Get a document from its path.
id: gcp_firestore_get
namespace: company.team
tasks:
- id: get
type: io.kestra.plugin.gcp.firestore.Get
collection: "persons"
childPath: "1"
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_firestore_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.gcp.firestore.Query
collection: "persons"
filters:
- field: "lastname"
value: "Doe"
Default value is : false
Default value is : false
FETCH_ONE output the first row, FETCH output all the rows, STORE store all rows in a file, NONE do nothing.
Default value is : STORE
Default value is : false
Default value is : ASCENDING
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Field value for the filter. Only strings are supported at the moment.
Default value is : EQUAL_TO
Examples
Set a document from a map.
id: gcp_firestore_set
namespace: company.team
tasks:
- id: set
type: io.kestra.plugin.gcp.firestore.Set
collection: "persons"
document:
firstname: "John"
lastname: "Doe"
Set a document from a JSON string.
id: gcp_firestore_set
namespace: company.team
inputs:
- id: json_string
type: STRING
default: "{"firstname": "John", "lastname": "Doe"}"
tasks:
- id: set
type: io.kestra.plugin.gcp.firestore.Set
collection: "persons"
document: "{{ inputs.json_string }}"
Default value is : false
Default value is : false
Can be a JSON string, or a map.
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Concat files in a bucket
id: gcp_gcs_compose
namespace: company.team
tasks:
- id: compose
type: io.kestra.plugin.gcp.gcs.Compose
list:
from: "gs://my_bucket/dir/"
to: "gs://my_bucket/destination/my-compose-file.txt"
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Default value is : BOTH
if DIRECTORY, will only list objects in the specified directory
if RECURSIVE, will list objects in the specified directory recursively
Default value is DIRECTORY
When using RECURSIVE value, be careful to move your files to a location not in the from scope
Default value is : DIRECTORY
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
Copy the file between Internal Storage or Google Cloud Storage file##### Examples
Move a file between bucket path
id: gcp_gcs_copy
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: copy
type: io.kestra.plugin.gcp.gcs.Copy
from: "{{ inputs.file }}"
delete: true
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Create a new bucket with some options
id: gcp_gcs_create_bucket
namespace: company.team
tasks:
- id: create_bucket
type: io.kestra.plugin.gcp.gcs.CreateBucket
name: "my-bucket"
versioningEnabled: true
labels:
my-label: my-value
Default value is : false
The access control configuration to apply to bucket's blobs when no other configuration is specified. See About Access Control Lists
Default value is : false
Default value is : ERROR
Behaves as the bucket's directory index where missing blobs are treated as potential directories.
This configuration is expressed as a number of lifecycle rules, consisting of an action and a condition. See Object Lifecycle Management Only the age condition is supported. Only the delete and SetStorageClass actions are supported
Data for blobs in the bucket resides in physical storage within this region. A list of supported values is available here.
Default value is : false
Whether a user accessing the bucket or an object it contains should assume the transit costs related to the access.
If policy is not locked this value can be cleared, increased, and decreased. If policy is locked the retention period can only be increased.
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
This defines how blobs in the bucket are stored and determines the SLA and the cost of storage. A list of supported values is available here.
When set to true, versioning is fully enabled.
1 nested properties
Examples
Add role to a service account on a bucket
id: gcp_gcs_create_bucket_iam_policy
namespace: company.team
tasks:
- id: create_bucket_iam_policy
type: io.kestra.plugin.gcp.gcs.CreateBucketIamPolicy
name: "my-bucket"
member: "[email protected]"
role: "roles/storage.admin"
Default value is : false
Default value is : false
Default value is : SKIP
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_gcs_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.gcp.gcs.Delete
uri: "gs://my_bucket/dir/file.csv"
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Delete a bucket
id: gcp_gcs_delete_bucket
namespace: company.team
tasks:
- id: delete_bucket
type: io.kestra.plugin.gcp.gcs.DeleteBucket
name: "my-bucket"
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_gcs_delete_list
namespace: company.team
tasks:
- id: delete_list
type: io.kestra.plugin.gcp.gcs.DeleteList
from: "gs://my_bucket/dir/"
Default value is : false
Default value is : false
Default value is : false
if DIRECTORY, will only list objects in the specified directory
if RECURSIVE, will list objects in the specified directory recursively
Default value is DIRECTORY
When using RECURSIVE value, be careful to move your files to a location not in the from scope
Default value is : DIRECTORY
Default value is : false
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_gcs_download
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.gcp.gcs.Download
from: "gs://my_bucket/dir/file.csv"
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Download a list of files and move it to an archive folders
id: gcp_gcs_downloads
namespace: company.team
tasks:
- id: downloads
type: io.kestra.plugin.gcp.gcs.Downloads
from: gs://my-bucket/kestra/files/
action: MOVE
moveDirectory: gs://my-bucket/kestra/archive/
Default value is : false
Default value is : false
if DIRECTORY, will only list objects in the specified directory
if RECURSIVE, will list objects in the specified directory recursively
Default value is DIRECTORY
When using RECURSIVE value, be careful to move your files to a location not in the from scope
Default value is : DIRECTORY
Default value is : false
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
List files in a bucket
id: gcp_gcs_list
namespace: company.team
tasks:
- id: list
type: io.kestra.plugin.gcp.gcs.List
from: "gs://my_bucket/dir/"
Default value is : false
Default value is : false
Default value is : BOTH
if DIRECTORY, will only list objects in the specified directory
if RECURSIVE, will list objects in the specified directory recursively
Default value is DIRECTORY
When using RECURSIVE value, be careful to move your files to a location not in the from scope
Default value is : DIRECTORY
Default value is : false
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
This trigger will poll every interval a GCS bucket. You can search for all files in a bucket or directory in from or you can filter the files with a regExp.The detection is atomic, internally we do a list and interact only with files listed.
Once a file is detected, we download the file on internal storage and processed with declared action in order to move or delete the files from the bucket (to avoid double detection on new poll).##### Examples
Wait for a list of files on a GCS bucket, and iterate through the files.
id: gcs-listen
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
value: "{{ trigger.blobs | jq('.[].uri') }}"
triggers:
- id: watch
type: io.kestra.plugin.gcp.gcs.Trigger
interval: "PT5M"
from: gs://my-bucket/kestra/listen/
action: MOVE
moveDirectory: gs://my-bucket/kestra/archive/
Wait for a list of files on a GCS bucket and iterate through the files. Delete files manually after processing to prevent infinite triggering.
id: gcs-listen
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
- id: delete
type: io.kestra.plugin.gcp.gcs.Delete
uri: "{{ taskrun.value }}"
value: "{{ trigger.blobs | jq('.[].uri') }}"
triggers:
- id: watch
type: io.kestra.plugin.gcp.gcs.Trigger
interval: "PT5M"
from: gs://my-bucket/kestra/listen/
action: NONE
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
if DIRECTORY, will only list objects in the specified directory
if RECURSIVE, will list objects in the specified directory recursively
Default value is DIRECTORY
When using RECURSIVE value, be careful to move your files to a location not in the from scope
Default value is : DIRECTORY
Default value is : false
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Update some bucket labels
id: gcp_gcs_update_bucket
namespace: company.team
tasks:
- id: update_bucket
type: io.kestra.plugin.gcp.gcs.UpdateBucket
name: "my-bucket"
labels:
my-label: my-value
Default value is : false
The access control configuration to apply to bucket's blobs when no other configuration is specified. See About Access Control Lists
Default value is : false
Behaves as the bucket's directory index where missing blobs are treated as potential directories.
This configuration is expressed as a number of lifecycle rules, consisting of an action and a condition. See Object Lifecycle Management Only the age condition is supported. Only the delete and SetStorageClass actions are supported
Data for blobs in the bucket resides in physical storage within this region. A list of supported values is available here.
Default value is : false
Whether a user accessing the bucket or an object it contains should assume the transit costs related to the access.
If policy is not locked this value can be cleared, increased, and decreased. If policy is locked the retention period can only be increased.
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
This defines how blobs in the bucket are stored and determines the SLA and the cost of storage. A list of supported values is available here.
When set to true, versioning is fully enabled.
1 nested properties
Examples
id: gcp_gcs_upload
namespace: company.team
tasks:
- id: upload
type: io.kestra.plugin.gcp.gcs.Upload
from: "{{ inputs.file }}"
to: "gs://my_bucket/dir/file.csv"
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
Fetch a GKE cluster's metadata.
id: gcp_gke_cluster_metadata
namespace: company.team
tasks:
- id: cluster_metadata
type: io.kestra.plugin.gcp.gke.ClusterMetadata
clusterProjectId: my-project-id
clusterZone: europe-west1-c
clusterId: my-cluster-id
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Requires a maxDuration or a maxRecords.##### Examples
id: gcp_pubsub_consume
namespace: company.team
tasks:
- id: consume
type: io.kestra.plugin.gcp.pubsub.Consume
topic: topic-test
maxRecords: 10
projectId: {{ secret('GCP_PROJECT_ID') }}
subscription: my-subscription
The Pub/Sub subscription. It will be created automatically if it didn't exist and 'autoCreateSubscription' is enabled.
The Pub/Sub topic. It must be created before executing the task.
Default value is : false
Default value is : true
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
Default value is : STRING
1 nested properties
Examples
id: gcp_pubsub_publish
namespace: company.team
tasks:
- id: publish
type: io.kestra.plugin.gcp.pubsub.Publish
topic: topic-test
from:
- data: "{{ 'base64-encoded-string-1' | base64encode }}"
attributes:
testAttribute: KestraTest
- messageId: '1234'
- orderingKey: 'foo'
- data: "{{ 'base64-encoded-string-2' | base64encode }}"
- attributes:
testAttribute: KestraTest
Can be an internal storage URI, a list of Pub/Sub messages, or a single Pub/Sub message.
The Pub/Sub topic. It must be created before executing the task.
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
Default value is : STRING
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.gcp.pubsub.Trigger instead.##### Examples
Consume a message from a Pub/Sub topic in real-time.
id: realtime-pubsub
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "Received: {{ trigger.data }}"
triggers:
- id: trigger
type: io.kestra.plugin.gcp.pubsub.RealtimeTrigger
projectId: test-project-id
topic: test-topic
subscription: test-subscription
The Pub/Sub topic. It must be created before executing the task.
Default value is : true
Default value is : false
Default value is : 60.000000000
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
Default value is : STRING
The Pub/Sub subscription. It will be created automatically if it didn't exist and 'autoCreateSubscription' is enabled.
1 nested properties
If you would like to consume each message from a Pub/Sub topic in real-time and create one execution per message, you can use the io.kestra.plugin.gcp.pubsub.RealtimeTrigger instead.##### Examples
id: gcp_trigger
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "Received: {{ trigger.data }}"
triggers:
- id: trigger
type: io.kestra.plugin.gcp.pubsub.Trigger
projectId: test-project-id
subscription: test-subscription
topic: test-topic
maxRecords: 10
The Pub/Sub topic. It must be created before executing the task.
Default value is : true
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
Default value is : STRING
The Pub/Sub subscription. It will be created automatically if it didn't exist and 'autoCreateSubscription' is enabled.
1 nested properties
If it's a string, it can be a dynamic property otherwise not.
Specify a lower value for shorter responses and a higher value for longer responses. A token may be smaller than a word. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words.
Default value is : 128
Temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that require a more deterministic and less open-ended or creative response, while higher temperatures can lead to more diverse or creative results. A temperature of 0 is deterministic: the highest probability response is always selected. For most use cases, try starting with a temperature of 0.2.
Default value is : 0.2
A top-k of 1 means the selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding), while a top-k of 3 means that the next token is selected from among the 3 most probable tokens (using temperature). For each token selection step, the top K tokens with the highest probabilities are sampled. Then tokens are further filtered based on topP with the final token selected using temperature sampling. Specify a lower value for less random responses and a higher value for more random responses.
Default value is : 40
Tokens are selected from most K (see topK parameter) probable to least until the sum of their probabilities equals the top-p value. For example, if tokens A, B, and C have a probability of 0.3, 0.2, and 0.1 and the top-p value is 0.5, then the model will select either A or B as the next token (using temperature) and doesn't consider C. The default top-p value is 0.95. Specify a lower value for less random responses and a higher value for more random responses.
Default value is : 0.95
See Generative AI quickstart using the Vertex AI API for more information.##### Examples
Chat completion using the Vertex AI Gemini API.
id: gcp_vertexai_chat_completion
namespace: company.team
tasks:
- id: chat_completion
type: io.kestra.plugin.gcp.vertexai.ChatCompletion
region: us-central1
projectId: my-project
context: I love jokes that talk about sport
messages:
- author: user
content: Please tell me a joke
Messages appear in chronological order: oldest first, newest last. When the history of messages causes the input to exceed the maximum length, the oldest messages are removed until the entire prompt is within the allowed limit.
Default value is : false
Default value is : false
Messages appear in chronological order: oldest first, newest last. When the history of messages causes the input to exceed the maximum length, the oldest messages are removed until the entire prompt is within the allowed limit.
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Examples
id: gcp_vertexai_custom_job
namespace: company.team
tasks:
- id: custom_job
type: io.kestra.plugin.gcp.vertexai.CustomJob
projectId: my-gcp-project
region: europe-west1
displayName: Start Custom Job
spec:
workerPoolSpecs:
- containerSpec:
imageUri: gcr.io/my-gcp-project/my-dir/my-image:latest
machineSpec:
machineType: n1-standard-4
replicaCount: 1
Default value is : false
Default value is : true
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
Allowing to capture job status & logs.
Default value is : true
1 nested properties
See Overview of multimodal models for more information.##### Examples
Text completion using the Vertex Gemini API
id: gcp_vertexai_multimodal_completion
namespace: company.team
tasks:
- id: multimodal_completion
type: io.kestra.plugin.gcp.vertexai.MultimodalCompletion
region: us-central1
projectId: my-project
contents:
- content: Please tell me a joke
Multimodal completion using the Vertex Gemini API
id: gcp_vertexai_multimodal_completion
namespace: company.team
inputs:
- id: image
type: FILE
tasks:
- id: multimodal_completion
type: io.kestra.plugin.gcp.vertexai.MultimodalCompletion
region: us-central1
projectId: my-project
contents:
- content: Can you describe this image?
- mimeType: image/jpeg
content: "{{ inputs.image }}"
Default value is : false
Default value is : false
Default value is : false
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
If the content is not text, the mimeType property must be set.
See Generative AI quickstart using the Vertex AI API for more information.##### Examples
Text completion using the Vertex AI Gemini API.
id: gcp_vertexai_text_completion
namespace: company.team
tasks:
- id: text_completion
type: io.kestra.plugin.gcp.vertexai.TextCompletion
region: us-central1
projectId: my-project
prompt: Please tell me a joke
Default value is : false
Default value is : false
Default value is : false
Prompts can include preamble, questions, suggestions, instructions, or examples.
Default value is : - https://www.googleapis.com/auth/cloud-platform
Default value is : - https://www.googleapis.com/auth/cloud-platform
[
"https://www.googleapis.com/auth/cloud-platform"
]
1 nested properties
Must be on google container registry, example: gcr.io/{{ project }}/{{ dir }}/{{ image }}:{{ tag }}
It overrides the entrypoint instruction in Dockerfile when provided.
Maximum limit is 100.
All worker pools except the first one are optional and can be skipped
For example, projects/12345/global/networks/myVPC.
Format is of the form projects/{project}/global/networks/{network}. Where {project} is a project number, as in 12345, and {network} is a network name.
To specify this field, you must have already configured VPC Network Peering for Vertex AI.
If this field is left unspecified, the job is not peered with any network.
Users submitting jobs must have act-as permission on this run-as account.
If unspecified, the [Vertex AI Custom Code Service
Agent](https://cloud.google.com/vertex-ai/docs/general/access-control#service-agents)
for the CustomJob's project is used.
will upload Tensorboard logs. Format:projects/{project}/locations/{location}/tensorboards/{tensorboard}
Default value is : 100
Default value is : PD_SSD
If the uri doesn't end with '/', a '/' will be automatically appended. The directory is created if it doesn't exist.
The maximum number of package URIs is 100.
Maximum limit is 100.
The maximum number of package URIs is 100.
This feature can be used by distributed training jobs that are not resilient to workers leaving and joining a job.
Examples
Clone a public GitHub repository.
id: git_clone
namespace: company.team
tasks:
- id: clone
type: io.kestra.plugin.git.Clone
url: https://github.com/dbt-labs/jaffle_shop
branch: main
Clone a private repository from an HTTP server such as a private GitHub repository using a personal access token.
id: git_clone
namespace: company.team
tasks:
- id: clone
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/examples
branch: main
username: git_username
password: your_personal_access_token
Clone a repository from an SSH server. If you want to clone the repository into a specific directory, you can configure the
directoryproperty as shown below.
id: git_clone
namespace: company.team
tasks:
- id: clone
type: io.kestra.plugin.git.Clone
url: [email protected]:kestra-io/kestra.git
directory: kestra
privateKey: <keyfile_content>
passphrase: <passphrase>
Clone a GitHub repository and run a Python ETL script. Note that the
Workertask is required so that the Python script shares the same local file system with files cloned from GitHub in the previous task.
id: git_python
namespace: company.team
tasks:
- id: file_system
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/examples
branch: main
- id: python_etl
type: io.kestra.plugin.scripts.python.Commands
beforeCommands:
- pip install requests pandas > /dev/null
commands:
- python examples/scripts/etl_script.py
Default value is : false
Default value is : 1
If the directory isn't set, the current directory will be used.
Default value is : false
Default value is : false
To generate an ECDSA PEM format key from OpenSSH, use the following command: ssh-keygen -t ecdsa -b 256 -m PEM. You can then set this property with your private key content and put your public key on Git.
1 nested properties
Replaced by PushFlows and PushNamespaceFiles for flow and namespace files push scenario. You can add inputFiles to be committed and pushed. Furthermore, you can use this task in combination with the Clone task so that you can first clone the repository, then add or modify files and push to Git afterwards. Check the examples below as well as the Version Control with Git documentation for more information.##### Examples
Push flows and namespace files to a Git repository every 15 minutes.
id: push_to_git
namespace: company.team
tasks:
- id: commit_and_push
type: io.kestra.plugin.git.Push
namespaceFiles:
enabled: true
flows:
enabled: true
url: https://github.com/kestra-io/scripts
branch: kestra
username: git_username
password: "{{ secret('GITHUB_ACCESS_TOKEN') }}"
commitMessage: "add flows and scripts {{ now() }}"
triggers:
- id: schedule_push
type: io.kestra.plugin.core.trigger.Schedule
cron: "*/15 * * * *"
Clone the main branch, generate a file in a script, and then push that new file to Git. Since we're in a working directory with a
.gitdirectory, you don't need to specify the URL in the Push task. However, the Git credentials always need to be explicitly provided on both Clone and Push tasks (unless using task defaults).
id: push_new_file_to_git
namespace: company.team
inputs:
- id: commit_message
type: STRING
defaults: add a new file to Git
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone
type: io.kestra.plugin.git.Clone
branch: main
url: https://github.com/kestra-io/scripts
- id: generate_data
type: io.kestra.plugin.scripts.python.Commands
docker:
image: ghcr.io/kestra-io/pydata:latest
commands:
- python generate_data/generate_orders.py
- id: push
type: io.kestra.plugin.git.Push
username: git_username
password: myPAT
branch: feature_branch
inputFiles:
to_commit/avg_order.txt: "{{ outputs.generate_data.vars.average_order }}"
addFilesPattern:
- to_commit
commitMessage: "{{ inputs.commit_message }}"
If the branch doesn't exist yet, it will be created.
A directory name (e.g. dir to add dir/file1 and dir/file2) can also be given to add all files in the directory, recursively. File globs (e.g. *.py) are not yet supported.
Default value is : - .
Default value is : - .
[
"."
]
Default value is : false
If the directory isn't set, the current directory will be used.
Default value is : false
Default value is : false
To generate an ECDSA PEM format key from OpenSSH, use the following command: ssh-keygen -t ecdsa -b 256 -m PEM. You can then set this property with your private key content and put your public key on Git.
1 nested properties
Default value is : true
Default value is : true
Default value is : _flows
Using this task, you can push one or more flows from a given namespace (and optionally also child namespaces) to Git. Check the examples below to see how you can push all flows or only specific ones. You can also learn about Git integration in the Version Control with Git documentation.##### Examples
Automatically push all saved flows from the dev namespace and all child namespaces to a Git repository every day at 5 p.m. Before pushing to Git, the task will adjust the flow's source code to match the targetNamespace to prepare the Git branch for merging to the production namespace.
id: push_to_git
namespace: company.team
tasks:
- id: commit_and_push
type: io.kestra.plugin.git.PushFlows
sourceNamespace: dev # the namespace from which flows are pushed
targetNamespace: prod # the target production namespace; if different than sourceNamespace, the sourceNamespace in the source code will be overwritten by the targetNamespace
flows: "*" # optional list of glob patterns; by default, all flows are pushed
includeChildNamespaces: true # optional boolean, false by default
gitDirectory: _flows
url: https://github.com/kestra-io/scripts # required string
username: git_username # required string needed for Auth with Git
password: "{{ secret('GITHUB_ACCESS_TOKEN') }}"
branch: kestra # optional, uses "kestra" by default
commitMessage: "add flows {{ now() }}" # optional string
dryRun: true # if true, you'll see what files will be added, modified or deleted based on the state in Git without overwriting the files yet
triggers:
- id: schedule_push
type: io.kestra.plugin.core.trigger.Schedule
cron: "0 17 * * *" # release/push to Git every day at 5pm
Manually push a single flow to Git if the input push is set to true.
id: myflow
namespace: prod
inputs:
- id: push
type: BOOLEAN
defaults: false
tasks:
- id: if
type: io.kestra.plugin.core.flow.If
condition: "{{ inputs.push == true}}"
then:
- id: commit_and_push
type: io.kestra.plugin.git.PushFlows
sourceNamespace: prod # optional; if you prefer templating, you can use "{{ flow.namespace }}"
targetNamespace: prod # optional; by default, set to the same namespace as defined in sourceNamespace
flows: myflow # if you prefer templating, you can use "{{ flow.id }}"
url: https://github.com/kestra-io/scripts
username: git_username
password: "{{ secret('GITHUB_ACCESS_TOKEN') }}"
branch: kestra
commitMessage: "add flow {{ flow.namespace ~ '.' ~ flow.id }}"
Default value is : false
If null, no author will be set on this commit.
If null, the username will be used instead.
Default value is : 'username'
If the branch doesn't exist yet, it will be created.
Default value is : kestra
Default value is : Add flows from sourceNamespace namespace
Default value is : false
Default value is : false
By default, all flows from the specified sourceNamespace will be pushed (and optionally adjusted to match the targetNamespace before pushing to Git).
If you want to push only the current flow, you can use the "{{flow.id}}" expression or specify the flow ID explicitly, e.g. myflow.
Given that this is a list of glob patterns, you can include as many flows as you wish, provided that the user is authorized to access that namespace.
Note that each glob pattern try to match the file name OR the relative path starting from gitDirectory
Default value is : '**'
If not set, flows will be pushed to a Git directory named _flows and will optionally also include subdirectories named after the child namespaces.
If you prefer, you can specify an arbitrary path, e.g., kestra/flows, allowing you to push flows to that specific Git directory.
If the includeChildNamespaces property is set to true, this task will also push all flows from child namespaces into their corresponding nested directories, e.g., flows from the child namespace called prod.marketing will be added to the marketing folder within the _flows folder.
Note that the targetNamespace (here prod) is specified in the flow code; therefore, kestra will not create the prod directory within _flows. You can use the PushFlows task to push flows from the sourceNamespace, and use SyncFlows to then sync PR-approved flows to the targetNamespace, including all child namespaces.
Default value is : _flows
By default, it’s false, so the task will push only flows from the explicitly declared namespace without pushing flows from child namespaces. If set to true, flows from child namespaces will be pushed to child directories in Git. See the example below for a practical explanation:
| Source namespace in the flow code | Git directory path | Synced to target namespace |
|---|---|---|
| namespace: dev | _flows/flow1.yml | namespace: prod |
| namespace: dev | _flows/flow2.yml | namespace: prod |
| namespace: dev.marketing | _flows/marketing/flow3.yml | namespace: prod.marketing |
| namespace: dev.marketing | _flows/marketing/flow4.yml | namespace: prod.marketing |
| namespace: dev.marketing.crm | _flows/marketing/crm/flow5.yml | namespace: prod.marketing.crm |
| namespace: dev.marketing.crm | _flows/marketing/crm/flow6.yml | namespace: prod.marketing.crm |
Default value is : false
Default value is : false
To generate an ECDSA PEM format key from OpenSSH, use the following command: ssh-keygen -t ecdsa -b 256 -m PEM. You can then set this property with your private key content and put your public key on Git.
Default value is : "{{ flow.namespace }}"
If set, the sourceNamespace will be overwritten to the targetNamespace in the flow source code to prepare your branch for merging into the production namespace.
1 nested properties
Using this task, you can push one or more Namespace Files from a given kestra namespace to Git. Check the Version Control with Git documentation for more details.##### Examples
Push all saved Namespace Files from the dev namespace to a Git repository every 15 minutes.
id: push_to_git
namespace: company.team
tasks:
- id: commit_and_push
type: io.kestra.plugin.git.PushNamespaceFiles
namespace: dev
files: "*" # optional list of glob patterns; by default, all files are pushed
gitDirectory: _files # optional path in Git where Namespace Files should be pushed
url: https://github.com/kestra-io/scripts # required string
username: git_username # required string needed for Auth with Git
password: "{{ secret('GITHUB_ACCESS_TOKEN') }}"
branch: dev # optional, uses "kestra" by default
commitMessage: "add namespace files" # optional string
dryRun: true # if true, you'll see what files will be added, modified or deleted based on the state in Git without overwriting the files yet
triggers:
- id: schedule_push_to_git
type: io.kestra.plugin.core.trigger.Schedule
cron: "*/15 * * * *"
Default value is : false
If null, no author will be set on this commit.
If null, the username will be used instead.
Default value is : 'username'
If the branch doesn’t exist yet, it will be created. If not set, the task will push the files to the kestra branch.
Default value is : kestra
Default value is : Add files from namespace namespace
Default value is : false
Default value is : false
By default, Kestra will push all Namespace Files from the specified namespace.
If you want to push only a specific file or directory e.g. myfile.py, you can set it explicitly using files: myfile.py.
Given that this is a glob pattern string (or a list of glob patterns), you can include as many files as you wish, provided that the user is authorized to access that namespace.
Note that each glob pattern try to match the file name OR the relative path starting from gitDirectory
Default value is : '**'
If not set, files will be pushed to a Git directory named _files. See the table below for an example mapping of Namespace Files to Git paths:
| Namespace File Path | Git directory path |
|---|---|
| scripts/app.py | _files/scripts/app.py |
| scripts/etl.py | _files/scripts/etl.py |
| queries/orders.sql | _files/queries/orders.sql |
| queries/customers.sql | _files/queries/customers.sql |
| requirements.txt | _files/requirements.txt |
Default value is : _files
Default value is : false
Default value is : "{{ flow.namespace }}"
To generate an ECDSA PEM format key from OpenSSH, use the following command: ssh-keygen -t ecdsa -b 256 -m PEM. You can then set this property with your private key content and put your public key on Git.
1 nested properties
Replaced by SyncFlows and SyncNamespaceFiles. Files located in gitDirectory will be synced with namespace files under namespaceFilesDirectory folder. Any file not present in the gitDirectory but present in namespaceFilesDirectory will be deleted from namespace files to ensure that Git remains a single source of truth for your workflow and application code. If you don't want some files from Git to be synced, you can add them to a .kestraignore file at the root of your gitDirectory folder — that file works the same way as .gitignore.
If there is a _flows folder under the gitDirectory folder, any file within it will be parsed and imported as a flow under the namespace declared in the task. It's important to keep in mind that all flows must be located within the same directory without any nested directories. If you want to deploy all flows to kestra from Git using the Git Sync pattern, you have to place all your flows in the _flows directory. Adding namespace folders will result in an error and that's expected. Flows are not equivalent to Namespace Files — while Namespace Files can be stored in arbitrarily nested folders stored in Internal Storage, Flows are just metadata. Flows are sent to Kestra's API and stored in the database backend. This is why they follow a different deployment pattern and cannot be stored in nested folders in Git.
Another important aspect is that the namespace defined in the flow code might get overwritten (!) if the namespace defined within Git doesn't match the namespace or a child namespace defined in the Git Sync task. All Git deployments, both the Git Sync and Kestra's CI/CD integrations, operate on a namespace level to ensure namespace-level governance of permissions, secrets, and to allow separation of resources. If you leverage multiple namespaces in a monorepo, you can create multiple flows, each using the Git Sync task to sync specific Git directories to the desired namespaces.##### Examples
Synchronizes namespace files and flows based on the current state in a Git repository. This flow can run either on a schedule (using the Schedule trigger) or anytime you push a change to a given Git branch (using the Webhook trigger).
id: sync_from_git
namespace: company.team
tasks:
- id: git
type: io.kestra.plugin.git.Sync
url: https://github.com/kestra-io/scripts
branch: main
username: git_username
password: "{{ secret('GITHUB_ACCESS_TOKEN') }}"
gitDirectory: your_git_dir # optional, otherwise all files
namespaceFilesDirectory: your_namespace_files_location # optional, otherwise the namespace root directory
dryRun: true # if true, print the output of what files will be added/modified or deleted without overwriting the files yet
triggers:
- id: every_minute
type: io.kestra.plugin.core.trigger.Schedule
cron: "*/1 * * * *"
Default value is : false
Default value is : false
Default value is : false
To generate an ECDSA PEM format key from OpenSSH, use the following command: ssh-keygen -t ecdsa -b 256 -m PEM. You can then set this property with your private key content and put your public key on Git.
1 nested properties
This task syncs flows from a given Git branch to a Kestra namespace. If the delete property is set to true, any flow available in kestra but not present in the gitDirectory will be deleted, considering Git as a single source of truth for your flows. Check the Version Control with Git documentation for more details.##### Examples
Sync flows from a Git repository. This flow can run either on a schedule (using the Schedule trigger) or anytime you push a change to a given Git branch (using the Webhook trigger).
id: sync_flows_from_git
namespace: company.team
tasks:
- id: git
type: io.kestra.plugin.git.SyncFlows
gitDirectory: flows # optional; set to _flows by default
targetNamespace: git # required
includeChildNamespaces: true # optional; by default, it's set to false to allow explicit definition
delete: true # optional; by default, it's set to false to avoid destructive behavior
url: https://github.com/kestra-io/flows # required
branch: main
username: git_username
password: "{{ secret('GITHUB_ACCESS_TOKEN') }}"
dryRun: true # if true, the task will only log which flows from Git will be added/modified or deleted in kestra without making any changes in kestra backend yet
triggers:
- id: every_full_hour
type: io.kestra.plugin.core.trigger.Schedule
cron: "0 * * * *"
If the top-level namespace specified in the flow source code is different than the targetNamespace, it will be overwritten by this target namespace. This facilitates moving between environments and projects. If includeChildNamespaces property is set to true, the top-level namespace in the source code will also be overwritten by the targetNamespace in children namespaces.
For example, if the targetNamespace is set to prod and includeChildNamespaces property is set to true, then:
namespace: devin flow source code will be overwritten bynamespace: prod,namespace: dev.marketing.crmwill be overwritten bynamespace: prod.marketing.crm.
See the table below for a practical explanation:
| Source namespace in the flow code | Git directory path | Synced to target namespace |
|---|---|---|
| namespace: dev | _flows/flow1.yml | namespace: prod |
| namespace: dev | _flows/flow2.yml | namespace: prod |
| namespace: dev.marketing | _flows/marketing/flow3.yml | namespace: prod.marketing |
| namespace: dev.marketing | _flows/marketing/flow4.yml | namespace: prod.marketing |
| namespace: dev.marketing.crm | _flows/marketing/crm/flow5.yml | namespace: prod.marketing.crm |
| namespace: dev.marketing.crm | _flows/marketing/crm/flow6.yml | namespace: prod.marketing.crm |
Default value is : false
Default value is : main
It’s false by default to avoid destructive behavior. Use this property with caution because when set to true and includeChildNamespaces is also set to true, this task will delete all flows from the targetNamespace and all its child namespaces that are not present in Git rather than only overwriting the changes.
Default value is : false
Default value is : false
Default value is : false
If not set, this task assumes your branch has a Git directory named _flows (equivalent to the default gitDirectory of the PushFlows task).
If includeChildNamespaces property is set to true, this task will push all flows from nested subdirectories into their corresponding child namespaces, e.g. if targetNamespace is set to prod, then:
- flows from the
_flowsdirectory will be synced to theprodnamespace, - flows from the
_flows/marketingsubdirectory in Git will be synced to theprod.marketingnamespace, - flows from the
_flows/marketing/crmsubdirectory will be synced to theprod.marketing.crmnamespace.
Default value is : _flows
It’s false by default so that we sync only flows from the explicitly declared gitDirectory without traversing child directories. If set to true, flows from subdirectories in Git will be synced to child namespace in Kestra using the dot notation . for each subdirectory in the folder structure.
Default value is : false
Default value is : false
To generate an ECDSA PEM format key from OpenSSH, use the following command: ssh-keygen -t ecdsa -b 256 -m PEM. You can then set this property with your private key content and put your public key on Git.
1 nested properties
This task syncs Namespace Files from a given Git branch to a Kestra `namespace. If the delete property is set to true, any Namespace Files available in kestra but not present in the gitDirectory will be deleted, allowing to maintain Git as a single source of truth for your Namespace Files. Check the Version Control with Git documentation for more details. Using this task, you can push one or more Namespace Files from a given kestra namespace to Git. Check the Version Control with Git documentation for more details.##### Examples
Sync Namespace Files from a Git repository. This flow can run either on a schedule (using the Schedule trigger) or anytime you push a change to a given Git branch (using the Webhook trigger).
id: sync_from_git
namespace: company.team
tasks:
- id: git
type: io.kestra.plugin.git.SyncNamespaceFiles
namespace: prod
gitDirectory: _files # optional; set to _files by default
delete: true # optional; by default, it's set to false to avoid destructive behavior
url: https://github.com/kestra-io/flows
branch: main
username: git_username
password: "{{ secret('GITHUB_ACCESS_TOKEN') }}"
dryRun: true # if true, the task will only log which flows from Git will be added/modified or deleted in kestra without making any changes in kestra backend yet
triggers:
- id: every_minute
type: io.kestra.plugin.core.trigger.Schedule
cron: "*/1 * * * *"
Default value is : false
Default value is : kestra
It’s false by default to avoid destructive behavior. Use with caution because when set to true, this task will delete all Namespace Files which are not present in Git.
Default value is : false
Default value is : false
Default value is : false
If not set, this task assumes your branch includes a directory named _files
Default value is : _files
Default value is : false
Default value is : "{{ flow.namespace }}"
To generate an ECDSA PEM format key from OpenSSH, use the following command: ssh-keygen -t ecdsa -b 256 -m PEM. You can then set this property with your private key content and put your public key on Git.
1 nested properties
Requires authentication.##### Examples
Search for code in a repository.
id: github_code_search_flow
namespace: company.team
tasks:
- id: search_code
type: io.kestra.plugin.github.code.Search
oauthToken: your_github_token
query: "addClass in:file language:js repo:jquery/jquery"
Search for code in a repository.
id: github_code_search_flow
namespace: company.team
tasks:
- id: search_code
type: io.kestra.plugin.github.code.Search
oauthToken: your_github_token
query: addClass
in: file
language: js
repository: jquery/jquery
Default value is : false
Default value is : false
Whether to include forks.
Does not requires additional fields to log-in
Default value is : false
Requires additional field: oauthToken, to log-in
GitHub Personal Access Token. In addition, can be used with login or by its own
ASC - the results will be in ascending order
DESC - the results will be in descending order
Default value is : ASC
Allow you to limit your search to specific areas of GitHub.
BEST_MATCH - the results will be sorted by best match results
INDEXED - the results will be sorted by the index
Default value is : BEST_MATCH
1 nested properties
Requires authentication.##### Examples
Search for commits in a repository.
id: github_commit_search_flow
namespace: company.team
tasks:
- id: search_commit
type: io.kestra.plugin.github.commits.Search
oauthToken: your_github_token
query: "Initial repo:kestra-io/plugin-github language:java"
Search for commits in a repository.
id: github_commit_search_flow
namespace: company.team
tasks:
- id: search_commit
type: io.kestra.plugin.github.commits.Search
oauthToken: your_github_token
query: Initial
repository: kestra-io/plugin-github
Default value is : false
When you search for a date, you can use greater than, less than, and range qualifiers to further filter results.
Default value is : false
Does not requires additional fields to log-in
Default value is : false
Requires additional field: oauthToken, to log-in
GitHub Personal Access Token. In addition, can be used with login or by its own
ASC - the results will be in ascending order
DESC - the results will be in descending order
Default value is : ASC
Allows you to limit your search to specific areas of GitHub.
COMMITTER_DATE - the results will be sorted by when user joined to Github
AUTHOR_DATE - the results will be sorted by the number of repositories owned by user
Default value is : COMMITTER_DATE
1 nested properties
If no authentication is provided, anonymous authentication will be used.##### Examples
Put a comment on an issue in a repository.
id: github_comment_on_issue_flow
namespace: company.team
tasks:
- id: comment_on_issue
type: io.kestra.plugin.github.issues.Comment
oauthToken: your_github_token
repository: kestra-io/kestra
issueNumber: 1347
body: "{{ execution.id }} has failed on {{ taskrun.startDate }}. See the link below for more details"
Default value is : false
Default value is : false
Does not requires additional fields to log-in
Default value is : false
Requires additional field: oauthToken, to log-in
GitHub Personal Access Token. In addition, can be used with login or by its own
Repository where issue/ticket should be created. It's a string of Username + / + Repository name
1 nested properties
If no authentication is provided, anonymous authentication will be used.##### Examples
Create an issue in a repository using JWT token.
id: github_issue_create_flow
namespace: company.team
tasks:
- id: create_issue
type: io.kestra.plugin.github.issues.Create
jwtToken: your_github_jwt_token
repository: kestra-io/kestra
title: Workflow failed
body: "{{ execution.id }} has failed on {{ taskrun.startDate }}. See the link below for more details"
labels:
- bug
- workflow
Create an issue in a repository using OAuth token.
id: github_issue_create_flow
namespace: company.team
tasks:
- id: create_issue
type: io.kestra.plugin.github.issues.Create
login: your_github_login
oauthToken: your_github_token
repository: kestra-io/kestra
title: Workflow failed
body: "{{ execution.id }} has failed on {{ taskrun.startDate }}. See the link below for more details"
labels:
- bug
- workflow
Create an issue in a repository with assignees.
id: github_issue_create_flow
namespace: company.team
tasks:
- id: create_issue
type: io.kestra.plugin.github.issues.Create
oauthToken: your_github_token
repository: kestra-io/kestra
title: Workflow failed
body: "{{ execution.id }} has failed on {{ taskrun.startDate }}. See the link below for more details"
labels:
- bug
- workflow
assignees:
- MyDeveloperUserName
- MyDesignerUserName
Default value is : false
List of unique names of assignees.
Default value is : false
Does not requires additional fields to log-in
List of labels for ticket.
Default value is : false
Requires additional field: oauthToken, to log-in
GitHub Personal Access Token. In addition, can be used with login or by its own
Repository where issue/ticket should be created. It's a string of Username + / + Repository name
1 nested properties
If no authentication is provided, anonymous authentication will be used##### Examples
Search for issues in a repository.
id: github_issue_search_flow
namespace: company.team
tasks:
- id: search_issues
type: io.kestra.plugin.github.issues.Search
oauthToken: your_github_token
query: "repo:kestra-io/plugin-github is:open"
Search for open issues in a repository.
id: github_issue_search_flow
namespace: company.team
tasks:
- id: search_open_issues
type: io.kestra.plugin.github.issues.Search
oauthToken: your_github_token
repository: kestra-io/plugin-github
open: TRUE
Default value is : false
Default value is : false
Does not requires additional fields to log-in
Default value is : false
Requires additional field: oauthToken, to log-in
GitHub Personal Access Token. In addition, can be used with login or by its own
ASC - the results will be in ascending order
DESC - the results will be in descending order
Default value is : ASC
CREATED - Sorts the results of query by the time issue was created (DEFAULT)
UPDATED - Sorts the results of query by the tome issue was last time updated
COMMENTS - Sorts the results of query by the number of comments
Default value is : CREATED
1 nested properties
If no authentication is provided, anonymous authentication will be used.##### Examples
Create a pull request in a repository.
id: github_pulls_create_flow
namespace: company.team
tasks:
- id: create_pull_request
type: io.kestra.plugin.github.pulls.Create
oauthToken: your_github_token
repository: kestra-io/kestra
sourceBranch: develop
targetBranch: main
title: Workflow failed
body: "{{ execution.id }} has failed on {{ taskrun.startDate }}. See the link below for more details"
Default value is : false
The contents of the pull request. This is the markdown description of a pull request.
Default value is : false
Boolean value indicates whether to create a draft pull request or not. Default is false.
Default value is : false
Does not requires additional fields to log-in
Default value is : false
Requires additional field: oauthToken, to log-in
Boolean value indicating whether maintainers can modify the pull request. Default is false.
Default value is : false
GitHub Personal Access Token. In addition, can be used with login or by its own
Repository where issue/ticket should be created. It's a string of Username + / + Repository name
Required. The name of the branch where your changes are implemented. For cross-repository pull requests in the same network, namespace head with a user like this: username:branch.
Required. The name of the branch you want your changes pulled into. This should be an existing branch on the current repository.
Required. The title of the pull request.
1 nested properties
If no authentication is provided, anonymous authentication will be used. Anonymous authentication can't retrieve full information.##### Examples
Search for pull requests in a repository.
id: github_pulls_search_flow
namespace: company.team
tasks:
- id: search_pull_requests
type: io.kestra.plugin.github.pulls.Search
oauthToken: your_github_token
query: "repo:kestra-io/plugin-github is:open"
Search for open pull requests in a repository.
id: github_pulls_search_flow
namespace: company.team
tasks:
- id: search_open_pull_requests
type: io.kestra.plugin.github.pulls.Search
oauthToken: your_github_token
repository: kestra-io/plugin-github
open: TRUE
Default value is : false
You can use greater than, less than, and range qualifiers (.. between two dates) to further filter results.
The SHA syntax must be at least seven characters.
You can use greater than, less than, and range qualifiers (.. between two dates) to further filter results.
Requires authentication.
Default value is : false
Does not requires additional fields to log-in
Default value is : false
Requires additional field: oauthToken, to log-in
GitHub Personal Access Token. In addition, can be used with login or by its own
ASC - the results will be in ascending order
DESC - the results will be in descending order
Default value is : ASC
Allow you to limit your search to specific areas of GitHub.
CREATED - Sorts the results of query by the time issue was created (DEFAULT)
UPDATED - Sorts the results of query by the tome issue was last time updated
COMMENTS - Sorts the results of query by the number of comments
Default value is : CREATED
You can use greater than, less than, and range qualifiers (.. between two dates) to further filter results
1 nested properties
If no authentication is provided, anonymous authentication will be used. Anonymous authentication can't retrieve full information.##### Examples
Search for Github repositories using query.
id: github_repo_search_flow
namespace: company.team
tasks:
- id: search_repositories
type: io.kestra.plugin.github.repositories.Search
oauthToken: your_github_token
query: "repo:kestra-io/plugin-github"
Search for Github repositories using repository.
id: github_repo_search_flow
namespace: company.team
tasks:
- id: search_repositories
type: io.kestra.plugin.github.repositories.Search
oauthToken: your_github_token
repository: kestra-io/plugin-github
Search for Github repositories and order the results.
id: github_repo_search_flow
namespace: company.team
tasks:
- id: search_repositories
type: io.kestra.plugin.github.repositories.Search
oauthToken: your_github_token
query: "user:kestra-io language:java is:public"
sort: STARS
order: DESC
Search for Github repositories with filters like language and visibility, and order the results.
id: github_repo_search_flow
namespace: company.team
tasks:
- id: search_repositories
type: io.kestra.plugin.github.repositories.Search
oauthToken: your_github_token
user: kestra-io
language: java
visibility: PUBLIC
sort: STARS
order: DESC
Default value is : false
Default value is : false
Does not requires additional fields to log-in
Can be the language name or alias.
Default value is : false
Requires additional field: oauthToken, to log-in
GitHub Personal Access Token. In addition, can be used with login or by its own
ASC - the results will be in ascending order
DESC - the results will be in descending order
Default value is : ASC
Qualifiers allow you to limit your search to specific areas of GitHub.
Example string: "myUserName/MyRepository". query equivalent: "repo:myUserName/MyRepository".
UPDATED - the results will be sorted by when the repository was last updated
STARS - the results will be sorted by the number of stars the repository has
FORKS - the results will be sorted by the number of forks the repository has
Default value is : UPDATED
To search by organization, use: "query: org:myOrganization".
PUBLIC - shows only public repositories
PRIVATE - shows only private repositories that are available for user who is searching
INTERNAL - shows only internal repositories
1 nested properties
If no authentication is provided, anonymous authentication will be used. Anonymous authentication can't retrieve full information.##### Examples
Search for topics.
id: github_topic_search_flow
namespace: company.team
tasks:
- id: search_topics
type: io.kestra.plugin.github.topics.Search
oauthToken: your_github_token
query: "micronaut framework is:not-curated repositories:>100"
Search for topics with conditions.
id: github_topic_search_flow
namespace: company.team
tasks:
- id: search_topics
type: io.kestra.plugin.github.topics.Search
oauthToken: your_github_token
query: "micronaut framework"
is: NOT_CURATED
repositories: >100
Default value is : false
You can use greater than, less than, and range qualifiers to further filter results.
Default value is : false
CURATED - Matches topics that are curated
FEATURED - Matches topics that are featured on https://github.com/topics/
NOT_CURATED - Matches topics that don't have extra information, such as a description or logo
NOT_FEATURED - Matches topics that aren't featured on https://github.com/topics/
Does not requires additional fields to log-in
Default value is : false
Requires additional field: oauthToken, to log-in
GitHub Personal Access Token. In addition, can be used with login or by its own
ASC - the results will be in ascending order
DESC - the results will be in descending order
Default value is : ASC
Allow you to limit your search to specific areas of GitHub.
You can use greater than, less than, and range qualifiers to further filter results.
1 nested properties
If no authentication is provided, anonymous authentication will be used. Anonymous authentication can't retrieve full information.##### Examples
Search for users.
id: github_user_search_flow
namespace: company.team
tasks:
- id: search_users
type: io.kestra.plugin.github.users.Search
oauthToken: your_github_token
query: "kestra-io in:login language:java"
Search for users with conditions.
id: github_user_search_flow
namespace: company.team
tasks:
- id: search_users
type: io.kestra.plugin.github.users.Search
oauthToken: your_github_token
query: kestra-io
in: login
language: java
USER - the results will include only user accounts
ORGANIZATION - the results will include only organization accounts
Default value is : false
Available formats:
-
'<=YYYY-MM-DD' - joined at or before
-
'>=YYYY-MM-DD' - joined at or after
-
Similar cases for above two with ">", "<"
-
'YYYY-MM-DD..YYYY-MM-DD' - joined in period between
Default value is : false
Example kenya in:login matches users with the word "kenya" in their username. One more case of use to search users that have sponsor profile, equivalent to query: is:sponsorable.
Does not requires additional fields to log-in
Can be the language name or alias.
Default value is : false
Requires additional field: oauthToken, to log-in
GitHub Personal Access Token. In addition, can be used with login or by its own
ASC - the results will be in ascending order (DEFAULT)
DESC - the results will be in descending order
Default value is : ASC
Qualifiers allow you to limit your search to specific areas of GitHub.
JOINED - the results will be sorted by when user joined to Github (DEFAULT)
REPOSITORIES - the results will be sorted by the number of repositories owned by user
FOLLOWERS - the results will be sorted by the number of followers that user has
Default value is : JOINED
1 nested properties
Examples
id: googleworkspace_drive_create
namespace: company.team
tasks:
- id: create
type: io.kestra.plugin.googleworkspace.drive.Create
name: "My Folder"
mimeType: "application/vnd.google-apps.folder"
Default value is : false
Default value is : false
Default value is : false
Drive will attempt to automatically detect an appropriate value from uploaded content if no value is provided. The value cannot be changed unless a new revision is uploaded. If a file is created with a Google Doc MIME type, the uploaded content will be imported if possible. The supported import formats are published here.
This is not necessarily unique within a folder
Default value is : 120
Default value is : - https://www.googleapis.com/auth/drive
Default value is : - https://www.googleapis.com/auth/drive
[
"https://www.googleapis.com/auth/drive"
]
1 nested properties
Examples
id: googleworkspace_drive_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.googleworkspace.drive.Delete
fileId: "1Dkd3W0OQo-wxz1rrORLP7YGSj6EBLEg74fiTdbJUIQE"
Default value is : false
Default value is : false
Default value is : false
Default value is : 120
Default value is : - https://www.googleapis.com/auth/drive
Default value is : - https://www.googleapis.com/auth/drive
[
"https://www.googleapis.com/auth/drive"
]
1 nested properties
Examples
id: googleworkspace_drive_download
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.googleworkspace.drive.Download
fileId: "1Dkd3W0OQo-wxz1rrORLP7YGSj6EBLEg74fiTdbJUIQE"
Default value is : false
Default value is : false
Default value is : false
Default value is : 120
Default value is : - https://www.googleapis.com/auth/drive
Default value is : - https://www.googleapis.com/auth/drive
[
"https://www.googleapis.com/auth/drive"
]
1 nested properties
Examples
id: googleworkspace_drive_export
namespace: company.team
tasks:
- id: export
type: io.kestra.plugin.googleworkspace.drive.Export
fileId: "1Dkd3W0OQo-wxz1rrORLP7YGSj6EBLEg74fiTdbJUIQE"
a valid RFC2045 like text/csv, application/msword, ...
Default value is : false
Default value is : false
Default value is : false
Default value is : 120
Default value is : - https://www.googleapis.com/auth/drive
Default value is : - https://www.googleapis.com/auth/drive
[
"https://www.googleapis.com/auth/drive"
]
1 nested properties
Examples
List subfolder in a Drive folder
id: googleworkspace_drive_list
namespace: company.team
tasks:
- id: list
type: io.kestra.plugin.googleworkspace.drive.List
query: |
mimeType = 'application/vnd.google-apps.folder'
and '1z2GZgLEX12BN9zbVE6TodrCHyTRMj_ka' in parents
Default value is : false
'allTeamDrives' must be combined with 'user'; all other values must be used in isolation. Prefer 'user' or 'teamDrive' to 'allTeamDrives' for efficiency.
Default value is : false
Default value is : false
see details here if not defined, will list all files that the service account have access
Default value is : 120
Default value is : - https://www.googleapis.com/auth/drive
Default value is : - https://www.googleapis.com/auth/drive
[
"https://www.googleapis.com/auth/drive"
]
1 nested properties
Examples
Upload a csv and convert it to sheet format
id: googleworkspace_drive_upload
namespace: company.team
inputs:
- id: file
type: FILE
description: The file to be uploaded to Google Drive
tasks:
- id: upload
type: io.kestra.plugin.googleworkspace.drive.Upload
from: "{{ inputs.file }}"
parents:
- "1HuxzpLt1b0111MuKMgy8wAv-m9Myc1E_"
name: "My awesome CSV"
contentType: "text/csv"
mimeType: "application/vnd.google-apps.spreadsheet"
a valid RFC2045 like text/csv, application/msword, ...
Default value is : false
Default value is : false
If not provided, it will create a new file
Default value is : false
Drive will attempt to automatically detect an appropriate value from uploaded content if no value is provided. The value cannot be changed unless a new revision is uploaded. If a file is created with a Google Doc MIME type, the uploaded content will be imported if possible. The supported import formats are published here.
This is not necessarily unique within a folder
Default value is : 120
Default value is : - https://www.googleapis.com/auth/drive
Default value is : - https://www.googleapis.com/auth/drive
[
"https://www.googleapis.com/auth/drive"
]
1 nested properties
Default value is : UTF-8
Default value is : ","
The default value is 0. This property is useful if you have header rows in the file that should be skipped.
Examples
Create a spreadsheet in Google Workspace
id: googleworkspace_sheets_create
namespace: company.team
inputs:
- id: serviceAccount
type: STRING
tasks:
- id: create_spreadsheet
type: io.kestra.plugin.googleworkspace.sheets.CreateSpreadsheet
serviceAccount: "{{ inputs.serviceAccount }}"
Default value is : false
Default value is : false
Default value is : false
Default value is : 120
Default value is : - https://www.googleapis.com/auth/spreadsheets
Default value is : - https://www.googleapis.com/auth/spreadsheets
[
"https://www.googleapis.com/auth/spreadsheets"
]
1 nested properties
Examples
Deletes a spreadsheet in google workspace
id: googleworkspace_sheets_delete
namespace: company.team
inputs:
- id: serviceAccount
type: STRING
tasks:
- id: delete_spreadsheet
type: io.kestra.plugin.googleworkspace.sheets.DeleteSpreadsheet
serviceAccount: "{{ inputs.serviceAccount }}"
spreadsheetId: "xxxxxxxxxxxxxxxx"
Default value is : false
Default value is : false
Default value is : false
Default value is : 120
Default value is : - https://www.googleapis.com/auth/spreadsheets
Default value is : - https://www.googleapis.com/auth/spreadsheets
[
"https://www.googleapis.com/auth/spreadsheets"
]
1 nested properties
Examples
Load data into a Google Workspace spreadsheet from an input file
id: googleworkspace_sheets_load
namespace: company.team
inputs:
- id: file
type: FILE
- id: serviceAccount
type: STRING
tasks:
- id: load_data
type: io.kestra.plugin.googleworkspace.sheets.Load
from: "{{ inputs.file }}"
spreadsheetId: xxxxxxxxxxxxxxxxx
range: Sheet2
serviceAccount: "{{ inputs.serviceAccount }}"
csvOptions:
fieldDelimiter: ";"
Default value is : false
If provided, the task will read avro objects using this schema.
Default value is : false
If not provided, the task will programmatically try to find the correct format based on the extension.
Default value is : false
Default value is : false
Default value is : Sheet1
Default value is : 120
Default value is : - https://www.googleapis.com/auth/spreadsheets
Default value is : - https://www.googleapis.com/auth/spreadsheets
[
"https://www.googleapis.com/auth/spreadsheets"
]
1 nested properties
Examples
id: googleworkspace_sheets_read
namespace: company.team
tasks:
- id: read
type: io.kestra.plugin.googleworkspace.sheets.Read
spreadsheetId: "1Dkd3W0OQo-wxz1rrORLP7YGSj6EBLEg74fiTdbJUIQE"
store: true
valueRender: FORMATTED_VALUE
Default value is : false
his is ignored if valueRender is FORMATTED_VALUE.
More details here
Default value is : FORMATTED_STRING
Default value is : false
Default value is : false
Default value is : true
Default value is : false
Default value is : 120
Default value is : - https://www.googleapis.com/auth/spreadsheets
Default value is : - https://www.googleapis.com/auth/spreadsheets
[
"https://www.googleapis.com/auth/spreadsheets"
]
If not provided all the sheets will be included.
Default value is : true
More details here
Default value is : UNFORMATTED_VALUE
1 nested properties
Examples
id: googleworkspace_sheets_readrange
namespace: company.team
tasks:
- id: read_range
type: io.kestra.plugin.googleworkspace.sheets.ReadRange
spreadsheetId: "1Dkd3W0OQo-wxz1rrORLP7YGSj6EBLEg74fiTdbJUIQE"
range: "Second One!A1:I"
store: true
valueRender: FORMATTED_VALUE
Default value is : false
his is ignored if valueRender is FORMATTED_VALUE.
More details here
Default value is : FORMATTED_STRING
Default value is : false
Default value is : false
Default value is : true
Default value is : false
Default value is : 120
Default value is : - https://www.googleapis.com/auth/spreadsheets
Default value is : - https://www.googleapis.com/auth/spreadsheets
[
"https://www.googleapis.com/auth/spreadsheets"
]
Default value is : true
More details here
Default value is : UNFORMATTED_VALUE
1 nested properties
Examples
id: hightouch_sync
namespace: company.team
tasks:
- id: sync
type: io.kestra.plugin.hightouch.Sync
token: YOUR_API_TOKEN
syncId: 1127166
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : 300.000000000
Allowing to capture run status and logs
Default value is : true
1 nested properties
Examples
id: hubspot_tickets_create
namespace: company.team
tasks:
- id: create_ticket
type: io.kestra.plugin.hubspot.tickets.Create
apiKey: my_api_key
subject: "Increased 5xx in Demo Service"
content: "The number of 5xx has increased beyond the threshold for Demo service."
stage: 3
priority: HIGH
Create an issue when a Kestra workflow in any namespace with
companyas prefix fails.
id: create_ticket_on_failure
namespace: system
tasks:
- id: create_ticket
type: io.kestra.plugin.hubspot.tickets.Create
apiKey: my_api_key
subject: Workflow failed
content: "{{ execution.id }} has failed on {{ taskrun.startDate }}"
stage: 3
priority: HIGH
triggers:
- id: on_failure
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: company
comparison: PREFIX
Default value is : false
Default value is : false
Default value is : false
(Optional) Available values: LOW: Low priority MEDIUM: Medium priority HIGH: High priority
Default value is : 1
1 nested properties
Examples
Send a SQL query to a database and fetch row(s) using Apache Arrow Flight SQL driver.
id: arrow_flight_sql_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.arrowflight.Query
url: jdbc:arrow-flight-sql://localhost:31010/?useEncryption=false
username: db_user
password: db_password
sql: select * FROM departments
fetchType: FETCH
Send a SQL query to a Dremio coordinator and fetch rows as output using Apache Arrow Flight SQL driver.
id: arrow_flight_sql_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.arrowflight.Query
url: jdbc:arrow-flight-sql://dremio-coordinator:32010/?schema=postgres.public
username: dremio_user
password: dremio_password
sql: select * FROM departments
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.arrowflight.Trigger
username: dremio_user
password: dremio_password
url: jdbc:arrow-flight-sql://dremio-coordinator:32010/?schema=postgres.public
interval: "PT5M"
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Send a SQL query to a AS400 Database and fetch a row as output.
id: as400_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.as400.Query
url: jdbc:as400://127.0.0.1:50000/
username: as400_user
password: as400_password
sql: select * from as400_types
fetchType: FETCH_ONE
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.as400.Trigger
interval: "PT5M"
url: jdbc:as400://127.0.0.1:50000/
username: as400_user
password: as400_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Insert rows from another table to a Clickhouse database using asynchronous inserts.
id: clickhouse_bulk_insert
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: bulk_insert
type: io.kestra.plugin.jdbc.clickhouse.BulkInsert
from: "{{ inputs.file }}"
url: jdbc:clickhouse://127.0.0.1:56982/
username: ch_user
password: ch_password
sql: INSERT INTO YourTable SETTINGS async_insert=1, wait_for_async_insert=1 values( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
Insert data into specific columns via a SQL query to a ClickHouse database using asynchronous inserts.
id: clickhouse_bulk_insert
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: bulk_insert
type: io.kestra.plugin.jdbc.clickhouse.BulkInsert
from: "{{ inputs.file }}"
url: jdbc:clickhouse://127.0.0.1:56982/
username: ch_user
password: ch_password
sql: INSERT INTO YourTable ( field1, field2, field3 ) SETTINGS async_insert=1, wait_for_async_insert=1 values( ?, ?, ? )
Insert data into specific columns via a SQL query to a ClickHouse database using asynchronous inserts.
id: clickhouse_bulk_insert
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: bulk_insert
type: io.kestra.plugin.jdbc.clickhouse.BulkInsert
from: "{{ inputs.file }}"
url: jdbc:clickhouse://127.0.0.1:56982/
username: ch_user
password: ch_password
table: YourTable
The query must have as many question marks as the number of columns in the table. Example: 'insert into <table_name> values( ? , ? , ? )' for 3 columns. In case you do not want all columns, you need to specify it in the query in the columns property Example: 'insert into <table_name> (id, name) values( ? , ? )' for inserting data into 2 columns: 'id' and 'name'.
Default value is : false
Default value is : 1000
If not provided, ? count need to match the from number of columns.
Default value is : false
Default value is : false
This property specifies the table name which will be used to retrieve the columns for the inserted values.
You can use it instead of specifying manually the columns in the columns property. In this case, the sql property can also be omitted, an INSERT statement would be generated automatically.
1 nested properties
Examples
Run clickhouse-local commands
id: clickhouse-local
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.clickhouse.ClickHouseLocalCLI
commands:
- SELECT count() FROM s3('https://datasets-documentation.s3.eu-west-3.amazonaws.com/house_parquet/house_0.parquet')
Default value is : false
Default value is : clickhouse/clickhouse-server:latest
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
Examples
Query a Clickhouse database.
id: clickhouse_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.clickhouse.Query
url: jdbc:clickhouse://127.0.0.1:56982/
username: ch_user
password: ch_password
sql: select * from clickhouse_types
fetchType: STORE
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.clickhouse.Trigger
interval: "PT5M"
url: jdbc:clickhouse://127.0.0.1:56982/
username: ch_user
password: ch_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Send a SQL query to a DB2 Database and fetch a row as output.
id: db2_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.db2.Query
url: jdbc:db2://127.0.0.1:50000/
username: db2inst
password: db2_password
sql: select * from db2_types
fetchType: FETCH_ONE
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.db2.Trigger
interval: "PT5M"
url: jdbc:db2://127.0.0.1:50000/
username: db2inst
password: db2_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Send a SQL query to a Dremio database and fetch a row as output.
id: dremio_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.dremio.Query
url: jdbc:dremio:direct=sql.dremio.cloud:443;ssl=true;PROJECT_ID=sampleProjectId;
username: dremio_token
password: samplePersonalAccessToken
sql: select * FROM source.database.table
fetchType: FETCH_ONE
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.dremio.Trigger
interval: "PT5M"
url: jdbc:dremio:direct=sql.dremio.cloud:443;ssl=true;PROJECT_ID=sampleProjectId;
username: dremio_token
password: samplePersonalAccessToken
sql: "SELECT * FROM source.database.my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Query an Apache Druid database.
id: druid_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.druid.Query
url: jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true
sql: |
SELECT *
FROM wikiticker
fetchType: STORE
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.druid.Trigger
interval: "PT5M"
url: jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Execute a query that reads a csv, and outputs another csv.
id: query_duckdb
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: "https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv"
- id: query
type: io.kestra.plugin.jdbc.duckdb.Query
url: 'jdbc:duckdb:'
timeZoneId: Europe/Paris
sql: |-
CREATE TABLE new_tbl AS SELECT * FROM read_csv_auto('{{ workingDir }}/in.csv', header=True);
COPY (SELECT order_id, customer_name FROM new_tbl) TO '{{ outputFiles.out }}' (HEADER, DELIMITER ',');
inputFiles:
in.csv: "{{ outputs.http_download.uri }}"
outputFiles:
- out
Execute a query that reads from an existing database file using a URL.
id: query_duckdb
namespace: company.team
tasks:
- id: query1
type: io.kestra.plugin.jdbc.duckdb.Query
url: jdbc:duckdb:/{{ vars.dbfile }}
sql: SELECT * FROM table_name;
fetchType: STORE
- id: query2
type: io.kestra.plugin.jdbc.duckdb.Query
url: jdbc:duckdb:/temp/folder/duck.db
sql: SELECT * FROM table_name;
fetchType: STORE
Execute a query that reads from an existing database file using the
databaseFilevariable.
id: query_duckdb
namespace: company.team
tasks:
- id: query1
type: io.kestra.plugin.jdbc.duckdb.Query
url: jdbc:duckdb:
databaseFile: {{ vars.dbfile }}
sql: SELECT * FROM table_name;
fetchType: STORE
- id: query2
type: io.kestra.plugin.jdbc.duckdb.Query
url: jdbc:duckdb:
databaseFile: /temp/folder/duck.db
sql: SELECT * FROM table_name;
fetchType: STORE
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Describe a files map that will be written and usable by DuckDb. You can reach files using a workingDir variable, example: SELECT * FROM read_csv_auto('{{ workingDir }}/myfile.csv');
Default value is : false
List of keys that will generate temporary files.
On the SQL query, you can just use a variable named outputFiles.key for the corresponding file.
If you add a file with ["first"], you can use the special vars COPY tbl TO '{{ outputFiles.first }}' (HEADER, DELIMITER ','); and use this file in others tasks using {{ outputs.taskId.outputFiles.first }}.
Default value is : false
The default value, jdbc:duckdb:, will use a local in-memory database.
Set this property when connecting to a persisted database instance, for example jdbc:duckdb:md:my_database?motherduck_token=<my_token> to connect to MotherDuck.
Default value is : "jdbc:duckdb:"
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.duckdb.Trigger
interval: "PT5M"
url: 'jdbc:duckdb:'
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
Default value is : jdbc:duckdb:null
1 nested properties
Examples
Fetch rows from a table, and bulk insert them to another one.
id: mysql_batch
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.mysql.Query
url: jdbc:mysql://127.0.0.1:3306/
username: mysql_user
password: mysql_password
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.mysql.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:mysql://127.0.0.1:3306/
username: mysql_user
password: mysql_password
sql: |
insert into xref values( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
Fetch rows from a table, and bulk insert them to another one, without using sql query.
id: mysql_batch
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.mysql.Query
url: jdbc:mysql://127.0.0.1:3306/
username: mysql_user
password: mysql_password
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.mysql.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:mysql://127.0.0.1:3306/
username: mysql_user
password: mysql_password
table: xref
The query must have as many question marks as the number of columns in the table. Example: 'insert into <table_name> values( ? , ? , ? )' for 3 columns. In case you do not want all columns, you need to specify it in the query in the columns property Example: 'insert into <table_name> (id, name) values( ? , ? )' for inserting data into 2 columns: 'id' and 'name'.
Default value is : false
Default value is : 1000
If not provided, ? count need to match the from number of columns.
Default value is : false
Default value is : false
This property specifies the table name which will be used to retrieve the columns for the inserted values.
You can use it instead of specifying manually the columns in the columns property. In this case, the sql property can also be omitted, an INSERT statement would be generated automatically.
1 nested properties
Examples
Send a SQL query to a MySQL Database and fetch a row as output.
id: mysql_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.mysql.Query
url: jdbc:mysql://127.0.0.1:3306/
username: mysql_user
password: mysql_password
sql: select * from mysql_types
fetchType: FETCH_ONE
Load a csv file into a MySQL table.
id: mysql_query
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv
- id: query
type: io.kestra.plugin.jdbc.mysql.Query
url: jdbc:mysql://127.0.0.1:3306/
username: mysql_user
password: mysql_password
inputFile: "{{ outputs.http_download.uri }}"
sql: |
LOAD DATA LOCAL INFILE '{{ inputFile }}'
INTO TABLE products
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The file must be from Kestra's internal storage
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.mysql.Trigger
interval: "PT5M"
url: jdbc:mysql://127.0.0.1:3306/
username: mysql_user
password: mysql_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Fetch rows from a table and bulk insert to another one
id: oracle_batch
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.oracle.Query
url: jdbc:oracle:thin:@dev:49161:XE
username: oracle
password: oracle_password
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.oracle.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:oracle:thin:@prod:49161:XE
username: oracle
password: oracle_password
sql: |
insert into xref values( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
Fetch rows from a table and bulk insert to another one, without using sql query
id: oracle_batch
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.oracle.Query
url: jdbc:oracle:thin:@dev:49161:XE
username: oracle
password: oracle_password
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.oracle.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:oracle:thin:@prod:49161:XE
username: oracle
password: oracle_password
table: XREF
The query must have as many question marks as the number of columns in the table. Example: 'insert into <table_name> values( ? , ? , ? )' for 3 columns. In case you do not want all columns, you need to specify it in the query in the columns property Example: 'insert into <table_name> (id, name) values( ? , ? )' for inserting data into 2 columns: 'id' and 'name'.
Default value is : false
Default value is : 1000
If not provided, ? count need to match the from number of columns.
Default value is : false
Default value is : false
This property specifies the table name which will be used to retrieve the columns for the inserted values.
You can use it instead of specifying manually the columns in the columns property. In this case, the sql property can also be omitted, an INSERT statement would be generated automatically.
1 nested properties
Examples
Execute a query and fetch results on another task to update another table.
id: oracle_query
namespace: company.team
tasks:
- id: select
type: io.kestra.plugin.jdbc.oracle.Query
url: jdbc:oracle:thin:@localhost:49161:XE
username: oracle_user
password: oracle_password
sql: select * from source
fetchType: FETCH
- id: generate_update
type: io.kestra.plugin.jdbc.oracle.Query
url: jdbc:oracle:thin:@localhost:49161:XE
username: oracle_user
password: oracle_password
sql: "{% for row in outputs.select.rows %} INSERT INTO destination (year_month, store_code, update_date) values ({{ row.year_month }}, {{ row.store_code }}, TO_DATE('{{ row.date }}', 'MONTH DD, YYYY') ); {% endfor %}"
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.oracle.Trigger
interval: "PT5M"
url: jdbc:oracle:thin:@localhost:49161:XE
username: oracle_user
password: oracle_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
id: pinot_query
namespace: company.team
tasks:
- id: query
type: o.kestra.plugin.jdbc.pinot.Query
url: jdbc:pinot://localhost:9000
sql: |
SELECT *
FROM airlineStats
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.pinot.Trigger
interval: "PT5M"
url: jdbc:pinot://localhost:9000
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Fetch rows from a table, and bulk insert them to another one.
id: postgres_bulk_insert
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.postgresql.Query
url: jdbc:postgresql://dev:56982/
username: pg_user
password: pg_password
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.postgresql.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:postgresql://prod:56982/
username: pg_user
password: pg_password
sql: |
insert into xref values( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
Fetch rows from a table, and bulk insert them to another one, without using sql query.
id: postgres_bulk_insert
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.postgresql.Query
url: jdbc:postgresql://dev:56982/
username: pg_user
password: pg_password
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.postgresql.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:postgresql://prod:56982/
username: pg_user
password: pg_password
table: xre
The query must have as many question marks as the number of columns in the table. Example: 'insert into <table_name> values( ? , ? , ? )' for 3 columns. In case you do not want all columns, you need to specify it in the query in the columns property Example: 'insert into <table_name> (id, name) values( ? , ? )' for inserting data into 2 columns: 'id' and 'name'.
Default value is : false
Default value is : 1000
If not provided, ? count need to match the from number of columns.
Default value is : false
Default value is : false
Default value is : false
Must be a PEM encoded certificate
Must be a PEM encoded key
Must be a PEM encoded certificate
This property specifies the table name which will be used to retrieve the columns for the inserted values.
You can use it instead of specifying manually the columns in the columns property. In this case, the sql property can also be omitted, an INSERT statement would be generated automatically.
1 nested properties
Copies in CSV, Text, or Binary data into PostgreSQL table.##### Examples
Load CSV data into a PostgreSQL table.
id: postgres_copy_in
namespace: company.team
tasks:
- id: copy_in
type: io.kestra.plugin.jdbc.postgresql.CopyIn
url: jdbc:postgresql://127.0.0.1:56982/
username: pg_user
password: pg_password
format: CSV
from: "{{ outputs.export.uri }}"
table: my_destination_table
header: true
delimiter: "\t"
Default value is : false
If no column list is specified, all columns of the table will be copied.
The default is a tab character in text format, a comma in CSV format. This must be a single one-byte character. This option is not allowed when using binary.
Default value is : false
If this option is omitted, the current client encoding is used. See the Notes below for more details.
The default is the same as the QUOTE value (so that the quoting character is doubled if it appears in the data). This must be a single one-byte character. This option is allowed only when using CSV format.
In the default case where the null string is empty, this means that empty values will be read as zero-length strings rather than nulls, even when they are not quoted. This option is allowed only in COPY FROM, and only when using CSV format.
In the default case where the null string is empty, this converts a quoted empty string into NULL. This option is allowed only in COPY FROM, and only when using CSV format.
NULL output is never quoted. If * is specified, non-NULL values will be quoted in all columns. This option is allowed only in COPY TO, and only when using CSV format.
Default value is : TEXT
This is intended as a performance option for initial data loading. Rows will be frozen only if the table being loaded has been created or truncated in the current sub-transaction, there are no cursors open and there are no older snapshots held by this transaction. It is currently not possible to perform a COPY FREEZE on a partitioned table.
Note that all other sessions will immediately be able to see the data once it has been successfully loaded. This violates the normal rules of MVCC visibility and users specifying should be aware of the potential problems this might cause.
On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV.
Default value is : false
The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings. This option is not allowed when using binary format.
An error is raised if OIDs is specified for a table that does not have OIDs, or in the case of copying a query.
The default is double-quote. This must be a single one-byte character. This option is allowed only when using CSV format.
Default value is : false
Must be a PEM encoded certificate
Must be a PEM encoded key
Must be a PEM encoded certificate
1 nested properties
Examples
Export a PostgreSQL table or query to a CSV or TSV file.
id: postgres_copy_out
namespace: company.team
tasks:
- id: copy_out
type: io.kestra.plugin.jdbc.postgresql.CopyOut
url: jdbc:postgresql://127.0.0.1:56982/
username: pg_user
password: pg_password
format: CSV
sql: SELECT 1 AS int, 't'::bool AS bool UNION SELECT 2 AS int, 'f'::bool AS bool
header: true
delimiter: "\t"
Default value is : false
If no column list is specified, all columns of the table will be copied.
The default is a tab character in text format, a comma in CSV format. This must be a single one-byte character. This option is not allowed when using binary.
Default value is : false
If this option is omitted, the current client encoding is used. See the Notes below for more details.
The default is the same as the QUOTE value (so that the quoting character is doubled if it appears in the data). This must be a single one-byte character. This option is allowed only when using CSV format.
In the default case where the null string is empty, this means that empty values will be read as zero-length strings rather than nulls, even when they are not quoted. This option is allowed only in COPY FROM, and only when using CSV format.
In the default case where the null string is empty, this converts a quoted empty string into NULL. This option is allowed only in COPY FROM, and only when using CSV format.
NULL output is never quoted. If * is specified, non-NULL values will be quoted in all columns. This option is allowed only in COPY TO, and only when using CSV format.
Default value is : TEXT
This is intended as a performance option for initial data loading. Rows will be frozen only if the table being loaded has been created or truncated in the current sub-transaction, there are no cursors open and there are no older snapshots held by this transaction. It is currently not possible to perform a COPY FREEZE on a partitioned table.
Note that all other sessions will immediately be able to see the data once it has been successfully loaded. This violates the normal rules of MVCC visibility and users specifying should be aware of the potential problems this might cause.
On output, the first line contains the column names from the table, and on input, the first line is ignored. This option is allowed only when using CSV.
Default value is : false
The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings. This option is not allowed when using binary format.
An error is raised if OIDs is specified for a table that does not have OIDs, or in the case of copying a query.
The default is double-quote. This must be a single one-byte character. This option is allowed only when using CSV format.
For INSERT, UPDATE and DELETE queries a RETURNING clause must be provided, and the target relation must not have a conditional rule, nor an ALSO rule, nor an INSTEAD rule that expands to multiple statements.
Default value is : false
Must be a PEM encoded certificate
Must be a PEM encoded key
Must be a PEM encoded certificate
1 nested properties
Examples
Execute a query and fetch results in a task, and update another table with fetched results in a different task.
id: postgres_query
namespace: company.team
tasks:
- id: fetch
type: io.kestra.plugin.jdbc.postgresql.Query
url: jdbc:postgresql://127.0.0.1:56982/
username: pg_user
password: pg_password
sql: select concert_id, available, a, b, c, d, play_time, library_record, floatn_test, double_test, real_test, numeric_test, date_type, time_type, timez_type, timestamp_type, timestampz_type, interval_type, pay_by_quarter, schedule, json_type, blob_type from pgsql_types
fetchType: FETCH
- id: use_fetched_data
type: io.kestra.plugin.jdbc.postgresql.Query
url: jdbc:postgresql://127.0.0.1:56982/
username: pg_user
password: pg_password
sql: "{% for row in outputs.fetch.rows %} INSERT INTO pl_store_distribute (year_month,store_code, update_date) values ({{row.play_time}}, {{row.concert_id}}, TO_TIMESTAMP('{{row.timestamp_type}}', 'YYYY-MM-DDTHH:MI:SS.US') ); {% endfor %}"
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
Must be a PEM encoded certificate
Must be a PEM encoded key
Must be a PEM encoded certificate
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.postgresql.Trigger
interval: "PT5M"
url: jdbc:postgresql://127.0.0.1:56982/
username: pg_user
password: pg_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
Must be a PEM encoded certificate
Must be a PEM encoded key
Must be a PEM encoded certificate
Default value is : false
1 nested properties
Examples
Send a SQL query to a Redshift database and fetch a row as output.
id: redshift_query
namespace: company.team
tasks:
- id: select
type: io.kestra.plugin.jdbc.redshift.Query
url: jdbc:redshift://123456789.eu-central-1.redshift-serverless.amazonaws.com:5439/dev
username: admin
password: admin_password
sql: select * from redshift_types
fetchType: FETCH_ONE
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.redshift.Trigger
interval: "PT5M"
url: jdbc:redshift://123456789.eu-central-1.redshift-serverless.amazonaws.com:5439/dev
username: admin
password: admin_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
id: snowflake_download
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.jdbc.snowflake.Download
url: jdbc:snowflake://<account_identifier>.snowflakecomputing.com
username: snowflake_user
password: snowflake_password
stageName: "@demo_db.public.%myStage"
fileName: prefix/destFile.csv
~ or table name or stage name.
Default value is : false
Default value is : true
The specified database should be an existing database for which the specified default role has privileges.
If you need to use a different database after connecting, execute the USE DATABASE command.
Default value is : false
Default value is : false
It needs to be an un-encoded private key in plaintext.
It needs to be the path on the host where the private key file is located.
The specified role should be an existing role that has already been assigned to the specified user for the driver. If the specified role has not already been assigned to the user, the role is not used when the session is initiated by the driver.
If you need to use a different role after connecting, execute the USE ROLE command.
The specified schema should be an existing schema for which the specified default role has privileges.
If you need to use a different schema after connecting, execute the USE SCHEMA command.
The specified warehouse should be an existing warehouse for which the specified default role has privileges.
If you need to use a different warehouse after connecting, execute the USE WAREHOUSE command to set a different warehouse for the session.
1 nested properties
Examples
Execute a query and fetch results in a task, and update another table with fetched results in a different task.
id: snowflake_query
namespace: company.team
tasks:
- id: select
type: io.kestra.plugin.jdbc.snowflake.Query
url: jdbc:snowflake://<account_identifier>.snowflakecomputing.com
username: snowflake_user
password: snowflake_password
sql: select * from demo_db.public.customers
fetchType: FETCH
- id: generate_update
type: io.kestra.plugin.jdbc.snowflake.Query
url: jdbc:snowflake://<account_identifier>.snowflakecomputing.com
username: snowflake_user
password: snowflake_password
sql: "INSERT INTO demo_db.public.customers_new (year_month, store_code, update_date) values {% for row in outputs.update.rows %} ({{ row.year_month }}, {{ row.store_code }}, TO_DATE('{{ row.date }}', 'MONTH DD, YYYY') ) {% if not loop.last %}, {% endif %}; {% endfor %}"
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
The specified database should be an existing database for which the specified default role has privileges.
If you need to use a different database after connecting, execute the USE DATABASE command.
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
It needs to be an un-encoded private key in plaintext.
It needs to be the path on the host where the private key file is located.
The specified role should be an existing role that has already been assigned to the specified user for the driver. If the specified role has not already been assigned to the user, the role is not used when the session is initiated by the driver.
If you need to use a different role after connecting, execute the USE ROLE command.
The specified schema should be an existing schema for which the specified default role has privileges.
If you need to use a different schema after connecting, execute the USE SCHEMA command.
Default value is : false
The specified warehouse should be an existing warehouse for which the specified default role has privileges.
If you need to use a different warehouse after connecting, execute the USE WAREHOUSE command to set a different warehouse for the session.
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.snowflake.Trigger
interval: "PT5M"
url: jdbc:snowflake://<account_identifier>.snowflakecomputing.com
username: snowflake_user
password: snowflake_password
sql: "SELECT * FROM demo_db.public.customers"
fetchType: FETCH
The specified database should be an existing database for which the specified default role has privileges.
If you need to use a different database after connecting, execute the USE DATABASE command.
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
It needs to be an un-encoded private key in plaintext.
It needs to be the path on the host where the private key file is located.
The specified role should be an existing role that has already been assigned to the specified user for the driver. If the specified role has not already been assigned to the user, the role is not used when the session is initiated by the driver.
If you need to use a different role after connecting, execute the USE ROLE command.
The specified schema should be an existing schema for which the specified default role has privileges.
If you need to use a different schema after connecting, execute the USE SCHEMA command.
Default value is : false
The specified warehouse should be an existing warehouse for which the specified default role has privileges.
If you need to use a different warehouse after connecting, execute the USE WAREHOUSE command to set a different warehouse for the session.
1 nested properties
Examples
id: snowflake_upload
namespace: company.team
tasks:
- id: upload
type: io.kestra.plugin.jdbc.snowflake.Upload
url: jdbc:snowflake://<account_identifier>.snowflakecomputing.com
username: snowflake_user
password: snowflake_password
from: '{{ outputs.extract.uri }}'
fileName: data.csv
prefix: raw
stageName: "@demo_db.public.%myStage"
This can either be a stage name or a table name.
Default value is : false
Default value is : true
The specified database should be an existing database for which the specified default role has privileges.
If you need to use a different database after connecting, execute the USE DATABASE command.
Default value is : false
Default value is : false
It needs to be an un-encoded private key in plaintext.
It needs to be the path on the host where the private key file is located.
The specified role should be an existing role that has already been assigned to the specified user for the driver. If the specified role has not already been assigned to the user, the role is not used when the session is initiated by the driver.
If you need to use a different role after connecting, execute the USE ROLE command.
The specified schema should be an existing schema for which the specified default role has privileges.
If you need to use a different schema after connecting, execute the USE SCHEMA command.
The specified warehouse should be an existing warehouse for which the specified default role has privileges.
If you need to use a different warehouse after connecting, execute the USE WAREHOUSE command to set a different warehouse for the session.
1 nested properties
Examples
Execute a query and pass the results to another task.
id: sqlite_query
namespace: company.team
tasks:
- id: update
type: io.kestra.plugin.jdbc.sqlite.Query
url: jdbc:sqlite:myfile.db
sql: select concert_id, available, a, b, c, d, play_time, library_record, floatn_test, double_test, real_test, numeric_test, date_type, time_type, timez_type, timestamp_type, timestampz_type, interval_type, pay_by_quarter, schedule, json_type, blob_type from pgsql_types
fetchType: FETCH
- id: use_fetched_data
type: io.kestra.plugin.jdbc.sqlite.Query
url: jdbc:sqlite:myfile.db
sql: "{% for row in outputs.update.rows %} INSERT INTO pl_store_distribute (year_month,store_code, update_date) values ({{row.play_time}}, {{row.concert_id}}, TO_TIMESTAMP('{{row.timestamp_type}}', 'YYYY-MM-DDTHH:MI:SS.US') ); {% endfor %}"
Execute a query, using existing sqlite file, and pass the results to another task.
id: sqlite_query_using_file
namespace: company.team
tasks:
- id: update
type: io.kestra.plugin.jdbc.sqlite.Query
url: jdbc:sqlite:myfile.db
sqliteFile: {{ outputs.get.outputFiles['myfile.sqlite'] }}
sql: select * from pgsql_types
fetchType: FETCH
- id: use_fetched_data
type: io.kestra.plugin.jdbc.sqlite.Query
url: jdbc:sqlite:myfile.db
sqliteFile: {{ outputs.get.outputFiles['myfile.sqlite'] }}
sql: "{% for row in outputs.update.rows %} INSERT INTO pl_store_distribute (year_month,store_code, update_date) values ({{row.play_time}}, {{row.concert_id}}, TO_TIMESTAMP('{{row.timestamp_type}}', 'YYYY-MM-DDTHH:MI:SS.US') ); {% endfor %}"
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
The file must be from Kestra's internal storage
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.sqlite.Trigger
interval: "PT5M"
url: jdbc:sqlite:myfile.db
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Fetch rows from a table and bulk insert to another one.
id: sqlserver_batch_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.sqlserver.Query
url: jdbc:sqlserver://dev:41433;trustServerCertificate=true
username: sql_server_user
password: sql_server_password
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.sqlserver.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:sqlserver://prod:41433;trustServerCertificate=true
username: sql_server_user
password: sql_server_password
sql: |
insert into xref values( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
Fetch rows from a table and bulk insert to another one, without using sql query.
id: sqlserver_batch_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.sqlserver.Query
url: jdbc:sqlserver://dev:41433;trustServerCertificate=true
username: sql_server_user
password: sql_server_passwd
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.sqlserver.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:sqlserver://prod:41433;trustServerCertificate=true
username: sql_server_user
password: sql_server_passwd
table: xref
The query must have as many question marks as the number of columns in the table. Example: 'insert into <table_name> values( ? , ? , ? )' for 3 columns. In case you do not want all columns, you need to specify it in the query in the columns property Example: 'insert into <table_name> (id, name) values( ? , ? )' for inserting data into 2 columns: 'id' and 'name'.
Default value is : false
Default value is : 1000
If not provided, ? count need to match the from number of columns.
Default value is : false
Default value is : false
This property specifies the table name which will be used to retrieve the columns for the inserted values.
You can use it instead of specifying manually the columns in the columns property. In this case, the sql property can also be omitted, an INSERT statement would be generated automatically.
1 nested properties
Examples
Execute a query and fetch results in a task, and update another table with fetched results in a different task.
id: sqlserver_query
namespace: company.team
tasks:
- id: select
type: io.kestra.plugin.jdbc.sqlserver.Query
url: jdbc:sqlserver://localhost:41433;trustServerCertificate=true
username: sql_server_user
password: sql_server_password
sql: select * from source
fetchType: FETCH
- id: generate_update
type: io.kestra.plugin.jdbc.sqlserver.Query
url: jdbc:sqlserver://localhost:41433;trustServerCertificate=true
username: sql_server_user
password: sql_server_password
sql: "{% for row in outputs.update.rows %} INSERT INTO destination (year_month, store_code, update_date) values ({{row.year_month}}, {{row.store_code}}, '{{row.date}}'); {% endfor %}"
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.sqlserver.Trigger
interval: "PT5M"
url: jdbc:sqlserver://localhost:41433;trustServerCertificate=true
username: sql_server_user
password: sql_server_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Send a SQL query to a Sybase Database and fetch a row as output.
id: sybase_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.sybase.Query
url: jdbc:sybase:Tds:127.0.0.1:5000/
username: syb_user
password: syb_password
sql: select * from syb_types
fetchType: FETCH_ONE
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.sybase.Trigger
interval: "PT5M"
url: jdbc:sybase:Tds:127.0.0.1:5000/
username: syb_user
password: syb_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Execute a query and fetch results to pass it to downstream tasks.
id: trino_query
namespace: company.team
tasks:
- id: analyze_orders
type: io.kestra.plugin.jdbc.trino.Query
url: jdbc:trino://localhost:8080/tpch
username: trino_user
password: trino_password
sql: |
select orderpriority as priority, sum(totalprice) as total
from tpch.tiny.orders
group by orderpriority
order by orderpriority
fetchType: FETCH
fetchType: STORE
- id: csv_report
type: io.kestra.plugin.serdes.csv.IonToCsv
from: "{{ outputs.analyze_orders.uri }}"
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.trino.Trigger
interval: "PT5M"
url: jdbc:trino://localhost:8080/tpch
username: trino_user
password: trino_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Fetch rows from a table and bulk insert to another one.
id: vectorwise_batch_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.vectorwise.Query
url: jdbc:vectorwise://dev:port/base
username: admin
password: admin_password
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.vectorwise.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:vectorwise://prod:port/base
username: admin
password: admin_password
sql: insert into xref values( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
Fetch rows from a table and bulk insert to another one without using sql query.
id: vectorwise_batch_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.vectorwise.Query
url: jdbc:vectorwise://dev:port/base
username: admin
password: admin_passwd
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.vectorwise.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:vectorwise://prod:port/base
username: admin
password: admin_passwd
table: xref
The query must have as many question marks as the number of columns in the table. Example: 'insert into <table_name> values( ? , ? , ? )' for 3 columns. In case you do not want all columns, you need to specify it in the query in the columns property Example: 'insert into <table_name> (id, name) values( ? , ? )' for inserting data into 2 columns: 'id' and 'name'.
Default value is : false
Default value is : 1000
If not provided, ? count need to match the from number of columns.
Default value is : false
Default value is : false
This property specifies the table name which will be used to retrieve the columns for the inserted values.
You can use it instead of specifying manually the columns in the columns property. In this case, the sql property can also be omitted, an INSERT statement would be generated automatically.
1 nested properties
Examples
Send a SQL query to a Vectorwise database and fetch a row as output.
id: vectorwise_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.vectorwise.Query
url: jdbc:vectorwise://url:port/base
username: admin
password: admin_password
sql: select * from vectorwise_types
fetchType: FETCH_ONE
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.vectorwise.Trigger
interval: "PT5M"
url: jdbc:vectorwise://url:port/base
username: admin
password: admin_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Fetch rows from a table and bulk insert to another one.
id: vertica_batch_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.vertica.Query
url: jdbc:vertica://dev:56982/db
username: vertica_user
password: vertica_password
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: FETCH
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.vertica.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:vertica://prod:56982/db
username: vertica_user
password: vertica_password
sql: insert into xref values( ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
Fetch rows from a table and bulk insert to another one, without using sql query.
id: vertica_batch_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.vertica.Query
url: jdbc:vertica://dev:56982/db
username: vertica_user
password: vertica_passwd
sql: |
SELECT *
FROM xref
LIMIT 1500;
fetchType: FETCH
fetchType: STORE
- id: update
type: io.kestra.plugin.jdbc.vertica.Batch
from: "{{ outputs.query.uri }}"
url: jdbc:vertica://prod:56982/db
username: vertica_user
password: vertica_passwd
table: xref
The query must have as many question marks as the number of columns in the table. Example: 'insert into <table_name> values( ? , ? , ? )' for 3 columns. In case you do not want all columns, you need to specify it in the query in the columns property Example: 'insert into <table_name> (id, name) values( ? , ? )' for inserting data into 2 columns: 'id' and 'name'.
Default value is : false
Default value is : 1000
If not provided, ? count need to match the from number of columns.
Default value is : false
Default value is : false
This property specifies the table name which will be used to retrieve the columns for the inserted values.
You can use it instead of specifying manually the columns in the columns property. In this case, the sql property can also be omitted, an INSERT statement would be generated automatically.
1 nested properties
Examples
Send a SQL query to a Vertica database, and fetch a row as output.
id: vertica_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.jdbc.vertica.Query
url: jdbc:vertica://127.0.0.1:56982/db
username: vertica_user
password: vertica_password
sql: select * from customer
fetchType: FETCH_ONE
Default value is : false
Sets this connection's auto-commit mode to the given state. If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback. By default, new connections are in auto-commit mode except when you are using store property in which case the auto-commit will be disabled.
Default value is : true
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
Default value is : false
Default value is : false
1 nested properties
Examples
Wait for a SQL query to return results, and then iterate through rows.
id: jdbc_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.jdbc.vertica.Trigger
interval: "PT5M"
url: jdbc:vertica://127.0.0.1:56982/db
username: vertica_user
password: vertica_password
sql: "SELECT * FROM my_table"
fetchType: FETCH
Default value is : false
Default value is : false
Default value is : false
Gives the JDBC driver a hint as to the number of rows that should be fetched from the database when more rows are needed for this ResultSet object. If the fetch size specified is zero, the JDBC driver ignores the value and is free to make its own best guess as to what the fetch size should be. Ignored if autoCommit is false.
Default value is : 10000
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : NONE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Create a jira ticket on a failed flow execution using basic authentication.
id: jira_flow
namespace: company.myteam
tasks:
- id: create_issue
type: io.kestra.plugin.jira.issues.Create
baseUrl: your-domain.atlassian.net
username: [email protected]
password: "{{ secret('your_jira_api_token') }}"
projectKey: myproject
summary: "Workflow failed"
description: "{{ execution.id }} has failed on {{ taskrun.startDate }} See the link below for more details"
labels:
- bug
- workflow
Create a jira ticket on a failed flow execution using OAUTH2 access token authentication.
id: jira_flow
namespace: company.myteam
tasks:
- id: create_issue
type: io.kestra.plugin.jira.issues.Create
baseUrl: your-domain.atlassian.net
accessToken: "{{ secret('your_jira_access_token') }}"
projectKey: myproject
summary: "Workflow failed"
description: "{{ execution.id }} has failed on {{ taskrun.startDate }} See the link below for more details"
labels:
- bug
- workflow
(Required for OAuth authorization)
Default value is : false
Default value is : false
Default value is : false
(Required for basic & API token authorization)
(Required for basic & API token authorization)
1 nested properties
Examples
Comment on a jira ticket on a failed flow execution.
id: jira_flow
namespace: company.myteam
tasks:
- id: create_comment_on_a_ticket
type: io.kestra.plugin.jira.issues.CreateComment
baseUrl: your-domain.atlassian.net
username: [email protected]
password: "{{ secret('jira_api_token') }}"
projectKey: project_key
issueIdOrKey: "TID-53"
body: "This ticket is not moving, do we need to outsource this!"
(Required for OAuth authorization)
Default value is : false
Default value is : false
Default value is : false
(Required for basic & API token authorization)
(Required for basic & API token authorization)
1 nested properties
Update specific fields in a Jira ticket.##### Examples
Update a Jira ticket fields
id: jira_update_field
namespace: company.myteam
tasks:
- id: update_ticket_field
type: io.kestra.plugin.jira.issues.UpdateFields
baseUrl: your-domain.atlassian.net
username: [email protected]
password: "{{ secret('your_jira_api_token') }}"
issueIdOrKey: YOUR_ISSUE_KEY
fields:
description: "Updated description of: {{ execution.id }}"
customfield_10005: "Updated value"
(Required for OAuth authorization)
Default value is : false
Default value is : false
Default value is : false
(Required for basic & API token authorization)
(Required for basic & API token authorization)
1 nested properties
Examples
id: kafka_consume
namespace: company.team
tasks:
- id: consume
type: io.kestra.plugin.kafka.Consume
topic: test_kestra
properties:
bootstrap.servers: localhost:9092
serdeProperties:
schema.registry.url: http://localhost:8085
keyDeserializer: STRING
valueDeserializer: AVRO
Connect to a Kafka cluster with SSL.
id: kafka_consume
namespace: company.team
tasks:
- id: consume
type: io.kestra.plugin.kafka.Consume
properties:
security.protocol: SSL
bootstrap.servers: localhost:19092
ssl.key.password: my-ssl-password
ssl.keystore.type: PKCS12
ssl.keystore.location: my-base64-encoded-keystore
ssl.keystore.password: my-ssl-password
ssl.truststore.location: my-base64-encoded-truststore
ssl.truststore.password: my-ssl-password
topic:
- kestra_workerinstance
keyDeserializer: STRING
valueDeserializer: STRING
The bootstrap.servers property is a minimal required configuration to connect to a Kafka topic.
This property can reference any valid Consumer Configs or Producer Configs
as key-value pairs.
If you want to pass a truststore or a keystore, you must provide a base64 encoded string for ssl.keystore.location and ssl.truststore.location.
Default value is : false
Default value is : false
Using a consumer group, we will fetch only records that haven't been consumed yet.
Possible values are: STRING, INTEGER, FLOAT, DOUBLE, LONG, SHORT, BYTE_ARRAY, BYTE_BUFFER, BYTES, UUID, VOID, AVRO, JSON.
Default value is : STRING
Default value is : false
It's a soft limit evaluated every second.
It's a soft limit evaluated every second.
Manually assign a list of partitions to the consumer.
If no records are available, the maximum wait duration to wait for new records.
Default value is : 5.000000000
Configuration that will be passed to serializer or deserializer. The avro.use.logical.type.converters is always passed when you have any values set to true.
Default value is : {}
{}
By default, we consume all messages from the topics with no consumer group or depending on the configuration of the auto.offset.reset property. However, you can provide an arbitrary start time.
This property is ignored if a consumer group is used.
It must be a valid ISO 8601 date.
It can be a string or a list of strings to consume from one or multiple topics.
Consumer will subscribe to all topics matching the specified pattern to get dynamically assigned partitions.
Possible values are: STRING, INTEGER, FLOAT, DOUBLE, LONG, SHORT, BYTE_ARRAY, BYTE_BUFFER, BYTES, UUID, VOID, AVRO, JSON.
Default value is : STRING
1 nested properties
Examples
Read a CSV file, transform it and send it to Kafka.
id: send_message_to_kafka
namespace: company.team
inputs:
- id: file
type: FILE
description: A CSV file with columns: id, username, tweet, and timestamp.
tasks:
- id: csv_to_ion
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ inputs.file }}"
- id: ion_to_avro_schema
type: io.kestra.plugin.scripts.nashorn.FileTransform
from: "{{ outputs.csv_to_ion.uri }}"
script: |
var result = {
"key": row.id,
"value": {
"username": row.username,
"tweet": row.tweet
},
"timestamp": row.timestamp,
"headers": {
"key": "value"
}
};
row = result
- id: avro_to_kafka
type: io.kestra.plugin.kafka.Produce
from: "{{ outputs.ion_to_avro_schema.uri }}"
keySerializer: STRING
properties:
bootstrap.servers: localhost:9092
serdeProperties:
schema.registry.url: http://localhost:8085
topic: test_kestra
valueAvroSchema: |
{"type":"record","name":"twitter_schema","namespace":"io.kestra.examples","fields":[{"name":"username","type":"string"},{"name":"tweet","type":"string"}]}
valueSerializer: AVRO
Can be a Kestra internal storage URI, a map (i.e. a list of key-value pairs) or a list of maps. The following keys are supported: key, value, partition, timestamp, and headers.
The bootstrap.servers property is a minimal required configuration to connect to a Kafka topic.
This property can reference any valid Consumer Configs or Producer Configs
as key-value pairs.
If you want to pass a truststore or a keystore, you must provide a base64 encoded string for ssl.keystore.location and ssl.truststore.location.
Default value is : false
Default value is : false
Possible values are: STRING, INTEGER, FLOAT, DOUBLE, LONG, SHORT, BYTE_ARRAY, BYTE_BUFFER, BYTES, UUID, VOID, AVRO, JSON.
Default value is : STRING
Default value is : false
Configuration that will be passed to serializer or deserializer. The avro.use.logical.type.converters is always passed when you have any values set to true.
Default value is : {}
{}
Could also be passed inside the from property using the key topic.
Default value is : true
Possible values are: STRING, INTEGER, FLOAT, DOUBLE, LONG, SHORT, BYTE_ARRAY, BYTE_BUFFER, BYTES, UUID, VOID, AVRO, JSON.
Default value is : STRING
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.kafka.Trigger instead.##### Examples
Consume a message from a Kafka topic in real time.
id: kafka_realtime_trigger
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.value }}"
triggers:
- id: realtime_trigger
type: io.kestra.plugin.kafka.RealtimeTrigger
topic: test_kestra
properties:
bootstrap.servers: localhost:9092
serdeProperties:
schema.registry.url: http://localhost:8085
keyDeserializer: STRING
valueDeserializer: AVRO
groupId: kafkaConsumerGroupId
Using a consumer group, we will fetch only records that haven't been consumed yet.
The bootstrap.servers property is a minimal required configuration to connect to a Kafka topic.
This property can reference any valid Consumer Configs or Producer Configs
as key-value pairs.
If you want to pass a truststore or a keystore, you must provide a base64 encoded string for ssl.keystore.location and ssl.truststore.location.
Default value is : false
Possible values are: STRING, INTEGER, FLOAT, DOUBLE, LONG, SHORT, BYTE_ARRAY, BYTE_BUFFER, BYTES, UUID, VOID, AVRO, JSON.
Default value is : STRING
Default value is : false
Manually assign a list of partitions to the consumer.
Configuration that will be passed to serializer or deserializer. The avro.use.logical.type.converters is always passed when you have any values set to true.
Default value is : {}
{}
By default, we consume all messages from the topics with no consumer group or depending on the configuration of the auto.offset.reset property. However, you can provide an arbitrary start time.
This property is ignored if a consumer group is used.
It must be a valid ISO 8601 date.
It can be a string or a list of strings to consume from one or multiple topics.
Consumer will subscribe to all topics matching the specified pattern to get dynamically assigned partitions.
Possible values are: STRING, INTEGER, FLOAT, DOUBLE, LONG, SHORT, BYTE_ARRAY, BYTE_BUFFER, BYTES, UUID, VOID, AVRO, JSON.
Default value is : STRING
1 nested properties
Note that you don't need an extra task to consume the message from the event trigger. The trigger will automatically consume messages and you can retrieve their content in your flow using the {{ trigger.uri }} variable. If you would like to consume each message from a Kafka topic in real-time and create one execution per message, you can use the io.kestra.plugin.kafka.RealtimeTrigger instead.##### Examples
id: kafka_trigger
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.value }}"
triggers:
- id: trigger
type: io.kestra.plugin.kafka.Trigger
topic: test_kestra
properties:
bootstrap.servers: localhost:9092
serdeProperties:
schema.registry.url: http://localhost:8085
keyDeserializer: STRING
valueDeserializer: AVRO
interval: PT30S
maxRecords: 5
groupId: kafkaConsumerGroupId
Using a consumer group, we will fetch only records that haven't been consumed yet.
The bootstrap.servers property is a minimal required configuration to connect to a Kafka topic.
This property can reference any valid Consumer Configs or Producer Configs
as key-value pairs.
If you want to pass a truststore or a keystore, you must provide a base64 encoded string for ssl.keystore.location and ssl.truststore.location.
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Possible values are: STRING, INTEGER, FLOAT, DOUBLE, LONG, SHORT, BYTE_ARRAY, BYTE_BUFFER, BYTES, UUID, VOID, AVRO, JSON.
Default value is : STRING
Default value is : false
It's a soft limit evaluated every second.
It's a soft limit evaluated every second.
Manually assign a list of partitions to the consumer.
If no records are available, the maximum wait duration to wait for new records.
Default value is : 5.000000000
Configuration that will be passed to serializer or deserializer. The avro.use.logical.type.converters is always passed when you have any values set to true.
Default value is : {}
{}
By default, we consume all messages from the topics with no consumer group or depending on the configuration of the auto.offset.reset property. However, you can provide an arbitrary start time.
This property is ignored if a consumer group is used.
It must be a valid ISO 8601 date.
It can be a string or a list of strings to consume from one or multiple topics.
Consumer will subscribe to all topics matching the specified pattern to get dynamically assigned partitions.
Possible values are: STRING, INTEGER, FLOAT, DOUBLE, LONG, SHORT, BYTE_ARRAY, BYTE_BUFFER, BYTES, UUID, VOID, AVRO, JSON.
Default value is : STRING
1 nested properties
Examples
Launch a Pod
id: kubernetes_pod_create
namespace: company.team
tasks:
- id: pod_create
type: io.kestra.plugin.kubernetes.PodCreate
namespace: default
metadata:
labels:
my-label: my-value
spec:
containers:
- name: unittest
image: debian:stable-slim
command:
- 'bash'
- '-c'
- 'for i in {1..10}; do echo $i; sleep 0.1; done'
restartPolicy: Never
Launch a Pod with input files and gather its output files.
id: kubernetes
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: kubernetes
type: io.kestra.plugin.kubernetes.PodCreate
spec:
containers:
- name: unittest
image: centos
command:
- cp
- "{{workingDir}}/data.txt"
- "{{workingDir}}/out.txt"
restartPolicy: Never
waitUntilRunning: PT3M
inputFiles:
data.txt: "{{inputs.file}}"
outputFiles:
- out.txt
Default value is : false
Default value is : true
Default value is : false
The files will be available inside the kestra/working-dir directory of the container. You can use the special variable {{workingDir}} in your command to refer to it.
Default value is : false
Default value is : default
Only files created inside the kestra/working-dir directory of the container can be retrieved.
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt..
Default value is : true
Default value is : 2.000000000
Default value is : 3600.000000000
This timeout is the maximum time that Kubernetes scheduler will take to
- schedule the job
- pull the pod image
- and start the pod.
Default value is : 600.000000000
1 nested properties
Examples
Apply a Kubernetes resource, using YAML.
id: create_or_replace_deployment
namespace: company.team
tasks:
- id: apply
type: io.kestra.plugin.kubernetes.kubectl.Apply
namespace: default
spec: |-
apiVersion: apps/v1
kind: Deployment
metadata:
name: mypod
Apply a Kubernetes resource, using a namespace file.
id: create_or_replace_deployment
namespace: company.team
tasks:
- id: apply
type: io.kestra.plugin.kubernetes.kubectl.Apply
namespaceFiles:
enabled: true
namespace: default
spec: "{{ read('deployment.yaml') }}"
Default value is : false
Default value is : false
The files will be available inside the kestra/working-dir directory of the container. You can use the special variable {{workingDir}} in your command to refer to it.
Default value is : false
Only files created inside the kestra/working-dir directory of the container can be retrieved.
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt..
Default value is : 3600.000000000
This timeout is the maximum time that Kubernetes scheduler will take to
- schedule the job
- pull the pod image
- and start the pod.
Default value is : 600.000000000
1 nested properties
Default value is : v1
default is RSA
Default value is : RSA
Default value is : https://kubernetes.default.svc
Default value is : busybox
Creates a new entry, if allowed, for each line of provided LDIF files.##### Examples
Insert entries in LDAP server.
id: ldap_add
namespace: company.team
tasks:
- id: add
type: io.kestra.plugin.ldap.Add
description: What your task is supposed to do and why.
userDn: cn=admin,dc=orga,dc=en
password: admin
inputs:
- "{{outputs.someTask.uri_of_ldif_formated_file}}"
hostname: 0.0.0.0
port: 18060
Hostname for connection.
List of URI(s) of file(s) containing LDIF formatted entries to input into LDAP.
User password for connection.
A whole number describing the port for connection.
User DN for connection.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Remove entries based on a targeted DN list.##### Examples
id: ldap_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.ldap.Delete
description: What your task is supposed to do and why.
userDn: cn=admin,dc=orga,dc=fr
password: admin
inputs:
- "{{ outputs.some_task.uri_of_ldif_formated_file }}"
hostname: 0.0.0.0
port: 15060
Hostname for connection.
Targeted DN(s) in the LDAP.
User password for connection.
A whole number describing the port for connection.
User DN for connection.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Transform .ion files to .ldif ones.##### Examples
YAML: Make LDIF entries from ION ones.
id: ldap_ion_to_ldif
namespace: company.team
inputs:
- id: file1
type: FILE
- id: file2
type: FILE
tasks:
- id: ion_to_ldiff
type: io.kestra.plugin.ldap.IonToLdif
inputs:
- "{{ inputs.file1 }}"
- "{{ inputs.file2 }}"
INPUT example: here's an ION file content that may be inputted :
# simple entry
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",attributes:{description:["Some description","Some other description"],someOtherAttribute:["perhaps","perhapsAgain"]}}
# modify changeRecord
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",changeType:"modify",modifications:[{operation:"DELETE",attribute:"description",values:["Some description 3"]},{operation:"ADD",attribute:"description",values:["Some description 4"]},{operation:"REPLACE",attribute:"someOtherAttribute",values:["Loves herself more"]}]}
# delete changeRecord
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",changeType:"delete"}
# moddn changeRecord (it is mandatory to specify a newrdn and a deleteoldrdn)
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",changeType:"moddn",newDn:{newrdn:"[email protected]",deleteoldrdn:false,newsuperior:"ou=expeople,dc=example,dc=com"}}
# moddn changeRecord without new superior (it is optional to specify a new superior field)
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",changeType:"moddn",newDn:{newrdn:"[email protected]",deleteoldrdn:true}}
OUTPUT example: here's an LDIF file content that may be outputted :
# simple entry
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
description: Some description
someOtherAttribute: perhaps
description: Some other description
someOtherAttribute: perhapsAgain
# modify changeRecord
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
changetype: modify
delete: description
description: Some description 3
-
add: description
description: Some description 4
-
replace: someOtherAttribute
someOtherAttribute: Loves herself more
-
# delete changeRecord
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
changetype: delete
# moddn with new superior
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
changetype: moddn
newrdn: [email protected]
deleteoldrdn: 0
newsuperior: ou=expeople,dc=example,dc=com
# moddn without new superior
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
changetype: moddn
newrdn: [email protected]
deleteoldrdn: 1
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Transform .ldif files to .ion ones.##### Examples
Make ION entries from LDIF ones.
id: ldap_ldif_to_ion
namespace: company.team
inputs:
- id: file1
type: FILE
- id: file2
type: FILE
tasks:
- id: ldif_to_ion
type: io.kestra.plugin.ldap.LdifToIon
inputs:
- "{{ inputs.file1 }}"
- "{{ inputs.file2 }}"
INPUT example : here's an LDIF file content that may be inputted :
# simple entry
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
description: Some description
someOtherAttribute: perhaps
description: Some other description
someOtherAttribute: perhapsAgain
# modify changeRecord
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
changetype: modify
delete: description
description: Some description 3
-
add: description
description: Some description 4
-
replace: someOtherAttribute
someOtherAttribute: Loves herself more
-
# delete changeRecord
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
changetype: delete
# moddn and modrdn are equals, what's mandatory is to specify in the following order : newrdn -> deleteoldrdn -> (optional) newsuperior
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
changetype: modrdn
newrdn: [email protected]
deleteoldrdn: 0
newsuperior: ou=expeople,dc=example,dc=com
# moddn without new superior
dn: [email protected],ou=diffusion_list,dc=orga,dc=com
changetype: moddn
newrdn: [email protected]
deleteoldrdn: 1
OUTPUT example : here's an ION file content that may be outputted :
# simple entry
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",attributes:{description:["Some description","Some other description"],someOtherAttribute:["perhaps","perhapsAgain"]}}
# modify changeRecord
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",changeType:"modify",modifications:[{operation:"DELETE",attribute:"description",values:["Some description 3"]},{operation:"ADD",attribute:"description",values:["Some description 4"]},{operation:"REPLACE",attribute:"someOtherAttribute",values:["Loves herself more"]}]}
# delete changeRecord
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",changeType:"delete"}
# moddn changeRecord (it is mandatory to specify a newrdn and a deleteoldrdn)
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",changeType:"moddn",newDn:{newrdn:"[email protected]",deleteoldrdn:false,newsuperior:"ou=expeople,dc=example,dc=com"}}
# moddn changeRecord without new superior (it is optional to specify a new superior field)
{dn:"[email protected],ou=diffusion_list,dc=orga,dc=com",changeType:"moddn",newDn:{newrdn:"[email protected]",deleteoldrdn:true}}
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Modify, Delete or Add attributes or DNs following LDIF changeType fields of each entries provided.##### Examples
Modify entries in LDAP server.
id: ldap_modify
namespace: company.team
tasks:
- id: modify
type: io.kestra.plugin.ldap.Modify
userDn: cn=admin,dc=orga,dc=en
password: admin
inputs:
- "{{ outputs.some_task.uri_of_ldif_change_record_formated_file }}"
hostname: 0.0.0.0
port: 18060
Hostname for connection.
List of URI(s) of file(s) containing LDIF formatted entries to modify into LDAP. Entries must provide a changeType field.
User password for connection.
A whole number describing the port for connection.
User DN for connection.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Search and list entries based on a filter list for each base DN target.##### Examples
Retrieve LDAP entries. In this example, assuming that their is exactly one entry matching each of our filter, the outputs of the task would be four entries in this order (since we search two times in the same baseDn) : (dn, description, mail) of {melusine, metatron, melusine, metatron}.
id: ldap_search
namespace: company.team
tasks:
- id: search
type: io.kestra.plugin.ldap.Search
userDn: cn=admin,dc=orga,dc=en
password: admin
baseDn: ou=people,dc=orga,dc=en
filter: (|(sn=melusine*)(sn=metatron*))
attributes:
- description
- mail
hostname: 0.0.0.0
port: 15060
Hostname for connection.
User password for connection.
A whole number describing the port for connection.
User DN for connection.
Default value is : false
Specific attributes to retrieve from the filtered entries. Retrieves all attributes by default. Sepcial attributes may be specified : "+" -> OPERATIONAL_ATTRIBUTES "1.1" -> NO_ATTRIBUTES "0.0" -> ALL_ATTRIBUTES_EXCEPT_OPERATIONAL `--> This special attribute canno't be combined with other attributes and the search will ignore everything else.
Default value is : - '*'
Default value is : - '*'
[
"*"
]
Base DN target in the LDAP.
Default value is : ou=system
Default value is : false
Filter for the search in the LDAP.
Default value is : (objectclass=*)
Default value is : false
1 nested properties
Examples
id: linear_issues_create
namespace: company.team
tasks:
- id: create_issue
type: io.kestra.plugin.linear.issues.Create
token: your_api_token
team: MyTeamName
title: "Increased 5xx in Demo Service"
description: "The number of 5xx has increased beyond the threshold for Demo service."
labels:
- Bug
- Workflow
Create an issue when a Kestra workflow in any namespace with
companyas prefix fails.
id: create_ticket_on_failure
namespace: system
tasks:
- id: create_issue
type: io.kestra.plugin.linear.issues.Create
token: your_api_token
team: MyTeamName
title: Workflow failed
description: "{{ execution.id }} has failed on {{ taskrun.startDate }}. See the link below for more details."
labels:
- Bug
- Workflow
triggers:
- id: on_failure
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: company
comparison: PREFIX
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Create a Malloy script and run the malloy-cli run command.
id: malloy
namespace: company.team
tasks:
- id: run_malloy
type: io.kestra.plugin.malloy.CLI
inputFiles:
model.malloy: |
source: my_model is duckdb.table('https://huggingface.co/datasets/kestra/datasets/raw/main/csv/iris.csv')
run: my_model -> {
group_by: variety
aggregate:
avg_petal_width is avg(petal_width)
avg_petal_length is avg(petal_length)
avg_sepal_width is avg(sepal_width)
avg_sepal_length is avg(sepal_length)
}
commands:
- malloy-cli run model.malloy
Default value is : false
Default value is : ghcr.io/kestra-io/malloy
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Add one or multiple documents to a Meilisearch DB##### Examples
Add Document to Meilisearch
id: meilisearch-add-flow
namespace: company.team
variables:
host: http://172.18.0.3:7700/
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://pokeapi.co/api/v2/pokemon/jigglypuff
- id: to_ion
type: io.kestra.plugin.serdes.json.JsonToIon
from: "{{ outputs.http_download.uri }}"
- id: add
type: io.kestra.plugin.meilisearch.DocumentAdd
index: "pokemon"
url: "{{ vars.host }}"
key: "MASTER_KEY"
data: "{{ outputs.to_ion.uri }}"
3 nested properties
Index of the collection you want to add documents to
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Get a json document from Meilisearch using id and index##### Examples
Get Document from Meilisearch
id: meilisearch-get-flow
namespace: company.team
variables:
id: a123
index: pokemons
host: http://172.18.0.3:7700/
tasks:
- id: get_document
type: io.kestra.plugin.meilisearch.DocumentGet
index: {{ vars.index }}
documentId: {{ vars.id }}
url: "{{ vars.host }}"
key: "MASTER_KEY"
Index of the collections you want to retrieve your document from
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Perform a facet search from a Meilisearch DB. WARNING: make sure to set the filterable attributes before.
Examples
Sample facet search
facetQuery: "fiction",
facetName: "genre",
filters:
-"rating > 3"
url: "http://localhost:7700",
key: "MASTER_KEY",
index: "movies"
id: meilisearch-facet-search-flow
namespace: company.team
variables:
index: movies
facetQuery: fiction
facetName: genre
host: http://172.18.0.3:7700/
tasks:
- id: facet_search_documents
type: io.kestra.plugin.meilisearch.FacetSearch
index: {{ vars.index }}
facetQuery: {{ vars.facetQuery }}
facetName: {{ vars.facetName }}
filters:
- "rating > 3"
url: "{{ vars.host }}"
key: "MASTER_KEY"
- id: to_json
type: io.kestra.plugin.serdes.json.IonToJson
from: "{{ outputs.search_documents.uri }}"
Name of the facet you wan to perform a search on (ex: facetName: "genre" on a film collection)
Index of the collection you want to search in
Default value is : false
Default value is : false
Query that will be used on the specified facetName
Default value is : --- ""
Additional filters to apply to your facet search
Default value is : "[]"
Default value is : "[]"
Default value is : false
1 nested properties
Perform a basic search query on a Meilisearch database with specific query and return the results in an .ion file##### Examples
id: meilisearch-search-flow
namespace: company.team
variables:
index: movies
query: "Lord of the Rings"
host: http://172.18.0.3:7700/
tasks:
- id: search_documents
type: io.kestra.plugin.meilisearch.Search
index: {{ vars.index }}
query: {{ vars.query }}
url: "{{ vars.host }}"
key: "MASTER_KEY"
- id: to_json
type: io.kestra.plugin.serdes.json.IonToJson
from: "{{ outputs.search_documents.uri }}"
Default value is : false
Default value is : false
Index of the collection you want to perform a search on
Default value is : false
Query performed to search on a specific collection
1 nested properties
Examples
id: minio_copy
namespace: company.team
tasks:
- id: copy
type: io.kestra.plugin.minio.Copy
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
from:
bucket: "my-bucket"
key: "path/to/file"
to:
bucket: "my-bucket2"
key: "path/to/file2"
Copy file in an S3-compatible storage — here, Spaces Object Storage from Digital Ocean.
id: s3_compatible_copy
namespace: company.team
tasks:
- id: copy_file
type: io.kestra.plugin.minio.Copy
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
endpoint: https://<region>.digitaloceanspaces.com
from:
bucket: "my-bucket"
key: "path/to/file"
to:
bucket: "my-bucket2"
key: "path/to/file2"
Default value is : false
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Create a new bucket with some options
id: minio_create_bucket
namespace: company.team
tasks:
- id: create_bucket
type: io.kestra.plugin.minio.CreateBucket
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
Create a new bucket on an S3-compatible storage — here, Spaces Object Storage from Digital Ocean.
id: s3_compatible_bucket
namespace: company.team
tasks:
- id: create_bucket
type: io.kestra.plugin.minio.CreateBucket
accessKeyId: "<access_key>"
secretKeyId: "<secret_key>"
endpoint: https://<region>.digitaloceanspaces.com #example region: nyc3, tor1
bucket: "kestra-test-bucket"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: minio_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.minio.Delete
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
key: "path/to/file"
Delete file from an S3-compatible storage — here, Spaces Object Storage from Digital Ocean.
id: s3_compatible_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.minio.Delete
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
endpoint: https://<region>.digitaloceanspaces.com
bucket: "kestra-test-bucket"
key: "path/to/file"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: minio_delete_objects
namespace: company.team
tasks:
- id: delete_objects
type: io.kestra.plugin.minio.DeleteList
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
Delete files from an S3-compatible storage — here, Spaces Object Storage from Digital Ocean.
id: s3_compatible_delete_objects
namespace: company.team
tasks:
- id: delete_objects
type: io.kestra.plugin.minio.DeleteList
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
endpoint: https://<region>.digitaloceanspaces.com
bucket: "kestra-test-bucket"
prefix: "sub-dir"
Default value is : false
Default value is : false
Default value is : false
Default value is : BOTH
Default value is : false
Start listing after this specified key. Marker can be any key in the bucket.
By default, the action returns up to 1,000 key names. The response might contain fewer keys but will never contain more.
Default value is : 1000
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
1 nested properties
Examples
id: minio_download
namespace: company.team
tasks:
- id: download_from_storage
type: io.kestra.plugin.minio.Download
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
key: "path/to/file"
Download file from an S3-compatible storage — here, Spaces Object Storage from Digital Ocean.
id: s3_compatible_download
namespace: company.team
tasks:
- id: download_from_storage
type: io.kestra.plugin.minio.Download
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
endpoint: https://<region>.digitaloceanspaces.com
bucket: "kestra-test-bucket"
key: "data/orders.csv"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: minio_downloads
namespace: company.team
tasks:
- id: downloads
type: io.kestra.plugin.minio.Downloads
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
action: "DELETE"
Download files from an S3-compatible storage — here, Spaces Object Storage from Digital Ocean.
id: s3_compatible_downloads
namespace: company.team
tasks:
- id: downloads
type: io.kestra.plugin.minio.Downloads
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
endpoint: https://<region>.digitaloceanspaces.com
bucket: "kestra-test-bucket"
prefix: "data/orders"
action: "DELETE"
Default value is : false
Default value is : false
Default value is : BOTH
Default value is : false
Start listing after this specified key. Marker can be any key in the bucket.
By default, the action returns up to 1,000 key names. The response might contain fewer keys but will never contain more.
Default value is : 1000
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
1 nested properties
Examples
id: minio_list
namespace: company.team
tasks:
- id: list_objects
type: io.kestra.plugin.minio.List
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
List files from an S3-compatible storage — here, Spaces Object Storage from Digital Ocean.
id: s3_compatible_list
namespace: company.team
tasks:
- id: list_objects
type: io.kestra.plugin.minio.List
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
endpoint: https://<region>.digitaloceanspaces.com
bucket: "kestra-test-bucket"
Default value is : false
Default value is : false
Default value is : BOTH
Default value is : true
Default value is : false
Start listing after this specified key. Marker can be any key in the bucket.
By default, the action returns up to 1,000 key names. The response might contain fewer keys but will never contain more.
Default value is : 1000
Default value is : true
ex:
regExp: .* to match all files
regExp: .*2020-01-0.\\.csv to match files between 01 and 09 of january ending with .csv
1 nested properties
This trigger will list every interval a bucket. You can search for all files in a bucket or directory in from or you can filter the files with a regExp. The detection is atomic, internally we do a list and interact only with files listed.
Once a file is detected, we download the file on internal storage and processed with declared action in order to move or delete the files from the bucket (to avoid double detection on new poll).##### Examples
Wait for a list of files on a bucket and iterate through the files.
id: minio_listen
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
value: "{{ trigger.objects | jq('.[].uri') }}"
triggers:
- id: watch
type: io.kestra.plugin.minio.Trigger
interval: "PT5M"
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
action: MOVE
moveTo:
key: archive
Wait for a list of files on a bucket and iterate through the files. Delete files manually after processing to prevent infinite triggering.
id: minio_listen
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
- id: delete
type: io.kestra.plugin.minio.Delete
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
key: "{{ taskrun.value }}"
value: "{{ trigger.objects | jq('.[].key') }}"
triggers:
- id: watch
type: io.kestra.plugin.minio.Trigger
interval: "PT5M"
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
bucket: "my-bucket"
prefix: "sub-dir"
action: NONE
Wait for a list of files on a bucket on an S3-compatible storage — here, Spaces Object Storage from Digital Ocean. Iterate through those files, and move it to another folder.
id: trigger_on_s3_compatible_storage
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ taskrun.value }}"
value: "{{ trigger.objects | jq('.[].uri') }}"
triggers:
- id: watch
type: io.kestra.plugin.minio.Trigger
interval: "PT5M"
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
endpoint: https://<region>>.digitaloceanspaces.com
bucket: "kestra-test-bucket"
prefix: "sub-dir"
action: MOVE
moveTo:
key: archive
Default value is : false
Default value is : BOTH
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : 1000
2 nested properties
1 nested properties
Examples
id: minio_upload
namespace: company.team
inputs:
id: file
type: FILE
tasks:
- id: upload_to_storage
type: io.kestra.plugin.minio.Upload
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
region: "eu-central-1"
from: "{{ inputs.file }}"
bucket: "my-bucket"
key: "path/to/file"
Upload file to an S3-compatible storage — here, Spaces Object Storage from Digital Ocean.
id: s3_compatible_upload
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv
- id: upload_to_storage
type: io.kestra.plugin.minio.Upload
accessKeyId: "<access-key>"
secretKeyId: "<secret-key>"
endpoint: https://<region>.digitaloceanspaces.com #example regions: nyc3, tor1
bucket: "kestra-test-bucket"
from: "{{ outputs.http_download.uri }}"
key: "data/orders.csv"
Default value is : false
Default value is : false
Can be a single file, a list of files or json array.
a full key (with filename) or the directory path if from is multiple files.
Default value is : false
1 nested properties
Examples
Execute a Python script on a GPU-powered instance in the cloud using Modal. Make sure to add the script that you want to orchestrate as a Namespace File in the Editor and point to it in the
commandssection.
id: modal
namespace: company.team
tasks:
- id: modal_cli
type: io.kestra.plugin.modal.cli.ModalCLI
namespaceFiles:
enabled: true
commands:
- modal run scripts/gpu.py
env:
MODAL_TOKEN_ID: "{{ secret('MODAL_TOKEN_ID') }}"
MODAL_TOKEN_SECRET: "{{ secret('MODAL_TOKEN_SECRET') }}"
Execute a Python script from Git on a cloud VM using Modal.
id: modal_git
namespace: company.team
tasks:
- id: repository
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone
type: io.kestra.plugin.git.Clone
branch: main
url: https://github.com/kestra-io/scripts
- id: modal_cli
type: io.kestra.plugin.modal.cli.ModalCLI
commands:
- modal run modal/getting_started.py
env:
MODAL_TOKEN_ID: "{{ secret('MODAL_TOKEN_ID') }}"
MODAL_TOKEN_SECRET: "{{ secret('MODAL_TOKEN_SECRET') }}"
Default value is : false
Default value is : ghcr.io/kestra-io/modal
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
Here are the sample file contents that can be provided as input to Bulk task:
{ "insertOne" : {"firstName": "John", "lastName": "Doe", "city": "Paris"}}
{ "insertOne" : {"firstName": "Ravi", "lastName": "Singh", "city": "Mumbai"}}
{ "deleteMany": {"filter": {"city": "Bengaluru"}}}
Examples
id: mongodb_bulk
namespace: company.team
inputs:
- id: myfile
type: FILE
tasks:
- id: bulk
type: io.kestra.plugin.mongodb.Bulk
connection:
uri: "mongodb://root:example@localhost:27017/?authSource=admin"
database: "my_database"
collection: "my_collection"
from: "{{ inputs.myfile }}"
Default value is : false
Default value is : 1000
Default value is : false
Default value is : false
1 nested properties
Examples
id: mongodb_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.mongodb.Delete
connection:
uri: "mongodb://root:example@localhost:27017/?authSource=admin"
database: "my_database"
collection: "my_collection"
operation: "DELETE_ONE"
filter:
_id:
$oid: 60930c39a982931c20ef6cd6
Default value is : false
Default value is : false
Can be a BSON string, or a map.
Default value is : false
Default value is : DELETE_ONE
1 nested properties
Examples
id: mongodb_find
namespace: company.team
tasks:
- id: find
type: io.kestra.plugin.mongodb.Find
connection:
uri: "mongodb://root:example@localhost:27017/?authSource=admin"
database: "my_database"
collection: "my_collection"
filter:
_id:
$oid: 60930c39a982931c20ef6cd6
Default value is : false
Default value is : false
Can be a BSON string, or a map.
Default value is : false
Can be a BSON string, or a map.
Can be a BSON string, or a map.
Default value is : false
1 nested properties
Examples
Insert a document with a map.
id: mongodb_insertone
namespace: company.team
tasks:
- id: insertone
type: io.kestra.plugin.mongodb.InsertOne
connection:
uri: "mongodb://root:example@localhost:27017/?authSource=admin"
database: "my_database"
collection: "my_collection"
document:
_id:
$oid: 60930c39a982931c20ef6cd6
name: "John Doe"
city: "Paris"
Insert a document from a JSON string.
id: mongodb_insertone
namespace: company.team
tasks:
- id: insertone
type: io.kestra.plugin.mongodb.InsertOne
connection:
uri: "mongodb://root:example@localhost:27017/?authSource=admin"
database: "my_database"
collection: "my_collection"
document: "{{ outputs.task_id.data | json }}"
Can be a BSON string, or a map.
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: mongodb_load
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: load
type: io.kestra.plugin.mongodb.Load
connection:
uri: "mongodb://root:example@localhost:27017/?authSource=admin"
database: "my_database"
collection: "my_collection"
from: "{{ inputs.file }}"
Default value is : false
Default value is : 1000
Default value is : false
Default value is : false
Default value is : true
1 nested properties
Examples
Wait for a MongoDB query to return results, and then iterate through rows.
id: mongodb_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.mongodb.Trigger
interval: "PT5M"
connection:
uri: mongodb://root:example@localhost:27017/?authSource=admin
database: samples
collection: books
filter:
pageCount:
$gte: 50
sort:
pageCount: -1
projection:
title: 1
publishedDate: 1
pageCount: 1
1 nested properties
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : false
1 nested properties
Examples
Replace a document.
id: mongodb_update
namespace: company.team
tasks:
- id: update
type: io.kestra.plugin.mongodb.Update
connection:
uri: "mongodb://root:example@localhost:27017/?authSource=admin"
database: "my_database"
collection: "my_collection"
operation: "REPLACE_ONE"
document:
_id:
$oid: 60930c39a982931c20ef6cd6
name: "John Doe"
city: "Paris"
filter:
_id:
$oid: 60930c39a982931c20ef6cd6
Update a document.
id: mongodb_update
namespace: company.team
tasks:
- id: update
type: io.kestra.plugin.mongodb.Update
connection:
uri: "mongodb://root:example@localhost:27017/?authSource=admin"
database: "my_database"
collection: "my_collection"
filter:
_id:
$oid: 60930c39a982931c20ef6cd6
document: "{"$set": { "tags": ["blue", "green", "red"]}}"
Can be a BSON string, or a map.
Can be a BSON string, or a map.
Default value is : false
Default value is : false
Default value is : false
Default value is : UPDATE_ONE
1 nested properties
Examples
id: mqtt_publish
namespace: company.team
tasks:
- id: publish
type: io.kestra.plugin.mqtt.Publish
server: tcp://localhost:1883
clientId: kestraProducer
topic: kestra/sensors/cpu
serdeType: JSON
retain: true
from:
type: "sensors"
value: 1.23
id: mqtt_publish
namespace: company.team
tasks:
- id: publish
type: io.kestra.plugin.mqtt.Publish
server: ssl://localhost:8883
clientId: kestraProducer
topic: kestra/sensors/cpu
crt: /home/path/to/ca.crt
serdeType: JSON
retain: true
from:
type: "sensors"
value: 1.23
A client identifier clientId must be specified and be less that 65535 characters. It must be unique across all clients connecting to the same server. The clientId is used by the server to store data related to the client, hence it is important that the clientId remain the same when connecting to a server if durable subscriptions or reliable messaging are required. As the client identifier is used by the server to identify a client when it reconnects, the client must use the same identifier between connections if durable subscriptions or reliable delivery of messages is required.
Can be an internal storage uri, a map or a list.
The serverURI parameter is typically used with the the clientId parameter to form a key. The key is used to store and reference messages while they are being delivered.
The address of the server to connect to is specified as a URI. Two types of connection are supported tcp:// for a TCP connection and ssl:// for a TCP connection secured by SSL/TLS. For example:
tcp://localhost:1883ssl://localhost:8883If the port is not specified, it will default to 1883 fortcp://" URIs, and 8883 forssl://URIs.
Default value is : false
Only available if version = V5
If set, this value contains the name of the authentication method to be used for extended authentication. If null, extended authentication is not performed.
This value defines the maximum time interval the client will wait for the network connection to the MQTT server to be established. The default timeout is 30 seconds. A value of 0 disables timeout processing meaning the client will wait until the network connection is made successfully or fails.
Default value is : false
This value will allow all ca certificate.
Default value is : false
- Quality of Service 0: indicates that a message should be delivered at most once (zero or one times). The message will not be persisted to disk, and will not be acknowledged across the network. This QoS is the fastest, but should only be used for messages which are not valuable - note that if the server cannot process the message (for example, there is an authorization problem). Also known as "fire and forget".
- Quality of Service 1: indicates that a message should be delivered at least once (one or more times). The message can only be delivered safely if it can be persisted, so the application must supply a means of persistence using MqttConnectOptions. If a persistence mechanism is not specified, the message will not be delivered in the event of a client failure. The message will be acknowledged across the network.
- Quality of Service 2: indicates that a message should be delivered once. The message will be persisted to disk, and will be subject to a two-phase acknowledgement across the network. The message can only be delivered safely if it can be persisted, so the application must supply a means of persistence using MqttConnectOptions. If a persistence mechanism is not specified, the message will not be delivered in the event of a client failure. If persistence is not configured, QoS 1 and 2 messages will still be delivered in the event of a network or server problem as the client will hold state in memory. If the MQTT client is shutdown or fails and persistence is not configured then delivery of QoS 1 and 2 messages can not be maintained as client-side state will be lost.
Default value is : 1
Sending a message with retained set to true and with an empty byte array as the payload e.g. null will clear the retained message from the server.
Default value is : false
Default value is : V5
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.mqtt.Trigger instead.##### Examples
Consume a message from MQTT topics in real-time.
id: mqtt_realtime_trigger
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.payload }}"
triggers:
- id: realtime_trigger
type: io.kestra.plugin.mqtt.RealtimeTrigger
server: tcp://localhost:1883
clientId: kestraProducer
topic:
- kestra/sensors/cpu
- kestra/sensors/mem
serdeType: JSON
Can be a string or a List of string to consume from multiple topic
Default value is : false
Default value is : false
- Quality of Service 0: indicates that a message should be delivered at most once (zero or one times). The message will not be persisted to disk, and will not be acknowledged across the network. This QoS is the fastest, but should only be used for messages which are not valuable - note that if the server cannot process the message (for example, there is an authorization problem). Also known as "fire and forget".
- Quality of Service 1: indicates that a message should be delivered at least once (one or more times). The message can only be delivered safely if it can be persisted, so the application must supply a means of persistence using MqttConnectOptions. If a persistence mechanism is not specified, the message will not be delivered in the event of a client failure. The message will be acknowledged across the network.
- Quality of Service 2: indicates that a message should be delivered once. The message will be persisted to disk, and will be subject to a two-phase acknowledgement across the network. The message can only be delivered safely if it can be persisted, so the application must supply a means of persistence using MqttConnectOptions. If a persistence mechanism is not specified, the message will not be delivered in the event of a client failure. If persistence is not configured, QoS 1 and 2 messages will still be delivered in the event of a network or server problem as the client will hold state in memory. If the MQTT client is shutdown or fails and persistence is not configured then delivery of QoS 1 and 2 messages can not be maintained as client-side state will be lost.
Default value is : 1
Default value is : JSON
Default value is : V5
1 nested properties
Examples
id: mqtt_subscribe
namespace: company.team
tasks:
- id: subscribe
type: io.kestra.plugin.mqtt.Subscribe
server: tcp://localhost:1883
clientId: kestraProducer
topic:
- kestra/sensors/cpu
- kestra/sensors/mem
serdeType: JSON
maxRecords: 10
id: mqtt_subscribe
namespace: company.team
tasks:
- id: subscribe
type: io.kestra.plugin.mqtt.Subscribe
server: ssl://localhost:8883
clientId: kestraProducer
topic:
- kestra/sensors/cpu
- kestra/sensors/mem
crt: /home/path/to/ca.crt
serdeType: JSON
maxRecords: 10
A client identifier clientId must be specified and be less that 65535 characters. It must be unique across all clients connecting to the same server. The clientId is used by the server to store data related to the client, hence it is important that the clientId remain the same when connecting to a server if durable subscriptions or reliable messaging are required. As the client identifier is used by the server to identify a client when it reconnects, the client must use the same identifier between connections if durable subscriptions or reliable delivery of messages is required.
The serverURI parameter is typically used with the the clientId parameter to form a key. The key is used to store and reference messages while they are being delivered.
The address of the server to connect to is specified as a URI. Two types of connection are supported tcp:// for a TCP connection and ssl:// for a TCP connection secured by SSL/TLS. For example:
tcp://localhost:1883ssl://localhost:8883If the port is not specified, it will default to 1883 fortcp://" URIs, and 8883 forssl://URIs.
Can be a string or a List of string to consume from multiple topic
Default value is : false
Only available if version = V5
If set, this value contains the name of the authentication method to be used for extended authentication. If null, extended authentication is not performed.
This value defines the maximum time interval the client will wait for the network connection to the MQTT server to be established. The default timeout is 30 seconds. A value of 0 disables timeout processing meaning the client will wait until the network connection is made successfully or fails.
Default value is : false
This value will allow all ca certificate.
Default value is : false
It's not an hard limit and is evaluated every second
It's not an hard limit and is evaluated every second
- Quality of Service 0: indicates that a message should be delivered at most once (zero or one times). The message will not be persisted to disk, and will not be acknowledged across the network. This QoS is the fastest, but should only be used for messages which are not valuable - note that if the server cannot process the message (for example, there is an authorization problem). Also known as "fire and forget".
- Quality of Service 1: indicates that a message should be delivered at least once (one or more times). The message can only be delivered safely if it can be persisted, so the application must supply a means of persistence using MqttConnectOptions. If a persistence mechanism is not specified, the message will not be delivered in the event of a client failure. The message will be acknowledged across the network.
- Quality of Service 2: indicates that a message should be delivered once. The message will be persisted to disk, and will be subject to a two-phase acknowledgement across the network. The message can only be delivered safely if it can be persisted, so the application must supply a means of persistence using MqttConnectOptions. If a persistence mechanism is not specified, the message will not be delivered in the event of a client failure. If persistence is not configured, QoS 1 and 2 messages will still be delivered in the event of a network or server problem as the client will hold state in memory. If the MQTT client is shutdown or fails and persistence is not configured then delivery of QoS 1 and 2 messages can not be maintained as client-side state will be lost.
Default value is : 1
Default value is : JSON
Default value is : V5
1 nested properties
Note that you don't need an extra task to consume the message from the event trigger. The trigger will automatically consume messages and you can retrieve their content in your flow using the {{ trigger.uri }} variable. If you would like to consume each message from MQTT topics in real-time and create one execution per message, you can use the io.kestra.plugin.mqtt.RealtimeTrigger instead.##### Examples
id: mqtt_trigger
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.payload }}"
triggers:
- id: trigger
type: io.kestra.plugin.mqtt.Trigger
server: tcp://localhost:1883
clientId: kestraProducer
topic:
- kestra/sensors/cpu
- kestra/sensors/mem
serdeType: JSON
maxRecords: 10
Can be a string or a List of string to consume from multiple topic
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
It's not an hard limit and is evaluated every second
It's not an hard limit and is evaluated every second
- Quality of Service 0: indicates that a message should be delivered at most once (zero or one times). The message will not be persisted to disk, and will not be acknowledged across the network. This QoS is the fastest, but should only be used for messages which are not valuable - note that if the server cannot process the message (for example, there is an authorization problem). Also known as "fire and forget".
- Quality of Service 1: indicates that a message should be delivered at least once (one or more times). The message can only be delivered safely if it can be persisted, so the application must supply a means of persistence using MqttConnectOptions. If a persistence mechanism is not specified, the message will not be delivered in the event of a client failure. The message will be acknowledged across the network.
- Quality of Service 2: indicates that a message should be delivered once. The message will be persisted to disk, and will be subject to a two-phase acknowledgement across the network. The message can only be delivered safely if it can be persisted, so the application must supply a means of persistence using MqttConnectOptions. If a persistence mechanism is not specified, the message will not be delivered in the event of a client failure. If persistence is not configured, QoS 1 and 2 messages will still be delivered in the event of a network or server problem as the client will hold state in memory. If the MQTT client is shutdown or fails and persistence is not configured then delivery of QoS 1 and 2 messages can not be maintained as client-side state will be lost.
Default value is : 1
Default value is : JSON
Default value is : V5
1 nested properties
Please note that the server you run it against must have JetStream enabled for it to work. It should also have a stream configured to match the given subject.##### Examples
Consume messages from any topic subject matching the kestra.> wildcard, using user password authentication.
id: nats_consume_messages
namespace: company.team
tasks:
- id: consume
type: io.kestra.plugin.nats.Consume
url: nats://localhost:4222
username: nats_user
password: nats_password
subject: kestra.>
durableId: someDurableId
pollDuration: PT5S
The format is (nats://)server_url:port. You can also provide a connection token like so: nats://token@server_url:port
Default value is : false
Default value is : 10
Possible settings are:
All: The default policy. The consumer will start receiving from the earliest available message.Last: When first consuming messages, the consumer will start receiving messages with the last message added to the stream, or the last message in the stream that matches the consumer's filter subject if defined.New: When first consuming messages, the consumer will only start receiving messages that were created after the consumer was created.ByStartSequence: When first consuming messages, start at the first message having the sequence number or the next one available.ByStartTime: When first consuming messages, start with messages on or after this time. The consumer is required to specifysincewhich defines this start time.LastPerSubject: When first consuming messages, start with the latest one for each filtered subject currently in the stream.
Default value is : All
Default value is : false
Default value is : false
It's not an hard limit and is evaluated every second
If no messages are available, define the max duration to wait for new messages
Default value is : 2.000000000
By default, we consume all messages from the subjects starting from beginning of logs or depending on the current durable id position. You can also provide an arbitrary start time to get all messages since this date for a new durable id. Note that if you don't provide a durable id, you will retrieve all messages starting from this date even after subsequent usage of this task.Must be a valid iso 8601 date.
1 nested properties
Examples
Produce a single message to kestra.publish subject, using user password authentication.
id: nats_produce_single_message
namespace: company.team
tasks:
- id: produce
type: io.kestra.plugin.nats.Produce
url: nats://localhost:4222
username: nats_user
password: nats_password
subject: kestra.publish
from:
headers:
someHeaderKey: someHeaderValue
data: Some message
Produce 2 messages to kestra.publish subject, using user password authentication.
id: nats_produce_two_messages
namespace: company.team
tasks:
- id: produce
type: io.kestra.plugin.nats.Produce
url: nats://localhost:4222
username: nats_user
password: nats_password
subject: kestra.publish
from:
- headers:
someHeaderKey: someHeaderValue
data: Some message
- data: Another message
Produce messages (1 / row) from an internal storage file to kestra.publish subject, using user password authentication.
id: nats_produce_messages_from_file
namespace: company.team
tasks:
- id: produce
type: io.kestra.plugin.nats.Produce
url: nats://localhost:4222
username: nats_user
password: nats_password
subject: kestra.publish
from: "{{ outputs.some_task_with_output_file.uri }}"
Can be an internal storage uri, a map or a list.with the following format: headers, data
The format is (nats://)server_url:port. You can also provide a connection token like so: nats://token@server_url:port
Default value is : false
Default value is : false
Default value is : false
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.nats.Trigger instead.##### Examples
Subscribe to a NATS subject, getting every message from the beginning of the subject on first trigger execution.
id: nats
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: watch
type: io.kestra.plugin.nats.RealtimeTrigger
url: nats://localhost:4222
username: nats_user
password: nats_password
subject: kestra.trigger
durableId: natsTrigger
deliverPolicy: All
The format is (nats://)server_url:port. You can also provide a connection token like so: nats://token@server_url:port
Default value is : 10
Possible settings are:
All: The default policy. The consumer will start receiving from the earliest available message.Last: When first consuming messages, the consumer will start receiving messages with the last message added to the stream, or the last message in the stream that matches the consumer's filter subject if defined.New: When first consuming messages, the consumer will only start receiving messages that were created after the consumer was created.ByStartSequence: When first consuming messages, start at the first message having the sequence number or the next one available.ByStartTime: When first consuming messages, start with messages on or after this time. The consumer is required to specifysincewhich defines this start time.LastPerSubject: When first consuming messages, start with the latest one for each filtered subject currently in the stream.
Default value is : All
Default value is : false
Default value is : false
By default, we consume all messages from the subjects starting from beginning of logs or depending on the current durable id position. You can also provide an arbitrary start time to get all messages since this date for a new durable id. Note that if you don't provide a durable id, you will retrieve all messages starting from this date even after subsequent usage of this task.Must be a valid iso 8601 date.
1 nested properties
If you would like to consume each message from a NATS subject in real-time and create one execution per message, you can use the io.kestra.plugin.nats.RealtimeTrigger instead.##### Examples
Subscribe to a NATS subject, getting every message from the beginning of the subject on first trigger execution.
id: nats
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.data }}"
triggers:
- id: watch
type: io.kestra.plugin.nats.Trigger
url: nats://localhost:4222
username: nats_user
password: nats_password
subject: kestra.trigger
durableId: natsTrigger
deliverPolicy: All
maxRecords: 1
The format is (nats://)server_url:port. You can also provide a connection token like so: nats://token@server_url:port
Default value is : 10
Possible settings are:
All: The default policy. The consumer will start receiving from the earliest available message.Last: When first consuming messages, the consumer will start receiving messages with the last message added to the stream, or the last message in the stream that matches the consumer's filter subject if defined.New: When first consuming messages, the consumer will only start receiving messages that were created after the consumer was created.ByStartSequence: When first consuming messages, start at the first message having the sequence number or the next one available.ByStartTime: When first consuming messages, start with messages on or after this time. The consumer is required to specifysincewhich defines this start time.LastPerSubject: When first consuming messages, start with the latest one for each filtered subject currently in the stream.
Default value is : All
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
It's not an hard limit and is evaluated every second
If no messages are available, define the max duration to wait for new messages
Default value is : 2.000000000
By default, we consume all messages from the subjects starting from beginning of logs or depending on the current durable id position. You can also provide an arbitrary start time to get all messages since this date for a new durable id. Note that if you don't provide a durable id, you will retrieve all messages starting from this date even after subsequent usage of this task.Must be a valid iso 8601 date.
1 nested properties
Examples
Creates a new Key/Value bucket, with all required properties.
id: nats_kv_create_bucket
namespace: company.team
tasks:
- id: create_bucket
type: io.kestra.plugin.nats.kv.CreateBucket
url: nats://localhost:4222
username: nats_user
password: nats_passwd
name: my_bucket
Creates a new Key/Value bucket.
id: nats_kv_create_bucket
namespace: company.team
tasks:
- id: create_bucket
type: io.kestra.plugin.nats.kv.CreateBucket
url: nats://localhost:4222
username: nats_user
password: nats_passwd
name: my_bucket
description: my bucket for special purposes
historyPerKey: 2
bucketSize: 1024
valueSize: 1024
metadata: {"key1":"value1","key2":"value2"}
The format is (nats://)server_url:port. You can also provide a connection token like so: nats://token@server_url:port
Default value is : false
Default value is : false
Default value is : 1
Default value is : false
1 nested properties
Examples
id: nats_kv_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.nats.kv.Delete
url: nats://localhost:4222
username: nats_user
password: nats_passwd
bucketName: my_bucket
keys:
- key1
- key2
The format is (nats://)server_url:port. You can also provide a connection token like so: nats://token@server_url:port
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Gets a value from a NATS Key/Value bucket by keys.
id: nats_kv_get
namespace: company.team
tasks:
- id: get
type: io.kestra.plugin.nats.kv.Get
url: nats://localhost:4222
username: nats_user
password: nats_passwd
bucketName: my_bucket
keys:
- key1
- key2
Gets a value from a NATS Key/Value bucket by keys with revisions.
id: nats_kv_get
namespace: company.team
tasks:
- id: get
type: io.kestra.plugin.nats.kv.Get
url: nats://localhost:4222
username: nats_user
password: nats_passwd
bucketName: my_bucket
keyRevisions:
- key1: 1
- key2: 3
The format is (nats://)server_url:port. You can also provide a connection token like so: nats://token@server_url:port
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: nats_kv_put
namespace: company.team
tasks:
- id: put
type: io.kestra.plugin.nats.kv.Put
url: nats://localhost:4222
username: nats_user
password: nats_passwd
bucketName: my_bucket
values:
- key1: value1
- key2: value2
- key3:
- subKey1: some other value
The format is (nats://)server_url:port. You can also provide a connection token like so: nats://token@server_url:port
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: neo4j_batch
namespace: company.team
tasks:
- id: batch
type: io.kestra.plugin.neo4j.Batch
url: "{{ url }}"
username: "{{ username }}"
password: "{{ password }}"
query: |
UNWIND $props AS properties
MERGE (y:Year {year: properties.year})
MERGE (y)<-[:IN]-(e:Event {id: properties.id})
RETURN e.id AS x ORDER BY x
from: "{{ outputs.previous_task_id.uri }}"
chunk: 1000
The query must have the row : "UNWIND $props AS X" with $props the variable where we input the source data for the batch.
Default value is : false
Default value is : 1000
Default value is : false
Default value is : false
If not specified, won't use basic auth
The URL can either be in HTTP or Bolt format
If not specified, won't use basic
1 nested properties
Examples
id: neo4j_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.neo4j.Query
url: "{{ url }}"
username: "{{ username }}"
password: "{{ password }}"
query: |
MATCH (p:Person)
RETURN p
storeType: FETCH
Default value is : false
Default value is : false
Default value is : false
If not specified, won't use basic auth
FETCHONE output the first rowFETCH output all the rowSTORE store all row in a fileNONE do nothing
Default value is : NONE
The URL can either be in HTTP or Bolt format
If not specified, won't use basic
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the DiscordIncomingWebhook task.##### Examples
Send a Discord notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.discord.DiscordExecution
url: "{{ secret('DISCORD_WEBHOOK') }}" # format: https://hooks.discord.com/services/xzy/xyz/xyz
username: "MyUsername"
embedList:
- title: "Discord Notification"
color:
- 255
- 255
- 255
executionId: "{{trigger.executionId}}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
Add this task to a list of errors tasks to implement custom flow-level failure notifications. Check the Discord documentation for more details..##### Examples
Send a Discord notification on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.discord.DiscordIncomingWebhook
url: "{{ secret('DISCORD_WEBHOOK') }}" # https://discord.com/api/webhooks/000000/xxxxxxxxxxx
payload: |
{
"username": "MyUsername",
"content": "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}"
"embedList": [{
"title": "Discord Notification"
}]
}
Send a Discord message via incoming webhook
id: discord_incoming_webhook
namespace: company.team
tasks:
- id: send_discord_message
type: io.kestra.plugin.notifications.discord.DiscordIncomingWebhook
url: "{{ secret('DISCORD_WEBHOOK') }}"
payload: |
{
"username": "MyUsername",
"tts": false,
"content": "Hello from the workflow {{ flow.id }}",
"embeds": [
{
"title": "Discord Hello",
"color": 16777215
"description": "Namespace: dev
Flow ID: discord
Execution ID: 1p0JVFz24ZVLSK8iJN6hfs
Execution Status: SUCCESS
[Link to the Execution page](http://localhost:8080/ui/executions/dev/discord/1p0JVFz24ZVLSK8iJN6hfs)",
"footer": {
"text": "Succeeded after 00:00:00.385"
}
}
]
}
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Example: [255, 255, 255]
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the GoogleChatIncomingWebhook task.##### Examples
Send a Google Chat notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.google.GoogleChatExecution
url: "{{ secret('GOOGLE_WEBHOOK') }}" # format: https://chat.googleapis.com/v1/spaces/xzy/messages
text: "Google Chat Notification"
executionId: "{{trigger.executionId}}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Check the Create an Incoming Webhook documentation for more details..
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
Add this task to a list of errors tasks to implement custom flow-level failure notifications. Check the Google documentation for more details..##### Examples
Send a Google Chat notification on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.google.GoogleChatIncomingWebhook
url: "{{ secret('GOOGLE_WEBHOOK') }}" # https://chat.googleapis.com/v1/spaces/xzy/messages?threadKey=errorThread
payload: |
{
"text": "Google Chat Alert"
}
Send a Google Chat message via incoming webhook
id: google_incoming_webhook
namespace: company.team
tasks:
- id: send_google_chat_message
type: io.kestra.plugin.notifications.google.GoogleChatIncomingWebhook
url: "{{ secret('GOOGLE_WEBHOOK') }}"
payload: |
{
"text": "Google Chat Hello"
}
Check the Create an Incoming Webhook documentation for more details..
Default value is : false
Default value is : false
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger, as shown in this example. Don't use this notification task in errors tasks. Instead, for errors tasks, use the MailSend task.##### Examples
Send an email notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.mail.MailExecution
to: [email protected]
from: [email protected]
subject: "The workflow execution {{trigger.executionId}} failed for the flow {{trigger.flowId}} in the namespace {{trigger.namespace}}"
host: mail.privateemail.com
port: 465
username: "{{ secret('EMAIL_USERNAME') }}"
password: "{{ secret('EMAIL_PASSWORD') }}"
executionId: "{{ trigger.executionId }}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
The attachment will be shown in the email client as separate files available for download, or displayed inline if the client supports it (for example, most browsers display PDF's in a popup window)
Note that each email address must be compliant with the RFC2822 format
Default value is : false
The provided images are assumed to be of MIME type png, jpg or whatever the email client supports as valid image that can be embedded in HTML content
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Both text and HTML can be provided, which will be offered to the email client as alternative contentEmail clients that support it, will favor HTML over plain text and ignore the text body completely
Default value is : false
Both text and HTML can be provided, which will be offered to the email client as alternative contentEmail clients that support it, will favor HTML over plain text and ignore the text body completely
It controls the maximum timeout value when sending emails
Default value is : 10000
Note that each email address must be compliant with the RFC2822 format
Will default to SMTPS if left empty
Default value is : SMTPS
1 nested properties
Examples
Send an email on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: send_email
type: io.kestra.plugin.notifications.mail.MailSend
from: [email protected]
to: [email protected]
username: "{{ secret('EMAIL_USERNAME') }}"
password: "{{ secret('EMAIL_PASSWORD') }}"
host: mail.privateemail.com
port: 465 # or 587
subject: "Kestra workflow failed for the flow {{flow.id}} in the namespace {{flow.namespace}}"
htmlTextContent: "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}"
Default value is : false
The attachment will be shown in the email client as separate files available for download, or displayed inline if the client supports it (for example, most browsers display PDF's in a popup window)
Note that each email address must be compliant with the RFC2822 format
Default value is : false
The provided images are assumed to be of MIME type png, jpg or whatever the email client supports as valid image that can be embedded in HTML content
Both text and HTML can be provided, which will be offered to the email client as alternative contentEmail clients that support it, will favor HTML over plain text and ignore the text body completely
Default value is : false
Both text and HTML can be provided, which will be offered to the email client as alternative contentEmail clients that support it, will favor HTML over plain text and ignore the text body completely
It controls the maximum timeout value when sending emails
Default value is : 10000
Note that each email address must be compliant with the RFC2822 format
Will default to SMTPS if left empty
Default value is : SMTPS
1 nested properties
Note that each email address must be compliant with the RFC2822 format
Default value is : application/octet-stream
Add this task to a list of errors tasks to implement custom flow-level failure notifications. Check the Opsgenie documentation for more details..##### Examples
Send a failed flow alert to Opsgenie
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.opsgenie.OpsgenieAlert
url: "{{ secret('OPSGENIE_REQUEST') }}" # https://api.opsgenie.com/v2/alerts/requests/xxx000xxxxx
payload: |
{
"message":"Kestra Opsgenie alert",
"alias":"ExecutionError",
"responders":[
{"id":"4513b7ea-3b91-438f-b7e4-e3e54af9147c","type":"team"},
{"id":"bb4d9938-c3c2-455d-aaab-727aa701c0d8","type":"user"},
{"id":"aee8a0de-c80f-4515-a232-501c0bc9d715","type":"escalation"},
{"id":"80564037-1984-4f38-b98e-8a1f662df552","type":"schedule"}
],
"visibleTo":[
{"id":"4513b7ea-3b91-438f-b7e4-e3e54af9147c","type":"team"},
{"id":"bb4d9938-c3c2-455d-aaab-727aa701c0d8","type":"user"}
],
"tags":["ExecutionFail","Error","Execution"],
"priority":"P1"
}
authorizationToken: sampleAuthorizationToken
Send a Opsgenie alert
id: opsgenie_incoming_webhook
namespace: company.team
tasks:
- id: send_opsgenie_message
type: io.kestra.plugin.notifications.opsgenie.OpsgenieAlert
url: "{{ secret('OPSGENIE_REQUEST') }}"
payload: |
{
"message":"Kestra Opsgenie alert",
"alias":"Some Execution",
"responders":[
{"id":"4513b7ea-3b91-438f-b7e4-e3e54af9147c","type":"team"},
{"id":"bb4d9938-c3c2-455d-aaab-727aa701c0d8","type":"user"}
],
"visibleTo":[
{"id":"4513b7ea-3b91-438f-b7e4-e3e54af9147c","type":"team"},
{"id":"bb4d9938-c3c2-455d-aaab-727aa701c0d8","type":"user"}
],
"tags":["Execution"],
"priority":"P2"
}
authorizationToken: sampleAuthorizationToken
Default value is : false
Default value is : false
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the OpsgenieAlert task.##### Examples
Send notification on a failed flow execution via Opsgenie
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.opsgenie.OpsgenieExecution
url: "{{ secret('OPSGENIE_REQUEST') }}" # format: 'https://api.opsgenie.com/v2/alerts/requests/xxxxxxyx-yyyx-xyxx-yyxx-yyxyyyyyxxxx'
message: "Kestra Opsgenie alert"
alias: ExecutionError
responders:
4513b7ea-3b91-438f-b7e4-e3e54af9147c: team
bb4d9938-c3c2-455d-aaab-727aa701c0d8: user
aee8a0de-c80f-4515-a232-501c0bc9d715: escalation
80564037-1984-4f38-b98e-8a1f662df552: schedule
visibleTo:
4513b7ea-3b91-438f-b7e4-e3e54af9147c: team
bb4d9938-c3c2-455d-aaab-727aa701c0d8: user
priority: P1
tags:
- ExecutionError
- Error
- Fail
- Execution
authorizationToken: sampleAuthorizationToken
executionId: "{{trigger.executionId}}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
Add this task to a list of errors tasks to implement custom flow-level failure notifications. Check the PagerDuty documentation for more details..##### Examples
Send a PagerDuty alert on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.pagerduty.PagerDutyAlert
url: "{{ secret('PAGERDUTY_EVENT') }}" # https://events.pagerduty.com/v2/enqueue
payload: |
{
"dedup_key": "samplekey",
"routing_key": "samplekey",
"event_action": "trigger",
"payload" : {
"summary": "PagerDuty alert",
}
}
Send a Discord message via incoming webhook
id: discord_incoming_webhook
namespace: company.team
tasks:
- id: send_pagerduty_alert
type: io.kestra.plugin.notifications.pagerduty.PagerDutyAlert
url: "{{ secret('PAGERDUTY_EVENT') }}"
payload: |
{
"dedup_key": "samplekey",
"routing_key": "samplekey",
"event_action": "acknowledge"
}
Default value is : false
Default value is : false
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the PagerDutyAlert task.##### Examples
Send a PagerDuty notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.pagerduty.PagerDutyExecution
url: "{{ secret('PAGERDUTY_EVENT') }}" # format: https://events.pagerduty.com/v2/enqueue
payloadSummary: "PagerDuty Alert"
deduplicationKey: "dedupkey"
routingKey: "routingkey"
eventAction: "acknowledge"
executionId: "{{trigger.executionId}}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger, as shown in this example. Don't use this notification task in errors tasks. Instead, for errors tasks, use the SendGridMailSend task.##### Examples
Send an SendGrid email notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.sendgrid.SendGridMailExecution
to:
- [email protected]
from: [email protected]
subject: "The workflow execution {{trigger.executionId}} failed for the flow {{trigger.flowId}} in the namespace {{trigger.namespace}}"
sendgridApiKey: "{{ secret('SENDGRID_API_KEY') }}"
executionId: "{{ trigger.executionId }}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Note that each email address must be compliant with the RFC2822 format
Default value is : false
The attachment will be shown in the email client as separate files available for download, or displayed inline if the client supports it (for example, most browsers display PDF's in a popup window)
Note that each email address must be compliant with the RFC2822 format
Default value is : false
The provided images are assumed to be of MIME type png, jpg or whatever the email client supports as valid image that can be embedded in HTML content
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Both text and HTML can be provided, which will be offered to the email client as alternative contentEmail clients that support it, will favor HTML over plain text and ignore the text body completely
Default value is : false
Both text and HTML can be provided, which will be offered to the email client as alternative contentEmail clients that support it, will favor HTML over plain text and ignore the text body completely
1 nested properties
Examples
Send an email on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: send_email
type: io.kestra.plugin.notifications.sendgrid.SendGridMailSend
from: [email protected]
to:
- [email protected]
sendgridApiKey: "{{ secret('SENDGRID_API_KEY') }}"
subject: "Kestra workflow failed for the flow {{flow.id}} in the namespace {{flow.namespace}}"
htmlTextContent: "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}"
Note that each email address must be compliant with the RFC2822 format
Default value is : false
The attachment will be shown in the email client as separate files available for download, or displayed inline if the client supports it (for example, most browsers display PDF's in a popup window)
Note that each email address must be compliant with the RFC2822 format
Default value is : false
The provided images are assumed to be of MIME type png, jpg or whatever the email client supports as valid image that can be embedded in HTML content
Both text and HTML can be provided, which will be offered to the email client as alternative contentEmail clients that support it, will favor HTML over plain text and ignore the text body completely
Default value is : false
Both text and HTML can be provided, which will be offered to the email client as alternative contentEmail clients that support it, will favor HTML over plain text and ignore the text body completely
1 nested properties
Note that each email address must be compliant with the RFC2822 format
Default value is : application/octet-stream
Add this task to a list of errors tasks to implement custom flow-level failure notifications.
The only required input is a DSN string value, which you can find when you go to your Sentry project settings and go to the section Client Keys (DSN). You can find more detailed description of how to find your DSN in the following Sentry documentation.
You can customize the alert payload, which is a JSON object, or you can skip it and use the default payload created by kestra. For more information about the payload, check the Sentry Event Payloads documentation.
The event_id is an optional payload attribute that you can use to override the default event ID. If you don't specify it (recommended), kestra will generate a random UUID. You can use this attribute to group events together, but note that this must be a UUID type. For more information, check the Sentry documentation.##### Examples
Send a Sentry alert on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.sentry.SentryAlert
dsn: "{{ secret('SENTRY_DSN') }}" # format: https://[email protected]/xxx
endpointType: ENVELOPE
Send a custom Sentry alert
id: sentry_alert
namespace: company.team
tasks:
- id: send_sentry_message
type: io.kestra.plugin.notifications.sentry.SentryAlert
dsn: "{{ secret('SENTRY_DSN') }}"
endpointType: "ENVELOPE"
payload: |
{
"timestamp": "{{ execution.startDate }}",
"platform": "java",
"level": "error",
"transaction": "/execution/id/{{ execution.id }}",
"server_name": "localhost:8080",
"message": {
"message": "Execution {{ execution.id }} failed"
},
"extra": {
"Namespace": "{{ flow.namespace }}",
"Flow ID": "{{ flow.id }}",
"Execution ID": "{{ execution.id }}",
"Link": "http://localhost:8080/ui/executions/{{flow.namespace}}/{{flow.id}}/{{execution.id}}"
}
}
Default value is : false
Default value is : false
Default value is : ENVELOPE
Default value is : false
1 nested properties
The alert message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the SentryAlert task.
The only required input is a DSN string value, which you can find when you go to your Sentry project settings and go to the section Client Keys (DSN). For more detailed description of how to find your DSN, visit the following Sentry documentation.
You can customize the alert payload, which is a JSON object. For more information about the payload, check the Sentry Event Payloads documentation.
The level parameter is the severity of the issue. The task documentation lists all available options including DEBUG, INFO, WARNING, ERROR, FATAL. The default value is ERROR.##### Examples
This monitoring flow is triggered anytime a flow fails in the
prodnamespace. It then sends a Sentry alert with the execution information. You can fully customize the trigger conditions.
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.sentry.SentryExecution
transaction: "/execution/id/{{ trigger.executionId }}"
dsn: "{{ secret('SENTRY_DSN') }}"
level: ERROR
executionId: "{{ trigger.executionId }}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
Default value is : false
Default value is : ENVELOPE
Default value is : a generated unique identifier
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Acceptable values are: fatal, error, warning, info, debug.
Default value is : ERROR
Default value is : false
Default value is : JAVA
For example, in a web app, this might be the route name
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the SlackIncomingWebhook task.##### Examples
Send a Slack notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.slack.SlackExecution
url: "{{ secret('SLACK_WEBHOOK') }}" # format: https://hooks.slack.com/services/xzy/xyz/xyz
channel: "#general"
executionId: "{{trigger.executionId}}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Send a Rocket.Chat notification on a failed flow execution
id: failure_alert
namespace: debug
tasks:
- id: send_alert_to_rocket_chat
type: io.kestra.plugin.notifications.slack.SlackExecution
url: "{{ secret('ROCKET_CHAT_WEBHOOK') }}"
channel: "#errors"
executionId: "{{ trigger.executionId }}"
username: "Kestra TEST"
iconUrl: "https://avatars.githubusercontent.com/u/59033362?s=48"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: debug
prefix: true
Check the Create an Incoming Webhook documentation for more details..
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
Add this task to send direct Slack notifications. Check the Slack documentation for more details..##### Examples
Send a Slack notification on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.slack.SlackIncomingWebhook
url: "{{ secret('SLACK_WEBHOOK') }}" # https://hooks.slack.com/services/xzy/xyz/xyz
payload: |
{
"text": "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}"
}
Send a Slack message via incoming webhook with a text argument
id: slack_incoming_webhook
namespace: company.team
tasks:
- id: send_slack_message
type: io.kestra.plugin.notifications.slack.SlackIncomingWebhook
url: "{{ secret('SLACK_WEBHOOK') }}"
payload: |
{
"text": "Hello from the workflow {{ flow.id }}"
}
Send a Slack message via incoming webhook with a blocks argument, read more on blocks here
id: slack_incoming_webhook
namespace: company.team
tasks:
- id: send_slack_message
type: io.kestra.plugin.notifications.slack.SlackIncomingWebhook
url: "{{ secret('SLACK_WEBHOOK') }}"
payload: |
{
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "Hello from the workflow *{{ flow.id }}*"
}
}
]
}
Send a Rocket.Chat message via incoming webhook
id: rocket_chat_notification
namespace: company.team
tasks:
- id: send_rocket_chat_message
type: io.kestra.plugin.notifications.slack.SlackIncomingWebhook
url: "{{ secret('ROCKET_CHAT_WEBHOOK') }}"
payload: |
{
"alias": "Kestra TEST",
"avatar": "https://avatars.githubusercontent.com/u/59033362?s=48",
"emoji": ":smirk:",
"roomId": "#my-channel",
"text": "Sample",
"tmshow": true,
"attachments": [
{
"collapsed": false,
"color": "#ff0000",
"text": "Yay!",
"title": "Attachment Example",
"title_link": "https://rocket.chat",
"title_link_download": false,
"fields": [
{
"short": false,
"title": "Test title",
"value": "Test value"
},
{
"short": true,
"title": "Test title",
"value": "Test value"
}
]
}
]
}
Check the Create an Incoming Webhook documentation for more details..
Default value is : false
Default value is : false
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the TeamsIncomingWebhook task.##### Examples
Send a Microsoft Teams notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.teams.TeamsExecution
url: "{{ secret('TEAMS_WEBHOOK') }}" # format: https://microsoft.webhook.office.com/webhook/xyz
activityTitle: "Kestra Teams notification"
executionId: "{{ trigger.executionId }}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
Default value is : 0076D7
1 nested properties
Add this task to a list of errors tasks to implement custom flow-level failure noticiations. Check the Microsoft Teams documentation for more details.##### Examples
Send a Microsoft Teams notification on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.teams.TeamsIncomingWebhook
url: "{{ secret('TEAMS_WEBHOOK') }}" # format: https://microsoft.webhook.office.com/webhook/xyz
payload: |
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "0076D7",
"summary": "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}",
"sections": [{
"activityTitle": "Kestra Workflow Notification",
"activitySubtitle": "Workflow Execution Finished With Errors",
"markdown": true
}],
"potentialAction": [
{
"@type": "OpenUri",
"name": "Kestra Workflow",
"targets": [
{
"os": "default",
"uri": "{{ vars.systemUrl }}"
}
]
}
]
}
Send a Microsoft Teams notification message
url: "https://microsoft.webhook.office.com/webhookb2/XXXXXXXXXX"
payload: |
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "0076D7",
"summary": "Notification message",
"sections": [{
"activityTitle": "Rolling Workflow started",
"activitySubtitle": "Workflow Notification",
"markdown": true
}],
"potentialAction": [
{
"@type": "OpenUri",
"name": "Rolling Workflow",
"targets": [
{
"os": "default",
"uri": "{{ vars.systemUrl }}"
}
]
}
]
}
Default value is : false
Default value is : false
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the TelegramSend task.##### Examples
Send a Telegram notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.telegram.TelegramExecution
token: "{{ secret('TELEGRAM_TOKEN') }}" # format: 6090305634:xyz
channel: "2072728690"
executionId: "{{ trigger.executionId }}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Add this task to a list of errors tasks to implement custom flow-level failure notifications. Check the Twilio documentation for more details..##### Examples
Send a Twilio notification on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.twilio.TwilioAlert
url: "{{ secret('TWILIO_ALERT') }}" # https://notify.twilio.com/v1/Services/ISXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Notifications
payload: |
{
"identity": "0000001"
}
Send a Twilio message via incoming notification API
id: twilio_alert
namespace: company.team
tasks:
- id: send_twilio_message
type: io.kestra.plugin.notifications.twilio.TwilioAlert
url: "{{ secret('TWILIO_ALERT') }}"
payload: |
{
"identity": "0000001"
}
Default value is : false
Default value is : false
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the TwilioAlert task.##### Examples
Send a Twilio notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.twilio.TwilioExecution
url: "{{ secret('TWILIO_ALERT') }}" # format: https://notify.twilio.com/v1/Services/ISXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/Notifications
identity: 0000001
executionId: "{{trigger.executionId}}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the WhatsAppIncomingWebhook task.##### Examples
Send a WhatsApp notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.discord.WhatsAppExecution
url: "{{ secret('WHATSAPP_WEBHOOK') }}" # format: https://hooks.discord.com/services/xzy/xyz/xyz
profileName: "MyProfile"
from: 380999999999
whatsAppIds:
- "some waId"
- "waId No2"
executionId: "{{trigger.executionId}}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
Add this task to a list of errors tasks to implement custom flow-level failure notifications. Check the WhatsApp documentation for more details..##### Examples
Send a WhatsApp notification on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.whatsapp.WhatsAppIncomingWebhook
url: "{{ secret('WHATSAPP_WEBHOOK') }}" # https://webhook.your-domain
payload: |
{
"profileName": "MyName",
"whatsAppIds": ["IdNo1, IdNo2"],
"from": 380999999999
}
Send a WhatsApp message via incoming webhook
id: whatsapp_incoming_webhook
namespace: company.team
tasks:
- id: send_whatsapp_message
type: io.kestra.plugin.notifications.whatsapp.WhatsAppIncomingWebhook
url: "{{ secret('WHATSAPP_WEBHOOK') }}"
payload: |
{
"profileName": "MyName",
"whatsAppIds": ["IdNo1, IdNo2"],
"from": 380999999999,
"messageId": "wamIdNo1"
}
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Add this task to a list of errors tasks to implement custom flow-level failure notifications. Check the Zenduty integration documentation and the Zenduty Events API specification for more details.##### Examples
Send a Zenduty alert on a failed flow execution. Make sure that the payload follows the Zenduty Events API specification, including the
messageandalert_typepayload properties, which are required.
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.zenduty.ZendutyAlert
url: "https://www.zenduty.com/api/events/{{ secret('ZENDUTY_INTEGRATION_KEY') }}/"
payload: |
{
"alert_type": "info",
"message": "This is info alert",
"summary": "This is the incident summary",
"suppressed": false,
"entity_id": 12345,
"payload": {
"status": "ACME Payments are failing",
"severity": "1",
"project": "kubeprod"
},
"urls": [
{
"link_url": "https://www.example.com/alerts/12345/",
"link_text": "Alert URL"
}
]
}
Default value is : false
Default value is : false
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the ZendutyAlert task.##### Examples
Send a Zenduty notification on a failed flow execution
id: zenduty_failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.zenduty.ZendutyExecution
url: "https://www.zenduty.com/api/events/{{ secret('ZENDUTY_INTEGRATION_KEY') }}/"
executionId: "{{ trigger.executionId }}"
message: Kestra workflow execution {{ trigger.executionId }} of a flow {{ trigger.flowId }} in the namespace {{ trigger.namespace }} changed status to {{ trigger.state }}
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
The message will include a link to the execution page in the UI along with the execution ID, namespace, flow name, the start date, duration and the final status of the execution, and (if failed) the task that led to a failure.
Use this notification task only in a flow that has a Flow trigger. Don't use this notification task in errors tasks. Instead, for errors tasks, use the ZulipIncomingWebhook task.##### Examples
Send a Zulip notification on a failed flow execution
id: failure_alert
namespace: company.team
tasks:
- id: send_alert
type: io.kestra.plugin.notifications.zulip.ZulipExecution
url: "{{ secret('ZULIP_WEBHOOK') }}" # format: https://yourZulipDomain.zulipchat.com/api/v1/external/INTEGRATION_NAME?api_key=API_KEY
channel: "#general"
executionId: "{{trigger.executionId}}"
triggers:
- id: failed_prod_workflows
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: prod
prefix: true
Check the Incoming Webhook Integrations documentation for more details..
Default value is : false
Default value is : false
Default is the current execution, change it to {{ trigger.executionId }} if you use this task with a Flow trigger to use the original execution.
Default value is : "{{ execution.id }}"
Default value is : false
1 nested properties
Add this task to send direct Zulip notifications. Check the Zulip documentation for more details..##### Examples
Send a Zulip notification on a failed flow execution
id: unreliable_flow
namespace: company.team
tasks:
- id: fail
type: io.kestra.plugin.scripts.shell.Commands
runner: PROCESS
commands:
- exit 1
errors:
- id: alert_on_failure
type: io.kestra.plugin.notifications.zulip.ZulipIncomingWebhook
url: "{{ secret('ZULIP_WEBHOOK') }}" # https://yourZulipDomain.zulipchat.com/api/v1/external/INTEGRATION_NAME?api_key=API_KEY
payload: |
{
"text": "Failure alert for flow {{ flow.namespace }}.{{ flow.id }} with ID {{ execution.id }}"
}
Send a Zulip message via incoming webhook with a text argument
id: zulip_incoming_webhook
namespace: company.team
tasks:
- id: send_zulip_message
type: io.kestra.plugin.notifications.zulip.ZulipIncomingWebhook
url: "{{ secret('ZULIP_WEBHOOK') }}" # https://yourZulipDomain.zulipchat.com/api/v1/external/INTEGRATION_NAME?api_key=API_KEY
payload: |
{
"text": "Hello from the workflow {{ flow.id }}"
}
Send a Zulip message via incoming webhook with a blocks argument, read more on blocks here
id: zulip_incoming_webhook
namespace: company.team
tasks:
- id: send_zulip_message
type: io.kestra.plugin.notifications.zulip.ZulipIncomingWebhook
url: "{{ secret('ZULIP_WEBHOOK') }}" # format: https://yourZulipDomain.zulipchat.com/api/v1/external/INTEGRATION_NAME?api_key=API_KEY
payload: |
{
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "Hello from the workflow *{{ flow.id }}*"
}
}
]
}
Check the Incoming Webhook Integrations documentation for more details..
Default value is : false
Default value is : false
Default value is : false
1 nested properties
For more information, refer to the Chat Completions API docs.##### Examples
Based on a prompt input, generate a completion response and pass it to a downstream task.
id: openai
namespace: company.team
inputs:
- id: prompt
type: STRING
defaults: What is data orchestration?
tasks:
- id: completion
type: io.kestra.plugin.openai.ChatCompletion
apiKey: "yourOpenAIapiKey"
model: gpt-4o
prompt: "{{ inputs.prompt }}"
- id: response
type: io.kestra.plugin.core.debug.Return
format: {{ outputs.completion.choices[0].message.content }}"
Based on a prompt input, ask OpenAI to call a function that determines whether you need to respond to a customer's review immediately or wait until later, and then comes up with a suggested response.
id: openai
namespace: company.team
inputs:
- id: prompt
type: STRING
defaults: I love your product and would purchase it again!
tasks:
- id: prioritize_response
type: io.kestra.plugin.openai.ChatCompletion
apiKey: "yourOpenAIapiKey"
model: gpt-4o
messages:
- role: user
content: "{{ inputs.prompt }}"
functions:
- name: respond_to_review
description: Given the customer product review provided as input, determines how urgently a reply is required and then provides suggested response text.
parameters:
- name: response_urgency
type: string
description: How urgently this customer review needs a reply. Bad reviews
must be addressed immediately before anyone sees them. Good reviews can
wait until later.
required: true
enumValues:
- reply_immediately
- reply_later
- name: response_text
type: string
description: The text to post online in response to this review.
required: true
- id: response_urgency
type: io.kestra.plugin.core.debug.Return
format: "{{ outputs.prioritize_response.choices[0].message.function_call.arguments.response_urgency }}"
- id: response_text
type: io.kestra.plugin.core.debug.Return
format: "{{ outputs.prioritize_response.choices[0].message.function_call.arguments.response_text }}"
See the OpenAI model's documentation page for more details.
Default value is : false
Default value is : 10
Default value is : false
Enter a specific function name, or 'auto' to let the model decide. The default is auto.
Default value is : false
Required if prompt is not set.
If not provided, make sure to set the messages property.
1 nested properties
Provide as many details as possible to ensure the model returns an accurate parameter.
Valid types are string, number, integer, boolean, array, object
Optional, but useful when for classification problems.
Defaults to false.
Default value is : false
For more information, refer to the OpenAI Image Generation API docs.##### Examples
id: openai
namespace: company.team
tasks:
- id: create_image
type: io.kestra.plugin.openai.CreateImage
prompt: A funny cat in a black suit
apiKey: <your-api-key>
download: true
n: 5
Default value is : false
Default value is : 10
Default value is : false
If enable, the generated image will be downloaded inside Kestra's internal storage. Else, the URL of the generated image will be available as task output.
Default value is : false
Default value is : false
Default value is : LARGE
1 nested properties
An asynchronous refresh would be triggered.
Default value is : false
Default value is : false
Default value is : false
Default value is : 5.000000000
Default value is : false
Default value is : 600.000000000
1 nested properties
Must be a base64-encoded pem file.
Must be a base64-encoded pem file.
Must be a base64-encoded pem file.
Examples
id: pulsar_consume
namespace: company.team
tasks:
- id: consume
type: io.kestra.plugin.pulsar.Consume
uri: pulsar://localhost:26650
topic: test_kestra
deserializer: JSON
subscriptionName: kestra_flow
Using subscription name, we will fetch only records that haven't been consumed yet.
Can be a string or a list of strings to consume from multiple topics.
You need to specify a Pulsar protocol URL.
- Example of localhost:
pulsar://localhost:6650 - If you have multiple brokers:
pulsar://localhost:6650,localhost:6651,localhost:6652 - If you use TLS authentication:
pulsar+ssl://pulsar.us-west.example.com:6651
Default value is : false
Authentication token that can be required by some providers such as Clever Cloud.
Default value is : false
Default value is : Earliest
Default value is : false
It's not a hard limit and is evaluated every second.
It's not a hard limit and is evaluated every second.
If no records are available, the maximum wait to wait for a new record.
Default value is : 2.000000000
Required for connecting with topics with a defined schema and strict schema checking
Can be one of NONE, AVRO or JSON. None means there will be no schema enforced.
Default value is : NONE
Default value is : Exclusive
1 nested properties
Examples
Read a CSV file, transform it to the right format, and publish it to Pulsar topic.
id: produce
namespace: company.team
inputs:
- type: FILE
id: file
tasks:
- id: csv_reader
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ inputs.file }}"
- id: file_transform
type: io.kestra.plugin.scripts.nashorn.FileTransform
from: {{ outputs.csv_reader.uri }}"
script: |
var result = {
"key": row.id,
"value": {
"username": row.username,
"tweet": row.tweet
},
"eventTime": row.timestamp,
"properties": {
"key": "value"
}
};
row = result
- id: produce
type: io.kestra.plugin.pulsar.Produce
from: "{{ outputs.file_transform.uri }}"
uri: pulsar://localhost:26650
serializer: JSON
topic: test_kestra
Can be a Kestra internal storage URI, a map or a list in the following format: key, value, eventTime, properties, deliverAt, deliverAfter and sequenceId.
You need to specify a Pulsar protocol URL.
- Example of localhost:
pulsar://localhost:6650 - If you have multiple brokers:
pulsar://localhost:6650,localhost:6651,localhost:6652 - If you use TLS authentication:
pulsar+ssl://pulsar.us-west.example.com:6651
Possible values are:
Shared: By default, multiple producers can publish to a topic.Exclusive: Require exclusive access for producer. Fail immediately if there's already a producer connected.WaitForExclusive: Producer creation is pending until it can acquire exclusive access.
Default value is : false
Authentication token that can be required by some providers such as Clever Cloud.
By default, message payloads are not compressed. Supported compression types are:
NONE: No compression (Default).LZ4: Compress with LZ4 algorithm. Faster but lower compression than ZLib.ZLIB: Standard ZLib compression.ZSTDCompress with Zstandard codec. Since Pulsar 2.3.SNAPPYCompress with Snappy codec. Since Pulsar 2.4.
Default value is : false
Default value is : false
Required for connecting with topics with a defined schema and strict schema checking
Can be one of NONE, AVRO or JSON. None means there will be no schema enforced.
Default value is : NONE
1 nested properties
Examples
id: pulsar_reader
namespace: company.team
tasks:
- id: reader
type: io.kestra.plugin.pulsar.Reader
uri: pulsar://localhost:26650
topic: test_kestra
deserializer: JSON
Can be a string or a list of strings to consume from multiple topics.
You need to specify a Pulsar protocol URL.
- Example of localhost:
pulsar://localhost:6650 - If you have multiple brokers:
pulsar://localhost:6650,localhost:6651,localhost:6652 - If you use TLS authentication:
pulsar+ssl://pulsar.us-west.example.com:6651
Default value is : false
Authentication token that can be required by some providers such as Clever Cloud.
Default value is : false
Default value is : false
It's not a hard limit and is evaluated every second.
It's not a hard limit and is evaluated every second.
The first message read will be the one immediately after the specified message.
If no since or messageId are provided, we start at the beginning of the topic.
If no records are available, the maximum wait to wait for a new record.
Default value is : 2.000000000
Required for connecting with topics with a defined schema and strict schema checking
Can be one of NONE, AVRO or JSON. None means there will be no schema enforced.
Default value is : NONE
So, broker can find a latest message that was published before given duration. eg: since set to 5 minutes (PT5M) indicates that broker should find message published 5 minutes in the past, and set the initial position to that messageId.
1 nested properties
If you would like to consume multiple messages processed within a given time frame and process them in batch, you can use the io.kestra.plugin.pulsar.Trigger instead.##### Examples
Consume a message from a Pulsar topic in real-time.
id: pulsar
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.value }}"
triggers:
- id: realtime_trigger
type: io.kestra.plugin.pulsar.RealtimeTrigger
topic: kestra_trigger
uri: pulsar://localhost:26650
deserializer: JSON
subscriptionName: kestra_trigger_sub
Using subscription name, we will fetch only records that haven't been consumed yet.
Can be a string or a list of strings to consume from multiple topics.
You need to specify a Pulsar protocol URL.
- Example of localhost:
pulsar://localhost:6650 - If you have multiple brokers:
pulsar://localhost:6650,localhost:6651,localhost:6652 - If you use TLS authentication:
pulsar+ssl://pulsar.us-west.example.com:6651
Authentication token that can be required by some providers such as Clever Cloud.
Default value is : false
Default value is : Earliest
Default value is : false
Required for connecting with topics with a defined schema and strict schema checking
Can be one of NONE, AVRO or JSON. None means there will be no schema enforced.
Default value is : NONE
Default value is : Exclusive
1 nested properties
Note that you don't need an extra task to consume the message from the event trigger. The trigger will automatically consume messages and you can retrieve their content in your flow using the {{ trigger.uri }} variable. If you would like to consume each message from a Pulsar topic in real-time and create one execution per message, you can use the io.kestra.plugin.pulsar.RealtimeTrigger instead.##### Examples
id: pulsar_trigger
namespace: company.team
tasks:
- id: log
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.value }}"
triggers:
- id: trigger
type: io.kestra.plugin.pulsar.Trigger
interval: PT30S
topic: kestra_trigger
uri: pulsar://localhost:26650
deserializer: JSON
subscriptionName: kestra_trigger_sub
Using subscription name, we will fetch only records that haven't been consumed yet.
Can be a string or a list of strings to consume from multiple topics.
You need to specify a Pulsar protocol URL.
- Example of localhost:
pulsar://localhost:6650 - If you have multiple brokers:
pulsar://localhost:6650,localhost:6651,localhost:6652 - If you use TLS authentication:
pulsar+ssl://pulsar.us-west.example.com:6651
Authentication token that can be required by some providers such as Clever Cloud.
Default value is : false
Default value is : Earliest
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
It's not a hard limit and is evaluated every second.
It's not a hard limit and is evaluated every second.
If no records are available, the maximum wait to wait for a new record.
Default value is : 2.000000000
Required for connecting with topics with a defined schema and strict schema checking
Can be one of NONE, AVRO or JSON. None means there will be no schema enforced.
Default value is : NONE
Default value is : Exclusive
1 nested properties
Examples
id: redis_list_pop
namespace: company.team
tasks:
- id: list_pop
type: io.kestra.plugin.redis.list.ListPop
url: redis://:redis@localhost:6379/0
key: mypopkeyjson
serdeType: JSON
maxRecords: 1
Default value is : false
Default value is : 100
Default value is : false
Default value is : false
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second.
1 nested properties
Examples
id: redis_list_push
namespace: company.team
tasks:
- id: list_push
type: io.kestra.plugin.redis.list.ListPush
url: redis://:redis@localhost:6379/0
key: mykey
from:
- value1
- value2
Default value is : false
Default value is : false
Default value is : false
1 nested properties
If you would like to consume multiple elements processed within a given time frame and process them in batch, you can use the io.kestra.plugin.redis.list.Trigger instead.##### Examples
Consume an element from the head of a list in real-time.
id: list_listen
namespace: company.team
tasks:
- id: echo
type: io.kestra.plugin.core.log.Log
message: "Received '{{ trigger.value }}'"
triggers:
- id: watch
type: io.kestra.plugin.redis.RealtimeTrigger
url: redis://localhost:6379/0
key: mytriggerkey
Default value is : false
Default value is : false
1 nested properties
If you would like to consume each message from a list in real-time and create one execution per message, you can use the io.kestra.plugin.redis.list.RealtimeTrigger instead.##### Examples
id: list_listen
namespace: company.team
tasks:
- id: echo
type: io.kestra.plugin.core.log.Log
message: "{{ trigger.uri }} containing {{ trigger.count }} lines"
triggers:
- id: watch
type: io.kestra.plugin.redis.list.Trigger
url: redis://localhost:6379/0
key: mytriggerkey
maxRecords: 2
Default value is : 100
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
It's not an hard limit and is evaluated every second.
It's not an hard limit and is evaluated every second.
1 nested properties
Examples
id: redis_publish
namespace: company.team
tasks:
- id: publish
type: io.kestra.plugin.redis.pubsub.Publish
url: redis://:redis@localhost:6379/0
channel: mych
from:
- value1
- value2
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: redis_delete
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.redis.string.Delete
url: redis://:redis@localhost:6379/0
keys:
- keyDelete1
- keyDelete2
Default value is : false
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: redis_get
namespace: company.team
tasks:
- id: get
type: io.kestra.plugin.redis.string.Get
url: redis://:redis@localhost:6379/0
key: mykey
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
id: redis_set
namespace: company.team
tasks:
- id: set
type: io.kestra.plugin.redis.string.Set
url: redis://:redis@localhost:6379/0
key: mykey
value: myvalue
serdeType: STRING
Default value is : false
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Docker configuration file that can set access credentials to private container registries. Usually located in ~/.docker/config.json.
Default value is : ALWAYS
The size must be greater than 0. If omitted, the system uses 64MB.
Must be a valid mount expression as string, example : /home/user:/app.
Volumes mount are disabled by default for security reasons; you must enable them on server configuration by setting kestra.tasks.scripts.docker.volume-enabled to true.
Examples
Make an API call and pass request body to a Groovy script.
id: api_request_to_groovy
namespace: company.team
tasks:
- id: request
type: io.kestra.plugin.core.http.Request
uri: "https://dummyjson.com/products/1"
- id: groovy
type: io.kestra.plugin.scripts.groovy.Eval
script: |
logger.info('{{ outputs.request.body }}')
- id: download
type: io.kestra.plugin.core.http.Download
uri: "https://dummyjson.com/products/1"
- id: run_context_groovy
type: io.kestra.plugin.scripts.groovy.Eval
script: |
// logger.info('Vars: {}', runContext.getVariables())
URI uri = new URI(runContext.variables.outputs.download.uri)
InputStream istream = runContext.storage().getFile(uri)
logger.info('Content: {}', istream.text)
id: groovy_eval
namespace: company.team
tasks:
- id: eval
type: io.kestra.plugin.scripts.groovy.Eval
outputs:
- out
- map
script: |
import io.kestra.core.models.executions.metrics.Counter
logger.info('executionId: {}', runContext.render('{{ execution.id }}'))
runContext.metric(Counter.of('total', 666, 'name', 'bla'))
map = Map.of('test', 'here')
File tempFile = runContext.workingDir().createTempFile().toFile()
var output = new FileOutputStream(tempFile)
output.write('555\n666\n'.getBytes())
out = runContext.storage().putFile(tempFile
Default value is : false
Default value is : false
Default value is : false
1 nested properties
This allows you to transform the data, previously loaded by Kestra, as you need.
Take a ion format file from Kestra and iterate row per row.
Each row will populate a row global variable. You need to alter this variable that will be saved on output file.
If you set the row to null, the row will be skipped.
You can create a variable rows to return multiple rows for a single row.
Examples
Convert row by row of a file from Kestra's internal storage.
id: groovy_file_transform
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.groovy.FileTransform
from: "{{ inputs.file }}"
script: |
logger.info('row: {}', row)
if (row.get('name') == 'richard') {
row = null
} else {
row.put('email', row.get('name') + '@kestra.io')
}
Create multiple rows from one row.
id: groovy_file_transform
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.groovy.FileTransform
from: "{{ inputs.file }}"
script: |
logger.info('row: {}', row)
rows = [["action", "insert"], row]
Transform a JSON string to a file.
id: groovy_file_transform
namespace: company.team
inputs:
- id: json
type: JSON
defaults: [{"name":"jane"}, {"name":"richard"}]
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.groovy.FileTransform
from: "{{ inputs.json }}"
script: |
logger.info('row: {}', row)
if (row.get('name') == 'richard') {
row = null
} else {
row.put('email', row.get('name') + '@kestra.io')
}
JSON transformations using jackson library
id: json_transform_using_jackson
namespace: company.team
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.groovy.FileTransform
from: "[{"name":"John Doe", "age":99, "embedded":{"foo":"bar"}}]"
script: |
import com.fasterxml.jackson.*
def mapper = new databind.ObjectMapper();
def jsonStr = mapper.writeValueAsString(row);
logger.info('input in json str: {}', jsonStr)
def typeRef = new core.type.TypeReference<HashMap<String,Object>>() {};
data = mapper.readValue(jsonStr, typeRef);
logger.info('json object: {}', data);
logger.info('embedded field: {}', data.embedded.foo)
Can be Kestra's internal storage URI, a map or a list.
Default value is : false
Take care that the order is not respected if you use parallelism.
Default value is : false
Default value is : false
1 nested properties
Examples
Execute JBang command to execute a JAR file.
id: jbang_commands
namespace: company.team
tasks:
- id: commands
type: io.kestra.plugin.scripts.jbang.Commands
commands:
- jbang --quiet --main picocli.codegen.aot.graalvm.ReflectionConfigGenerator info.picocli:picocli-codegen:4.6.3
Default value is : false
Default value is : jbangdev/jbang-action
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Execute a script written in Java
id: jbang_script
namespace: company.team
tasks:
- id: script
type: io.kestra.plugin.scripts.jbang.Script
script: |
class helloworld {
public static void main(String[] args) {
if(args.length==0) {
System.out.println("Hello World!");
} else {
System.out.println("Hello " + args[0]);
}
}
}
Execute a script written in Java with dependencies
id: jbang_script
namespace: company.team
tasks:
- id: script_with_dependency
type: io.kestra.plugin.scripts.jbang.Script
script: |
//DEPS ch.qos.reload4j:reload4j:1.2.19
import org.apache.log4j.Logger;
import org.apache.log4j.BasicConfigurator;
class classpath_example {
static final Logger logger = Logger.getLogger(classpath_example.class);
public static void main(String[] args) {
BasicConfigurator.configure();
logger.info("Hello World");
}
}
Execute a script written in Kotlin.
id: jbang_script
namespace: company.team
tasks:
- id: script_kotlin
type: io.kestra.plugin.scripts.jbang.Script
extension: .kt
script: |
public fun main() {
println("Hello World");
}
Default value is : false
Default value is : jbangdev/jbang-action
Default value is : false
JBang support more than Java scripts, you can use it with JShell (.jsh), Kotlin (.kt), Groovy (.groovy) or even Markdowns (.md).
Default value is : .java
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
By default, JBang logs in stderr so quiet is configured to true by default so no JBang logs are shown except errors.
Default value is : true
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Create a Julia script, install required packages and execute it. Note that instead of defining the script inline, you could create the Julia script in the embedded VS Code editor and point to its location by path. If you do so, make sure to enable namespace files by setting the
enabledflag of thenamespaceFilesproperty totrue.
id: julia_commands
namespace: company.team
tasks:
- id: commands
type: io.kestra.plugin.scripts.julia.Commands
warningOnStdErr: false
inputFiles:
main.jl: |
using DataFrames, CSV
df = DataFrame(Name = ["Alice", "Bob", "Charlie"], Age = [25, 30, 35])
CSV.write("output.csv", df)
outputFiles:
- output.csv
beforeCommands:
- julia -e 'using Pkg; Pkg.add("DataFrames"); Pkg.add("CSV")'
commands:
- julia main.jl
Default value is : false
Default value is : julia
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Create a Julia script, install required packages and execute it. Note that instead of defining the script inline, you could create the Julia script in the embedded VS Code editor and read its content using the
{{ read('your_script.jl') }}function.
id: julia_script
namespace: company.team
tasks:
- id: script
type: io.kestra.plugin.scripts.julia.Script
warningOnStdErr: false
script: |
using DataFrames, CSV
df = DataFrame(Name = ["Alice", "Bob", "Charlie"], Age = [25, 30, 35])
CSV.write("output.csv", df)
outputFiles:
- output.csv
beforeCommands:
- julia -e 'using Pkg; Pkg.add("DataFrames"); Pkg.add("CSV")'
Default value is : false
Default value is : julia
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
id: jython_eval
namespace: company.team
tasks:
- id: eval
type: io.kestra.plugin.scripts.jython.Eval
outputs:
- out
- map
script: |
from io.kestra.core.models.executions.metrics import Counter
import tempfile
from java.io import File
logger.info('executionId: {}', runContext.render('{{ execution.id }}'))
runContext.metric(Counter.of('total', 666, 'name', 'bla'))
map = {'test': 'here'}
tempFile = tempfile.NamedTemporaryFile()
tempFile.write('555\n666\n')
out = runContext.storage().putFile(File(tempFile.name)
Default value is : false
Default value is : false
Default value is : false
1 nested properties
This allows you to transform the data, previously loaded by Kestra, as you need.
Take a ion format file from Kestra and iterate row per row.
Each row will populate a row global variable. You need to alter this variable that will be saved on output file.
If you set the row to null, the row will be skipped.
You can create a variable rows to return multiple rows for a single row.
Examples
Extract data from an API, add a column, and store it as a downloadable CSV file.
id: etl_api_to_csv
namespace: company.team
tasks:
- id: download
type: io.kestra.plugin.fs.http.Download
uri: https://gorest.co.in/public/v2/users
- id: ion_to_json
type: io.kestra.plugin.serdes.json.JsonToIon
from: "{{ outputs.download.uri }}"
newLine: false
- id: write_json
type: io.kestra.plugin.serdes.json.IonToJson
from: "{{ outputs.ion_to_json.uri }}"
- id: add_column
type: io.kestra.plugin.scripts.jython.FileTransform
from: "{{ outputs.write_json.uri }}"
script: |
from datetime import datetime
logger.info('row: {}', row)
row['inserted_at'] = datetime.utcnow()
- id: csv
type: io.kestra.plugin.serdes.csv.IonToCsv
from: "{{ outputs.add_column.uri }}"
Transform with file from internal storage.
id: jython_file_transform
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.jython.FileTransform
from: "{{ inputs.file }}"
script: |
logger.info('row: {}', row)
if row['name'] == 'richard':
row = None
else:
row['email'] = row['name'] + '@kestra.io'
Transform with file from JSON string.
id: jython_file_transform
namespace: company.team
inputs:
- id: json
type: JSON
defaults: {"name": "john"}
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.jython.FileTransform
from: "{{ inputs.json }}"
script: |
logger.info('row: {}', row)
if row['name'] == 'richard':
row = None
else:
row['email'] = row['name'] + '@kestra.io'
Can be Kestra's internal storage URI, a map or a list.
Default value is : false
Take care that the order is not respected if you use parallelism.
Default value is : false
Default value is : false
1 nested properties
Examples
id: nashorn_eval
namespace: company.team
tasks:
- id: eval
type: io.kestra.plugin.scripts.nashorn.Eval
outputs:
- out
- map
script: |
var Counter = Java.type('io.kestra.core.models.executions.metrics.Counter');
var File = Java.type('java.io.File');
var FileOutputStream = Java.type('java.io.FileOutputStream');
logger.info('executionId: {}', runContext.render('{{ execution.id }}'));
runContext.metric(Counter.of('total', 666, 'name', 'bla'));
map = {'test': 'here'}
var tempFile = runContext.workingDir().createTempFile().toFile()
var output = new FileOutputStream(tempFile)
output.write('555\n666\n'.getBytes())
out = runContext.storage().putFile(tempFile)"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Transform with file from internal storage
id: nashorn_file_transform
namespace: company.team
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.nashorn.FileTransform
from: "{{ outputs['avro-to-gcs'] }}"
script: |
logger.info('row: {}', row)
if (row['name'] === 'richard') {
row = null
} else {
row['email'] = row['name'] + '@kestra.io'
}
Transform JSON string input with a Nashorn script.
id: nashorn_file_transform
namespace: company.team
tasks:
- id: file_transform
type: io.kestra.plugin.scripts.nashorn.FileTransform
from: "[{"name":"jane"}, {"name":"richard"}]"
script: |
logger.info('row: {}', row)
if (row['name'] === 'richard') {
row = null
} else {
row['email'] = row['name'] + '@kestra.io'
}
Can be Kestra's internal storage URI, a map or a list.
Default value is : false
Take care that the order is not respected if you use parallelism.
Default value is : false
Default value is : false
1 nested properties
Examples
Install required npm packages, create a Node.js script and execute it.
id: nodejs_commands
namespace: company.team
tasks:
- id: commands
type: io.kestra.plugin.scripts.node.Commands
inputFiles:
main.js: |
const colors = require("colors");
console.log(colors.red("Hello"));
beforeCommands:
- npm install colors
commands:
- node main.js
warningOnStdErr: false
Default value is : false
Default value is : node
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Install package, create a Node.js script and execute it.
id: nodejs_script
namespace: company.team
tasks:
- id: script
type: io.kestra.plugin.scripts.node.Script
beforeCommands:
- npm install colors
script: |
const colors = require("colors");
console.log(colors.red("Hello"));
warningOnStdErr: false"
If you want to generate files in your script to make them available for download and use in downstream tasks, you can leverage the
{{ outputDir }}variable. Files stored in that directory will be persisted in Kestra's internal storage. To access this output in downstream tasks, use the syntax{{ outputs.yourTaskId.outputFiles['yourFileName.fileExtension'] }}.
Alternatively, instead of the {{ outputDir }} variable, you could use the outputFiles property to output files from your script. You can access those files in downstream tasks using the same syntax {{ outputs.yourTaskId.outputFiles['yourFileName.fileExtension'] }}, and you can download the files from the UI's Output tab.
id: nodejs_script
namespace: company.team
tasks:
- id: node
type: io.kestra.plugin.scripts.node.Script
warningOnStdErr: false
beforeCommands:
- npm install json2csv > /dev/null 2>&1
script: |
const fs = require('fs');
const { Parser } = require('json2csv');
// Product prices in our simulation
const productPrices = {
'T-shirt': 20,
'Jeans': 75,
'Shoes': 80,
'Socks': 5,
'Hat': 25
}
const generateOrder = () => {
const products = ['T-shirt', 'Jeans', 'Shoes', 'Socks', 'Hat'];
const statuses = ['pending', 'shipped', 'delivered', 'cancelled'];
const randomProduct = products[Math.floor(Math.random() * products.length)];
const randomStatus = statuses[Math.floor(Math.random() * statuses.length)];
const randomQuantity = Math.floor(Math.random() * 10) + 1;
const order = {
product: randomProduct,
status: randomStatus,
quantity: randomQuantity,
total: randomQuantity * productPrices[randomProduct]
};
return order;
}
let totalSales = 0;
let orders = [];
for (let i = 0; i < 100; i++) {
const order = generateOrder();
orders.push(order);
totalSales += order.total;
}
console.log(`Total sales: $${totalSales}`);
const fields = ['product', 'status', 'quantity', 'total'];
const json2csvParser = new Parser({ fields });
const csvData = json2csvParser.parse(orders);
fs.writeFileSync('{{ outputDir }}/orders.csv', csvData);
console.log('Orders saved to orders.csv');
Default value is : false
Default value is : node
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Execute PowerShell commands.
id: execute_powershell_commands
namespace: company.team
tasks:
- id: powershell
type: io.kestra.plugin.scripts.powershell.Commands
inputFiles:
main.ps1: |
'Hello, World!' | Write-Output
commands:
- ./main.ps1
Default value is : false
Default value is : ghcr.io/kestra-io/powershell:latest
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- pwsh
- -NoProfile
- -NonInteractive
- -Command`
Default value is : `- pwsh
- -NoProfile
- -NonInteractive
- -Command`
[
"pwsh",
"-NoProfile",
"-NonInteractive",
"-Command"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Execute a PowerShell script.
id: execute_powershell_script
namespace: company.team
tasks:
- id: powershell
type: io.kestra.plugin.scripts.powershell.Script
script: |
'Hello, World!' | Write-Output
If you want to generate files in your script to make them available for download and use in downstream tasks, you can leverage the
{{ outputDir }}variable. Files stored in that directory will be persisted in Kestra's internal storage. To access this output in downstream tasks, use the syntax{{ outputs.yourTaskId.outputFiles['yourFileName.fileExtension'] }}.
id: powershell_generate_files
namespace: company.team
tasks:
- id: powershell
type: io.kestra.plugin.scripts.powershell.Script
script: |
Set-Content -Path {{ outputDir }}\hello.txt -Value "Hello World"
Default value is : false
Default value is : ghcr.io/kestra-io/powershell:latest
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- pwsh
- -NoProfile
- -NonInteractive
- -Command`
Default value is : `- pwsh
- -NoProfile
- -NonInteractive
- -Command`
[
"pwsh",
"-NoProfile",
"-NonInteractive",
"-Command"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Execute a Python script in a Conda virtual environment. First, add the following script in the embedded Code Editor and name it
etl_script.py:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--num", type=int, default=42, help="Enter an integer")
args = parser.parse_args()
result = args.num * 2
print(result)
Then, make sure to set the enabled flag of the namespaceFiles property to true to enable namespace files. We include only the etl_script.py file as that is the only file we require from namespace files.
This flow uses a io.kestra.plugin.core.runner.Process Task Runner and Conda virtual environment for process isolation and dependency management. However, note that, by default, Kestra runs tasks in a Docker container (i.e. a Docker task runner), and you can use the taskRunner property to customize many options, as well as containerImage to choose the Docker image to use.
id: python_venv
namespace: company.team
tasks:
- id: python
type: io.kestra.plugin.scripts.python.Commands
namespaceFiles:
enabled: true
include:
- etl_script.py
taskRunner:
type: io.kestra.plugin.core.runner.Process
beforeCommands:
- conda activate myCondaEnv
commands:
- python etl_script.py
Execute a Python script from Git in a Docker container and output a file
id: python_commands_example
namespace: company.team
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/examples
branch: main
- id: git_python_scripts
type: io.kestra.plugin.scripts.python.Commands
warningOnStdErr: false
containerImage: ghcr.io/kestra-io/pydata:latest
beforeCommands:
- pip install faker > /dev/null
commands:
- python examples/scripts/etl_script.py
- python examples/scripts/generate_orders.py
outputFiles:
- orders.csv
- id: load_csv_to_s3
type: io.kestra.plugin.aws.s3.Upload
accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}"
region: eu-central-1
bucket: kestraio
key: stage/orders.csv
from: "{{ outputs.gitPythonScripts.outputFiles['orders.csv'] }}"
Execute a Python script on a remote worker with a GPU
id: gpu_task
namespace: company.team
tasks:
- id: python
type: io.kestra.plugin.scripts.python.Commands
taskRunner:
type: io.kestra.plugin.core.runner.Process
commands:
- python ml_on_gpu.py
workerGroup:
key: gpu
Pass detected S3 objects from the event trigger to a Python script
id: s3_trigger_commands
namespace: company.team
description: process CSV file from S3 trigger
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/examples
branch: main
- id: python
type: io.kestra.plugin.scripts.python.Commands
inputFiles:
data.csv: "{{ trigger.objects | jq('.[].uri') | first }}"
description: this script reads a file `data.csv` from S3 trigger
containerImage: ghcr.io/kestra-io/pydata:latest
warningOnStdErr: false
commands:
- python examples/scripts/clean_messy_dataset.py
outputFiles:
- "*.csv"
- "*.parquet"
triggers:
- id: wait_for_s3_object
type: io.kestra.plugin.aws.s3.Trigger
bucket: declarative-orchestration
maxKeys: 1
interval: PT1S
filter: FILES
action: MOVE
prefix: raw/
moveTo:
key: archive/raw/
accessKeyId: "{{ secret('AWS_ACCESS_KEY_ID') }}"
secretKeyId: "{{ secret('AWS_SECRET_ACCESS_KEY') }}"
region: "{{ secret('AWS_DEFAULT_REGION') }}"
Execute a Python script from Git using a private Docker container image
id: python_in_container
namespace: company.team
tasks:
- id: wdir
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/kestra-io/examples
branch: main
- id: git_python_scripts
type: io.kestra.plugin.scripts.python.Commands
warningOnStdErr: false
commands:
- python examples/scripts/etl_script.py
outputFiles:
- "*.csv"
- "*.parquet"
containerImage: annageller/kestra:latest
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
config: |
{
"auths": {
"https://index.docker.io/v1/": {
"username": "annageller",
"password": "{{ secret('DOCKER_PAT') }}"
}
}
}
Create a python script and execute it in a virtual environment
id: script_in_venv
namespace: company.team
tasks:
- id: python
type: io.kestra.plugin.scripts.python.Commands
inputFiles:
main.py: |
import requests
from kestra import Kestra
response = requests.get('https://google.com')
print(response.status_code)
Kestra.outputs({'status': response.status_code, 'text': response.text})
beforeCommands:
- python -m venv venv
- . venv/bin/activate
- pip install requests kestra > /dev/null
commands:
- python main.py
Default value is : false
Default value is : ghcr.io/kestra-io/kestrapy:latest
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Execute a Python script and generate an output.
id: python_use_input_file
namespace: company.team
tasks:
- id: python
task: io.kestra.plugin.scripts.python.Script
script: |
from kestra import Kestra
import requests
response = requests.get('https://kestra.io')
print(response.status_code)
Kestra.outputs({'status': response.status_code, 'text': response.text})
beforeCommands:
- pip install requests kestra
Log messages at different log levels using Kestra logger.
id: python_logs
namespace: company.team
tasks:
- id: python_logger
type: io.kestra.plugin.scripts.python.Script
allowFailure: true
warningOnStdErr: false
script: |
import time
from kestra import Kestra
logger = Kestra.logger()
logger.debug("DEBUG is used for diagnostic info.")
time.sleep(0.5)
logger.info("INFO confirms normal operation.")
time.sleep(0.5)
logger.warning("WARNING signals something unexpected.")
time.sleep(0.5)
logger.error("ERROR indicates a serious issue.")
time.sleep(0.5)
logger.critical("CRITICAL means a severe failure.")
Execute a Python script with an input file from Kestra's local storage created by a previous task.
id: python_use_input_file
namespace: company.team
tasks:
- id: python
task: io.kestra.plugin.scripts.python.Script
script: |
with open('{{ outputs.previousTaskId.uri }}', 'r') as f:
print(f.read())
Execute a Python script that outputs a file.
id: python_output_file
namespace: company.team
tasks:
- id: python
type: io.kestra.plugin.scripts.python.Script
script: |
f = open("{{ outputDir }}/myfile.txt", "a")
f.write("Hello from a Kestra task!")
f.close()
If you want to generate files in your script to make them available for download and use in downstream tasks, you can leverage the
{{outputDir}}expression. Files stored in that directory will be persisted in Kestra's internal storage. The first task in this example creates a file'myfile.txt'and the next task can access it by leveraging the syntax{{outputs.yourTaskId.outputFiles['yourFileName.fileExtension']}}.
id: python_outputs
namespace: company.team
tasks:
- id: clean_dataset
type: io.kestra.plugin.scripts.python.Script
containerImage: ghcr.io/kestra-io/pydata:latest
script: |
import pandas as pd
df = pd.read_csv("https://huggingface.co/datasets/kestra/datasets/raw/main/csv/messy_dataset.csv")
# Replace non-numeric age values with NaN
df["Age"] = pd.to_numeric(df["Age"], errors="coerce")
# mean imputation: fill NaN values with the mean age
mean_age = int(df["Age"].mean())
print(f"Filling NULL values with mean: {mean_age}")
df["Age"] = df["Age"].fillna(mean_age)
df.to_csv("{{ outputDir }}/clean_dataset.csv", index=False)
- id: readFileFromPython
type: io.kestra.plugin.scripts.shell.Commands
taskRunner:
type: io.kestra.plugin.core.runner.Process
commands:
- head -n 10 {{ outputs.clean_dataset.outputFiles['clean_dataset.csv'] }}
Default value is : false
Default value is : ghcr.io/kestra-io/kestrapy:latest
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Create an R script, install required packages and execute it. Note that instead of defining the script inline, you could create the script as a dedicated R script in the embedded VS Code editor and point to its location by path. If you do so, make sure to enable namespace files by setting the
enabledflag of thenamespaceFilesproperty totrue.
id: r_commands
namespace: company.team
tasks:
- id: r
type: io.kestra.plugin.scripts.r.Commands
inputFiles:
main.R: |
library(lubridate)
ymd("20100604");
mdy("06-04-2011");
dmy("04/06/2012")
beforeCommands:
- Rscript -e 'install.packages("lubridate")'
commands:
- Rscript main.R
Default value is : false
Default value is : r-base
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Install a package and execute an R script
script: |
library(lubridate)
ymd("20100604");
mdy("06-04-2011");
dmy("04/06/2012")
beforeCommands:
- Rscript -e 'install.packages("lubridate")'
Add an R script in the embedded VS Code editor, install required packages and execute it.
Here is an example R script that you can add in the embedded VS Code editor. You can name the script file main.R:
library(dplyr)
library(arrow)
data(mtcars) # load mtcars data
print(head(mtcars))
final <- mtcars %>%
summarise(
avg_mpg = mean(mpg),
avg_disp = mean(disp),
avg_hp = mean(hp),
avg_drat = mean(drat),
avg_wt = mean(wt),
avg_qsec = mean(qsec),
avg_vs = mean(vs),
avg_am = mean(am),
avg_gear = mean(gear),
avg_carb = mean(carb)
)
final %>% print()
write.csv(final, "final.csv")
mtcars_clean <- na.omit(mtcars) # this line removes rows with NA values
write_parquet(mtcars_clean, "mtcars_clean.parquet")
Note that tasks in Kestra are stateless. Therefore, the files generated by a task, such as the CSV and Parquet files in the example above, are not persisted in Kestra's internal storage, unless you explicitly tell Kestra to do so. Make sure to add the outputFiles property to your task as shown below to persist the generated Parquet file (or any other file) in Kestra's internal storage and make them visible in the Outputs tab.
To access this output in downstream tasks, use the syntax {{outputs.yourTaskId.outputFiles['yourFileName.fileExtension']}}. Alternatively, you can wrap your tasks that need to pass data between each other in a WorkingDirectory task — this way, those tasks will share the same working directory and will be able to access the same files.
Note how we use the read function to read the content of the R script stored as a Namespace File.
Finally, note that the docker property is optional. If you don't specify it, Kestra will use the default R image. If you want to use a different image, you can specify it in the docker property as shown below.
id: r_cars
namespace: company.team
tasks:
- id: r
type: io.kestra.plugin.scripts.r.Script
warningOnStdErr: false
containerImage: ghcr.io/kestra-io/rdata:latest
script: "{{ read('main.R') }}"
outputFiles:
- "*.csv"
- "*.parquet"
Default value is : false
Default value is : r-base
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Create a Ruby script and execute it. The easiest way to create a Ruby script is to use the embedded VS Code editor. Create a file named
main.rband paste the following code:
require 'csv'
require 'json'
file = File.read('data.json')
data_hash = JSON.parse(file)
# Extract headers
headers = data_hash.first.keys
# Convert hashes to arrays
data = data_hash.map(&:values)
# Prepend headers to data
data.unshift(headers)
# Create and write data to CSV file
CSV.open('output.csv', 'wb') do |csv|
data.each { |row| csv << row }
end
In order to read that script from the Namespace File called main.rb, you need to enable the namespaceFiles property. We include only main.rb as that is the only file we want from the namespaceFiles.
Also, note how we use the inputFiles option to read additional files into the script's working directory. In this case, we read the data.json file, which contains the data that we want to convert to CSV.
Finally, we use the outputFiles option to specify that we want to output the output.csv file that is generated by the script. This allows us to access the file in the UI's Output tab and download it, or pass it to other tasks.
id: generate_csv
namespace: company.team
tasks:
- id: bash
type: io.kestra.plugin.scripts.ruby.Commands
namespaceFiles:
enabled: true
include:
- main.rb
inputFiles:
data.json: |
[
{"Name": "Alice", "Age": 30, "City": "New York"},
{"Name": "Bob", "Age": 22, "City": "Los Angeles"},
{"Name": "Charlie", "Age": 35, "City": "Chicago"}
]
beforeCommands:
- ruby -v
commands:
- ruby main.rb
outputFiles:
- "*.csv"
Default value is : false
Default value is : ruby
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Create a Ruby script and execute it. The easiest way to create a Ruby script is to use the embedded VS Code editor. Create a file named
main.rband paste the following code:
require 'csv'
require 'json'
file = File.read('data.json')
data_hash = JSON.parse(file)
# Extract headers
headers = data_hash.first.keys
# Convert hashes to arrays
data = data_hash.map(&:values)
# Prepend headers to data
data.unshift(headers)
# Create and write data to CSV file
CSV.open('output.csv', 'wb') do |csv|
data.each { |row| csv << row }
end
In order to read that script from the Namespace File called main.rb, you can leverage the {{ read('main.rb') }} function.
Also, note how we use the inputFiles option to read additional files into the script's working directory. In this case, we read the data.json file, which contains the data that we want to convert to CSV.
Finally, we use the outputFiles option to specify that we want to output the output.csv file that is generated by the script. This allows us to access the file in the UI's Output tab and download it, or pass it to other tasks.
id: generate_csv
namespace: company.team
tasks:
- id: bash
type: io.kestra.plugin.scripts.ruby.Script
inputFiles:
data.json: |
[
{"Name": "Alice", "Age": 30, "City": "New York"},
{"Name": "Bob", "Age": 22, "City": "Los Angeles"},
{"Name": "Charlie", "Age": 35, "City": "Chicago"}
]
beforeCommands:
- ruby -v
script: "{{ read('main.rb') }}"
outputFiles:
- "*.csv"
Default value is : false
Default value is : ruby
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Make sure to set that to a numeric value e.g. cpus: "1.5" or cpus: "4" or For instance, if the host machine has two CPUs and you set cpus: "1.5", the container is guaranteed at most one and a half of the CPUs.
The auth field is a base64-encoded authentication string of username:password or a token.
If not defined, the registry will be extracted from the image name.
These options are passed directly to the driver.
This task runner executes tasks in a container-based Docker-compatible engine.
Use the containerImage property to configure the image for the task.
To access the task's working directory, use the {{workingDir}} Pebble expression
or the WORKING_DIR environment variable.
Input files and namespace files added to the task will be accessible from that directory.
To generate output files, we recommend using the outputFiles task's property.
This allows you to explicitly define which files from the task's working directory
should be saved as output files.
Alternatively, when writing files in your task, you can leverage
the {{outputDir}} Pebble expression or the OUTPUT_DIR environment variable.
All files written to that directory will be saved as output files automatically.##### Examples
Execute a Shell command.
id: simple_shell_example
namespace: company.team
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
commands:
- echo "Hello World"
Pass input files to the task, execute a Shell command, then retrieve output files.
id: shell_example_with_files
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
inputFiles:
data.txt: "{{ inputs.file }}"
outputFiles:
- "*.txt"
containerImage: centos
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
commands:
- cp {{ workingDir }}/data.txt {{ workingDir }}/out.txt
Run a Python script in Docker and allocate a specific amount of memory.
id: allocate_memory_to_python_script
namespace: company.team
tasks:
- id: script
type: io.kestra.plugin.scripts.python.Script
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
pullPolicy: IF_NOT_PRESENT
cpu:
cpus: 1
memory:
memory: "512Mb"
containerImage: ghcr.io/kestra-io/kestrapy:latest
script: |
from kestra import Kestra
data = dict(message="Hello from Kestra!")
Kestra.outputs(data)
Docker configuration file that can set access credentials to private container registries. Usually located in ~/.docker/config.json.
Default value is : true
Default value is : - ""
Default value is : - ""
[
""
]
How to handle local files (input files, output files, namespace files, ...).
By default, we create a volume and copy the file into the volume bind path.
Configuring it to MOUNT will mount the working directory instead.
Default value is : VOLUME
The size must be greater than 0. If omitted, the system uses 64MB.
Make sure to provide a map of a local path to a container path in the format: /home/local/path:/app/container/path.
Volume mounts are disabled by default for security reasons — if you are sure you want to use them,
enable that feature in the plugin configuration
by setting volume-enabled to true.
Here is how you can add that setting to your kestra configuration:
kestra:
plugins:
configurations:
- type: io.kestra.plugin.scripts.runner.docker.Docker
values:
volume-enabled: true
The minimum allowed value is 4MB. Because kernel memory cannot be swapped out, a container which is starved of kernel memory may block host machine resources, which can have side effects on the host machine and on other containers. See the kernel-memory docs for more details.
Make sure to use the format number + unit (regardless of the case) without any spaces.
The unit can be KB (kilobytes), MB (megabytes), GB (gigabytes), etc.
Given that it's case-insensitive, the following values are equivalent:
"512MB""512Mb""512mb""512000KB""0.5GB"
It is recommended that you allocate at least 6MB.
If you use memoryReservation, it must be set lower than memory for it to take precedence. Because it is a soft limit, it does not guarantee that the container doesn’t exceed the limit.
If memory and memorySwap are set to the same value, this prevents containers from using any swap. This is because memorySwap includes both the physical memory and swap space, while memory is only the amount of physical memory that can be used.
By default, the host kernel can swap out a percentage of anonymous pages used by a container. You can set memorySwappiness to a value between 0 and 100 to tune this percentage.
To change this behavior, use the oomKillDisable option. Only disable the OOM killer on containers where you have also set the memory option. If the memory flag is not set, the host can run out of memory, and the kernel may need to kill the host system’s processes to free the memory.
Examples
Execute ETL in Rust in a Docker container and output CSV files generated as a result of the script.
id: rust_flow
namespace: company.team
tasks:
- id: rust
type: io.kestra.plugin.scripts.shell.Commands
commands:
- etl
containerImage: ghcr.io/kestra-io/rust:latest
outputFiles:
- "*.csv"
Execute a single Shell command.
id: shell_single_command
namespace: company.team
tasks:
- id: command
type: io.kestra.plugin.scripts.shell.Commands
commands:
- 'echo "The current execution is: {{ execution.id }}"'
Include only specific namespace files.
id: include_files
namespace: company.team
tasks:
- id: command
type: io.kestra.plugin.scripts.shell.Commands
description: "Only the included `namespaceFiles` get listed"
namespaceFiles:
enabled: true
include:
- test1.txt
- test2.yaml
commands:
- ls
Exclude specific namespace files.
id: exclude_files
namespace: company.team
tasks:
- id: command
type: io.kestra.plugin.scripts.shell.Commands
description: "All `namespaceFiles` except those that are excluded will be injected into the task's working directory"
namespaceFiles:
enabled: true
exclude:
- test1.txt
- test2.yaml
commands:
- ls
Execute Shell commands that generate files accessible by other tasks and available for download in the UI's Output tab.
id: shell_generate_files
namespace: company.team
tasks:
- id: commands
type: io.kestra.plugin.scripts.shell.Commands
outputFiles:
- first.txt
- second.txt
commands:
- echo "1" >> first.txt
- echo "2" >> second.txt
Execute a Shell command using an input file generated in a previous task.
id: use_input_file
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv
- id: commands
type: io.kestra.plugin.scripts.shell.Commands
commands:
- cat {{ outputs.http_download.uri }}
Run a PHP Docker container and execute a command.
id: run_php_code
namespace: company.team
tasks:
- id: commands
type: io.kestra.plugin.scripts.shell.Commands
taskRunner:
type: io.kestra.plugin.scripts.runner.docker.Docker
containerImage: php
commands:
- php -r 'print(phpversion());'
Create output variables from a standard output.
id: create_output_variables
namespace: company.team
tasks:
- id: commands
type: io.kestra.plugin.scripts.shell.Commands
commands:
- echo '::{"outputs":{"test":"value","int":2,"bool":true,"float":3.65}}::'
Send a counter metric from a standard output.
id: create_counter_metric
namespace: company.team
tasks:
- id: commands
type: io.kestra.plugin.scripts.shell.Commands
commands:
- echo '::{"metrics":[{"name":"count","type":"counter","value":1,"tags":{"tag1":"i","tag2":"win"}}]}::'
Default value is : false
Default value is : ubuntu
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Create an inline Shell script and execute it.
id: shell_script_example
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/orders.csv
- id: shell_script_task
type: io.kestra.plugin.scripts.shell.Script
outputFiles:
- first.txt
script: |
echo "The current execution is : {{ execution.id }}"
echo "1" >> first.txt
cat {{ outputs.http_download.uri }}
If you want to generate files in your script to make them available for download and use in downstream tasks, you can leverage the
{{ outputDir }}variable. Files stored in that directory will be persisted in Kestra's internal storage. To access this output in downstream tasks, use the syntax{{ outputs.yourTaskId.outputFiles['yourFileName.fileExtension'] }}.
id: shell_script_example
namespace: company.team
tasks:
- id: hello
type: io.kestra.plugin.scripts.shell.Script
taskRunner:
type: io.kestra.plugin.core.runner.Process
outputFiles:
- hello.txt
script: |
echo "Hello world!" > hello.txt
Default value is : false
Default value is : ubuntu
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Convert an Avro file to the Amazon Ion format.
id: avro_to_ion
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/avro/products.avro
- id: to_ion
type: io.kestra.plugin.serdes.avro.AvroToIon
from: "{{ outputs.http_download.uri }}"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Convert a CSV file to the Avro format.
id: divvy_tripdata
namespace: company.team
variables:
file_id: "{{ execution.startDate | dateAdd(-3, 'MONTHS') | date('yyyyMM') }}"
tasks:
- id: get_zipfile
type: io.kestra.plugin.core.http.Download
uri: "https://divvy-tripdata.s3.amazonaws.com/{{ render(vars.file_id) }}-divvy-tripdata.zip"
- id: unzip
type: io.kestra.plugin.compress.ArchiveDecompress
algorithm: ZIP
from: "{{ outputs.get_zipfile.uri }}"
- id: convert
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ outputs.unzip.files[render(vars.file_id) ~ '-divvy-tripdata.csv'] }}"
- id: to_avro
type: io.kestra.plugin.serdes.avro.IonToAvro
from: "{{ outputs.convert.uri }}"
datetimeFormat: "yyyy-MM-dd' 'HH:mm:ss"
schema: |
{
"type": "record",
"name": "Ride",
"namespace": "com.example.bikeshare",
"fields": [
{"name": "ride_id", "type": "string"},
{"name": "rideable_type", "type": "string"},
{"name": "started_at", "type": {"type": "long", "logicalType": "timestamp-millis"}},
{"name": "ended_at", "type": {"type": "long", "logicalType": "timestamp-millis"}},
{"name": "start_station_name", "type": "string"},
{"name": "start_station_id", "type": "string"},
{"name": "end_station_name", "type": "string"},
{"name": "end_station_id", "type": "string"},
{"name": "start_lat", "type": "double"},
{"name": "start_lng", "type": "double"},
{
"name": "end_lat",
"type": ["null", "double"],
"default": null
},
{
"name": "end_lng",
"type": ["null", "double"],
"default": null
},
{"name": "member_casual", "type": "string"}
]
}
Default value is : false
Default value is : "yyyy-MM-dd[XXX]"
Default value is yyyy-MM-dd'T'HH:mm[:ss][.SSSSSS][XXX]
Default value is : "yyyy-MM-dd'T'HH:mm[:ss][.SSSSSS][XXX]"
Default value is '.'
Default value is : .
Default value is : false
Default value is : `- f
- "false"
- disabled
- 0
- "off"
- "no"
- ""`
Default value is : `- f
- "false"
- disabled
- 0
- "off"
- "no"
- ""`
[
"f",
"false",
"disabled",
"0",
"off",
"no",
""
]
If true, we try to infer all fields with trueValues, trueValues & nullValues.If false, we will infer bool & null only on field declared on schema as null and bool.
Default value is : false
Default value is : false
Default value is : `- ""
- "#N/A"
- "#N/A N/A"
- "#NA"
- -1.#IND
- -1.#QNAN
- -NaN
- 1.#IND
- 1.#QNAN
- NA
- n/a
- nan
- "null"`
Default value is : `- ""
- "#N/A"
- "#N/A N/A"
- "#NA"
- -1.#IND
- -1.#QNAN
- -NaN
- 1.#IND
- 1.#QNAN
- NA
- n/a
- nan
- "null"`
[
"",
"#N/A",
"#N/A N/A",
"#NA",
"-1.#IND",
"-1.#QNAN",
"-NaN",
"1.#IND",
"1.#QNAN",
"NA",
"n/a",
"nan",
"null"
]
Default value is false
Default value is : false
Default value is : "HH:mm[:ss][.SSSSSS][XXX]"
If null, the timezone will be UTC Default value is system timezone
Default value is : Etc/UTC
Default value is : `- t
- "true"
- enabled
- 1
- "on"
- "yes"`
Default value is : `- t
- "true"
- enabled
- 1
- "on"
- "yes"`
[
"t",
"true",
"enabled",
"1",
"on",
"yes"
]
1 nested properties
Examples
Convert a CSV file to the Amazon Ion format.
id: csv_to_ion
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv
- id: to_ion
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ outputs.http_download.uri }}"
Default value is : false
Default value is : UTF-8
Default value is : false
Default value is : false
Default value is : ","
Default value is : true
Default value is : false
Default value is : false
Default value is : 0
Default value is : '"'
1 nested properties
Examples
Download a CSV file, transform it in SQL and store the transformed data as a CSV file.
id: ion_to_csv
namespace: company.team
tasks:
- id: download_csv
type: io.kestra.plugin.core.http.Download
description: salaries of data professionals from 2020 to 2023 (source ai-jobs.net)
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/salaries.csv
- id: avg_salary_by_job_title
type: io.kestra.plugin.jdbc.duckdb.Query
inputFiles:
data.csv: "{{ outputs.download_csv.uri }}"
sql: |
SELECT
job_title,
ROUND(AVG(salary),2) AS avg_salary
FROM read_csv_auto('{{ workingDir }}/data.csv', header=True)
GROUP BY job_title
HAVING COUNT(job_title) > 10
ORDER BY avg_salary DESC;
store: true
- id: result
type: io.kestra.plugin.serdes.csv.IonToCsv
from: "{{ outputs.avg_salary_by_job_title.uri }}"
Default value is : false
Default value is : false
Default value is : UTF-8
Default value is : yyyy-MM-dd
Default value is : "yyyy-MM-dd'T'HH:mm:ss.SSS[XXX]"
Default value is : false
Default value is : ","
Default value is : true
Default value is : |2+
Default value is : false
Default value is : '"'
Default value is : "HH:mm:ss[XXX]"
Default value is : Etc/UTC
1 nested properties
Examples
Convert an Excel file to the Ion format.
id: excel_to_ion
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/excel/Products.xlsx
- id: to_ion
type: io.kestra.plugin.serdes.excel.ExcelToIon
from: "{{ outputs.http_download.uri }}"
Default value is : false
Default value is : UTF-8
Possible values: SERIAL_NUMBER, FORMATTED_STRING
Default value is : UNFORMATTED_VALUE
Default value is : false
Default value is : true
Default value is : false
Default value is : false
Default value is : 0
Possible values: FORMATTED_VALUE, UNFORMATTED_VALUE, FORMULA
Default value is : UNFORMATTED_VALUE
1 nested properties
Examples
Download a CSV file and convert it to the Excel file format.
id: ion_to_excel
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv
- id: convert
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ outputs.http_download.uri }}"
- id: to_excel
type: io.kestra.plugin.serdes.excel.IonToExcel
from: "{{ outputs.convert.uri }}"
Download CSV files and convert them into an Excel file with dedicated sheets.
id: excel
namespace: company.team
tasks:
- id: dataset1
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv
- id: dataset2
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/fruit.csv
- id: convert1
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ outputs.dataset1.uri }}"
- id: convert2
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ outputs.dataset2.uri }}"
- id: write
type: io.kestra.plugin.serdes.excel.IonToExcel
from:
Sheet_1: "{{ outputs.convert1.uri }}"
Sheet_2: "{{ outputs.convert2.uri }}"
Default value is : false
Default value is : UTF-8
Default value is : yyyy-MM-dd
Default value is : "yyyy-MM-dd'T'HH:mm:ss.SSS[XXX]"
Default value is : false
Default value is : true
Default value is : false
Default value is : Sheet
Excel is limited to 64000 styles per document, and styles are applied on every date, removed this options when you have a lots of values.
Default value is : true
Default value is : "HH:mm:ss[XXX]"
Default value is : Etc/UTC
1 nested properties
Examples
Download a CSV file and convert it to a JSON format.
id: ion_to_json
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/products.csv
- id: convert
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ outputs.http_download.uri }}"
- id: to_json
type: io.kestra.plugin.serdes.json.IonToJson
from: "{{ outputs.convert.uri }}"
Default value is : false
Default value is UTF-8.
Default value is : UTF-8
Default value is : false
Default value is : false
Is the file is a json with new line separator Warning, if not, the whole file will loaded in memory and can lead to out of memory!
Default value is : true
Default value is : Etc/UTC
1 nested properties
Please note that we support JSONL format only, i.e. one JSON dictionary/map per line.
Here is how a sample JSON file content might look like:
{"product_id":"1","product_name":"streamline turn-key systems","product_category":"Electronics","brand":"gomez"},
{"product_id":"2","product_name":"morph viral applications","product_category":"Household","brand":"wolfe"},
{"product_id":"3","product_name":"expedite front-end schemas","product_category":"Household","brand":"davis-martinez"}
We do NOT support an array of JSON objects. A JSON file in the following array format is not supported:
[
{"product_id":"1","product_name":"streamline turn-key systems","product_category":"Electronics","brand":"gomez"},
{"product_id":"2","product_name":"morph viral applications","product_category":"Household","brand":"wolfe"},
{"product_id":"3","product_name":"expedite front-end schemas","product_category":"Household","brand":"davis-martinez"}
]
Examples
Convert a JSON file to the Amazon Ion format.
id: json_to_ion
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/json/products.json
- id: to_ion
type: io.kestra.plugin.serdes.json.JsonToIon
from: "{{ outputs.http_download.uri }}"
Default value is : false
Default value is UTF-8.
Default value is : UTF-8
Default value is : false
Default value is : false
Is the file is a json with new line separator Warning, if not, the whole file will loaded in memory and can lead to out of memory!
Default value is : true
1 nested properties
Examples
Read a CSV file, transform it and store the transformed data as a parquet file.
id: ion_to_parquet
namespace: company.team
tasks:
- id: download_csv
type: io.kestra.plugin.core.http.Download
description: salaries of data professionals from 2020 to 2023 (source ai-jobs.net)
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/salaries.csv
- id: avg_salary_by_job_title
type: io.kestra.plugin.jdbc.duckdb.Query
inputFiles:
data.csv: "{{ outputs.download_csv.uri }}"
sql: |
SELECT
job_title,
ROUND(AVG(salary),2) AS avg_salary
FROM read_csv_auto('{{ workingDir }}/data.csv', header=True)
GROUP BY job_title
HAVING COUNT(job_title) > 10
ORDER BY avg_salary DESC;
store: true
- id: result
type: io.kestra.plugin.serdes.parquet.IonToParquet
from: "{{ outputs.avg_salary_by_job_title.uri }}"
schema: |
{
"type": "record",
"name": "Salary",
"namespace": "com.example.salary",
"fields": [
{"name": "job_title", "type": "string"},
{"name": "avg_salary", "type": "double"}
]
}
Default value is : false
Default value is : GZIP
Default value is : "yyyy-MM-dd[XXX]"
Default value is yyyy-MM-dd'T'HH:mm[:ss][.SSSSSS][XXX]
Default value is : "yyyy-MM-dd'T'HH:mm[:ss][.SSSSSS][XXX]"
Default value is '.'
Default value is : .
Default value is : 1048576
Default value is : false
Default value is : `- f
- "false"
- disabled
- 0
- "off"
- "no"
- ""`
Default value is : `- f
- "false"
- disabled
- 0
- "off"
- "no"
- ""`
[
"f",
"false",
"disabled",
"0",
"off",
"no",
""
]
If true, we try to infer all fields with trueValues, trueValues & nullValues.If false, we will infer bool & null only on field declared on schema as null and bool.
Default value is : false
Default value is : false
Default value is : `- ""
- "#N/A"
- "#N/A N/A"
- "#NA"
- -1.#IND
- -1.#QNAN
- -NaN
- 1.#IND
- 1.#QNAN
- NA
- n/a
- nan
- "null"`
Default value is : `- ""
- "#N/A"
- "#N/A N/A"
- "#NA"
- -1.#IND
- -1.#QNAN
- -NaN
- 1.#IND
- 1.#QNAN
- NA
- n/a
- nan
- "null"`
[
"",
"#N/A",
"#N/A N/A",
"#NA",
"-1.#IND",
"-1.#QNAN",
"-NaN",
"1.#IND",
"1.#QNAN",
"NA",
"n/a",
"nan",
"null"
]
Default value is : 1048576
Default value is : 134217728
Default value is false
Default value is : false
Default value is : "HH:mm[:ss][.SSSSSS][XXX]"
If null, the timezone will be UTC Default value is system timezone
Default value is : Etc/UTC
Default value is : `- t
- "true"
- enabled
- 1
- "on"
- "yes"`
Default value is : `- t
- "true"
- enabled
- 1
- "on"
- "yes"`
[
"t",
"true",
"enabled",
"1",
"on",
"yes"
]
Default value is : V2
1 nested properties
Examples
Convert a parquet file to the Amazon Ion format.
id: parquet_to_ion
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/parquet/products.parquet
- id: to_ion
type: io.kestra.plugin.serdes.parquet.ParquetToIon
from: "{{ outputs.http_download.uri }}"
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Examples
Read a CSV file, transform it and store the transformed data as an XML file.
id: ion_to_xml
namespace: company.team
tasks:
- id: download_csv
type: io.kestra.plugin.core.http.Download
description: salaries of data professionals from 2020 to 2023 (source ai-jobs.net)
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/csv/salaries.csv
- id: avg_salary_by_job_title
type: io.kestra.plugin.jdbc.duckdb.Query
inputFiles:
data.csv: "{{ outputs.download_csv.uri }}"
sql: |
SELECT
job_title,
ROUND(AVG(salary),2) AS avg_salary
FROM read_csv_auto('{{ workingDir }}/data.csv', header=True)
GROUP BY job_title
HAVING COUNT(job_title) > 10
ORDER BY avg_salary DESC;
store: true
- id: result
type: io.kestra.plugin.serdes.xml.IonToXml
from: "{{ outputs.avg_salary_by_job_title.uri }}"
Default value is : false
Default value is UTF-8.
Default value is : UTF-8
Default value is : false
Default value is : false
Default value is : items
Default value is : Etc/UTC
1 nested properties
Examples
Convert an XML file to the Amazon Ion format.
id: xml_to_ion
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/xml/products.xml
- id: to_ion
type: io.kestra.plugin.serdes.xml.XmlToIon
from: "{{ outputs.http_download.uri }}"
Default value is : false
Default value is UTF-8.
Default value is : UTF-8
Default value is : false
Default value is : false
1 nested properties
Examples
Create an incident.
id: servicenow_post
namespace: company.team
tasks:
- id: post
type: io.kestra.plugin.servicenow.Post
domain: "snow_domain"
username: "snow_username"
password: "snow_password"
clientId: "snow_client_id"
clientSecret: "snow_client_secret"
table: incident
data:
short_description: "API Create Incident..."
requester_id: f8266e2adb16fb00fa638a3a489619d2
requester_for_id: a7ec77cbdefac300d322d182689619dc
product_id: 01a2e3c1db15f340d329d18c689ed922
Will be used to generate the url: https://[[DOMAIN]].service-now.com/
Default value is : false
Default value is : false
Default value is : false
1 nested properties
Default value is : true
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
This could cause records to be missed that were created after the last run finished, but during the same second and with the same timestamp.
Default value is : true
Default value is : singer-state
1 nested properties
filters are optional but we strongly recommend using this over a large partitioned table to control the cost.
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
mostly in the form {site}.chargebee.com
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : 1.0
Default value is : singer-state
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : EUR
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : 0
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Will be save on config.json and used as arguments
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : `- PROPERTIES
- DISCOVER
- STATE`
Default value is : `- PROPERTIES
- DISCOVER
- STATE`
[
"PROPERTIES",
"DISCOVER",
"STATE"
]
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Login to your GitHub account, go to the Personal Access Tokens settings page, and generate a new token with at least the repo scope.
The repo path is relative to https://github.com/.
For example the path for this repository is kestra-io/kestra.
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : 300
Default value is : singer-state
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
When an API path is omitted, /api/v4/ is assumed.
Default value is : https://gitlab.com
Default value is : python:3.10.12
Default value is : false
This can slow down extraction considerably because of the many API calls required.
Default value is : false
This can slow down extraction considerably because of the many API calls required.
Default value is : false
Leave empty and provide a project name if you'd like to pull data from a project in a personal user namespace.
Default value is : false
Leave empty and provide a group name to extract data from all group projects.
Default value is : singer-state
Default value is : false
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : 0
Default value is : false
Default value is : false
Default value is : singer-state
Default value is : tap-adwords via Kestra
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Do not include the domain-level property in the list.
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
Default value is : tap-google-search-console via Kestra
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : false
Default value is : singer-state
For more information check out Stream Maps.
1 nested properties
Full documentation can be found here
The base URL contains the account id (a.k.a. Munchkin id) and is therefore unique for each Marketo subscription. Your base URL is found by logging into Marketo and navigating to the Admin > Integration > Web Services menu. It is labeled as “Endpoint:” underneath the “REST API” section as shown in the following screenshots.
Identity is found directly below the endpoint entry.https://developers.marketo.com/rest-api/base-url/
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
This can be found under Setup -> Company -> Company Information. Look for Account Id. Note _SB is for Sandbox account.
This should always be set to true if you are connecting Production account of NetSuite. Set it to false if you want to connect to SandBox account.
When new fields are discovered in NetSuite objects, the select_fields_by_default key describes whether or not the tap will select those fields by default.
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Default value is : false
For LOG_BASED only.
Default value is : 1000
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : true
Default value is : singer-state
For LOG_BASED only, the buffer is flushed once the size is reached.
Default value is : 1
1 nested properties
Full documentation can be found here##### Examples
host: 127.0.0.1
username: root
password: mysql_passwd
port: 63306
streamsConfigurations:
- stream: Category
replicationMethod: INCREMENTAL
replicationKeys: categoryId
selected: true
- propertiesPattern:
- description
selected: false
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : 50000
Default value is : false
Default value is : `- SET @@session.time_zone="+0:00"
- SET @@session.wait_timeout=28800
- SET @@session.net_read_timeout=3600
- SET @@session.innodb_lock_wait_timeout=3600`
Default value is : `- SET @@session.time_zone="+0:00"
- SET @@session.wait_timeout=28800
- SET @@session.net_read_timeout=3600
- SET @@session.innodb_lock_wait_timeout=3600`
[
"SET @@session.time_zone=\"+0:00\"",
"SET @@session.wait_timeout=28800",
"SET @@session.net_read_timeout=3600",
"SET @@session.innodb_lock_wait_timeout=3600"
]
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here##### Examples
host: 127.0.0.1
username: oracle
password: oracle_passwd
port: 63306
sid: ORCL
streamsConfigurations:
- stream: Category
replicationMethod: INCREMENTAL
replicationKeys: categoryId
selected: true
- propertiesPattern:
- description
selected: false
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Default value is : false
Default value is : true
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : false
Default value is : 10800
Default value is : 43200
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here##### Examples
host: 127.0.0.1
username: SA
password: sqlserver_passwd
port: 57037
filterDbs: dbo
streamsConfigurations:
- stream: Categories
replicationMethod: INCREMENTAL
replicationKeys: CategoryID
selected: true
- propertiesPattern:
- Description
selected: false
Default value is : false
Default value is : python:3.10.12
The common query tuning scenario is for SELECT statements that return a large number of rows over a slow network. Increasing arraysize can improve performance by reducing the number of round-trips to the database. However increasing this value increases the amount of memory required.
Default value is : false
Default value is : false
Default value is : singer-state
Default value is : true
When true, the resulting SCHEMA message will contain an attribute in additionalProperties containing the scale and precision of the discovered property.
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : false
Default value is : 8
Default value is : true
Default value is : 1000
Default value is : singer-state
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
User agent to send to ReCharge along with API requests. Typically includes name of integration and an email address you can be reached at.
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : BULK
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : false
Default value is : 8
Default value is : true
Default value is : 1000
Default value is : singer-state
1 nested properties
Full documentation can be found here
Ex. my-first-store
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Specifies whether the tap will sync archived channels or not. Note that a bot cannot join an archived channel, so unless the bot was added to the channel prior to it being archived it will not be able to sync the data from that channel.
Default value is : false
By default the tap will sync all channels it has been invited to, but this can be overridden to limit it to specific channels. Note this needs to be channel ID, not the name, as recommended by the Slack API. To get the ID for a channel, either use the Slack API or find it in the URL.
Default value is : python:3.10.12
Due to the potentially high volume of data when syncing certain streams (messages, files, threads) this tap implements date windowing based on a configuration parameter.5 means the tap to sync 5 days of data per request, for applicable streams.
Default value is : 7
Default value is : false
Default value is : false
Specifies whether to sync private channels or not.
Default value is : true
Specifies whether to have the tap auto-join all public channels in your ogranziation.
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Ex. acct_1a2b3c4d5e
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Please be aware that the larger the time period and amount of data, the longer the initial extraction can be expected to take.
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Add _time_extracted and _time_loaded metadata columns.
Default value is : false
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : 50
default: merges multiple state messages from the tap into the state file, if true : uses the last state message as the state file.
Default value is : false
Default value is : append
Default value is : singer-state
This option is disabled by default and invalid RECORD messages will fail only at load time by Postgres. Enabling this option will detect invalid records earlier but could cause performance degradation..
Default value is : false
1 nested properties
Full documentation can be found here
Default value is : false
Default value is : python:3.10.12
Default value is : ","
Default value is : false
Default value is : false
Default value is : '"'
Default value is : singer-state
1 nested properties
Full documentation can be found here
These indexes will make data loading slightly slower but the deduplication phase much faster. Defaults to on for better baseline performance.
Default value is : true
Default value is : false
There's a slight performance penalty to checking the buffered records count or bytesize, so this controls how often this is polled in order to mitigate the penalty. This value is usually not necessary to set as the default is dynamically adjusted to check reasonably often.
Default is 5000, or 1/40th maxBatchRows
Useful for setup like SET ROLE or other connection state that is important.
Default value is : python:3.10.12
Default value is : false
Default value is : true
Default value is : 0
Default value is : false
Set to DEBUG to get things like queries executed, timing of those queries, etc. See Python's Logger Levels for information about valid values.
Default value is : INFO
Default value is : 200000
Default value is : 104857600
Default value is : false
Default value is : public
Default value is : prefer
Default value is : singer-state
1 nested properties
Will be save on config.json and used as arguments
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
(i.e. rtXXXXX.eu-central-1)
Default value is : true
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Full documentation can be found here
Metadata columns add extra row level information about data ingestions, (i.e. when was the row read in source, when was inserted or deleted in postgres etc.) Metadata columns are creating automatically by adding extra columns to the tables with a column prefix _SDC_. The column names are following the stitch naming conventions. Enabling metadata columns will flag the deleted rows by setting the _SDC_DELETED_AT metadata column. Without the add_metadata_columns option the deleted rows from singer taps will not be recognisable in Postgres.
Default value is : false
Default value is : false
At the end of each batch, the rows in the batch are loaded into Postgres.
Default value is : 100000
Default value is : python:3.10.12
When value is 0 (default) then flattening functionality is turned off.
Default value is : 0
If schemaMapping is not defined then every stream sent by the tap is loaded into this schema.
Default value is : false
Warning: This may trigger the COPY command to use files with low number of records..
Default value is : false
When hard_delete option is true then DELETE SQL commands will be performed in Postgres to delete rows in tables. It's achieved by continuously checking the _SDC_DELETED_AT metadata column sent by the singer tap. Due to deleting rows requires metadata columns, hard_delete option automatically enables the add_metadata_columns option as well.
Default value is : false
Default value is : false
Default value is : 16
0 will create a thread for each stream, up to parallelism_max. -1 will create a thread for each CPU core. Any other positive number will create that number of threads, up to parallelism_max.
Default value is : 0
When set to true, stop loading data if no Primary Key is defined.
Default value is : true
Default value is : singer-state
This option is disabled by default and invalid RECORD messages will fail only at load time by Postgres. Enabling this option will detect invalid records earlier but could cause performance degradation..
Default value is : false
1 nested properties
Full documentation can be found here
If schema_mapping is not defined then every stream sent by the tap is loaded into this schema.
Used for S3 and Redshift copy operations.
Metadata columns add extra row level information about data ingestions, (i.e. when was the row read in source, when was inserted or deleted in redshift etc.) Metadata columns are creating automatically by adding extra columns to the tables with a column prefix SDC. The metadata columns are documented at here. Enabling metadata columns will flag the deleted rows by setting the _SDC_DELETED_AT metadata column. Without the addMetadataColumns option the deleted rows from singer taps will not be recongisable in Redshift.
Default value is : false
Default value is : false
At the end of each batch, the rows in the batch are loaded into Redshift.
Default value is : 100000
Default value is : bzip2
Default value is : python:3.10.12
Parameters to use in the COPY command when loading data to Redshift. Some basic file formatting parameters are fixed values and not recommended overriding them by custom values. They are like: CSV GZIP DELIMITER ',' REMOVEQUOTES ESCAPE.
When hardDelete option is true then DELETE SQL commands will be performed in Redshift to delete rows in tables. It's achieved by continuously checking the _SDC_DELETED_AT metadata column sent by the singer tap. Due to deleting rows requires metadata columns, hardDelete option automatically enables the addMetadataColumns option as well..
Default value is : 0
If schemaMapping is not defined then every stream sent by the tap is granted accordingly.
By default the connector caches the available table structures in Redshift at startup. In this way it doesn't need to run additional queries when ingesting data to check if altering the target tables is required. With disable_table_cache option you can turn off this caching. You will always see the most recent table structures but will cause an extra query runtime.
Default value is : false
Default value is : false
Warning: This may trigger the COPY command to use files with low number of records..
Default value is : false
When hardDelete option is true then DELETE SQL commands will be performed in Redshift to delete rows in tables. It's achieved by continuously checking the _SDC_DELETED_AT metadata column sent by the singer tap. Due to deleting rows requires metadata columns, hardDelete option automatically enables the addMetadataColumns option as well.
Default value is : false
Default value is : false
Default value is : 16
0 will create a thread for each stream, up to parallelism_max. -1 will create a thread for each CPU core. Any other positive number will create that number of threads, up to parallelism_max.
Default value is : 0
When set to true, stop loading data if no Primary Key is defined..
Default value is : true
AWS Role ARN to be used for the Redshift COPY operation. Used instead of the given AWS keys for the COPY operation if provided - the keys are still used for other S3 operations.
S3 Object ACL.
A static prefix before the generated S3 key names. Using prefixes you can upload files into specific directories in the S3 bucket. Default(None).
Useful if you want to load multiple streams from one tap to multiple Redshift schemas. If the tap sends the stream_id in <schema_name>-<table_name> format then this option overwrites the default_target_schema value. Note, that using schema_mapping you can overwrite the default_target_schema_select_permissions value to grant SELECT permissions to different groups per schemas or optionally you can create indices automatically for the replicated tables.
Used for S3 and Redshift copy operations.
S3 AWS STS token for temporary credentials.
Useful to improve performance when records are immutable, e.g. events.
Default value is : false
This should be set to the number of Redshift slices. The number of slices per node depends on the node size of the cluster - run SELECT COUNT(DISTINCT slice) slices FROM stv_slices to calculate this.
Default value is : 1
Default value is : singer-state
This option is disabled by default and invalid RECORD messages will fail only at load time by Redshift. Enabling this option will detect invalid records earlier but could cause performance degradation..
Default value is : false
1 nested properties
Full documentation can be found here
(i.e. rtXXXXX.eu-central-1)
Default value is : false
Default value is : false
Default value is : false
Default value is : 100000
Default value is : python:3.10.12
Default value is : 0
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : false
Default value is : 0
Default value is : 16
Default value is : true
Default value is : singer-state
Default value is : false
1 nested properties
Full documentation can be found here
Default value is : false
Default value is : python:3.10.12
Default value is : false
Default value is : false
Default value is : singer-state
1 nested properties
Examples
Run a scan on BigQuery.
id: soda_scan
namespacae: company.team
tasks:
- id: scan
type: io.kestra.plugin.soda.Scan
configuration:
data_source kestra:
type: bigquery
connection:
project_id: kestra-unit-test
dataset: demo
account_info_json: |
{{ secret('GCP_CREDS') }}
checks:
checks for orderDetail:
- row_count > 0
- max(unitPrice):
warn: when between 1 and 250
fail: when > 250
checks for territory:
- row_count > 0
- failed rows:
name: Failed rows query test
fail condition: regionId = 4
requirements:
- soda-core-bigquery
Python dependencies list to setup in the virtualenv, in the same format than requirements.txt. It must at least provides dbt.
Default value is : false
Default value is : sodadata/soda-core
Default value is : false
You can define the files as map or a JSON string. Each file can be defined inlined or can reference a file from Kestra's internal storage.
Default value is : false
Deprecated, use 'taskRunner' instead
Default value is : false
1 nested properties
Examples
Consume messages from a Solace queue.
id: consume_message_from_solace_queue
namespace: company.team
tasks:
- id: consume_from_solace
type: io.kestra.plugin.solace.Consume
host: localhost:55555
username: admin
password: admin
vpn: default
messageDeserializer: JSON
queueName: test_queue
queueType: DURABLE_EXCLUSIVE
Default value is : false
Default value is : false
Default value is : false
Default value is : 10.000000000
Default value is : 100
Default value is : STRING
Configs in key/value pairs.
Default value is : {}
{}
Enables support for message selection based on message header parameter and message properties values.
Default value is : {}
{}
Default value is : default
1 nested properties
Examples
Publish a file as messages into a Solace Broker.
id: send_messages_to_solace_queue
namespace: company.team
inputs:
- id: file
type: FILE
description: a CSV file with columns id, username, tweet, and timestamp
tasks:
- id: read_csv_file
type: io.kestra.plugin.serdes.csv.CsvToIon
from: "{{ inputs.file }}"
- id: transform_row_to_json
type: io.kestra.plugin.scripts.nashorn.FileTransform
from: "{{ outputs.read_csv_file.uri }}"
script: |
var result = {
"payload": {
"username": row.username,
"tweet": row.tweet
},
"properties": {
"correlationId": "42"
}
};
row = result
- id: send_message_to_solace
type: io.kestra.plugin.solace.Produce
from: "{{ outputs.transform_row_to_json.uri }}"
topicDestination: test/tweets
host: localhost:55555
username: admin
password: admin
vpn: default
messageSerializer: "JSON"
Can be an internal storage URI, a map (i.e. a list of key-value pairs) or a list of maps. The following keys are supported: payload, properties.
Default value is : false
Default value is : 60.000000000
Default value is : PERSISTENT
Default value is : false
Default value is : false
Additional properties must be provided with Key of type String and Value of type String. Each key can be customer provided, or it can be a Solace message properties.
Default value is : {}
{}
Default value is : STRING
Configs in key/value pairs.
Default value is : {}
{}
Default value is : {}
{}
Default value is : default
1 nested properties
Examples
Trigger flow based on messages received from a Solace broker.
id: trigger_from_solace_queue
namespace: company.team
tasks:
- id: hello
type: io.kestra.plugin.core.log.Log
message: Hello there! I received {{ trigger.messagesCount }} from Solace!
triggers:
- id: read_from_solace
type: io.kestra.plugin.solace.Trigger
interval: PT30S
host: localhost:55555
username: admin
password: admin
vpn: default
messageDeserializer: JSON
queueName: test_queue
queueType: DURABLE_EXCLUSIVE
Default value is : false
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
Default value is : 10.000000000
Default value is : 100
Default value is : STRING
Configs in key/value pairs.
Default value is : {}
{}
Enables support for message selection based on message header parameter and message properties values.
Default value is : {}
{}
Default value is : default
1 nested properties
Examples
id: spark_jar_submit
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: jar_submit
type: io.kestra.plugin.spark.JarSubmit
runner: DOCKER
master: spark://localhost:7077
mainResource: {{ inputs.file }}
mainClass: spark.samples.App
This should be the location of a JAR file for Scala/Java applications, or a Python script for PySpark applications. Must be an internal storage URI.
Spark master URL formats.
Default value is : false
Must be an internal storage URI.
Default value is : bitnami/spark
Default value is : false
Must be an internal storage URI.
Default value is : false
Deprecated - use 'taskRunner' instead.
Default value is : spark-submit
Default value is : false
1 nested properties
Examples
id: spark_python_submit
namespace: company.team
tasks:
- id: python_submit
type: io.kestra.plugin.spark.PythonSubmit
runner: DOCKER
docker:
networkMode: host
user: root
master: spark://localhost:7077
args:
- "10"
mainScript: |
import sys
from random import random
from operator import add
from pyspark.sql import SparkSession
if __name__ == "__main__":
spark = SparkSession .builder .appName("PythonPi") .getOrCreate()
partitions = int(sys.argv[1]) if len(sys.argv) > 1 else 2
n = 100000 * partitions
def f(_: int) -> float:
x = random() * 2 - 1
y = random() * 2 - 1
return 1 if x ** 2 + y ** 2 <= 1 else 0
count = spark.sparkContext.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
print("Pi is roughly %f" % (4.0 * count / n))
spark.stop()
Spark master URL formats.
Default value is : false
Must be an internal storage URI.
Default value is : bitnami/spark
Default value is : false
Default value is : false
Must be an internal storage URI.
Deprecated - use 'taskRunner' instead.
Default value is : spark-submit
Default value is : false
1 nested properties
Examples
id: spark_r_submit
namespace: company.team
tasks:
- id: r_submit
type: io.kestra.plugin.spark.RSubmit
runner: DOCKER
docker:
networkMode: host
user: root
master: spark://localhost:7077
mainScript: |
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sparkR.session()
print("The SparkR session has initialized successfully.")
sparkR.stop()
Spark master URL formats.
Default value is : false
Must be an internal storage URI.
Default value is : bitnami/spark
Default value is : false
Default value is : false
Deprecated - use 'taskRunner' instead.
Default value is : spark-submit
Default value is : false
1 nested properties
Examples
Submit a PySpark job to a master node.
id: spark_cli
namespace: company.team
tasks:
- id: hello
type: io.kestra.plugin.spark.SparkCLI
inputFiles:
pi.py: |
import sys
from random import random
from operator import add
from pyspark.sql import SparkSession
if __name__ == "__main__":
spark = SparkSession .builder .appName("PythonPi") .getOrCreate()
partitions = int(sys.argv[1]) if len(sys.argv) > 1 else 2
n = 100000 * partitions
def f(_: int) -> float:
x = random() * 2 - 1
y = random() * 2 - 1
return 1 if x ** 2 + y ** 2 <= 1 else 0
count = spark.sparkContext.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
print("Pi is roughly %f" % (4.0 * count / n))
spark.stop()
docker:
image: bitnami/spark
networkMode: host
commands:
- spark-submit --name Pi --master spark://localhost:7077 pi.py
Default value is : false
Default value is : bitnami/spark
Default value is : false
If set to false all commands will be executed one after the other. The final state of task execution is determined by the last command. Note that this property maybe be ignored if a non compatible interpreter is specified.
You can also disable it if your interpreter does not support the set -eoption.
Default value is : true
Default value is : `- /bin/sh
- -c`
Default value is : `- /bin/sh
- -c`
[
"/bin/sh",
"-c"
]
Default value is : false
Required to use the {{ outputDir }} expression. Note that it could increase the starting time. Deprecated, use the outputFiles property instead.
Default value is : "false"
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
Only used if the taskRunner property is not set
Default value is : AUTO
Default value is : true
1 nested properties
Examples
Orchestrate a SQLMesh project by automatically applying the plan
id: sqlmesh_transform
namespace: company.team
tasks:
- id: transform
type: io.kestra.plugin.sqlmesh.cli.SQLMeshCLI
beforeCommands:
- sqlmesh init duckdb
commands:
- sqlmesh plan --auto-apply
Default value is : false
Default value is : ghcr.io/kestra-io/sqlmesh
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
Examples
Send a SurrealQL query to a SurrealDB database.
id: surrealdb_query
namespace: company.team
tasks:
- id: select
type: io.kestra.plugin.surrealdb.Query
useTls: true
host: localhost
port: 8000
username: surreal_user
password: surreal_passwd
database: surreal_db
namespace: surreal_namespace
query: SELECT * FROM SURREAL_TABLE
fetchType: STORE
Default value is : false
Default value is : 60
Default value is : false
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : STORE
Default value is : false
See SurrealDB documentation about SurrealQL Prepared Statements for query syntax.This should be supplied with a parameter map using named parameters.
Default value is : {}
{}
Default value is : 8000
Default value is : false
1 nested properties
Examples
Wait for SurrealQL query to return results, and then iterate through rows.
id: surrealdb_trigger
namespace: company.team
tasks:
- id: each
type: io.kestra.plugin.core.flow.EachSequential
tasks:
- id: return
type: io.kestra.plugin.core.debug.Return
format: "{{ json(taskrun.value) }}"
value: "{{ trigger.rows }}"
triggers:
- id: watch
type: io.kestra.plugin.surrealdb.Trigger
interval: "PT5M"
host: localhost
port: 8000
username: surreal_user
password: surreal_passwd
namespace: surreal_namespace
database: surreal_db
fetchType: FETCH
query: SELECT * FROM SURREAL_TABLE
Default value is : 60
Default value is : false
FETCH_ONE - output the first row. FETCH - output all rows as output variable. STORE - store all rows to a file. NONE - do nothing.
Default value is : STORE
The interval between 2 different polls of schedule, this can avoid to overload the remote system with too many calls. For most of the triggers that depend on external systems, a minimal interval must be at least PT30S. See ISO_8601 Durations for more information of available interval values.
Default value is : 60.000000000
Default value is : false
See SurrealDB documentation about SurrealQL Prepared Statements for query syntax.This should be supplied with a parameter map using named parameters.
Default value is : {}
{}
Default value is : 8000
Default value is : false
1 nested properties
Examples
Initialize Terraform, then create and apply the Terraform plan
id: git_terraform
namespace: company.team
tasks:
- id: git
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: clone_repository
type: io.kestra.plugin.git.Clone
url: https://github.com/anna-geller/kestra-ci-cd
branch: main
- id: terraform
type: io.kestra.plugin.terraform.cli.TerraformCLI
beforeCommands:
- terraform init
inputFiles:
terraform.tfvars: |
username = "cicd"
password = "{{ secret('CI_CD_PASSWORD') }}"
hostname = "https://demo.kestra.io"
outputFiles:
- "*.txt"
commands:
- terraform plan 2>&1 | tee plan_output.txt
- terraform apply -auto-approve 2>&1 | tee apply_output.txt
env:
AWS_ACCESS_KEY_ID: "{{ secret('AWS_ACCESS_KEY_ID') }}"
AWS_SECRET_ACCESS_KEY: "{{ secret('AWS_SECRET_ACCESS_KEY') }}"
AWS_DEFAULT_REGION: "{{ secret('AWS_DEFAULT_REGION') }}"
Default value is : false
Default value is : hashicorp/terraform
Default value is : false
Default value is : false
Must be a list of glob expressions relative to the current working directory, some examples: my-dir/**, my-dir/*/** or my-dir/my-file.txt.
1 nested properties
Examples
Extract text from a file.
id: tika_parse
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: parse
type: io.kestra.plugin.tika.Parse
from: '{{ inputs.file }}'
extractEmbedded: true
store: false
Extract text from an image using OCR.
id: tika_parse
namespace: company.team
inputs:
- id: file
type: FILE
tasks:
- id: parse
type: io.kestra.plugin.tika.Parse
from: '{{ inputs.file }}'
ocrOptions:
strategy: OCR_AND_TEXT_EXTRACTION
store: true
Default value is : false
Default value is : XHTML
Default value is : false
Default value is : false
Must be an internal storage URI.
Default value is : false
Default value is : true
1 nested properties
Apache Tika will run preprocessing of images (rotation detection and image normalizing with ImageMagick) before sending the image to Tesseract if the user has included dependencies (listed below) and if the user opts to include these preprocessing steps.
You need to install Tesseract to enable OCR processing, along with Tesseract language pack.
Default value is : NO_OCR
The TransformItems task is similar to the famous Logstash Grok filter from the ELK stack.
It is particularly useful for transforming unstructured data such as logs into a structured, indexable, and queryable data structure.
The TransformItems ships with all the default patterns as defined You can find them here: https://github.com/kestra-io/plugin-transform/tree/main/plugin-transform-grok/src/main/resources/patterns.
Examples
Consume, parse, and structure logs events from Kafka topic.
id: grok_transform_items
namespace: company.team
tasks:
- id: transform_items
type: io.kestra.plugin.transform.grok.TransformItems
pattern: "%{TIMESTAMP_ISO8601:logdate} %{LOGLEVEL:loglevel} %{GREEDYDATA:message}"
from: "{{ trigger.uri }}"
triggers:
- id: trigger
type: io.kestra.plugin.kafka.Trigger
topic: test_kestra
properties:
bootstrap.servers: localhost:9092
serdeProperties:
schema.registry.url: http://localhost:8085
keyDeserializer: STRING
valueDeserializer: STRING
groupId: kafkaConsumerGroupId
interval: PT30S
maxRecords: 5
Must be a kestra:// internal storage URI.
Default value is : false
The first successful match by grok will result in the task being finished. Set to false if you want the task to try all configured patterns.
Default value is : true
Default value is : false
When an optional field cannot be captured, the empty field is retained in the output. Set false if you want empty optional fields to be filtered out.
Default value is : false
Default value is : false
Default value is : true
A map of pattern-name and pattern pairs defining custom patterns to be used by the current tasks. Patterns matching existing names will override the pre-existing definition.
Directories must be paths relative to the working directory.
1 nested properties
The TransformValue task is similar to the famous Logstash Grok filter from the ELK stack.
It is particularly useful for transforming unstructured data such as logs into a structured, indexable, and queryable data structure.
The TransformValue ships with all the default patterns as defined You can find them here: https://github.com/kestra-io/plugin-transform/tree/main/plugin-transform-grok/src/main/resources/patterns.
Examples
Consume, parse, and structure logs events from Kafka topic.
id: grok_transform_value
namespace: company.team
tasks:
- id: transform_value
type: io.kestra.plugin.transform.grok.TransformValue
pattern: "%{TIMESTAMP_ISO8601:logdate} %{LOGLEVEL:loglevel} %{GREEDYDATA:message}"
from: "{{ trigger.value }}"
- id: log_on_warn
type: io.kestra.plugin.core.flow.If
condition: "{{ grok.value['LOGLEVEL'] == 'ERROR' }}"
then:
- id: when_true
type: io.kestra.plugin.core.log.Log
message: "{{ outputs.transform_value.value }}"
triggers:
- id: realtime_trigger
type: io.kestra.plugin.kafka.RealtimeTrigger
topic: test_kestra
properties:
bootstrap.servers: localhost:9092
serdeProperties:
schema.registry.url: http://localhost:8085
keyDeserializer: STRING
valueDeserializer: STRING
groupId: kafkaConsumerGroupId
Default value is : false
The first successful match by grok will result in the task being finished. Set to false if you want the task to try all configured patterns.
Default value is : true
Default value is : false
When an optional field cannot be captured, the empty field is retained in the output. Set false if you want empty optional fields to be filtered out.
Default value is : false
Default value is : false
Default value is : true
A map of pattern-name and pattern pairs defining custom patterns to be used by the current tasks. Patterns matching existing names will override the pre-existing definition.
Directories must be paths relative to the working directory.
1 nested properties
JSONata is a sophisticated query and transformation language for JSON data.##### Examples
Transform JSON payload using JSONata expression.
id: jsonata_example
namespace: company.team
tasks:
- id: http_download
type: io.kestra.plugin.core.http.Download
uri: https://dummyjson.com/products
- id: get_product_and_brand_name
description: "String Transformation"
type: io.kestra.plugin.transform.jsonata.TransformItems
from: "{{ outputs.http_download.uri }}"
expression: products.(title & ' by ' & brand)
- id: get_total_price
description: "Number Transformation"
type: io.kestra.plugin.transform.jsonata.TransformItems
from: "{{ outputs.http_download.uri }}"
expression: $sum(products.price)
- id: get_discounted_price
type: io.kestra.plugin.transform.jsonata.TransformItems
from: "{{ outputs.http_download.uri }}"
expression: $sum(products.(price-(price*discountPercentage/100)))
- id: sum_up
description: "Writing out results in the form of JSON"
type: io.kestra.plugin.transform.jsonata.TransformItems
from: "{{ outputs.http_download.uri }}"
expression: |
{
"total_products": $count(products),
"total_price": $sum(products.price),
"total_discounted_price": $sum(products.(price-(price*discountPercentage/100)))
}
Must be a kestra:// internal storage URI.
Default value is : false
Default value is : false
If the JSONata expression results in a JSON array and this property is set to true, then a record will be written for each element. Otherwise, the JSON array is kept as a single record.
Default value is : true
Default value is : false
Default value is : 1000
1 nested properties
JSONata is a sophisticated query and transformation language for JSON data.##### Examples
Transform JSON data using JSONata expression
id: jsonata_transform_value
namespace: company.team
tasks:
- id: transform_json
type: io.kestra.plugin.transform.jsonata.TransformValue
from: |
{
"order_id": "ABC123",
"first_name": "John",
"last_name": "Doe",
"address": {
"city": "Paris",
"country": "France"
},
"items": [
{
"product_id": "001",
"name": "Apple",
"quantity": 5,
"price_per_unit": 0.5
},
{
"product_id": "002",
"name": "Banana",
"quantity": 3,
"price_per_unit": 0.3
},
{
"product_id": "003",
"name": "Orange",
"quantity": 2,
"price_per_unit": 0.4
}
]
}
expression: |
{
"order_id": order_id,
"customer_name": first_name & ' ' & last_name,
"address": address.city & ', ' & address.country,
"total_price": $sum(items.(quantity * price_per_unit))
}
Must be a valid JSON string.
Default value is : false
Default value is : false
Default value is : false
Default value is : 1000
1 nested properties
Data can be either in an ION-serialized file format or as a list of key-value pairs. If the schema doesn't exist yet, it will be created automatically.##### Examples
Send batch object creation request to a Weaviate database.
id: weaviate_batch_load
namespace: company.team
tasks:
- id: batch_load
type: io.kestra.plugin.weaviate.BatchCreate
url: "https://demo-cluster-id.weaviate.network"
apiKey: "{{ secret('WEAVIATE_API_KEY') }}"
className: WeaviateDemo
objects:
- textField: "some text"
numField: 24
- textField: "another text"
numField: 42
Send batch object creation request to a Weaviate database using an ION input file e.g. passed from output of another task.
id: weaviate_batch_insert
namespace: company.team
tasks:
- id: extract
type: io.kestra.plugin.core.http.Download
uri: https://huggingface.co/datasets/kestra/datasets/raw/main/ion/ingest.ion
- id: batch_insert
type: io.kestra.plugin.weaviate.BatchCreate
url: "https://demo-cluster-id.weaviate.network"
apiKey: "{{ secret('WEAVIATE_API_KEY') }}"
className: Titles
objects: "{{ outputs.extract.uri }}"
ION File URI or the list of objects to insert
Example: localhost:8080 or https://cluster-id.weaviate.network
Default value is : false
If not provided, the anonymous authentication scheme will be used.
Default value is : false
Default value is : {}
{}
Default value is : false
1 nested properties
Examples
Send delete request to a Weaviate database. Use object ID or other properties.
id: weaviate_delete_flow
namespace: company.team
tasks:
- id: delete
type: io.kestra.plugin.weaviate.Delete
url: https://demo-cluster-id.weaviate.network
className: WeaviateObject
filter:
fieldName: field value to be deleted by
Example: localhost:8080 or https://cluster-id.weaviate.network
Default value is : false
If not provided, the anonymous authentication scheme will be used.
Default value is : false
Default value is : {}
{}
Default value is : false
1 nested properties
Examples
Execute a GraphQL query to fetch data from a Weaviate database.
id: weaviate_query
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.weaviate.Query
url: https://demo-cluster-id.weaviate.network
apiKey: "{{ secret('WEAVIATE_API_KEY') }}"
query: |
{
Get {
Question(limit: 5) {
question
answer
category
}
}
}
Query data from a Weaviate database using Generative Search with OpenAI
id: weaviate_generative_search
namespace: company.team
tasks:
- id: query
type: io.kestra.plugin.weaviate.Query
url: https://demo-cluster-id.weaviate.network
apiKey: "{{ secret('WEAVIATE_API_KEY') }}"
headers:
X-OpenAI-Api-Key: "{{ secret('OPENAI_API_KEY') }}"
query: |
{
Get {
Question(limit: 5, nearText: {concepts: ["biology"]}) {
question
answer
category
}
}
}
Example: localhost:8080 or https://cluster-id.weaviate.network
Default value is : false
If not provided, the anonymous authentication scheme will be used.
Default value is : false
FETCH_ONE outputs only the first row
FETCH outputs all rows
STORE stores all rows in a file
NONE doesn't store any data. It's particularly useful when you execute DDL statements or run queries that insert data into another table e.g. using SELECT ... INSERT INTO statements.
Default value is : STORE
Default value is : {}
{}
Default value is : false
1 nested properties
Examples
Send schema creation request to a Weaviate database.
id: create_weaviate_schema
namespace: company.team
tasks:
- id: schema
type: io.kestra.plugin.weaviate.SchemaCreate
url: "https://demo-cluster-id.weaviate.network"
apiKey: "{{ secret('WEAVIATE_API_KEY') }}"
className: Movies
fields:
name:
- string
description:
- string
category:
- string
Example: localhost:8080 or https://cluster-id.weaviate.network
Default value is : false
If not provided, the anonymous authentication scheme will be used.
Default value is : false
Requires specified field name and a list of data types that will be stored in this field
Default value is : {}
{}
Default value is : false
1 nested properties
Examples
Create Zendesk ticket using username and token.
id: zendesk_flow
namespace: company.team
tasks:
- id: create_ticket
type: io.kestra.plugin.zendesk.tickets.Create
domain: mycompany.zendesk.com
username: [email protected]
token: zendesk_api_token
subject: "Increased 5xx in Demo Service"
description: |
"The number of 5xx has increased beyond the threshold for Demo service."
priority: NORMAL
ticketType: INCIDENT
assigneeId: 1
tags:
- bug
- workflow
Create Zendesk ticket using OAuth token.
id: zendesk_flow
namespace: company.team
tasks:
- id: create_ticket
type: io.kestra.plugin.zendesk.tickets.Create
domain: mycompany.zendesk.com
oauthToken: zendesk_oauth_token
subject: "Increased 5xx in Demo Service"
description: |
"The number of 5xx has increased beyond the threshold for Demo service."
priority: NORMAL
ticketType: INCIDENT
assigneeId: 1
tags:
- bug
- workflow
Create a ticket when a Kestra workflow in any namespace with
companyas prefix fails.
id: create_ticket_on_failure
namespace: company.team
tasks:
- id: create_ticket
type: io.kestra.plugin.zendesk.tickets.Create
domain: mycompany.zendesk.com
oauthToken: zendesk_oauth_token
subject: Workflow failed
description: |
"{{ execution.id }} has failed on {{ taskrun.startDate }}.
See the link below for more details."
priority: NORMAL
ticketType: INCIDENT
assigneeId: 1
tags:
- bug
- workflow
triggers:
- id: on_failure
type: io.kestra.plugin.core.trigger.Flow
conditions:
- type: io.kestra.plugin.core.condition.ExecutionStatusCondition
in:
- FAILED
- WARNING
- type: io.kestra.plugin.core.condition.ExecutionNamespaceCondition
namespace: company
comparison: PREFIX
Default value is : false
Default value is : false
Default value is : false
Available values:
- URGENT
- HIGH
- NORMAL
- LOW
Available values:
- PROBLEM
- INCIDENT
- QUESTION
- TASK