Type object
File match source-*-manifest.yaml destination-*-manifest.yaml **/source-*/manifest.yaml **/destination-*/manifest.yaml
Schema URL https://catalog.lintel.tools/schemas/schemastore/airbyte-declarative-connectors-specification-manifest-yaml/latest.json
Source https://raw.githubusercontent.com/airbytehq/airbyte-python-cdk/49c5a482de7bdfbaa3a68373a940b90c0690a56f/airbyte_cdk/sources/declarative/generated/declarative_component_schema.json

Validate with Lintel

npx @lintel/lintel check
Type: object

An API source that extracts data according to its declarative components.

Properties

type string required
Values: "DeclarativeSource"
check CheckStream | CheckDynamicStream required
version string required

The version of the Airbyte CDK used to build and test the source.

streams ConditionalStreams | DeclarativeStream | StateDelegatingStream[]
dynamic_streams DynamicDeclarativeStream[]
schemas object

The stream schemas representing the shape of the data emitted by the stream.

definitions object
spec object

A source specification made up of connector metadata and how it can be configured.

5 nested properties
type string required
Values: "Spec"
connection_specification object required

A connection specification describing how a the connector can be configured.

documentation_url string

URL of the connector's documentation page.

Examples: "https://docs.airbyte.com/integrations/sources/dremio"
advanced_auth object

Additional and optional specification object to describe what an 'advanced' Auth flow would need to function.

  • A connector should be able to fully function with the configuration as described by the ConnectorSpecification in a 'basic' mode.
  • The 'advanced' mode provides easier UX for the user with UI improvements and automations. However, this requires further setup on the server side by instance or workspace admins beforehand. The trade-off is that the user does not have to provide as many technical inputs anymore and the auth process is faster and easier to complete.
4 nested properties
auth_flow_type string

The type of auth to use

Values: "oauth2.0" "oauth1.0"
predicate_key string[]

JSON path to a field in the connectorSpecification that should exist for the advanced auth to be applicable.

Examples: ["credentials","auth_type"]
predicate_value string

Value of the predicate_key fields for the advanced auth to be applicable.

Examples: "Oauth"
oauth_config_specification object

Specification describing how an 'advanced' Auth flow would need to function.

5 nested properties
oauth_user_input_from_connector_config_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations used as input to OAuth. Must be a valid non-nested JSON that refers to properties from ConnectorSpecification.connectionSpecification using special annotation 'path_in_connector_config'. These are input values the user is entering through the UI to authenticate to the connector, that might also shared as inputs for syncing data via the connector. Examples: if no connector values is shared during oauth flow, oauth_user_input_from_connector_config_specification=[] if connector values such as 'app_id' inside the top level are used to generate the API url for the oauth flow, oauth_user_input_from_connector_config_specification={ app_id: { type: string path_in_connector_config: ['app_id'] } } if connector values such as 'info.app_id' nested inside another object are used to generate the API url for the oauth flow, oauth_user_input_from_connector_config_specification={ app_id: { type: string path_in_connector_config: ['info', 'app_id'] } }

Examples: {"app_id":{"type":"string","path_in_connector_config":["app_id"]}}, {"app_id":{"type":"string","path_in_connector_config":["info","app_id"]}}
oauth_connector_input_specification object

The DeclarativeOAuth specific blob. Pertains to the fields defined by the connector relating to the OAuth flow.

Interpolation capabilities:

  • The variables placeholders are declared as {{my_var}}.

  • The nested resolution variables like {{ {{my_nested_var}} }} is allowed as well.

  • The allowed interpolation context is:

    • base64Encoder - encode to base64, {{ {{my_var_a}}:{{my_var_b}} | base64Encoder }}
    • base64Decorer - decode from base64 encoded string, {{ {{my_string_variable_or_string_value}} | base64Decoder }}
    • urlEncoder - encode the input string to URL-like format, {{ https://test.host.com/endpoint | urlEncoder}}
    • urlDecorer - decode the input url-encoded string into text format, {{ urlDecoder:https%3A%2F%2Fairbyte.io | urlDecoder}}
    • codeChallengeS256 - get the codeChallenge encoded value to provide additional data-provider specific authorisation values, {{ {{state_value}} | codeChallengeS256 }}

Examples:

complete_oauth_output_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations produced by the OAuth flows as they are returned by the distant OAuth APIs. Must be a valid JSON describing the fields to merge back to ConnectorSpecification.connectionSpecification. For each field, a special annotation path_in_connector_config can be specified to determine where to merge it, Examples: complete_oauth_output_specification={ refresh_token: { type: string, path_in_connector_config: ['credentials', 'refresh_token'] } }

Examples: {"refresh_token":{"type":"string,","path_in_connector_config":["credentials","refresh_token"]}}
complete_oauth_server_input_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations persisted as Airbyte Server configurations. Must be a valid non-nested JSON describing additional fields configured by the Airbyte Instance or Workspace Admins to be used by the server when completing an OAuth flow (typically exchanging an auth code for refresh token). Examples: complete_oauth_server_input_specification={ client_id: { type: string }, client_secret: { type: string } }

Examples: {"client_id":{"type":"string"},"client_secret":{"type":"string"}}
complete_oauth_server_output_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations persisted as Airbyte Server configurations that also need to be merged back into the connector configuration at runtime. This is a subset configuration of complete_oauth_server_input_specification that filters fields out to retain only the ones that are necessary for the connector to function with OAuth. (some fields could be used during oauth flows but not needed afterwards, therefore they would be listed in the complete_oauth_server_input_specification but not complete_oauth_server_output_specification) Must be a valid non-nested JSON describing additional fields configured by the Airbyte Instance or Workspace Admins to be used by the connector when using OAuth flow APIs. These fields are to be merged back to ConnectorSpecification.connectionSpecification. For each field, a special annotation path_in_connector_config can be specified to determine where to merge it, Examples: complete_oauth_server_output_specification={ client_id: { type: string, path_in_connector_config: ['credentials', 'client_id'] }, client_secret: { type: string, path_in_connector_config: ['credentials', 'client_secret'] } }

Examples: {"client_id":{"type":"string,","path_in_connector_config":["credentials","client_id"]},"client_secret":{"type":"string,","path_in_connector_config":["credentials","client_secret"]}}
config_normalization_rules object
4 nested properties
type string required
Values: "ConfigNormalizationRules"
config_migrations ConfigMigration[]

The discrete migrations that will be applied on the incoming config. Each migration will be applied in the order they are defined.

Default:
[]
transformations ConfigRemapField | ConfigAddFields | ConfigRemoveFields | CustomConfigTransformation[]

The list of transformations that will be applied on the incoming config at the start of each sync. The transformations will be applied in the order they are defined.

Default:
[]
validations DpathValidator | PredicateValidator[]

The list of validations that will be performed on the incoming config at the start of each sync.

Default:
[]
concurrency_level object

Defines the amount of parallelization for the streams that are being synced. The factor of parallelization is how many partitions or streams are synced at the same time. For example, with a concurrency_level of 10, ten streams or partitions of data will processed at the same time. Note that a value of 1 could create deadlock if a stream has a very high number of partitions.

4 nested properties
default_concurrency integer | string required

The amount of concurrency that will applied during a sync. This value can be hardcoded or user-defined in the config if different users have varying volume thresholds in the target API.

Examples: 10, "{{ config['num_workers'] or 10 }}"
type string
Values: "ConcurrencyLevel"
max_concurrency integer

The maximum level of concurrency that will be used during a sync. This becomes a required field when the default_concurrency derives from the config, because it serves as a safeguard against a user-defined threshold that is too high.

Examples: 20, 100
$parameters object
api_budget object

Defines how many requests can be made to the API in a given time frame. HTTPAPIBudget extracts the remaining call count and the reset time from HTTP response headers using the header names provided by ratelimit_remaining_header and ratelimit_reset_header. Only requests using HttpRequester are rate-limited; custom components that bypass HttpRequester are not covered by this budget.

5 nested properties
type string required
Values: "HTTPAPIBudget"
policies FixedWindowCallRatePolicy | MovingWindowCallRatePolicy | UnlimitedCallRatePolicy[] required

List of call rate policies that define how many calls are allowed.

ratelimit_reset_header string

The HTTP response header name that indicates when the rate limit resets.

Default: "ratelimit-reset"
ratelimit_remaining_header string

The HTTP response header name that indicates the number of remaining allowed calls.

Default: "ratelimit-remaining"
status_codes_for_ratelimit_hit integer[]

List of HTTP status codes that indicate a rate limit has been hit.

Default:
[
  429
]
max_concurrent_async_job_count integer | string

Maximum number of concurrent asynchronous jobs to run. This property is only relevant for sources/streams that support asynchronous job execution through the AsyncRetriever (e.g. a report-based stream that initiates a job, polls the job status, and then fetches the job results). This is often set by the API's maximum number of concurrent jobs on the account level. Refer to the API's documentation for this information.

Examples: 3, "{{ config['max_concurrent_async_job_count'] }}"
metadata object

For internal Airbyte use only - DO NOT modify manually. Used by consumers of declarative manifests for storing related metadata.

description string

A description of the connector. It will be presented on the Source documentation page.

Any of

1. variant
2. variant

Definitions

AddedFieldDefinition object

Defines the field to add on a record.

type string required
Values: "AddedFieldDefinition"
path string[] required

List of strings defining the path where to add the value on the record.

Examples: ["segment_id"], ["metadata","segment_id"]
value string required

Value of the new field. Use {{ record['existing_field'] }} syntax to refer to other fields in the record.

Examples: "{{ record['updates'] }}", "{{ record['MetaData']['LastUpdatedTime'] }}", "{{ stream_partition['segment_id'] }}"
value_type string

A schema type.

Values: "string" "number" "integer" "boolean"
$parameters object
AddFields object

Transformation which adds field to an output record. The path of the added field can be nested.

type string required
Values: "AddFields"
fields AddedFieldDefinition[] required

List of transformations (path and corresponding value) that will be added to the record.

condition string

Fields will be added if expression is evaluated to True.

Default: ""
Examples: "{{ property|string == '' }}", "{{ property is integer }}", "{{ property|length > 5 }}", "{{ property == 'some_string_to_match' }}"
$parameters object
ApiKeyAuthenticator object

Authenticator for requests authenticated with an API token injected as an HTTP request header.

type string required
Values: "ApiKeyAuthenticator"
api_token string

The API key to inject in the request. Fill it in the user inputs.

Examples: "{{ config['api_key'] }}", "Token token={{ config['api_key'] }}"
header string

The name of the HTTP header that will be set to the API key. This setting is deprecated, use inject_into instead. Header and inject_into can not be defined at the same time.

Examples: "Authorization", "Api-Token", "X-Auth-Token"
inject_into object

Specifies the key field or path and where in the request a component's value should be injected.

4 nested properties
type string required
Values: "RequestOption"
inject_into enum required

Configures where the descriptor should be set on the HTTP requests. Note that request parameters that are already encoded in the URL path will not be duplicated.

Values: "request_parameter" "header" "body_data" "body_json"
Examples: "request_parameter", "header", "body_data", "body_json"
field_name string

Configures which key should be used in the location that the descriptor is being injected into. We hope to eventually deprecate this field in favor of field_path for all request_options, but must currently maintain it for backwards compatibility in the Builder.

Examples: "segment_id"
field_path string[]

Configures a path to be used for nested structures in JSON body requests (e.g. GraphQL queries)

Examples: ["data","viewer","id"]
$parameters object
AuthFlow object

Additional and optional specification object to describe what an 'advanced' Auth flow would need to function.

  • A connector should be able to fully function with the configuration as described by the ConnectorSpecification in a 'basic' mode.
  • The 'advanced' mode provides easier UX for the user with UI improvements and automations. However, this requires further setup on the server side by instance or workspace admins beforehand. The trade-off is that the user does not have to provide as many technical inputs anymore and the auth process is faster and easier to complete.
auth_flow_type string

The type of auth to use

Values: "oauth2.0" "oauth1.0"
predicate_key string[]

JSON path to a field in the connectorSpecification that should exist for the advanced auth to be applicable.

Examples: ["credentials","auth_type"]
predicate_value string

Value of the predicate_key fields for the advanced auth to be applicable.

Examples: "Oauth"
oauth_config_specification object

Specification describing how an 'advanced' Auth flow would need to function.

5 nested properties
oauth_user_input_from_connector_config_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations used as input to OAuth. Must be a valid non-nested JSON that refers to properties from ConnectorSpecification.connectionSpecification using special annotation 'path_in_connector_config'. These are input values the user is entering through the UI to authenticate to the connector, that might also shared as inputs for syncing data via the connector. Examples: if no connector values is shared during oauth flow, oauth_user_input_from_connector_config_specification=[] if connector values such as 'app_id' inside the top level are used to generate the API url for the oauth flow, oauth_user_input_from_connector_config_specification={ app_id: { type: string path_in_connector_config: ['app_id'] } } if connector values such as 'info.app_id' nested inside another object are used to generate the API url for the oauth flow, oauth_user_input_from_connector_config_specification={ app_id: { type: string path_in_connector_config: ['info', 'app_id'] } }

Examples: {"app_id":{"type":"string","path_in_connector_config":["app_id"]}}, {"app_id":{"type":"string","path_in_connector_config":["info","app_id"]}}
oauth_connector_input_specification object

The DeclarativeOAuth specific blob. Pertains to the fields defined by the connector relating to the OAuth flow.

Interpolation capabilities:

  • The variables placeholders are declared as {{my_var}}.

  • The nested resolution variables like {{ {{my_nested_var}} }} is allowed as well.

  • The allowed interpolation context is:

    • base64Encoder - encode to base64, {{ {{my_var_a}}:{{my_var_b}} | base64Encoder }}
    • base64Decorer - decode from base64 encoded string, {{ {{my_string_variable_or_string_value}} | base64Decoder }}
    • urlEncoder - encode the input string to URL-like format, {{ https://test.host.com/endpoint | urlEncoder}}
    • urlDecorer - decode the input url-encoded string into text format, {{ urlDecoder:https%3A%2F%2Fairbyte.io | urlDecoder}}
    • codeChallengeS256 - get the codeChallenge encoded value to provide additional data-provider specific authorisation values, {{ {{state_value}} | codeChallengeS256 }}

Examples:

13 nested properties
consent_url string required

The DeclarativeOAuth Specific string URL string template to initiate the authentication. The placeholders are replaced during the processing to provide neccessary values.

Examples: "https://domain.host.com/marketing_api/auth?{{client_id_key}}={{client_id_value}}&{{redirect_uri_key}}={{{{redirect_uri_value}} | urlEncoder}}&{{state_key}}={{state_value}}", "https://endpoint.host.com/oauth2/authorize?{{client_id_key}}={{client_id_value}}&{{redirect_uri_key}}={{{{redirect_uri_value}} | urlEncoder}}&{{scope_key}}={{{{scope_value}} | urlEncoder}}&{{state_key}}={{state_value}}&subdomain={{subdomain}}"
access_token_url string required

The DeclarativeOAuth Specific URL templated string to obtain the access_token, refresh_token etc. The placeholders are replaced during the processing to provide neccessary values.

Examples: "https://auth.host.com/oauth2/token?{{client_id_key}}={{client_id_value}}&{{client_secret_key}}={{client_secret_value}}&{{auth_code_key}}={{auth_code_value}}&{{redirect_uri_key}}={{{{redirect_uri_value}} | urlEncoder}}"
scope string

The DeclarativeOAuth Specific string of the scopes needed to be grant for authenticated user.

Examples: "user:read user:read_orders workspaces:read"
access_token_headers object

The DeclarativeOAuth Specific optional headers to inject while exchanging the auth_code to access_token during completeOAuthFlow step.

Examples: {"Authorization":"Basic {{ {{ client_id_value }}:{{ client_secret_value }} | base64Encoder }}"}
access_token_params object

The DeclarativeOAuth Specific optional query parameters to inject while exchanging the auth_code to access_token during completeOAuthFlow step. When this property is provided, the query params will be encoded as Json and included in the outgoing API request.

Examples: {"{{ auth_code_key }}":"{{ auth_code_value }}","{{ client_id_key }}":"{{ client_id_value }}","{{ client_secret_key }}":"{{ client_secret_value }}"}
extract_output string[]

The DeclarativeOAuth Specific list of strings to indicate which keys should be extracted and returned back to the input config.

Examples: ["access_token","refresh_token","other_field"]
state object

The DeclarativeOAuth Specific object to provide the criteria of how the state query param should be constructed, including length and complexity.

Examples: {"min":7,"max":128}
client_id_key string

The DeclarativeOAuth Specific optional override to provide the custom client_id key name, if required by data-provider.

Examples: "my_custom_client_id_key_name"
client_secret_key string

The DeclarativeOAuth Specific optional override to provide the custom client_secret key name, if required by data-provider.

Examples: "my_custom_client_secret_key_name"
scope_key string

The DeclarativeOAuth Specific optional override to provide the custom scope key name, if required by data-provider.

Examples: "my_custom_scope_key_key_name"
state_key string

The DeclarativeOAuth Specific optional override to provide the custom state key name, if required by data-provider.

Examples: "my_custom_state_key_key_name"
auth_code_key string

The DeclarativeOAuth Specific optional override to provide the custom code key name to something like auth_code or custom_auth_code, if required by data-provider.

Examples: "my_custom_auth_code_key_name"
redirect_uri_key string

The DeclarativeOAuth Specific optional override to provide the custom redirect_uri key name to something like callback_uri, if required by data-provider.

Examples: "my_custom_redirect_uri_key_name"
complete_oauth_output_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations produced by the OAuth flows as they are returned by the distant OAuth APIs. Must be a valid JSON describing the fields to merge back to ConnectorSpecification.connectionSpecification. For each field, a special annotation path_in_connector_config can be specified to determine where to merge it, Examples: complete_oauth_output_specification={ refresh_token: { type: string, path_in_connector_config: ['credentials', 'refresh_token'] } }

Examples: {"refresh_token":{"type":"string,","path_in_connector_config":["credentials","refresh_token"]}}
complete_oauth_server_input_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations persisted as Airbyte Server configurations. Must be a valid non-nested JSON describing additional fields configured by the Airbyte Instance or Workspace Admins to be used by the server when completing an OAuth flow (typically exchanging an auth code for refresh token). Examples: complete_oauth_server_input_specification={ client_id: { type: string }, client_secret: { type: string } }

Examples: {"client_id":{"type":"string"},"client_secret":{"type":"string"}}
complete_oauth_server_output_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations persisted as Airbyte Server configurations that also need to be merged back into the connector configuration at runtime. This is a subset configuration of complete_oauth_server_input_specification that filters fields out to retain only the ones that are necessary for the connector to function with OAuth. (some fields could be used during oauth flows but not needed afterwards, therefore they would be listed in the complete_oauth_server_input_specification but not complete_oauth_server_output_specification) Must be a valid non-nested JSON describing additional fields configured by the Airbyte Instance or Workspace Admins to be used by the connector when using OAuth flow APIs. These fields are to be merged back to ConnectorSpecification.connectionSpecification. For each field, a special annotation path_in_connector_config can be specified to determine where to merge it, Examples: complete_oauth_server_output_specification={ client_id: { type: string, path_in_connector_config: ['credentials', 'client_id'] }, client_secret: { type: string, path_in_connector_config: ['credentials', 'client_secret'] } }

Examples: {"client_id":{"type":"string,","path_in_connector_config":["credentials","client_id"]},"client_secret":{"type":"string,","path_in_connector_config":["credentials","client_secret"]}}
BasicHttpAuthenticator object

Authenticator for requests authenticated with the Basic HTTP authentication scheme, which encodes a username and an optional password in the Authorization request header.

type string required
Values: "BasicHttpAuthenticator"
username string required

The username that will be combined with the password, base64 encoded and used to make requests. Fill it in the user inputs.

Examples: "{{ config['username'] }}", "{{ config['api_key'] }}"
password string

The password that will be combined with the username, base64 encoded and used to make requests. Fill it in the user inputs.

Default: ""
Examples: "{{ config['password'] }}", ""
$parameters object
BearerAuthenticator object

Authenticator for requests authenticated with a bearer token injected as a request header of the form Authorization: Bearer <token>.

type string required
Values: "BearerAuthenticator"
api_token string required

Token to inject as request header for authenticating with the API.

Examples: "{{ config['api_key'] }}", "{{ config['token'] }}"
$parameters object
SelectiveAuthenticator object

Authenticator that selects concrete authenticator based on config property.

type string required
Values: "SelectiveAuthenticator"
authenticator_selection_path string[] required

Path of the field in config with selected authenticator name

Examples: ["auth"], ["auth","type"]
authenticators Record<string, ApiKeyAuthenticator | BasicHttpAuthenticator | BearerAuthenticator | OAuthAuthenticator | JwtAuthenticator | SessionTokenAuthenticator | LegacySessionTokenAuthenticator | CustomAuthenticator | NoAuth> required

Authenticators to select from.

Examples: {"authenticators":{"token":"#/definitions/ApiKeyAuthenticator","oauth":"#/definitions/OAuthAuthenticator","jwt":"#/definitions/JwtAuthenticator"}}
$parameters object
CheckStream object

Defines the streams to try reading when running a check operation.

type string required
Values: "CheckStream"
stream_names string[]

Names of the streams to try reading from when running a check operation.

Examples: ["users"], ["users","contacts"]
dynamic_streams_check_configs DynamicStreamCheckConfig[]
DynamicStreamCheckConfig object
type string required
Values: "DynamicStreamCheckConfig"
dynamic_stream_name string required

The dynamic stream name.

stream_count integer

The number of streams to attempt reading from during a check operation. If stream_count exceeds the total number of available streams, the minimum of the two values will be used.

Default: 0
CheckDynamicStream object

(This component is experimental. Use at your own risk.) Defines the dynamic streams to try reading when running a check operation.

type string required
Values: "CheckDynamicStream"
stream_count integer required

Numbers of the streams to try reading from when running a check operation.

use_check_availability boolean

Enables stream check availability. This field is automatically set by the CDK.

Default: true
CompositeErrorHandler object

Error handler that sequentially iterates over a list of error handlers.

type string required
Values: "CompositeErrorHandler"
error_handlers CompositeErrorHandler | DefaultErrorHandler[] required

List of error handlers to iterate on to determine how to handle a failed response.

$parameters object
ConcurrencyLevel object

Defines the amount of parallelization for the streams that are being synced. The factor of parallelization is how many partitions or streams are synced at the same time. For example, with a concurrency_level of 10, ten streams or partitions of data will processed at the same time. Note that a value of 1 could create deadlock if a stream has a very high number of partitions.

default_concurrency integer | string required

The amount of concurrency that will applied during a sync. This value can be hardcoded or user-defined in the config if different users have varying volume thresholds in the target API.

Examples: 10, "{{ config['num_workers'] or 10 }}"
type string
Values: "ConcurrencyLevel"
max_concurrency integer

The maximum level of concurrency that will be used during a sync. This becomes a required field when the default_concurrency derives from the config, because it serves as a safeguard against a user-defined threshold that is too high.

Examples: 20, 100
$parameters object
ConditionalStreams object

Streams that are only available while performing a connector operation when the condition is met.

type string required
Values: "ConditionalStreams"
condition string required

Condition that will be evaluated to determine if a set of streams should be available.

Examples: "{{ config['is_sandbox'] }}"
streams DeclarativeStream[] required

Streams that will be used during an operation based on the condition.

$parameters object
ConstantBackoffStrategy object

Backoff strategy with a constant backoff interval.

type string required
Values: "ConstantBackoffStrategy"
backoff_time_in_seconds number | string required

Backoff time in seconds.

Examples: 30, 30.5, "{{ config['backoff_time'] }}"
$parameters object
CursorPagination object

Pagination strategy that evaluates an interpolated string to define the next page to fetch.

type string required
Values: "CursorPagination"
cursor_value string required

Value of the cursor defining the next page to fetch.

Examples: "{{ headers.link.next.cursor }}", "{{ last_record['key'] }}", "{{ response['nextPage'] }}"
page_size integer

The number of records to include in each pages.

Examples: 100
stop_condition string

Template string evaluating when to stop paginating.

Examples: "{{ response.data.has_more is false }}", "{{ 'next' not in headers['link'] }}"
$parameters object
CustomAuthenticator object

Authenticator component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomAuthenticator"
class_name string required

Fully-qualified name of the class that will be implementing the custom authentication strategy. Has to be a sub class of DeclarativeAuthenticator. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.ShortLivedTokenAuthenticator"
$parameters object
CustomBackoffStrategy object

Backoff strategy component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomBackoffStrategy"
class_name string required

Fully-qualified name of the class that will be implementing the custom backoff strategy. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomBackoffStrategy"
$parameters object
CustomErrorHandler object

Error handler component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomErrorHandler"
class_name string required

Fully-qualified name of the class that will be implementing the custom error handler. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomErrorHandler"
$parameters object
CustomIncrementalSync object

Incremental component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomIncrementalSync"
class_name string required

Fully-qualified name of the class that will be implementing the custom incremental sync. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomIncrementalSync"
cursor_field string required

The location of the value on a record that will be used as a bookmark during sync.

$parameters object
CustomPaginationStrategy object

Pagination strategy component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomPaginationStrategy"
class_name string required

Fully-qualified name of the class that will be implementing the custom pagination strategy. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomPaginationStrategy"
$parameters object
CustomRecordExtractor object

Record extractor component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomRecordExtractor"
class_name string required

Fully-qualified name of the class that will be implementing the custom record extraction strategy. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomRecordExtractor"
$parameters object
CustomRecordFilter object

Record filter component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomRecordFilter"
class_name string required

Fully-qualified name of the class that will be implementing the custom record filter strategy. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomCustomRecordFilter"
$parameters object
CustomRequester object

Requester component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomRequester"
class_name string required

Fully-qualified name of the class that will be implementing the custom requester strategy. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomRecordExtractor"
$parameters object
CustomRetriever object

Retriever component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomRetriever"
class_name string required

Fully-qualified name of the class that will be implementing the custom retriever strategy. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomRetriever"
$parameters object
CustomPartitionRouter object

Partition router component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomPartitionRouter"
class_name string required

Fully-qualified name of the class that will be implementing the custom partition router. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomPartitionRouter"
$parameters object
CustomSchemaLoader object

Schema Loader component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomSchemaLoader"
class_name string required

Fully-qualified name of the class that will be implementing the custom schema loader. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomSchemaLoader"
$parameters object
CustomSchemaNormalization object

Schema normalization component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomSchemaNormalization"
class_name string required

Fully-qualified name of the class that will be implementing the custom normalization. The format is source_<name>.<package>.<class_name>.

Examples: "source_amazon_seller_partner.components.LedgerDetailedViewReportsTypeTransformer"
$parameters object
CustomStateMigration object

Apply a custom transformation on the input state.

type string required
Values: "CustomStateMigration"
class_name string required

Fully-qualified name of the class that will be implementing the custom state migration. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomStateMigration"
$parameters object
CustomTransformation object

Transformation component whose behavior is derived from a custom code implementation of the connector.

type string required
Values: "CustomTransformation"
class_name string required

Fully-qualified name of the class that will be implementing the custom transformation. The format is source_<name>.<package>.<class_name>.

Examples: "source_railz.components.MyCustomTransformation"
$parameters object
LegacyToPerPartitionStateMigration object

Transforms the input state for per-partitioned streams from the legacy format to the low-code format. The cursor field and partition ID fields are automatically extracted from the stream's DatetimebasedCursor and SubstreamPartitionRouter. Example input state: { "13506132": { "last_changed": "2022-12-27T08:34:39+00:00" } Example output state: { "partition": {"id": "13506132"}, "cursor": {"last_changed": "2022-12-27T08:34:39+00:00"} }

type string
Values: "LegacyToPerPartitionStateMigration"
IncrementingCountCursor object

Cursor that allows for incremental sync according to a continuously increasing integer.

type string required
Values: "IncrementingCountCursor"
cursor_field string required

The location of the value on a record that will be used as a bookmark during sync. To ensure no data loss, the API must return records in ascending order based on the cursor field. Nested fields are not supported, so the field must be at the top level of the record. You can use a combination of Add Field and Remove Field transformations to move the nested field to the top.

Examples: "created_at", "{{ config['record_cursor'] }}"
start_value string | integer

The value that determines the earliest record that should be synced.

Examples: 0, "{{ config['start_value'] }}"
start_value_option object

Specifies the key field or path and where in the request a component's value should be injected.

4 nested properties
type string required
Values: "RequestOption"
inject_into enum required

Configures where the descriptor should be set on the HTTP requests. Note that request parameters that are already encoded in the URL path will not be duplicated.

Values: "request_parameter" "header" "body_data" "body_json"
Examples: "request_parameter", "header", "body_data", "body_json"
field_name string

Configures which key should be used in the location that the descriptor is being injected into. We hope to eventually deprecate this field in favor of field_path for all request_options, but must currently maintain it for backwards compatibility in the Builder.

Examples: "segment_id"
field_path string[]

Configures a path to be used for nested structures in JSON body requests (e.g. GraphQL queries)

Examples: ["data","viewer","id"]
$parameters object
DatetimeBasedCursor object

Cursor to provide incremental capabilities over datetime.

type string required
Values: "DatetimeBasedCursor"
cursor_field string required

The location of the value on a record that will be used as a bookmark during sync. To ensure no data loss, the API must return records in ascending order based on the cursor field. Nested fields are not supported, so the field must be at the top level of the record. You can use a combination of Add Field and Remove Field transformations to move the nested field to the top.

Examples: "created_at", "{{ config['record_cursor'] }}"
start_datetime MinMaxDatetime | string required

The datetime that determines the earliest record that should be synced.

Examples: "2020-01-1T00:00:00Z", "{{ config['start_time'] }}"
datetime_format string required

The datetime format used to format the datetime values that are sent in outgoing requests to the API. Use placeholders starting with "%" to describe the format the API is using. The following placeholders are available:

  • %s: Epoch unix timestamp - 1686218963
  • %s_as_float: Epoch unix timestamp in seconds as float with microsecond precision - 1686218963.123456
  • %ms: Epoch unix timestamp (milliseconds) - 1686218963123
  • %a: Weekday (abbreviated) - Sun
  • %A: Weekday (full) - Sunday
  • %w: Weekday (decimal) - 0 (Sunday), 6 (Saturday)
  • %d: Day of the month (zero-padded) - 01, 02, ..., 31
  • %b: Month (abbreviated) - Jan
  • %B: Month (full) - January
  • %m: Month (zero-padded) - 01, 02, ..., 12
  • %y: Year (without century, zero-padded) - 00, 01, ..., 99
  • %Y: Year (with century) - 0001, 0002, ..., 9999
  • %H: Hour (24-hour, zero-padded) - 00, 01, ..., 23
  • %I: Hour (12-hour, zero-padded) - 01, 02, ..., 12
  • %p: AM/PM indicator
  • %M: Minute (zero-padded) - 00, 01, ..., 59
  • %S: Second (zero-padded) - 00, 01, ..., 59
  • %f: Microsecond (zero-padded to 6 digits) - 000000
  • %_ms: Millisecond (zero-padded to 3 digits) - 000
  • %z: UTC offset - (empty), +0000, -04:00
  • %Z: Time zone name - (empty), UTC, GMT
  • %j: Day of the year (zero-padded) - 001, 002, ..., 366
  • %U: Week number of the year (starting Sunday) - 00, ..., 53
  • %W: Week number of the year (starting Monday) - 00, ..., 53
  • %c: Date and time - Tue Aug 16 21:30:00 1988
  • %x: Date standard format - 08/16/1988
  • %X: Time standard format - 21:30:00
  • %%: Literal '%' character

Some placeholders depend on the locale of the underlying system - in most cases this locale is configured as en/US. For more information see the Python documentation.

Examples: "%Y-%m-%dT%H:%M:%S.%f%z", "%Y-%m-%d", "%s", "%ms", "%s_as_float"
clamping object

This option is used to adjust the upper and lower boundaries of each datetime window to beginning and end of the provided target period (day, week, month)

2 nested properties
target string required

The period of time that datetime windows will be clamped by

Examples: "DAY", "WEEK", "MONTH", "{{ config['target'] }}"
target_details object
cursor_datetime_formats string[]

The possible formats for the cursor field, in order of preference. The first format that matches the cursor field value will be used to parse it. If not provided, the Outgoing Datetime Format will be used. Use placeholders starting with "%" to describe the format the API is using. The following placeholders are available:

  • %s: Epoch unix timestamp - 1686218963
  • %s_as_float: Epoch unix timestamp in seconds as float with microsecond precision - 1686218963.123456
  • %ms: Epoch unix timestamp - 1686218963123
  • %a: Weekday (abbreviated) - Sun
  • %A: Weekday (full) - Sunday
  • %w: Weekday (decimal) - 0 (Sunday), 6 (Saturday)
  • %d: Day of the month (zero-padded) - 01, 02, ..., 31
  • %b: Month (abbreviated) - Jan
  • %B: Month (full) - January
  • %m: Month (zero-padded) - 01, 02, ..., 12
  • %y: Year (without century, zero-padded) - 00, 01, ..., 99
  • %Y: Year (with century) - 0001, 0002, ..., 9999
  • %H: Hour (24-hour, zero-padded) - 00, 01, ..., 23
  • %I: Hour (12-hour, zero-padded) - 01, 02, ..., 12
  • %p: AM/PM indicator
  • %M: Minute (zero-padded) - 00, 01, ..., 59
  • %S: Second (zero-padded) - 00, 01, ..., 59
  • %f: Microsecond (zero-padded to 6 digits) - 000000, 000001, ..., 999999
  • %_ms: Millisecond (zero-padded to 3 digits) - 000, 001, ..., 999
  • %z: UTC offset - (empty), +0000, -04:00
  • %Z: Time zone name - (empty), UTC, GMT
  • %j: Day of the year (zero-padded) - 001, 002, ..., 366
  • %U: Week number of the year (Sunday as first day) - 00, 01, ..., 53
  • %W: Week number of the year (Monday as first day) - 00, 01, ..., 53
  • %c: Date and time representation - Tue Aug 16 21:30:00 1988
  • %x: Date representation - 08/16/1988
  • %X: Time representation - 21:30:00
  • %%: Literal '%' character

Some placeholders depend on the locale of the underlying system - in most cases this locale is configured as en/US. For more information see the Python documentation.

Examples: "%Y-%m-%d", "%Y-%m-%d %H:%M:%S", "%Y-%m-%dT%H:%M:%S", "%Y-%m-%dT%H:%M:%SZ", "%Y-%m-%dT%H:%M:%S%z", "%Y-%m-%dT%H:%M:%S.%fZ", "%Y-%m-%dT%H:%M:%S.%f%z", "%Y-%m-%d %H:%M:%S.%f+00:00", "%s", "%ms"
start_time_option object

Specifies the key field or path and where in the request a component's value should be injected.

4 nested properties
type string required
Values: "RequestOption"
inject_into enum required

Configures where the descriptor should be set on the HTTP requests. Note that request parameters that are already encoded in the URL path will not be duplicated.

Values: "request_parameter" "header" "body_data" "body_json"
Examples: "request_parameter", "header", "body_data", "body_json"
field_name string

Configures which key should be used in the location that the descriptor is being injected into. We hope to eventually deprecate this field in favor of field_path for all request_options, but must currently maintain it for backwards compatibility in the Builder.

Examples: "segment_id"
field_path string[]

Configures a path to be used for nested structures in JSON body requests (e.g. GraphQL queries)

Examples: ["data","viewer","id"]
end_datetime MinMaxDatetime | string

The datetime that determines the last record that should be synced. If not provided, {{ now_utc() }} will be used.

Examples: "2021-01-1T00:00:00Z", "{{ now_utc() }}", "{{ day_delta(-1) }}"
end_time_option object

Specifies the key field or path and where in the request a component's value should be injected.

4 nested properties
type string required
Values: "RequestOption"
inject_into enum required

Configures where the descriptor should be set on the HTTP requests. Note that request parameters that are already encoded in the URL path will not be duplicated.

Values: "request_parameter" "header" "body_data" "body_json"
Examples: "request_parameter", "header", "body_data", "body_json"
field_name string

Configures which key should be used in the location that the descriptor is being injected into. We hope to eventually deprecate this field in favor of field_path for all request_options, but must currently maintain it for backwards compatibility in the Builder.

Examples: "segment_id"
field_path string[]

Configures a path to be used for nested structures in JSON body requests (e.g. GraphQL queries)

Examples: ["data","viewer","id"]
cursor_granularity string

Smallest increment the datetime_format has (ISO 8601 duration) that is used to ensure the start of a slice does not overlap with the end of the previous one, e.g. for %Y-%m-%d the granularity should be P1D, for %Y-%m-%dT%H:%M:%SZ the granularity should be PT1S. Given this field is provided, step needs to be provided as well.

  • PT0.000001S: 1 microsecond
  • PT0.001S: 1 millisecond
  • PT1S: 1 second
  • PT1M: 1 minute
  • PT1H: 1 hour
  • P1D: 1 day
Examples: "PT1S"
is_data_feed boolean

A data feed API is an API that does not allow filtering and paginates the content from the most recent to the least recent. Given this, the CDK needs to know when to stop paginating and this field will generate a stop condition for pagination.

is_client_side_incremental boolean

Set to True if the target API endpoint does not take cursor values to filter records and returns all records anyway. This will cause the connector to filter out records locally, and only emit new records from the last sync, hence incremental. This means that all records would be read from the API, but only new records will be emitted to the destination.

is_compare_strictly boolean

Set to True if the target API does not accept queries where the start time equal the end time. This will cause those requests to be skipped.

Default: false
global_substream_cursor boolean

Setting to True causes the connector to store the cursor as one value, instead of per-partition. This setting optimizes performance when the parent stream has thousands of partitions. Notably, the substream state is updated only at the end of the sync, which helps prevent data loss in case of a sync failure. See more info in the docs.

Default: false
lookback_window string

Time interval (ISO8601 duration) before the start_datetime to read data for, e.g. P1M for looking back one month.

  • PT1H: 1 hour
  • P1D: 1 day
  • P1W: 1 week
  • P1M: 1 month
  • P1Y: 1 year
Examples: "P1D", "P{{ config['lookback_days'] }}D"
partition_field_end string

Name of the partition start time field.

Examples: "ending_time"
partition_field_start string

Name of the partition end time field.

Examples: "starting_time"
step string

The size of the time window (ISO8601 duration). Given this field is provided, cursor_granularity needs to be provided as well.

  • PT1H: 1 hour
  • P1D: 1 day
  • P1W: 1 week
  • P1M: 1 month
  • P1Y: 1 year
Examples: "P1W", "{{ config['step_increment'] }}"
$parameters object
JwtAuthenticator object

Authenticator for requests using JWT authentication flow.

type string required
Values: "JwtAuthenticator"
secret_key string required

Secret used to sign the JSON web token.

Examples: "{{ config['secret_key'] }}"
algorithm string required

Algorithm used to sign the JSON web token.

Values: "HS256" "HS384" "HS512" "ES256" "ES256K" "ES384" "ES512" "RS256" "RS384" "RS512" "PS256" "PS384" "PS512" "EdDSA"
Examples: "ES256", "HS256", "RS256", "{{ config['algorithm'] }}"
base64_encode_secret_key boolean

When set to true, the secret key will be base64 encoded prior to being encoded as part of the JWT. Only set to "true" when required by the API.

Default: false
token_duration integer

The amount of time in seconds a JWT token can be valid after being issued.

Default: 1200
Examples: 1200, 3600
header_prefix string

The prefix to be used within the Authentication header.

Examples: "Bearer", "Basic"
jwt_headers object

JWT headers used when signing JSON web token.

3 nested properties
kid string

Private key ID for user account.

Examples: "{{ config['kid'] }}"
typ string

The media type of the complete JWT.

Default: "JWT"
Examples: "JWT"
cty string

Content type of JWT header.

Examples: "JWT"
additional_jwt_headers object

Additional headers to be included with the JWT headers object.

jwt_payload object

JWT Payload used when signing JSON web token.

3 nested properties
iss string

The user/principal that issued the JWT. Commonly a value unique to the user.

Examples: "{{ config['iss'] }}"
sub string

The subject of the JWT. Commonly defined by the API.

aud string

The recipient that the JWT is intended for. Commonly defined by the API.

Examples: "appstoreconnect-v1"
additional_jwt_payload object

Additional properties to be added to the JWT payload.

$parameters object
OAuthAuthenticator object

Authenticator for requests using OAuth 2.0 authorization flow.

type string required
Values: "OAuthAuthenticator"
client_id_name string

The name of the property to use to refresh the access_token.

Default: "client_id"
Examples: "custom_app_id"
client_id string

The OAuth client ID. Fill it in the user inputs.

Examples: "{{ config['client_id'] }}", "{{ config['credentials']['client_id }}"
client_secret_name string

The name of the property to use to refresh the access_token.

Default: "client_secret"
Examples: "custom_app_secret"
client_secret string

The OAuth client secret. Fill it in the user inputs.

Examples: "{{ config['client_secret'] }}", "{{ config['credentials']['client_secret }}"
refresh_token_name string

The name of the property to use to refresh the access_token.

Default: "refresh_token"
Examples: "custom_app_refresh_value"
refresh_token string

Credential artifact used to get a new access token.

Examples: "{{ config['refresh_token'] }}", "{{ config['credentials]['refresh_token'] }}"
token_refresh_endpoint string

The full URL to call to obtain a new access token.

Examples: "https://connect.squareup.com/oauth2/token"
access_token_name string

The name of the property which contains the access token in the response from the token refresh endpoint.

Default: "access_token"
Examples: "access_token"
access_token_value string

The value of the access_token to bypass the token refreshing using refresh_token.

Examples: "secret_access_token_value"
expires_in_name string

The name of the property which contains the expiry date in the response from the token refresh endpoint.

Default: "expires_in"
Examples: "expires_in"
grant_type_name string

The name of the property to use to refresh the access_token.

Default: "grant_type"
Examples: "custom_grant_type"
grant_type string

Specifies the OAuth2 grant type. If set to refresh_token, the refresh_token needs to be provided as well. For client_credentials, only client id and secret are required. Other grant types are not officially supported.

Default: "refresh_token"
Examples: "refresh_token", "client_credentials"
refresh_request_body object

Body of the request sent to get a new access token.

Examples: {"applicationId":"{{ config['application_id'] }}","applicationSecret":"{{ config['application_secret'] }}","token":"{{ config['token'] }}"}
refresh_request_headers object

Headers of the request sent to get a new access token.

Examples: {"Authorization":"<AUTH_TOKEN>","Content-Type":"application/x-www-form-urlencoded"}
scopes string[]

List of scopes that should be granted to the access token.

Examples: ["crm.list.read","crm.objects.contacts.read","crm.schema.contacts.read"]
token_expiry_date string

The access token expiry date.

Examples: "2023-04-06T07:12:10.421833+00:00", 1680842386
token_expiry_date_format string

The format of the time to expiration datetime. Provide it if the time is returned as a date-time string instead of seconds.

Examples: "%Y-%m-%d %H:%M:%S.%f+00:00"
refresh_token_updater

When the refresh token updater is defined, new refresh tokens, access tokens and the access token expiry date are written back from the authentication response to the config object. This is important if the refresh token can only used once.

7 nested properties
refresh_token_name string

The name of the property which contains the updated refresh token in the response from the token refresh endpoint.

Default: "refresh_token"
Examples: "refresh_token"
access_token_config_path string[]

Config path to the access token. Make sure the field actually exists in the config.

Default:
[
  "credentials",
  "access_token"
]
Examples: ["credentials","access_token"], ["access_token"]
refresh_token_config_path string[]

Config path to the access token. Make sure the field actually exists in the config.

Default:
[
  "credentials",
  "refresh_token"
]
Examples: ["credentials","refresh_token"], ["refresh_token"]
token_expiry_date_config_path string[]

Config path to the expiry date. Make sure actually exists in the config.

Default:
[
  "credentials",
  "token_expiry_date"
]
Examples: ["credentials","token_expiry_date"]
refresh_token_error_status_codes integer[]

Status Codes to Identify refresh token error in response (Refresh Token Error Key and Refresh Token Error Values should be also specified). Responses with one of the error status code and containing an error value will be flagged as a config error

Default:
[]
Examples: [400,500]
refresh_token_error_key string

Key to Identify refresh token error in response (Refresh Token Error Status Codes and Refresh Token Error Values should be also specified).

Default: ""
Examples: "error"
refresh_token_error_values string[]

List of values to check for exception during token refresh process. Used to check if the error found in the response matches the key from the Refresh Token Error Key field (e.g. response={"error": "invalid_grant"}). Only responses with one of the error status code and containing an error value will be flagged as a config error

Default:
[]
Examples: ["invalid_grant","invalid_permissions"]
profile_assertion object

Authenticator for requests using JWT authentication flow.

11 nested properties
type string required
Values: "JwtAuthenticator"
secret_key string required

Secret used to sign the JSON web token.

Examples: "{{ config['secret_key'] }}"
algorithm string required

Algorithm used to sign the JSON web token.

Values: "HS256" "HS384" "HS512" "ES256" "ES256K" "ES384" "ES512" "RS256" "RS384" "RS512" "PS256" "PS384" "PS512" "EdDSA"
Examples: "ES256", "HS256", "RS256", "{{ config['algorithm'] }}"
base64_encode_secret_key boolean

When set to true, the secret key will be base64 encoded prior to being encoded as part of the JWT. Only set to "true" when required by the API.

Default: false
token_duration integer

The amount of time in seconds a JWT token can be valid after being issued.

Default: 1200
Examples: 1200, 3600
header_prefix string

The prefix to be used within the Authentication header.

Examples: "Bearer", "Basic"
jwt_headers object

JWT headers used when signing JSON web token.

3 nested properties
kid string

Private key ID for user account.

Examples: "{{ config['kid'] }}"
typ string

The media type of the complete JWT.

Default: "JWT"
Examples: "JWT"
cty string

Content type of JWT header.

Examples: "JWT"
additional_jwt_headers object

Additional headers to be included with the JWT headers object.

jwt_payload object

JWT Payload used when signing JSON web token.

3 nested properties
iss string

The user/principal that issued the JWT. Commonly a value unique to the user.

Examples: "{{ config['iss'] }}"
sub string

The subject of the JWT. Commonly defined by the API.

aud string

The recipient that the JWT is intended for. Commonly defined by the API.

Examples: "appstoreconnect-v1"
additional_jwt_payload object

Additional properties to be added to the JWT payload.

$parameters object
use_profile_assertion boolean

Enable using profile assertion as a flow for OAuth authorization.

Default: false
$parameters object
DeclarativeStream object

A stream whose behavior is described by a set of declarative low code components.

type string required
Values: "DeclarativeStream"

Component used to coordinate how records are extracted across stream slices and request pages.

name string

The stream name.

Default: ""

Component used to fetch data incrementally based on a time field in the data.

primary_key string | string[] | string[][]

The stream field to be used to distinguish unique records. Can either be a single field, an array of fields representing a composite key, or an array of arrays representing a composite key where the fields are nested fields.

Default: ""
Examples: "id", ["code","type"]
schema_loader InlineSchemaLoader | DynamicSchemaLoader | JsonFileSchemaLoader | InlineSchemaLoader | DynamicSchemaLoader | JsonFileSchemaLoader | CustomSchemaLoader[] | CustomSchemaLoader

One or many schema loaders can be used to retrieve the schema for the current stream. When multiple schema loaders are defined, schema properties will be merged together. Schema loaders defined first taking precedence in the event of a conflict.

transformations AddFields | RemoveFields | KeysToLower | KeysToSnakeCase | FlattenFields | DpathFlattenFields | KeysReplace | CustomTransformation[]

A list of transformations to be applied to each output record.

state_migrations LegacyToPerPartitionStateMigration | CustomStateMigration[]

Array of state migrations to be applied on the input state

Default:
[]
file_uploader object

(experimental) Describes how to fetch a file

6 nested properties
type string required
Values: "FileUploader"
requester HttpRequester | CustomRequester required

Requester component that describes how to prepare HTTP requests to send to the source API.

download_target_extractor DpathExtractor | CustomRecordExtractor required

Responsible for fetching the url where the file is located. This is applied on each records and not on the HTTP response

Responsible for fetching the content of the file. If not defined, the assumption is that the whole response body is the file content

filename_extractor string

Defines the name to store the file. Stream name is automatically added to the file path. File unique ID can be used to avoid overwriting files. Random UUID will be used if the extractor is not provided.

Examples: "{{ record.id }}/{{ record.file_name }}/", "{{ record.id }}_{{ record.file_name }}/"
$parameters object
$parameters object
HTTPAPIBudget object

Defines how many requests can be made to the API in a given time frame. HTTPAPIBudget extracts the remaining call count and the reset time from HTTP response headers using the header names provided by ratelimit_remaining_header and ratelimit_reset_header. Only requests using HttpRequester are rate-limited; custom components that bypass HttpRequester are not covered by this budget.

type string required
Values: "HTTPAPIBudget"
policies FixedWindowCallRatePolicy | MovingWindowCallRatePolicy | UnlimitedCallRatePolicy[] required

List of call rate policies that define how many calls are allowed.

ratelimit_reset_header string

The HTTP response header name that indicates when the rate limit resets.

Default: "ratelimit-reset"
ratelimit_remaining_header string

The HTTP response header name that indicates the number of remaining allowed calls.

Default: "ratelimit-remaining"
status_codes_for_ratelimit_hit integer[]

List of HTTP status codes that indicate a rate limit has been hit.

Default:
[
  429
]
FixedWindowCallRatePolicy object

A policy that allows a fixed number of calls within a specific time window.

type string required
Values: "FixedWindowCallRatePolicy"
period string required

The time interval for the rate limit window.

call_limit integer required

The maximum number of calls allowed within the period.

matchers HttpRequestRegexMatcher[] required

List of matchers that define which requests this policy applies to.

MovingWindowCallRatePolicy object

A policy that allows a fixed number of calls within a moving time window.

type string required
Values: "MovingWindowCallRatePolicy"
rates Rate[] required

List of rates that define the call limits for different time intervals.

matchers HttpRequestRegexMatcher[] required

List of matchers that define which requests this policy applies to.

UnlimitedCallRatePolicy object

A policy that allows unlimited calls for specific requests.

type string required
Values: "UnlimitedCallRatePolicy"
matchers HttpRequestRegexMatcher[] required

List of matchers that define which requests this policy applies to.

Rate object

Defines a rate limit with a specific number of calls allowed within a time interval.

limit integer | string required

The maximum number of calls allowed within the interval.

interval string required

The time interval for the rate limit.

Examples: "PT1H", "P1D"
HttpRequestRegexMatcher object

Matches HTTP requests based on method, base URL, URL path pattern, query parameters, and headers. Use url_base to specify the scheme and host (without trailing slash) and url_path_pattern to apply a regex to the request path.

method string

The HTTP method to match (e.g., GET, POST).

url_base string

The base URL (scheme and host, e.g. "https://api.example.com") to match.

url_path_pattern string

A regular expression pattern to match the URL path.

params object

The query parameters to match.

headers object

The headers to match.

DefaultErrorHandler object

Component defining how to handle errors. Default behavior includes only retrying server errors (HTTP 5XX) and too many requests (HTTP 429) with an exponential backoff.

type string required
Values: "DefaultErrorHandler"
backoff_strategies ConstantBackoffStrategy | ExponentialBackoffStrategy | WaitTimeFromHeader | WaitUntilTimeFromHeader | CustomBackoffStrategy[]

List of backoff strategies to use to determine how long to wait before retrying a retryable request.

max_retries integer

The maximum number of time to retry a retryable request before giving up and failing.

Default: 5
Examples: 5, 0, 10
response_filters HttpResponseFilter[]

List of response filters to iterate on when deciding how to handle an error. When using an array of multiple filters, the filters will be applied sequentially and the response will be selected if it matches any of the filter's predicate.

$parameters object
DefaultPaginator object

Default pagination implementation to request pages of results with a fixed size until the pagination strategy no longer returns a next_page_token.

type string required
Values: "DefaultPaginator"

Strategy defining how records are paginated.

page_size_option object

Specifies the key field or path and where in the request a component's value should be injected.

4 nested properties
type string required
Values: "RequestOption"
inject_into enum required

Configures where the descriptor should be set on the HTTP requests. Note that request parameters that are already encoded in the URL path will not be duplicated.

Values: "request_parameter" "header" "body_data" "body_json"
Examples: "request_parameter", "header", "body_data", "body_json"
field_name string

Configures which key should be used in the location that the descriptor is being injected into. We hope to eventually deprecate this field in favor of field_path for all request_options, but must currently maintain it for backwards compatibility in the Builder.

Examples: "segment_id"
field_path string[]

Configures a path to be used for nested structures in JSON body requests (e.g. GraphQL queries)

Examples: ["data","viewer","id"]
page_token_option RequestOption | RequestPath
$parameters object
DpathExtractor object

Record extractor that searches a decoded response over a path defined as an array of fields.

type string required
Values: "DpathExtractor"
field_path string[] required

List of potentially nested fields describing the full path of the field to extract. Use "*" to extract all values from an array. See more info in the docs.

Examples: ["data"], ["data","records"], ["data","{{ parameters.name }}"], ["data","*","record"]
$parameters object
ResponseToFileExtractor object

A record extractor designed for handling large responses that may exceed memory limits (to prevent OOM issues). It downloads a CSV file to disk, reads the data from disk, and deletes the file once it has been fully processed.

type string required
Values: "ResponseToFileExtractor"
$parameters object
ExponentialBackoffStrategy object

Backoff strategy with an exponential backoff interval. The interval is defined as factor * 2^attempt_count.

type string required
Values: "ExponentialBackoffStrategy"
factor number | string

Multiplicative constant applied on each retry.

Default: 5
Examples: 5, 5.5, "10"
$parameters object
GroupByKeyMergeStrategy object

Record merge strategy that combines records according to fields on the record.

type string required
Values: "GroupByKeyMergeStrategy"
key string | string[] required

The name of the field on the record whose value will be used to group properties that were retrieved through multiple API requests.

Examples: "id", ["parent_id","end_date"]
$parameters object
SessionTokenAuthenticator object

Authenticator for requests using the session token as an API key that's injected into the request.

type string required
Values: "SessionTokenAuthenticator"
login_requester object required

Requester submitting HTTP requests and extracting records from the response.

16 nested properties
type string required
Values: "HttpRequester"
url_base string

Deprecated, use the url instead. Base URL of the API source. Do not put sensitive information (e.g. API tokens) into this field - Use the Authenticator component for this.

Examples: "https://connect.squareup.com/v2", "{{ config['base_url'] or 'https://app.posthog.com'}}/api", "https://connect.squareup.com/v2/quotes/{{ stream_partition['id'] }}/quote_line_groups", "https://example.com/api/v1/resource/{{ next_page_token['id'] }}"
url string

The URL of the source API endpoint. Do not put sensitive information (e.g. API tokens) into this field - Use the Authenticator component for this.

Examples: "https://connect.squareup.com/v2", "{{ config['url'] or 'https://app.posthog.com'}}/api", "https://connect.squareup.com/v2/quotes/{{ stream_partition['id'] }}/quote_line_groups", "https://example.com/api/v1/resource/{{ next_page_token['id'] }}"
path string

Deprecated, use the url instead. Path the specific API endpoint that this stream represents. Do not put sensitive information (e.g. API tokens) into this field - Use the Authenticator component for this.

Examples: "/products", "/quotes/{{ stream_partition['id'] }}/quote_line_groups", "/trades/{{ config['symbol_id'] }}/history"
http_method string

The HTTP method used to fetch data from the source (can be GET or POST).

Default: "GET"
Values: "GET" "POST"
Examples: "GET", "POST"
fetch_properties_from_endpoint object

Defines the behavior for fetching the list of properties from an API that will be loaded into the requests to extract records.

4 nested properties
type string required
Values: "PropertiesFromEndpoint"
property_field_path string[] required

Describes the path to the field that should be extracted

Examples: ["name"]
retriever SimpleRetriever | CustomRetriever required

Requester component that describes how to fetch the properties to query from a remote API endpoint.

$parameters object
query_properties object

For APIs that require explicit specification of the properties to query for, this component specifies which property fields and how they are supplied to outbound requests.

5 nested properties
type string required
Values: "QueryProperties"
property_list string[] | PropertiesFromEndpoint required

The set of properties that will be queried for in the outbound request. This can either be statically defined or dynamic based on an API endpoint

always_include_properties string[]

The list of properties that should be included in every set of properties when multiple chunks of properties are being requested.

property_chunking object

For APIs with restrictions on the amount of properties that can be requester per request, property chunking can be applied to make multiple requests with a subset of the properties.

$parameters object
request_parameters object | string

Specifies the query parameters that should be set on an outgoing HTTP request given the inputs.

Examples: {"unit":"day"}, {"query":"last_event_time BETWEEN TIMESTAMP \"{{ stream_interval.start_time }}\" AND TIMESTAMP \"{{ stream_interval.end_time }}\""}, {"searchIn":"{{ ','.join(config.get('search_in', [])) }}"}, {"sort_by[asc]":"updated_at"}
request_headers object | string

Return any non-auth headers. Authentication headers will overwrite any overlapping headers returned from this method.

Examples: {"Output-Format":"JSON"}, {"Version":"{{ config['version'] }}"}
request_body_data object | string

Specifies how to populate the body of the request with a non-JSON payload. Plain text will be sent as is, whereas objects will be converted to a urlencoded form.

Examples: "[{"clause": {"type": "timestamp", "operator": 10, "parameters": [{"value": {{ stream_interval['start_time'] | int * 1000 }} }] }, "orderBy": 1, "columnName": "Timestamp"}]/ "
request_body_json object | string

Specifies how to populate the body of the request with a JSON payload. Can contain nested objects.

Examples: {"sort_order":"ASC","sort_field":"CREATED_AT"}, {"key":"{{ config['value'] }}"}, {"sort":{"field":"updated_at","order":"ascending"}}

Specifies how to populate the body of the request with a payload. Can contain nested objects.

Error handler component that defines how to handle errors.

use_cache boolean

Enables stream requests caching. This field is automatically set by the CDK.

Default: false
$parameters object
session_token_path string[] required

The path in the response body returned from the login requester to the session token.

Examples: ["access_token"], ["result","token"]

Authentication method to use for requests sent to the API, specifying how to inject the session token.

expiration_duration string

The duration in ISO 8601 duration notation after which the session token expires, starting from the time it was obtained. Omitting it will result in the session token being refreshed for every request.

  • PT1H: 1 hour
  • P1D: 1 day
  • P1W: 1 week
  • P1M: 1 month
  • P1Y: 1 year
Examples: "PT1H", "P1D"

Component used to decode the response.

$parameters object
SessionTokenRequestApiKeyAuthenticator object

Authenticator for requests using the session token as an API key that's injected into the request.

type enum required
Values: "ApiKey"
inject_into object required

Specifies the key field or path and where in the request a component's value should be injected.

4 nested properties
type string required
Values: "RequestOption"
inject_into enum required

Configures where the descriptor should be set on the HTTP requests. Note that request parameters that are already encoded in the URL path will not be duplicated.

Values: "request_parameter" "header" "body_data" "body_json"
Examples: "request_parameter", "header", "body_data", "body_json"
field_name string

Configures which key should be used in the location that the descriptor is being injected into. We hope to eventually deprecate this field in favor of field_path for all request_options, but must currently maintain it for backwards compatibility in the Builder.

Examples: "segment_id"
field_path string[]

Configures a path to be used for nested structures in JSON body requests (e.g. GraphQL queries)

Examples: ["data","viewer","id"]
SessionTokenRequestBearerAuthenticator object

Authenticator for requests using the session token as a standard bearer token.

type enum required
Values: "Bearer"
HttpRequester object

Requester submitting HTTP requests and extracting records from the response.

type string required
Values: "HttpRequester"
url_base string

Deprecated, use the url instead. Base URL of the API source. Do not put sensitive information (e.g. API tokens) into this field - Use the Authenticator component for this.

Examples: "https://connect.squareup.com/v2", "{{ config['base_url'] or 'https://app.posthog.com'}}/api", "https://connect.squareup.com/v2/quotes/{{ stream_partition['id'] }}/quote_line_groups", "https://example.com/api/v1/resource/{{ next_page_token['id'] }}"
url string

The URL of the source API endpoint. Do not put sensitive information (e.g. API tokens) into this field - Use the Authenticator component for this.

Examples: "https://connect.squareup.com/v2", "{{ config['url'] or 'https://app.posthog.com'}}/api", "https://connect.squareup.com/v2/quotes/{{ stream_partition['id'] }}/quote_line_groups", "https://example.com/api/v1/resource/{{ next_page_token['id'] }}"
path string

Deprecated, use the url instead. Path the specific API endpoint that this stream represents. Do not put sensitive information (e.g. API tokens) into this field - Use the Authenticator component for this.

Examples: "/products", "/quotes/{{ stream_partition['id'] }}/quote_line_groups", "/trades/{{ config['symbol_id'] }}/history"
http_method string

The HTTP method used to fetch data from the source (can be GET or POST).

Default: "GET"
Values: "GET" "POST"
Examples: "GET", "POST"
fetch_properties_from_endpoint object

Defines the behavior for fetching the list of properties from an API that will be loaded into the requests to extract records.

4 nested properties
type string required
Values: "PropertiesFromEndpoint"
property_field_path string[] required

Describes the path to the field that should be extracted

Examples: ["name"]
retriever SimpleRetriever | CustomRetriever required

Requester component that describes how to fetch the properties to query from a remote API endpoint.

$parameters object
query_properties object

For APIs that require explicit specification of the properties to query for, this component specifies which property fields and how they are supplied to outbound requests.

5 nested properties
type string required
Values: "QueryProperties"
property_list string[] | PropertiesFromEndpoint required

The set of properties that will be queried for in the outbound request. This can either be statically defined or dynamic based on an API endpoint

always_include_properties string[]

The list of properties that should be included in every set of properties when multiple chunks of properties are being requested.

property_chunking object

For APIs with restrictions on the amount of properties that can be requester per request, property chunking can be applied to make multiple requests with a subset of the properties.

5 nested properties
type string required
Values: "PropertyChunking"
property_limit_type enum required

The type used to determine the maximum number of properties per chunk

Values: "characters" "property_count"
property_limit integer

The maximum amount of properties that can be retrieved per request according to the limit type.

record_merge_strategy object

Record merge strategy that combines records according to fields on the record.

$parameters object
$parameters object
request_parameters object | string

Specifies the query parameters that should be set on an outgoing HTTP request given the inputs.

Examples: {"unit":"day"}, {"query":"last_event_time BETWEEN TIMESTAMP \"{{ stream_interval.start_time }}\" AND TIMESTAMP \"{{ stream_interval.end_time }}\""}, {"searchIn":"{{ ','.join(config.get('search_in', [])) }}"}, {"sort_by[asc]":"updated_at"}
request_headers object | string

Return any non-auth headers. Authentication headers will overwrite any overlapping headers returned from this method.

Examples: {"Output-Format":"JSON"}, {"Version":"{{ config['version'] }}"}
request_body_data object | string

Specifies how to populate the body of the request with a non-JSON payload. Plain text will be sent as is, whereas objects will be converted to a urlencoded form.

Examples: "[{"clause": {"type": "timestamp", "operator": 10, "parameters": [{"value": {{ stream_interval['start_time'] | int * 1000 }} }] }, "orderBy": 1, "columnName": "Timestamp"}]/ "
request_body_json object | string

Specifies how to populate the body of the request with a JSON payload. Can contain nested objects.

Examples: {"sort_order":"ASC","sort_field":"CREATED_AT"}, {"key":"{{ config['value'] }}"}, {"sort":{"field":"updated_at","order":"ascending"}}

Specifies how to populate the body of the request with a payload. Can contain nested objects.

Error handler component that defines how to handle errors.

use_cache boolean

Enables stream requests caching. This field is automatically set by the CDK.

Default: false
$parameters object
HttpResponseFilter object

A filter that is used to select on properties of the HTTP response received. When used with additional filters, a response will be selected if it matches any of the filter's criteria.

type string required
Values: "HttpResponseFilter"
action string

Action to execute if a response matches the filter.

Values: "SUCCESS" "FAIL" "RETRY" "IGNORE" "RATE_LIMITED"
Examples: "SUCCESS", "FAIL", "RETRY", "IGNORE", "RATE_LIMITED"
failure_type string

Failure type of traced exception if a response matches the filter.

Values: "system_error" "config_error" "transient_error"
Examples: "system_error", "config_error", "transient_error"
error_message string

Error Message to display if the response matches the filter.

error_message_contains string

Match the response if its error message contains the substring.

http_codes integer[]

Match the response if its HTTP code is included in this list.

Examples: [420,429], [500]
uniqueItems=true
predicate string

Match the response if the predicate evaluates to true.

Examples: "{{ 'Too much requests' in response }}", "{{ 'error_code' in response and response['error_code'] == 'ComplexityException' }}"
$parameters object
ComplexFieldType object

(This component is experimental. Use at your own risk.) Represents a complex field type.

field_type string required
items string | ComplexFieldType
TypesMap object

(This component is experimental. Use at your own risk.) Represents a mapping between a current type and its corresponding target type.

target_type string | string[] | ComplexFieldType required
current_type string | string[] required
condition string
SchemaTypeIdentifier object

(This component is experimental. Use at your own risk.) Identifies schema details for dynamic schema extraction and processing.

key_pointer string[] required

List of potentially nested fields describing the full path of the field key to extract.

type string
Values: "SchemaTypeIdentifier"
schema_pointer string[]

List of nested fields defining the schema field path to extract. Defaults to [].

Default:
[]
type_pointer string[]

List of potentially nested fields describing the full path of the field type to extract.

types_mapping TypesMap[]
$parameters object
DynamicSchemaLoader object

(This component is experimental. Use at your own risk.) Loads a schema by extracting data from retrieved records.

type string required
Values: "DynamicSchemaLoader"

Component used to coordinate how records are extracted across stream slices and request pages.

schema_type_identifier object required

(This component is experimental. Use at your own risk.) Identifies schema details for dynamic schema extraction and processing.

6 nested properties
key_pointer string[] required

List of potentially nested fields describing the full path of the field key to extract.

type string
Values: "SchemaTypeIdentifier"
schema_pointer string[]

List of nested fields defining the schema field path to extract. Defaults to [].

Default:
[]
type_pointer string[]

List of potentially nested fields describing the full path of the field type to extract.

types_mapping TypesMap[]
$parameters object

Responsible for filtering fields to be added to json schema.

schema_transformations AddFields | RemoveFields | KeysToLower | KeysToSnakeCase | FlattenFields | DpathFlattenFields | KeysReplace | CustomTransformation[]

A list of transformations to be applied to the schema.

$parameters object
InlineSchemaLoader object

Loads a schema that is defined directly in the manifest file.

type string required
Values: "InlineSchemaLoader"
schema object

Describes a streams' schema. Refer to the Data Types documentation for more details on which types are valid.

JsonFileSchemaLoader object

Loads the schema from a json file.

type string required
Values: "JsonFileSchemaLoader"
file_path string

Path to the JSON file defining the schema. The path is relative to the connector module's root.

$parameters object
JsonDecoder object

Select 'JSON' if the response is formatted as a JSON object.

type string required
Values: "JsonDecoder"
JsonlDecoder object

Select 'JSON Lines' if the response consists of JSON objects separated by new lines ('\n') in JSONL format.

type string required
Values: "JsonlDecoder"
KeysToLower object

A transformation that renames all keys to lower case.

type string required
Values: "KeysToLower"
$parameters object
KeysToSnakeCase object

A transformation that renames all keys to snake case.

type string required
Values: "KeysToSnakeCase"
$parameters object
FlattenFields object

A transformation that flatten record to single level format.

type string required
Values: "FlattenFields"
flatten_lists boolean

Whether to flatten lists or leave it as is. Default is True.

Default: true
$parameters object
KeyTransformation object
type string required
Values: "KeyTransformation"
prefix string

Prefix to add for object keys. If not provided original keys remain unchanged.

Examples: "flattened_"
suffix string

Suffix to add for object keys. If not provided original keys remain unchanged.

Examples: "_flattened"
DpathFlattenFields object

A transformation that flatten field values to the to top of the record.

type string required
Values: "DpathFlattenFields"
field_path string[] required

A path to field that needs to be flattened.

Examples: ["data"], ["data","*","field"]
delete_origin_value boolean

Whether to delete the origin value or keep it. Default is False.

replace_record boolean

Whether to replace the origin record or not. Default is False.

key_transformation object
3 nested properties
type string required
Values: "KeyTransformation"
prefix string

Prefix to add for object keys. If not provided original keys remain unchanged.

Examples: "flattened_"
suffix string

Suffix to add for object keys. If not provided original keys remain unchanged.

Examples: "_flattened"
$parameters object
KeysReplace object

A transformation that replaces symbols in keys.

type string required
Values: "KeysReplace"
old string required

Old value to replace.

Examples: " ", "{{ record.id }}", "{{ config['id'] }}", "{{ stream_slice['id'] }}"
new string required

New value to set.

Examples: "_", "{{ record.id }}", "{{ config['id'] }}", "{{ stream_slice['id'] }}"
$parameters object
IterableDecoder object

Select 'Iterable' if the response consists of strings separated by new lines (\n). The string will then be wrapped into a JSON object with the record key.

type string required
Values: "IterableDecoder"
XmlDecoder object

Select 'XML' if the response consists of XML-formatted data.

type string required
Values: "XmlDecoder"
CustomDecoder object

Use this to implement custom decoder logic.

type string required
Values: "CustomDecoder"
class_name string required

Fully-qualified name of the class that will be implementing the custom decoding. Has to be a sub class of Decoder. The format is source_<name>.<package>.<class_name>.

Examples: "source_amazon_ads.components.GzipJsonlDecoder"
$parameters object
ZipfileDecoder object

Select 'ZIP file' for response data that is returned as a zipfile. Requires specifying an inner data type/decoder to parse the unzipped data.

type string required
Values: "ZipfileDecoder"

Parser to parse the decompressed data from the zipfile(s).

ListPartitionRouter object

A Partition router that specifies a list of attributes where each attribute describes a portion of the complete data set for a stream. During a sync, each value is iterated over and can be used as input to outbound API requests.

type string required
Values: "ListPartitionRouter"
cursor_field string required

While iterating over list values, the name of field used to reference a list value. The partition value can be accessed with string interpolation. e.g. "{{ stream_partition['my_key'] }}" where "my_key" is the value of the cursor_field.

Examples: "section", "{{ config['section_key'] }}"
values string | string[] required

The list of attributes being iterated over and used as input for the requests made to the source API.

Examples: ["section_a","section_b","section_c"], "{{ config['sections'] }}"
request_option object

Specifies the key field or path and where in the request a component's value should be injected.

4 nested properties
type string required
Values: "RequestOption"
inject_into enum required

Configures where the descriptor should be set on the HTTP requests. Note that request parameters that are already encoded in the URL path will not be duplicated.

Values: "request_parameter" "header" "body_data" "body_json"
Examples: "request_parameter", "header", "body_data", "body_json"
field_name string

Configures which key should be used in the location that the descriptor is being injected into. We hope to eventually deprecate this field in favor of field_path for all request_options, but must currently maintain it for backwards compatibility in the Builder.

Examples: "segment_id"
field_path string[]

Configures a path to be used for nested structures in JSON body requests (e.g. GraphQL queries)

Examples: ["data","viewer","id"]
$parameters object
MinMaxDatetime object

Compares the provided date against optional minimum or maximum times. The max_datetime serves as the ceiling and will be returned when datetime exceeds it. The min_datetime serves as the floor.

type string required
Values: "MinMaxDatetime"
datetime string required

Datetime value.

Examples: "2021-01-01", "2021-01-01T00:00:00+00:00", "{{ config['start_time'] }}", "{{ now_utc().strftime('%Y-%m-%dT%H:%M:%SZ') }}"
datetime_format string

Format of the datetime value. Defaults to "%Y-%m-%dT%H:%M:%S.%f%z" if left empty. Use placeholders starting with "%" to describe the format the API is using. The following placeholders are available:

  • %s: Epoch unix timestamp - 1686218963
  • %s_as_float: Epoch unix timestamp in seconds as float with microsecond precision - 1686218963.123456
  • %ms: Epoch unix timestamp - 1686218963123
  • %a: Weekday (abbreviated) - Sun
  • %A: Weekday (full) - Sunday
  • %w: Weekday (decimal) - 0 (Sunday), 6 (Saturday)
  • %d: Day of the month (zero-padded) - 01, 02, ..., 31
  • %b: Month (abbreviated) - Jan
  • %B: Month (full) - January
  • %m: Month (zero-padded) - 01, 02, ..., 12
  • %y: Year (without century, zero-padded) - 00, 01, ..., 99
  • %Y: Year (with century) - 0001, 0002, ..., 9999
  • %H: Hour (24-hour, zero-padded) - 00, 01, ..., 23
  • %I: Hour (12-hour, zero-padded) - 01, 02, ..., 12
  • %p: AM/PM indicator
  • %M: Minute (zero-padded) - 00, 01, ..., 59
  • %S: Second (zero-padded) - 00, 01, ..., 59
  • %f: Microsecond (zero-padded to 6 digits) - 000000, 000001, ..., 999999
  • %_ms: Millisecond (zero-padded to 3 digits) - 000, 001, ..., 999
  • %z: UTC offset - (empty), +0000, -04:00
  • %Z: Time zone name - (empty), UTC, GMT
  • %j: Day of the year (zero-padded) - 001, 002, ..., 366
  • %U: Week number of the year (Sunday as first day) - 00, 01, ..., 53
  • %W: Week number of the year (Monday as first day) - 00, 01, ..., 53
  • %c: Date and time representation - Tue Aug 16 21:30:00 1988
  • %x: Date representation - 08/16/1988
  • %X: Time representation - 21:30:00
  • %%: Literal '%' character

Some placeholders depend on the locale of the underlying system - in most cases this locale is configured as en/US. For more information see the Python documentation.

Default: ""
Examples: "%Y-%m-%dT%H:%M:%S.%f%z", "%Y-%m-%d", "%s"
max_datetime string

Ceiling applied on the datetime value. Must be formatted with the datetime_format field.

Examples: "2021-01-01T00:00:00Z", "2021-01-01"
min_datetime string

Floor applied on the datetime value. Must be formatted with the datetime_format field.

Examples: "2010-01-01T00:00:00Z", "2010-01-01"
$parameters object
NoAuth object

Authenticator for requests requiring no authentication.

type string required
Values: "NoAuth"
$parameters object
NoPagination object

Pagination implementation that never returns a next page.

type string required
Values: "NoPagination"
OAuthConfigSpecification object

Specification describing how an 'advanced' Auth flow would need to function.

oauth_user_input_from_connector_config_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations used as input to OAuth. Must be a valid non-nested JSON that refers to properties from ConnectorSpecification.connectionSpecification using special annotation 'path_in_connector_config'. These are input values the user is entering through the UI to authenticate to the connector, that might also shared as inputs for syncing data via the connector. Examples: if no connector values is shared during oauth flow, oauth_user_input_from_connector_config_specification=[] if connector values such as 'app_id' inside the top level are used to generate the API url for the oauth flow, oauth_user_input_from_connector_config_specification={ app_id: { type: string path_in_connector_config: ['app_id'] } } if connector values such as 'info.app_id' nested inside another object are used to generate the API url for the oauth flow, oauth_user_input_from_connector_config_specification={ app_id: { type: string path_in_connector_config: ['info', 'app_id'] } }

Examples: {"app_id":{"type":"string","path_in_connector_config":["app_id"]}}, {"app_id":{"type":"string","path_in_connector_config":["info","app_id"]}}
oauth_connector_input_specification object

The DeclarativeOAuth specific blob. Pertains to the fields defined by the connector relating to the OAuth flow.

Interpolation capabilities:

  • The variables placeholders are declared as {{my_var}}.

  • The nested resolution variables like {{ {{my_nested_var}} }} is allowed as well.

  • The allowed interpolation context is:

    • base64Encoder - encode to base64, {{ {{my_var_a}}:{{my_var_b}} | base64Encoder }}
    • base64Decorer - decode from base64 encoded string, {{ {{my_string_variable_or_string_value}} | base64Decoder }}
    • urlEncoder - encode the input string to URL-like format, {{ https://test.host.com/endpoint | urlEncoder}}
    • urlDecorer - decode the input url-encoded string into text format, {{ urlDecoder:https%3A%2F%2Fairbyte.io | urlDecoder}}
    • codeChallengeS256 - get the codeChallenge encoded value to provide additional data-provider specific authorisation values, {{ {{state_value}} | codeChallengeS256 }}

Examples:

13 nested properties
consent_url string required

The DeclarativeOAuth Specific string URL string template to initiate the authentication. The placeholders are replaced during the processing to provide neccessary values.

Examples: "https://domain.host.com/marketing_api/auth?{{client_id_key}}={{client_id_value}}&{{redirect_uri_key}}={{{{redirect_uri_value}} | urlEncoder}}&{{state_key}}={{state_value}}", "https://endpoint.host.com/oauth2/authorize?{{client_id_key}}={{client_id_value}}&{{redirect_uri_key}}={{{{redirect_uri_value}} | urlEncoder}}&{{scope_key}}={{{{scope_value}} | urlEncoder}}&{{state_key}}={{state_value}}&subdomain={{subdomain}}"
access_token_url string required

The DeclarativeOAuth Specific URL templated string to obtain the access_token, refresh_token etc. The placeholders are replaced during the processing to provide neccessary values.

Examples: "https://auth.host.com/oauth2/token?{{client_id_key}}={{client_id_value}}&{{client_secret_key}}={{client_secret_value}}&{{auth_code_key}}={{auth_code_value}}&{{redirect_uri_key}}={{{{redirect_uri_value}} | urlEncoder}}"
scope string

The DeclarativeOAuth Specific string of the scopes needed to be grant for authenticated user.

Examples: "user:read user:read_orders workspaces:read"
access_token_headers object

The DeclarativeOAuth Specific optional headers to inject while exchanging the auth_code to access_token during completeOAuthFlow step.

Examples: {"Authorization":"Basic {{ {{ client_id_value }}:{{ client_secret_value }} | base64Encoder }}"}
access_token_params object

The DeclarativeOAuth Specific optional query parameters to inject while exchanging the auth_code to access_token during completeOAuthFlow step. When this property is provided, the query params will be encoded as Json and included in the outgoing API request.

Examples: {"{{ auth_code_key }}":"{{ auth_code_value }}","{{ client_id_key }}":"{{ client_id_value }}","{{ client_secret_key }}":"{{ client_secret_value }}"}
extract_output string[]

The DeclarativeOAuth Specific list of strings to indicate which keys should be extracted and returned back to the input config.

Examples: ["access_token","refresh_token","other_field"]
state object

The DeclarativeOAuth Specific object to provide the criteria of how the state query param should be constructed, including length and complexity.

Examples: {"min":7,"max":128}
2 nested properties
min integer required
max integer required
client_id_key string

The DeclarativeOAuth Specific optional override to provide the custom client_id key name, if required by data-provider.

Examples: "my_custom_client_id_key_name"
client_secret_key string

The DeclarativeOAuth Specific optional override to provide the custom client_secret key name, if required by data-provider.

Examples: "my_custom_client_secret_key_name"
scope_key string

The DeclarativeOAuth Specific optional override to provide the custom scope key name, if required by data-provider.

Examples: "my_custom_scope_key_key_name"
state_key string

The DeclarativeOAuth Specific optional override to provide the custom state key name, if required by data-provider.

Examples: "my_custom_state_key_key_name"
auth_code_key string

The DeclarativeOAuth Specific optional override to provide the custom code key name to something like auth_code or custom_auth_code, if required by data-provider.

Examples: "my_custom_auth_code_key_name"
redirect_uri_key string

The DeclarativeOAuth Specific optional override to provide the custom redirect_uri key name to something like callback_uri, if required by data-provider.

Examples: "my_custom_redirect_uri_key_name"
complete_oauth_output_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations produced by the OAuth flows as they are returned by the distant OAuth APIs. Must be a valid JSON describing the fields to merge back to ConnectorSpecification.connectionSpecification. For each field, a special annotation path_in_connector_config can be specified to determine where to merge it, Examples: complete_oauth_output_specification={ refresh_token: { type: string, path_in_connector_config: ['credentials', 'refresh_token'] } }

Examples: {"refresh_token":{"type":"string,","path_in_connector_config":["credentials","refresh_token"]}}
complete_oauth_server_input_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations persisted as Airbyte Server configurations. Must be a valid non-nested JSON describing additional fields configured by the Airbyte Instance or Workspace Admins to be used by the server when completing an OAuth flow (typically exchanging an auth code for refresh token). Examples: complete_oauth_server_input_specification={ client_id: { type: string }, client_secret: { type: string } }

Examples: {"client_id":{"type":"string"},"client_secret":{"type":"string"}}
complete_oauth_server_output_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations persisted as Airbyte Server configurations that also need to be merged back into the connector configuration at runtime. This is a subset configuration of complete_oauth_server_input_specification that filters fields out to retain only the ones that are necessary for the connector to function with OAuth. (some fields could be used during oauth flows but not needed afterwards, therefore they would be listed in the complete_oauth_server_input_specification but not complete_oauth_server_output_specification) Must be a valid non-nested JSON describing additional fields configured by the Airbyte Instance or Workspace Admins to be used by the connector when using OAuth flow APIs. These fields are to be merged back to ConnectorSpecification.connectionSpecification. For each field, a special annotation path_in_connector_config can be specified to determine where to merge it, Examples: complete_oauth_server_output_specification={ client_id: { type: string, path_in_connector_config: ['credentials', 'client_id'] }, client_secret: { type: string, path_in_connector_config: ['credentials', 'client_secret'] } }

Examples: {"client_id":{"type":"string,","path_in_connector_config":["credentials","client_id"]},"client_secret":{"type":"string,","path_in_connector_config":["credentials","client_secret"]}}
OffsetIncrement object

Pagination strategy that returns the number of records reads so far and returns it as the next page token.

type string required
Values: "OffsetIncrement"
page_size integer | string

The number of records to include in each pages.

Examples: 100, "{{ config['page_size'] }}"
inject_on_first_request boolean

Using the offset with value 0 during the first request

Default: false
$parameters object
PageIncrement object

Pagination strategy that returns the number of pages reads so far and returns it as the next page token.

type string required
Values: "PageIncrement"
page_size integer | string

The number of records to include in each pages.

Examples: 100, "100", "{{ config['page_size'] }}"
start_from_page integer

Index of the first page to request.

Default: 0
Examples: 0, 1
inject_on_first_request boolean

Using the page number with value defined by start_from_page during the first request

Default: false
$parameters object
ParentStreamConfig object

Describes how to construct partitions from the records retrieved from the parent stream..

type string required
Values: "ParentStreamConfig"

Reference to the parent stream.

parent_key string required

The primary key of records from the parent stream that will be used during the retrieval of records for the current substream. This parent identifier field is typically a characteristic of the child records being extracted from the source API.

Examples: "id", "{{ config['parent_record_id'] }}"
partition_field string required

While iterating over parent records during a sync, the parent_key value can be referenced by using this field.

Examples: "parent_id", "{{ config['parent_partition_field'] }}"
request_option object

Specifies the key field or path and where in the request a component's value should be injected.

4 nested properties
type string required
Values: "RequestOption"
inject_into enum required

Configures where the descriptor should be set on the HTTP requests. Note that request parameters that are already encoded in the URL path will not be duplicated.

Values: "request_parameter" "header" "body_data" "body_json"
Examples: "request_parameter", "header", "body_data", "body_json"
field_name string

Configures which key should be used in the location that the descriptor is being injected into. We hope to eventually deprecate this field in favor of field_path for all request_options, but must currently maintain it for backwards compatibility in the Builder.

Examples: "segment_id"
field_path string[]

Configures a path to be used for nested structures in JSON body requests (e.g. GraphQL queries)

Examples: ["data","viewer","id"]
incremental_dependency boolean

Indicates whether the parent stream should be read incrementally based on updates in the child stream.

Default: false
lazy_read_pointer string[]

If set, this will enable lazy reading, using the initial read of parent records to extract child records.

Default:
[]
extra_fields string[][]

Array of field paths to include as additional fields in the stream slice. Each path is an array of strings representing keys to access fields in the respective parent record. Accessible via stream_slice.extra_fields. Missing fields are set to None.

$parameters object
PrimaryKey string | string[] | string[][]

The stream field to be used to distinguish unique records. Can either be a single field, an array of fields representing a composite key, or an array of arrays representing a composite key where the fields are nested fields.

Examples:
  • "id"
  • [ "code", "type" ]
PropertiesFromEndpoint object

Defines the behavior for fetching the list of properties from an API that will be loaded into the requests to extract records.

type string required
Values: "PropertiesFromEndpoint"
property_field_path string[] required

Describes the path to the field that should be extracted

Examples: ["name"]
retriever SimpleRetriever | CustomRetriever required

Requester component that describes how to fetch the properties to query from a remote API endpoint.

$parameters object
PropertyChunking object

For APIs with restrictions on the amount of properties that can be requester per request, property chunking can be applied to make multiple requests with a subset of the properties.

type string required
Values: "PropertyChunking"
property_limit_type enum required

The type used to determine the maximum number of properties per chunk

Values: "characters" "property_count"
property_limit integer

The maximum amount of properties that can be retrieved per request according to the limit type.

record_merge_strategy object

Record merge strategy that combines records according to fields on the record.

3 nested properties
type string required
Values: "GroupByKeyMergeStrategy"
key string | string[] required

The name of the field on the record whose value will be used to group properties that were retrieved through multiple API requests.

Examples: "id", ["parent_id","end_date"]
$parameters object
$parameters object
QueryProperties object

For APIs that require explicit specification of the properties to query for, this component specifies which property fields and how they are supplied to outbound requests.

type string required
Values: "QueryProperties"
property_list string[] | PropertiesFromEndpoint required

The set of properties that will be queried for in the outbound request. This can either be statically defined or dynamic based on an API endpoint

always_include_properties string[]

The list of properties that should be included in every set of properties when multiple chunks of properties are being requested.

property_chunking object

For APIs with restrictions on the amount of properties that can be requester per request, property chunking can be applied to make multiple requests with a subset of the properties.

5 nested properties
type string required
Values: "PropertyChunking"
property_limit_type enum required

The type used to determine the maximum number of properties per chunk

Values: "characters" "property_count"
property_limit integer

The maximum amount of properties that can be retrieved per request according to the limit type.

record_merge_strategy object

Record merge strategy that combines records according to fields on the record.

3 nested properties
type string required
Values: "GroupByKeyMergeStrategy"
key string | string[] required

The name of the field on the record whose value will be used to group properties that were retrieved through multiple API requests.

Examples: "id", ["parent_id","end_date"]
$parameters object
$parameters object
$parameters object
RecordFilter object

Filter applied on a list of records.

type string required
Values: "RecordFilter"
condition string

The predicate to filter a record. Records will be removed if evaluated to False.

Default: ""
Examples: "{{ record['created_at'] >= stream_interval['start_time'] }}", "{{ record.status in ['active', 'expired'] }}"
$parameters object
RecordSelector object

Responsible for translating an HTTP response into a list of records by extracting records from the response and optionally filtering records based on a heuristic.

type string required
Values: "RecordSelector"
extractor DpathExtractor | CustomRecordExtractor required

Responsible for filtering records to be emitted by the Source.

Responsible for normalization according to the schema.

transform_before_filtering boolean

If true, transformation will be applied before record filtering.

$parameters object
SchemaNormalization string

Responsible for normalization according to the schema.

Examples:
  • "Default"
  • "None"
RemoveFields object

A transformation which removes fields from a record. The fields removed are designated using FieldPointers. During transformation, if a field or any of its parents does not exist in the record, no error is thrown.

type string required
Values: "RemoveFields"
field_pointers string[][] required

Array of paths defining the field to remove. Each item is an array whose field describe the path of a field to remove.

Examples: ["tags"], [["content","html"],["content","plain_text"]]
condition string

The predicate to filter a property by a property value. Property will be removed if it is empty OR expression is evaluated to True.,

Default: ""
Examples: "{{ property|string == '' }}", "{{ property is integer }}", "{{ property|length > 5 }}", "{{ property == 'some_string_to_match' }}"
RequestPath object

Specifies where in the request path a component's value should be inserted.

type string required
Values: "RequestPath"
RequestOption object

Specifies the key field or path and where in the request a component's value should be injected.

type string required
Values: "RequestOption"
inject_into enum required

Configures where the descriptor should be set on the HTTP requests. Note that request parameters that are already encoded in the URL path will not be duplicated.

Values: "request_parameter" "header" "body_data" "body_json"
Examples: "request_parameter", "header", "body_data", "body_json"
field_name string

Configures which key should be used in the location that the descriptor is being injected into. We hope to eventually deprecate this field in favor of field_path for all request_options, but must currently maintain it for backwards compatibility in the Builder.

Examples: "segment_id"
field_path string[]

Configures a path to be used for nested structures in JSON body requests (e.g. GraphQL queries)

Examples: ["data","viewer","id"]
Schemas object

The stream schemas representing the shape of the data emitted by the stream.

LegacySessionTokenAuthenticator object

Deprecated - use SessionTokenAuthenticator instead. Authenticator for requests authenticated using session tokens. A session token is a random value generated by a server to identify a specific user for the duration of one interaction session.

type string required
Values: "LegacySessionTokenAuthenticator"
header string required

The name of the session token header that will be injected in the request

Examples: "X-Session"
login_url string required

Path of the login URL (do not include the base URL)

Examples: "session"
session_token_response_key string required

Name of the key of the session token to be extracted from the response

Examples: "id"
validate_session_url string required

Path of the URL to use to validate that the session token is valid (do not include the base URL)

Examples: "user/current"
session_token string

Session token to use if using a pre-defined token. Not needed if authenticating with username + password pair

username string

Username used to authenticate and obtain a session token

Examples: " {{ config['username'] }}"
password string

Password used to authenticate and obtain a session token

Default: ""
Examples: "{{ config['password'] }}", ""
$parameters object
StateDelegatingStream object

(This component is experimental. Use at your own risk.) Orchestrate the retriever's usage based on the state value.

type string required
Values: "StateDelegatingStream"
name string required

The stream name.

Default: ""
full_refresh_stream object required

A stream whose behavior is described by a set of declarative low code components.

10 nested properties
type string required
Values: "DeclarativeStream"

Component used to coordinate how records are extracted across stream slices and request pages.

name string

The stream name.

Default: ""

Component used to fetch data incrementally based on a time field in the data.

primary_key string | string[] | string[][]

The stream field to be used to distinguish unique records. Can either be a single field, an array of fields representing a composite key, or an array of arrays representing a composite key where the fields are nested fields.

Default: ""
Examples: "id", ["code","type"]
schema_loader InlineSchemaLoader | DynamicSchemaLoader | JsonFileSchemaLoader | InlineSchemaLoader | DynamicSchemaLoader | JsonFileSchemaLoader | CustomSchemaLoader[] | CustomSchemaLoader

One or many schema loaders can be used to retrieve the schema for the current stream. When multiple schema loaders are defined, schema properties will be merged together. Schema loaders defined first taking precedence in the event of a conflict.

transformations AddFields | RemoveFields | KeysToLower | KeysToSnakeCase | FlattenFields | DpathFlattenFields | KeysReplace | CustomTransformation[]

A list of transformations to be applied to each output record.

state_migrations LegacyToPerPartitionStateMigration | CustomStateMigration[]

Array of state migrations to be applied on the input state

Default:
[]
file_uploader object

(experimental) Describes how to fetch a file

6 nested properties
type string required
Values: "FileUploader"
requester HttpRequester | CustomRequester required

Requester component that describes how to prepare HTTP requests to send to the source API.

download_target_extractor DpathExtractor | CustomRecordExtractor required

Responsible for fetching the url where the file is located. This is applied on each records and not on the HTTP response

Responsible for fetching the content of the file. If not defined, the assumption is that the whole response body is the file content

filename_extractor string

Defines the name to store the file. Stream name is automatically added to the file path. File unique ID can be used to avoid overwriting files. Random UUID will be used if the extractor is not provided.

Examples: "{{ record.id }}/{{ record.file_name }}/", "{{ record.id }}_{{ record.file_name }}/"
$parameters object
$parameters object
incremental_stream object required

A stream whose behavior is described by a set of declarative low code components.

10 nested properties
type string required
Values: "DeclarativeStream"

Component used to coordinate how records are extracted across stream slices and request pages.

name string

The stream name.

Default: ""

Component used to fetch data incrementally based on a time field in the data.

primary_key string | string[] | string[][]

The stream field to be used to distinguish unique records. Can either be a single field, an array of fields representing a composite key, or an array of arrays representing a composite key where the fields are nested fields.

Default: ""
Examples: "id", ["code","type"]
schema_loader InlineSchemaLoader | DynamicSchemaLoader | JsonFileSchemaLoader | InlineSchemaLoader | DynamicSchemaLoader | JsonFileSchemaLoader | CustomSchemaLoader[] | CustomSchemaLoader

One or many schema loaders can be used to retrieve the schema for the current stream. When multiple schema loaders are defined, schema properties will be merged together. Schema loaders defined first taking precedence in the event of a conflict.

transformations AddFields | RemoveFields | KeysToLower | KeysToSnakeCase | FlattenFields | DpathFlattenFields | KeysReplace | CustomTransformation[]

A list of transformations to be applied to each output record.

state_migrations LegacyToPerPartitionStateMigration | CustomStateMigration[]

Array of state migrations to be applied on the input state

Default:
[]
file_uploader object

(experimental) Describes how to fetch a file

6 nested properties
type string required
Values: "FileUploader"
requester HttpRequester | CustomRequester required

Requester component that describes how to prepare HTTP requests to send to the source API.

download_target_extractor DpathExtractor | CustomRecordExtractor required

Responsible for fetching the url where the file is located. This is applied on each records and not on the HTTP response

Responsible for fetching the content of the file. If not defined, the assumption is that the whole response body is the file content

filename_extractor string

Defines the name to store the file. Stream name is automatically added to the file path. File unique ID can be used to avoid overwriting files. Random UUID will be used if the extractor is not provided.

Examples: "{{ record.id }}/{{ record.file_name }}/", "{{ record.id }}_{{ record.file_name }}/"
$parameters object
$parameters object
$parameters object
SimpleRetriever object

Retrieves records by synchronously sending requests to fetch records. The retriever acts as an orchestrator between the requester, the record selector, the paginator, and the partition router.

type string required
Values: "SimpleRetriever"
requester HttpRequester | CustomRequester required

Requester component that describes how to prepare HTTP requests to send to the source API.

record_selector object required

Responsible for translating an HTTP response into a list of records by extracting records from the response and optionally filtering records based on a heuristic.

6 nested properties
type string required
Values: "RecordSelector"
extractor DpathExtractor | CustomRecordExtractor required

Responsible for filtering records to be emitted by the Source.

Responsible for normalization according to the schema.

transform_before_filtering boolean

If true, transformation will be applied before record filtering.

$parameters object

Component decoding the response so records can be extracted.

Paginator component that describes how to navigate through the API's pages.

ignore_stream_slicer_parameters_on_paginated_requests boolean

If true, the partition router and incremental request options will be ignored when paginating requests. Request options set directly on the requester will not be ignored.

Default: false
partition_router SubstreamPartitionRouter | ListPartitionRouter | GroupingPartitionRouter | CustomPartitionRouter | SubstreamPartitionRouter | ListPartitionRouter | GroupingPartitionRouter | CustomPartitionRouter[]

Used to iteratively execute requests over a set of values, such as a parent stream's records or a list of constant values.

$parameters object
GzipDecoder object

Select 'gzip' for response data that is compressed with gzip. Requires specifying an inner data type/decoder to parse the decompressed data.

type string required
Values: "GzipDecoder"
CsvDecoder object

Select 'CSV' for response data that is formatted as CSV (comma-separated values). Can specify an encoding (default: 'utf-8') and a delimiter (default: ',').

type string required
Values: "CsvDecoder"
encoding string
Default: "utf-8"
delimiter string
Default: ","
set_values_to_none string[]
AsyncJobStatusMap object

Matches the api job status to Async Job Status.

running string[] required
completed string[] required
failed string[] required
timeout string[] required
type string
Values: "AsyncJobStatusMap"
AsyncRetriever object

Retrieves records by Asynchronously sending requests to fetch records. The retriever acts as an orchestrator between the requester, the record selector, the paginator, and the partition router.

type string required
Values: "AsyncRetriever"
record_selector object required

Responsible for translating an HTTP response into a list of records by extracting records from the response and optionally filtering records based on a heuristic.

6 nested properties
type string required
Values: "RecordSelector"
extractor DpathExtractor | CustomRecordExtractor required

Responsible for filtering records to be emitted by the Source.

Responsible for normalization according to the schema.

transform_before_filtering boolean

If true, transformation will be applied before record filtering.

$parameters object
status_mapping AsyncJobStatusMap required

Async Job Status to Airbyte CDK Async Job Status mapping.

status_extractor DpathExtractor | CustomRecordExtractor required

Responsible for fetching the actual status of the async job.

download_target_extractor DpathExtractor | CustomRecordExtractor required

Responsible for fetching the final result urls provided by the completed / finished / ready async job.

creation_requester HttpRequester | CustomRequester required

Requester component that describes how to prepare HTTP requests to send to the source API to create the async server-side job.

polling_requester HttpRequester | CustomRequester required

Requester component that describes how to prepare HTTP requests to send to the source API to fetch the status of the running async job.

download_requester HttpRequester | CustomRequester required

Requester component that describes how to prepare HTTP requests to send to the source API to download the data provided by the completed async job.

Responsible for fetching the records from provided urls.

polling_job_timeout integer | string

The time in minutes after which the single Async Job should be considered as Timed Out.

download_target_requester HttpRequester | CustomRequester

Requester component that describes how to prepare HTTP requests to send to the source API to extract the url from polling response by the completed async job.

download_paginator DefaultPaginator | NoPagination

Paginator component that describes how to navigate through the API's pages during download.

abort_requester HttpRequester | CustomRequester

Requester component that describes how to prepare HTTP requests to send to the source API to abort a job once it is timed out from the source's perspective.

delete_requester HttpRequester | CustomRequester

Requester component that describes how to prepare HTTP requests to send to the source API to delete a job once the records are extracted.

partition_router ListPartitionRouter | SubstreamPartitionRouter | GroupingPartitionRouter | CustomPartitionRouter | ListPartitionRouter | SubstreamPartitionRouter | GroupingPartitionRouter | CustomPartitionRouter[]

PartitionRouter component that describes how to partition the stream, enabling incremental syncs and checkpointing.

Default:
[]

Component decoding the response so records can be extracted.

Component decoding the download response so records can be extracted.

$parameters object
Spec object

A source specification made up of connector metadata and how it can be configured.

type string required
Values: "Spec"
connection_specification object required

A connection specification describing how a the connector can be configured.

documentation_url string

URL of the connector's documentation page.

Examples: "https://docs.airbyte.com/integrations/sources/dremio"
advanced_auth object

Additional and optional specification object to describe what an 'advanced' Auth flow would need to function.

  • A connector should be able to fully function with the configuration as described by the ConnectorSpecification in a 'basic' mode.
  • The 'advanced' mode provides easier UX for the user with UI improvements and automations. However, this requires further setup on the server side by instance or workspace admins beforehand. The trade-off is that the user does not have to provide as many technical inputs anymore and the auth process is faster and easier to complete.
4 nested properties
auth_flow_type string

The type of auth to use

Values: "oauth2.0" "oauth1.0"
predicate_key string[]

JSON path to a field in the connectorSpecification that should exist for the advanced auth to be applicable.

Examples: ["credentials","auth_type"]
predicate_value string

Value of the predicate_key fields for the advanced auth to be applicable.

Examples: "Oauth"
oauth_config_specification object

Specification describing how an 'advanced' Auth flow would need to function.

5 nested properties
oauth_user_input_from_connector_config_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations used as input to OAuth. Must be a valid non-nested JSON that refers to properties from ConnectorSpecification.connectionSpecification using special annotation 'path_in_connector_config'. These are input values the user is entering through the UI to authenticate to the connector, that might also shared as inputs for syncing data via the connector. Examples: if no connector values is shared during oauth flow, oauth_user_input_from_connector_config_specification=[] if connector values such as 'app_id' inside the top level are used to generate the API url for the oauth flow, oauth_user_input_from_connector_config_specification={ app_id: { type: string path_in_connector_config: ['app_id'] } } if connector values such as 'info.app_id' nested inside another object are used to generate the API url for the oauth flow, oauth_user_input_from_connector_config_specification={ app_id: { type: string path_in_connector_config: ['info', 'app_id'] } }

Examples: {"app_id":{"type":"string","path_in_connector_config":["app_id"]}}, {"app_id":{"type":"string","path_in_connector_config":["info","app_id"]}}
oauth_connector_input_specification object

The DeclarativeOAuth specific blob. Pertains to the fields defined by the connector relating to the OAuth flow.

Interpolation capabilities:

  • The variables placeholders are declared as {{my_var}}.

  • The nested resolution variables like {{ {{my_nested_var}} }} is allowed as well.

  • The allowed interpolation context is:

    • base64Encoder - encode to base64, {{ {{my_var_a}}:{{my_var_b}} | base64Encoder }}
    • base64Decorer - decode from base64 encoded string, {{ {{my_string_variable_or_string_value}} | base64Decoder }}
    • urlEncoder - encode the input string to URL-like format, {{ https://test.host.com/endpoint | urlEncoder}}
    • urlDecorer - decode the input url-encoded string into text format, {{ urlDecoder:https%3A%2F%2Fairbyte.io | urlDecoder}}
    • codeChallengeS256 - get the codeChallenge encoded value to provide additional data-provider specific authorisation values, {{ {{state_value}} | codeChallengeS256 }}

Examples:

complete_oauth_output_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations produced by the OAuth flows as they are returned by the distant OAuth APIs. Must be a valid JSON describing the fields to merge back to ConnectorSpecification.connectionSpecification. For each field, a special annotation path_in_connector_config can be specified to determine where to merge it, Examples: complete_oauth_output_specification={ refresh_token: { type: string, path_in_connector_config: ['credentials', 'refresh_token'] } }

Examples: {"refresh_token":{"type":"string,","path_in_connector_config":["credentials","refresh_token"]}}
complete_oauth_server_input_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations persisted as Airbyte Server configurations. Must be a valid non-nested JSON describing additional fields configured by the Airbyte Instance or Workspace Admins to be used by the server when completing an OAuth flow (typically exchanging an auth code for refresh token). Examples: complete_oauth_server_input_specification={ client_id: { type: string }, client_secret: { type: string } }

Examples: {"client_id":{"type":"string"},"client_secret":{"type":"string"}}
complete_oauth_server_output_specification object

OAuth specific blob. This is a Json Schema used to validate Json configurations persisted as Airbyte Server configurations that also need to be merged back into the connector configuration at runtime. This is a subset configuration of complete_oauth_server_input_specification that filters fields out to retain only the ones that are necessary for the connector to function with OAuth. (some fields could be used during oauth flows but not needed afterwards, therefore they would be listed in the complete_oauth_server_input_specification but not complete_oauth_server_output_specification) Must be a valid non-nested JSON describing additional fields configured by the Airbyte Instance or Workspace Admins to be used by the connector when using OAuth flow APIs. These fields are to be merged back to ConnectorSpecification.connectionSpecification. For each field, a special annotation path_in_connector_config can be specified to determine where to merge it, Examples: complete_oauth_server_output_specification={ client_id: { type: string, path_in_connector_config: ['credentials', 'client_id'] }, client_secret: { type: string, path_in_connector_config: ['credentials', 'client_secret'] } }

Examples: {"client_id":{"type":"string,","path_in_connector_config":["credentials","client_id"]},"client_secret":{"type":"string,","path_in_connector_config":["credentials","client_secret"]}}
config_normalization_rules object
4 nested properties
type string required
Values: "ConfigNormalizationRules"
config_migrations ConfigMigration[]

The discrete migrations that will be applied on the incoming config. Each migration will be applied in the order they are defined.

Default:
[]
transformations ConfigRemapField | ConfigAddFields | ConfigRemoveFields | CustomConfigTransformation[]

The list of transformations that will be applied on the incoming config at the start of each sync. The transformations will be applied in the order they are defined.

Default:
[]
validations DpathValidator | PredicateValidator[]

The list of validations that will be performed on the incoming config at the start of each sync.

Default:
[]
ConfigMigration object

A config migration that will be applied on the incoming config at the start of a sync.

type string required
Values: "ConfigMigration"
transformations ConfigRemapField | ConfigAddFields | ConfigRemoveFields | CustomConfigTransformation[] required

The list of transformations that will attempt to be applied on an incoming unmigrated config. The transformations will be applied in the order they are defined.

Default:
[]
description string

The description/purpose of the config migration.

SubstreamPartitionRouter object

Partition router that is used to retrieve records that have been partitioned according to records from the specified parent streams. An example of a parent stream is automobile brands and the substream would be the various car models associated with each branch.

type string required
Values: "SubstreamPartitionRouter"
parent_stream_configs ParentStreamConfig[] required

Specifies which parent streams are being iterated over and how parent records should be used to partition the child stream data set.

$parameters object
ValueType string

A schema type.

WaitTimeFromHeader object

Extract wait time from a HTTP header in the response.

type string required
Values: "WaitTimeFromHeader"
header string required

The name of the response header defining how long to wait before retrying.

Examples: "Retry-After"
regex string

Optional regex to apply on the header to extract its value. The regex should define a capture group defining the wait time.

Examples: "([-+]?\d+)"
max_waiting_time_in_seconds number

Given the value extracted from the header is greater than this value, stop the stream.

Examples: 3600
$parameters object
GroupingPartitionRouter object

A decorator on top of a partition router that groups partitions into batches of a specified size. This is useful for APIs that support filtering by multiple partition keys in a single request. Note that per-partition incremental syncs may not work as expected because the grouping of partitions might change between syncs, potentially leading to inconsistent state tracking.

type string required
Values: "GroupingPartitionRouter"
group_size integer required

The number of partitions to include in each group. This determines how many partition values are batched together in a single slice.

Examples: 10, 50
underlying_partition_router ListPartitionRouter | SubstreamPartitionRouter | CustomPartitionRouter required

The partition router whose output will be grouped. This can be any valid partition router component.

deduplicate boolean

If true, ensures that partitions are unique within each group by removing duplicates based on the partition key.

Default: true
$parameters object
WaitUntilTimeFromHeader object

Extract time at which we can retry the request from response header and wait for the difference between now and that time.

type string required
Values: "WaitUntilTimeFromHeader"
header string required

The name of the response header defining how long to wait before retrying.

Examples: "wait_time"
min_wait number | string

Minimum time to wait before retrying.

Examples: 10, "60"
regex string

Optional regex to apply on the header to extract its value. The regex should define a capture group defining the wait time.

Examples: "([-+]?\d+)"
$parameters object
ComponentMappingDefinition object

(This component is experimental. Use at your own risk.) Specifies a mapping definition to update or add fields in a record or configuration. This allows dynamic mapping of data by interpolating values into the template based on provided contexts.

type string required
Values: "ComponentMappingDefinition"
field_path string[] required

A list of potentially nested fields indicating the full path where value will be added or updated.

Examples: ["data"], ["data","records"], ["data",1,"name"], ["data","{{ components_values.name }}"], ["data","*","record"], ["*","**","name"]
value string required

The dynamic or static value to assign to the key. Interpolated values can be used to dynamically determine the value during runtime.

Examples: "{{ components_values['updates'] }}", "{{ components_values['MetaData']['LastUpdatedTime'] }}", "{{ config['segment_id'] }}", "{{ stream_slice['parent_id'] }}", "{{ stream_slice['extra_fields']['name'] }}"
value_type string

A schema type.

Values: "string" "number" "integer" "boolean"
create_or_update boolean

Determines whether to create a new path if it doesn't exist (true) or only update existing paths (false). When set to true, the resolver will create new paths in the stream template if they don't exist. When false (default), it will only update existing paths.

Default: false
condition string

A condition that must be met for the mapping to be applied. This property is only supported for ConfigComponentsResolver.

Examples: "{{ components_values.get('cursor_field', None) }}", "{{ '_incremental' in components_values.get('stream_name', '') }}"
$parameters object
HttpComponentsResolver object

(This component is experimental. Use at your own risk.) Component resolve and populates stream templates with components fetched via an HTTP retriever.

type string required
Values: "HttpComponentsResolver"

Component used to coordinate how records are extracted across stream slices and request pages.

components_mapping ComponentMappingDefinition[] required
$parameters object
StreamConfig object

(This component is experimental. Use at your own risk.) Describes how to get streams config from the source config.

type string required
Values: "StreamConfig"
configs_pointer string[] required

A list of potentially nested fields indicating the full path in source config file where streams configs located.

Examples: ["data"], ["data","streams"], ["data","{{ parameters.name }}"]
default_values object[]

A list of default values, each matching the structure expected from the parsed component value.

$parameters object
ConfigComponentsResolver object

(This component is experimental. Use at your own risk.) Resolves and populates stream templates with components fetched from the source config.

type string required
Values: "ConfigComponentsResolver"
stream_config StreamConfig[] | StreamConfig required
components_mapping ComponentMappingDefinition[] required
$parameters object
StreamParametersDefinition object

(This component is experimental. Use at your own risk.) Represents a stream parameters definition to set up dynamic streams from defined values in manifest.

type string required
Values: "StreamParametersDefinition"
list_of_parameters_for_stream object[] required

A list of object of parameters for stream, each object in the list represents params for one stream.

Examples: [{"name":"test stream","$parameters":{"entity":"test entity"},"primary_key":"test key"}]
ParametrizedComponentsResolver object

(This component is experimental. Use at your own risk.) Resolves and populates dynamic streams from defined parametrized values in manifest.

type string required
Values: "ParametrizedComponentsResolver"
stream_parameters object required

(This component is experimental. Use at your own risk.) Represents a stream parameters definition to set up dynamic streams from defined values in manifest.

2 nested properties
type string required
Values: "StreamParametersDefinition"
list_of_parameters_for_stream object[] required

A list of object of parameters for stream, each object in the list represents params for one stream.

Examples: [{"name":"test stream","$parameters":{"entity":"test entity"},"primary_key":"test key"}]
components_mapping ComponentMappingDefinition[] required
$parameters object
DynamicDeclarativeStream object

(This component is experimental. Use at your own risk.) A component that described how will be created declarative streams based on stream template.

type string required
Values: "DynamicDeclarativeStream"
stream_template DeclarativeStream | StateDelegatingStream required

Reference to the stream template.

Component resolve and populates stream templates with components values.

name string

The dynamic stream name.

Default: ""
use_parent_parameters boolean

Whether or not to prioritize parent parameters over component parameters when constructing dynamic streams. Defaults to true for backward compatibility.

Default: true
RequestBodyPlainText object

Request body value is sent as plain text

type string required
Values: "RequestBodyPlainText"
value string required
RequestBodyUrlEncodedForm object

Request body value is converted into a url-encoded form

type string required
Values: "RequestBodyUrlEncodedForm"
value Record<string, string> required
RequestBodyJsonObject object

Request body value converted into a JSON object

type string required
Values: "RequestBodyJsonObject"
value object required
RequestBodyGraphQL object

Request body value converted into a GraphQL query object

type string required
Values: "RequestBodyGraphQL"
value object required

Request body GraphQL query object

1 nested properties
query string required

The GraphQL query to be executed

Default: "query { }"
RequestBodyGraphQlQuery object

Request body GraphQL query object

query string required

The GraphQL query to be executed

Default: "query { }"
DpathValidator object

Validator that extracts the value located at a given field path.

type string required
Values: "DpathValidator"
field_path string[] required

List of potentially nested fields describing the full path of the field to validate. Use "*" to validate all values from an array.

Examples: ["data"], ["data","records"], ["data","{{ parameters.name }}"], ["data","*","record"]
validation_strategy ValidateAdheresToSchema | CustomValidationStrategy required

The condition that the specified config value will be evaluated against

PredicateValidator object

Validator that applies a validation strategy to a specified value.

type string required
Values: "PredicateValidator"
value string | number | object | array | boolean | null required

The value to be validated. Can be a literal value or interpolated from configuration.

Examples: "test-value", "{{ config['api_version'] }}", "{{ config['tenant_id'] }}", 123
validation_strategy ValidateAdheresToSchema | CustomValidationStrategy required

The validation strategy to apply to the value.

ValidateAdheresToSchema object

Validates that a user-provided schema adheres to a specified JSON schema.

type string required
Values: "ValidateAdheresToSchema"
base_schema string | object required

The base JSON schema against which the user-provided schema will be validated.

Examples: "{{ config['report_validation_schema'] }}", "'{ "$schema": "http://json-schema.org/draft-07/schema#", "title": "Person", "type": "object", "properties": { "name": { "type": "string", "description": "The person's name" }, "age": { "type": "integer", "minimum": 0, "description": "The person's age" } }, "required": ["name", "age"] }' ", {"$schema":"http://json-schema.org/draft-07/schema#","title":"Person","type":"object","properties":{"name":{"type":"string","description":"The person's name"},"age":{"type":"integer","minimum":0,"description":"The person's age"}},"required":["name","age"]}
CustomValidationStrategy object

Custom validation strategy that allows for custom validation logic.

type string required
Values: "CustomValidationStrategy"
class_name string required

Fully-qualified name of the class that will be implementing the custom validation strategy. Has to be a sub class of ValidationStrategy. The format is source_<name>.<package>.<class_name>.

Examples: "source_declarative_manifest.components.MyCustomValidationStrategy"
ConfigRemapField object

Transformation that remaps a field's value to another value based on a static map.

type string required
Values: "ConfigRemapField"
map object | string required

A mapping of original values to new values. When a field value matches a key in this map, it will be replaced with the corresponding value.

Examples: {"pending":"in_progress","done":"completed","cancelled":"terminated"}, "{{ config['status_mapping'] }}"
field_path string[] required

The path to the field whose value should be remapped. Specified as a list of path components to navigate through nested objects.

Examples: ["status"], ["data","status"], ["data","{{ config.name }}","status"], ["data","*","status"]
ConfigAddFields object

Transformation that adds fields to a config. The path of the added field can be nested.

type string required
Values: "ConfigAddFields"
fields AddedFieldDefinition[] required

A list of transformations (path and corresponding value) that will be added to the config.

condition string

Fields will be added if expression is evaluated to True.

Default: ""
Examples: "{{ config['environemnt'] == 'sandbox' }}", "{{ property is integer }}", "{{ property|length > 5 }}", "{{ property == 'some_string_to_match' }}"
ConfigRemoveFields object

Transformation that removes a field from the config.

type string required
Values: "ConfigRemoveFields"
field_pointers string[][] required

A list of field pointers to be removed from the config.

Examples: ["tags"], [["content","html"],["content","plain_text"]]
condition string

Fields will be removed if expression is evaluated to True.

Default: ""
Examples: "{{ config['environemnt'] == 'sandbox' }}", "{{ property is integer }}", "{{ property|length > 5 }}", "{{ property == 'some_string_to_match' }}"
CustomConfigTransformation object

A custom config transformation that can be used to transform the connector configuration.

type string required
Values: "CustomConfigTransformation"
class_name string required

Fully-qualified name of the class that will be implementing the custom config transformation. The format is source_<name>.<package>.<class_name>.

Examples: "source_declarative_manifest.components.MyCustomConfigTransformation"
$parameters object

Additional parameters to be passed to the custom config transformation.