Open Data Contract Standard (ODCS) (SchemaStore) JSON Schema

Type	object
File match	`.odcs.yaml` `.odcs.yml`
Schema URL	https://catalog.lintel.tools/schemas/schemastore/open-data-contract-standard-odcs/latest.json
Source	https://raw.githubusercontent.com/bitol-io/open-data-contract-standard/main/schema/odcs-json-schema-latest.json

ShorthandReference string

Shorthand notation using name fields (table_name.column_name)

FullyQualifiedReference string

Fully qualified notation using id fields (section/id/properties/id), optionally prefixed with external file reference

StableId string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

Server object

Data source details of where data is physically stored.

server string required

Identifier of the server.

type string required

Type of the server.

Values: "api" "athena" "azure" "bigquery" "clickhouse" "databricks" "denodo" "dremio" "duckdb" "glue" "cloudsql" "db2" "hive" "impala" "informix" "kafka" "kinesis" "local" "mysql" "oracle" "postgresql" "postgres" "presto" "pubsub" "redshift" "s3" "sftp" "snowflake" "sqlserver" "synapse" "trino" "vertica" "zen" "custom"

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

description string

Description of the server.

environment string

Environment of the server.

Examples: "prod", "preprod", "dev", "uat"

roles Role[]

List of roles that have access to the server.

customProperties CustomProperty[]

A list of key/value pairs for custom properties.

Relationship RelationshipSchemaLevel | RelationshipPropertyLevel

Compatibility wrapper for relationship definitions.

SchemaElement object

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

name string

Name of the element.

physicalType string

The physical element data type in the data source.

Examples: "table", "view", "topic", "file"

description string

Description of the element.

businessName string

The business name of the element.

authoritativeDefinitions object[]

List of links to sources that provide more details on the dataset; examples would be a link to an external definition, a training video, a git repo, data catalog, or another tool. Authoritative definitions follow the same structure in the standard.

tags string[]

A list of tags that may be assigned to the elements (object or property); the tags keyword may appear at any level. Tags may be used to better categorize an element. For example, finance, sensitive, employee_record.

Examples: "finance", "sensitive", "employee_record"

customProperties CustomProperty[]

A list of key/value pairs for custom properties.

SchemaObject object

logicalType string

The logical element data type.

Values: "object"

physicalName string

Physical name.

Examples: "table_1_2_0"

dataGranularityDescription string

Granular level of the data in the object.

Examples: "Aggregation by country"

properties SchemaProperty[]

A list of properties for the object.

relationships RelationshipSchemaLevel[]

A list of relationships to other properties. Each relationship must have 'from', 'to' and optionally 'type' field.

quality DataQuality[]

Data quality rules with all the relevant information for rule setup and execution.

SchemaBaseProperty object

primaryKey boolean

Boolean value specifying whether the element is primary or not. Default is false.

primaryKeyPosition integer

If element is a primary key, the position of the primary key element. Starts from 1. Example of account_id, name being primary key columns, account_id has primaryKeyPosition 1 and name primaryKeyPosition 2. Default to -1.

Default: -1

logicalType string

The logical element data type.

Values: "string" "date" "timestamp" "time" "number" "integer" "object" "array" "boolean"

logicalTypeOptions object

Additional optional metadata to describe the logical type.

physicalType string

The physical element data type in the data source. For example, VARCHAR(2), DOUBLE, INT.

physicalName string

Physical name.

Examples: "col_str_a"

required boolean

Indicates if the element may contain Null values; possible values are true and false. Default is false.

Default: false

unique boolean

Indicates if the element contains unique values; possible values are true and false. Default is false.

Default: false

partitioned boolean

Indicates if the element is partitioned; possible values are true and false.

Default: false

partitionKeyPosition integer

If element is used for partitioning, the position of the partition element. Starts from 1. Example of country, year being partition columns, country has partitionKeyPosition 1 and year partitionKeyPosition 2. Default to -1.

Default: -1

classification string

Can be anything, like confidential, restricted, and public to more advanced categorization. Some companies like PayPal, use data classification indicating the class of data in the element; expected values are 1, 2, 3, 4, or 5.

Examples: "confidential", "restricted", "public"

encryptedName string

The element name within the dataset that contains the encrypted element value. For example, unencrypted element email_address might have an encryptedName of email_address_encrypt.

transformSourceObjects string[]

List of objects in the data source used in the transformation.

transformLogic string

Logic used in the element transformation.

transformDescription string

Describes the transform logic in very simple terms.

examples AnyType[]

List of sample element values.

criticalDataElement boolean

True or false indicator; If element is considered a critical data element (CDE) then true else false.

Default: false

relationships RelationshipPropertyLevel[]

A list of relationships to other properties. When defined at property level, the 'from' field is implicit and should not be specified.

quality DataQuality[]

Data quality rules with all the relevant information for rule setup and execution.

SchemaProperty object

primaryKey boolean

Boolean value specifying whether the element is primary or not. Default is false.

primaryKeyPosition integer

If element is a primary key, the position of the primary key element. Starts from 1. Example of account_id, name being primary key columns, account_id has primaryKeyPosition 1 and name primaryKeyPosition 2. Default to -1.

Default: -1

logicalType string

The logical element data type.

Values: "string" "date" "timestamp" "time" "number" "integer" "object" "array" "boolean"

logicalTypeOptions object

Additional optional metadata to describe the logical type.

physicalType string

The physical element data type in the data source. For example, VARCHAR(2), DOUBLE, INT.

physicalName string

Physical name.

Examples: "col_str_a"

required boolean

Indicates if the element may contain Null values; possible values are true and false. Default is false.

Default: false

unique boolean

Indicates if the element contains unique values; possible values are true and false. Default is false.

Default: false

partitioned boolean

Indicates if the element is partitioned; possible values are true and false.

Default: false

partitionKeyPosition integer

If element is used for partitioning, the position of the partition element. Starts from 1. Example of country, year being partition columns, country has partitionKeyPosition 1 and year partitionKeyPosition 2. Default to -1.

Default: -1

classification string

Can be anything, like confidential, restricted, and public to more advanced categorization. Some companies like PayPal, use data classification indicating the class of data in the element; expected values are 1, 2, 3, 4, or 5.

Examples: "confidential", "restricted", "public"

encryptedName string

The element name within the dataset that contains the encrypted element value. For example, unencrypted element email_address might have an encryptedName of email_address_encrypt.

transformSourceObjects string[]

List of objects in the data source used in the transformation.

transformLogic string

Logic used in the element transformation.

transformDescription string

Describes the transform logic in very simple terms.

examples AnyType[]

List of sample element values.

criticalDataElement boolean

True or false indicator; If element is considered a critical data element (CDE) then true else false.

Default: false

relationships RelationshipPropertyLevel[]

A list of relationships to other properties. When defined at property level, the 'from' field is implicit and should not be specified.

quality DataQuality[]

Data quality rules with all the relevant information for rule setup and execution.

SchemaItemProperty object

primaryKey boolean

Boolean value specifying whether the element is primary or not. Default is false.

primaryKeyPosition integer

If element is a primary key, the position of the primary key element. Starts from 1. Example of account_id, name being primary key columns, account_id has primaryKeyPosition 1 and name primaryKeyPosition 2. Default to -1.

Default: -1

logicalType string

The logical element data type.

Values: "string" "date" "timestamp" "time" "number" "integer" "object" "array" "boolean"

logicalTypeOptions object

Additional optional metadata to describe the logical type.

physicalType string

The physical element data type in the data source. For example, VARCHAR(2), DOUBLE, INT.

physicalName string

Physical name.

Examples: "col_str_a"

required boolean

Indicates if the element may contain Null values; possible values are true and false. Default is false.

Default: false

unique boolean

Indicates if the element contains unique values; possible values are true and false. Default is false.

Default: false

partitioned boolean

Indicates if the element is partitioned; possible values are true and false.

Default: false

partitionKeyPosition integer

If element is used for partitioning, the position of the partition element. Starts from 1. Example of country, year being partition columns, country has partitionKeyPosition 1 and year partitionKeyPosition 2. Default to -1.

Default: -1

classification string

Can be anything, like confidential, restricted, and public to more advanced categorization. Some companies like PayPal, use data classification indicating the class of data in the element; expected values are 1, 2, 3, 4, or 5.

Examples: "confidential", "restricted", "public"

encryptedName string

The element name within the dataset that contains the encrypted element value. For example, unencrypted element email_address might have an encryptedName of email_address_encrypt.

transformSourceObjects string[]

List of objects in the data source used in the transformation.

transformLogic string

Logic used in the element transformation.

transformDescription string

Describes the transform logic in very simple terms.

examples AnyType[]

List of sample element values.

criticalDataElement boolean

True or false indicator; If element is considered a critical data element (CDE) then true else false.

Default: false

relationships RelationshipPropertyLevel[]

A list of relationships to other properties. When defined at property level, the 'from' field is implicit and should not be specified.

quality DataQuality[]

Data quality rules with all the relevant information for rule setup and execution.

Tags string[]

A list of tags that may be assigned to the elements (object or property); the tags keyword may appear at any level. Tags may be used to better categorize an element. For example, finance, sensitive, employee_record.

Examples:

"finance"
"sensitive"
"employee_record"

DataQuality object

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

authoritativeDefinitions object[]

List of links to sources that provide more details on the dataset; examples would be a link to an external definition, a training video, a git repo, data catalog, or another tool. Authoritative definitions follow the same structure in the standard.

businessImpact string

Consequences of the rule failure.

Examples: "operational", "regulatory"

customProperties CustomProperty[]

Additional properties required for rule execution.

description string

Describe the quality check to be completed.

dimension string

The key performance indicator (KPI) or dimension for data quality.

Values: "accuracy" "completeness" "conformity" "consistency" "coverage" "timeliness" "uniqueness"

method string

Examples: "reconciliation"

name string

Name of the data quality check.

schedule string

Rule execution schedule details.

Examples: "0 20 * * *"

scheduler string

The name or type of scheduler used to start the data quality check.

Examples: "cron"

severity string

The severance of the quality rule.

Examples: "info", "warning", "error"

tags string[]

A list of tags that may be assigned to the elements (object or property); the tags keyword may appear at any level. Tags may be used to better categorize an element. For example, finance, sensitive, employee_record.

Examples: "finance", "sensitive", "employee_record"

type string

The type of quality check. 'text' is human-readable text that describes the quality of the data. 'library' is a set of maintained predefined quality attributes such as row count or unique. 'sql' is an individual SQL query that returns a value that can be compared. 'custom' is quality attributes that are vendor-specific, such as Soda or Great Expectations.

Default: "library"

Values: "text" "library" "sql" "custom"

unit string

Unit the rule is using, popular values are rows or percent, but any value is allowed.

Examples: "rows", "percent"

DataQualityChecks DataQuality[]

Data quality rules with all the relevant information for rule setup and execution.

DataQualityOperators object

Common comparison operators for data quality checks.

DataQualityLibrary object

metric string required

Define a data quality check based on the predefined metrics as per ODCS.

Values: "nullValues" "missingValues" "invalidValues" "duplicateValues" "rowCount"

rule string

Use metric instead

arguments object

Additional arguments for the metric, if needed.

DataQualitySql object

query string required

Query string that adheres to the dialect of the provided server.

Examples: "SELECT COUNT(*) FROM ${table} WHERE ${column} IS NOT NULL"

DataQualityCustom object

engine string required

Name of the engine which executes the data quality checks.

Examples: "soda", "great-expectations", "monte-carlo", "dbt"

implementation string | object required

AuthoritativeDefinitions object[]

List of links to sources that provide more details on the dataset; examples would be a link to an external definition, a training video, a git repo, data catalog, or another tool. Authoritative definitions follow the same structure in the standard.

Support SupportItem[]

Top level for support channels.

SupportItem object

channel string required

Channel name or identifier.

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

url string

Access URL using normal URL scheme (https, mailto, etc.).

description string

Description of the channel, free text.

tool string

Name of the tool, value can be email, slack, teams, discord, ticket, googlechat, or other.

Examples: "email", "slack", "teams", "discord", "ticket", "googlechat", "other"

scope string

Scope can be: interactive, announcements, issues, notifications.

Examples: "interactive", "announcements", "issues", "notifications"

invitationUrl string

Some tools uses invitation URL for requesting or subscribing. Follows the URL scheme.

customProperties CustomProperty[]

A list of key/value pairs for custom properties.

Pricing object

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

priceAmount number

Subscription price per unit of measure in priceUnit.

priceCurrency string

Currency of the subscription price in price.priceAmount.

priceUnit string

The unit of measure for calculating cost. Examples megabyte, gigabyte.

TeamMember object

Team member information.

username string required

The user's username or email.

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

name string

The user's name.

description string

The user's description.

role string

The user's job role; Examples might be owner, data steward. There is no limit on the role.

dateIn string

The date when the user joined the team.

format=date

dateOut string

The date when the user ceased to be part of the team.

format=date

replacedByUsername string

The username of the user who replaced the previous user.

tags string[]

A list of tags that may be assigned to the elements (object or property); the tags keyword may appear at any level. Tags may be used to better categorize an element. For example, finance, sensitive, employee_record.

Examples: "finance", "sensitive", "employee_record"

customProperties CustomProperty[]

Custom properties block.

authoritativeDefinitions object[]

List of links to sources that provide more details on the dataset; examples would be a link to an external definition, a training video, a git repo, data catalog, or another tool. Authoritative definitions follow the same structure in the standard.

Team object

Team information.

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

name string

Team name.

description string

Team description.

members TeamMember[]

List of members.

tags string[]

A list of tags that may be assigned to the elements (object or property); the tags keyword may appear at any level. Tags may be used to better categorize an element. For example, finance, sensitive, employee_record.

Examples: "finance", "sensitive", "employee_record"

customProperties CustomProperty[]

Custom properties block.

authoritativeDefinitions object[]

List of links to sources that provide more details on the dataset; examples would be a link to an external definition, a training video, a git repo, data catalog, or another tool. Authoritative definitions follow the same structure in the standard.

Role object

role string required

Name of the IAM role that provides access to the dataset.

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

description string

Description of the IAM role and its permissions.

access string

The type of access provided by the IAM role.

firstLevelApprovers string

The name(s) of the first-level approver(s) of the role.

secondLevelApprovers string

The name(s) of the second-level approver(s) of the role.

customProperties CustomProperty[]

A list of key/value pairs for custom properties.

ServiceLevelAgreementProperty object

property string required

Specific property in SLA, check the periodic table. May requires units (more details to come).

value string | number | integer | boolean | null required

Agreement value. The label will change based on the property itself.

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

valueExt string | number | integer | boolean | null

unit string

d, day, days for days; y, yr, years for years, etc. Units use the ISO standard.

element string

Element(s) to check on. Multiple elements should be extremely rare and, if so, separated by commas.

driver string

Describes the importance of the SLA from the list of: regulatory, analytics, or operational.

Examples: "regulatory", "analytics", "operational"

description string

Description of the SLA for humans.

Examples: "99.9% of the time, data is available by 6 AM UTC"

scheduler string

Name of the scheduler, can be cron or any tool your organization support.

Examples: "cron"

schedule string

Configuration information for the scheduling tool, for cron a possible value is 0 20 * * *.

Examples: "0 20 * * *"

CustomProperties CustomProperty[]

A list of key/value pairs for custom properties.

CustomProperty object

property string required

The name of the key. Names should be in camel case–the same as if they were permanent properties in the contract.

id string

Stable technical identifier for references. Must be unique within its containing array. Cannot contain special characters ('-', '_' allowed).

pattern=^[A-Za-z0-9_-]+$

description string

Description of the custom property.

AnyNonCollectionType string | number | integer | boolean | null

RelationshipBase object

Base definition for relationships between properties, typically for foreign key constraints.

type string

The type of relationship. Defaults to 'foreignKey'.

Default: "foreignKey"

Values: "foreignKey"

from ShorthandReference | FullyQualifiedReference | ShorthandReference | FullyQualifiedReference[]

Source property or properties.

to ShorthandReference | FullyQualifiedReference | ShorthandReference | FullyQualifiedReference[]

Target property or properties to reference.

customProperties CustomProperty[]

A list of key/value pairs for custom properties.

RelationshipSchemaLevel object

Relationship definition at schema level, requiring both 'from' and 'to' fields with matching types.

RelationshipPropertyLevel object

Relationship definition at property level, where 'from' is implicitly the current property.

ApiServer object

location string required

The url to the API.

Examples: "https://api.example.com/v1"

format=uri

AthenaServer object

stagingDir string required

Amazon Athena automatically stores query results and metadata information for each query that runs in a query result location that you can specify in Amazon S3.

Examples: "s3://my_storage_account_name/my_container/path"

format=uri

schema string required

Identify the schema in the data source in which your tables exist.

catalog string

Identify the name of the Data Source, also referred to as a Catalog.

Default: "awsdatacatalog"

regionName string

The region your AWS account uses.

Examples: "eu-west-1"

AzureServer object

location string required

Fully qualified path to Azure Blob Storage or Azure Data Lake Storage (ADLS), supports globs.

Examples: "az://my_storage_account_name.blob.core.windows.net/my_container/path/*.parquet", "abfss://my_storage_account_name.dfs.core.windows.net/my_container_name/path/*.parquet"

format=uri

format string required

File format.

Examples: "parquet", "delta", "json", "csv"

delimiter string

Only for format = json. How multiple json documents are delimited within one file

Examples: "new_line", "array"

BigQueryServer object

project string required

The GCP project name.

dataset string required

The GCP dataset name.

ClickHouseServer object

host string required

The host of the ClickHouse server.

port integer required

The port to the ClickHouse server.

database string required

The name of the database.

DatabricksServer object

catalog string required

The name of the Hive or Unity catalog

schema string required

The schema name in the catalog

host string

The Databricks host

Examples: "dbc-abcdefgh-1234.cloud.databricks.com"

DenodoServer object

host string required

The host of the Denodo server.

port integer required

The port of the Denodo server.

database string

The name of the database.

DremioServer object

host string required

The host of the Dremio server.

port integer required

The port of the Dremio server.

schema string

The name of the schema.

DuckdbServer object

database string required

Path to duckdb database file.

schema string

The name of the schema.

GlueServer object

account string required

The AWS Glue account

Examples: "1234-5678-9012"

database string required

The AWS Glue database name

Examples: "my_database"

location string

The AWS S3 path. Must be in the form of a URL.

Examples: "s3://datacontract-example-orders-latest/data/{model}"

format=uri

format string

The format of the files

Examples: "parquet", "csv", "json", "delta"

GoogleCloudSqlServer object

host string required

The host of the Google Cloud Sql server.

port integer required

The port of the Google Cloud Sql server.

database string required

The name of the database.

schema string required

The name of the schema.

IBMDB2Server object

host string required

The host of the IBM DB2 server.

port integer required

The port of the IBM DB2 server.

database string required

The name of the database.

schema string

The name of the schema.

HiveServer object

host string required

The host to the Hive server.

database string required

The name of the Hive database.

port integer

The port to the Hive server. Defaults to 10000.

ImpalaServer object

host string required

The host to the Impala server.

database string required

The name of the Impala database.

port integer

The port to the Impala server. Defaults to 21050.

InformixServer object

host string required

The host to the Informix server.

database string required

The name of the database.

port integer

The port to the Informix server. Defaults to 9088.

ZenServer object

host string required

Hostname or IP address of the Zen server.

database string required

Database name to connect to on the Zen server.

port integer

Zen server SQL connections port. Defaults to 1583.

CustomServer object

account string

Account used by the server.

catalog string

Name of the catalog.

database string

Name of the database.

dataset string

Name of the dataset.

delimiter string

Delimiter.

endpointUrl string

Server endpoint.

format=uri

format string

File format.

host string

Host name or IP address.

location string

A URL to a location.

format=uri

path string

Relative or absolute path to the data file(s).

port integer

Port to the server. No default value is assumed for custom servers.

project string

Project name.

region string

Cloud region.

regionName string

Region name.

schema string

Name of the schema.

serviceName string

Name of the service.

stagingDir string

Staging directory.

warehouse string

Name of the cluster or warehouse.

stream string

Name of the data stream.

KafkaServer object

Kafka Server

host string required

The bootstrap server of the kafka cluster.

format string

The format of the messages.

Default: "json"

Examples: "json", "avro", "protobuf", "xml"

KinesisServer object

Kinesis Data Streams Server

region string

AWS region.

Examples: "eu-west-1"

format string

The format of the record

Examples: "json", "avro", "protobuf"

LocalServer object

path string required

The relative or absolute path to the data file(s).

Examples: "./folder/data.parquet", "./folder/*.parquet"

format string required

The format of the file(s)

Examples: "json", "parquet", "delta", "csv"

MySqlServer object

host string required

The host of the MySql server.

port integer required

The port of the MySql server.

database string required

The name of the database.

OracleServer object

host string required

The host to the oracle server

Examples: "localhost"

port integer required

The port to the oracle server.

Examples: 1523

serviceName string required

The name of the service.

Examples: "service"

PostgresServer object

host string required

The host to the Postgres server

port integer required

The port to the Postgres server.

database string required

The name of the database.

schema string required

The name of the schema in the database.

PrestoServer object

host string required

The host to the Presto server

Examples: "localhost:8080"

catalog string

The name of the catalog.

Examples: "postgres"

schema string

The name of the schema.

Examples: "public"

PubSubServer object

project string required

The GCP project name.

RedshiftServer object

database string required

The name of the database.

schema string required

The name of the schema.

host string

An optional string describing the server.

region string

AWS region of Redshift server.

Examples: "us-east-1"

account string

The account used by the server.

S3Server object

location string required

S3 URL, starting with s3://

Examples: "s3://datacontract-example-orders-latest/data/{model}/*.json"

format=uri

endpointUrl string

The server endpoint for S3-compatible servers.

Examples: "https://minio.example.com"

format=uri

format string

File format.

Examples: "parquet", "delta", "json", "csv"

delimiter string

Only for format = json. How multiple json documents are delimited within one file

Examples: "new_line", "array"

SftpServer object

location string required

SFTP URL, starting with sftp://

Examples: "sftp://123.123.12.123/{model}/*.json"

format=uripattern=^sftp://.*

format string

File format.

Examples: "parquet", "delta", "json", "csv"

delimiter string

Only for format = json. How multiple json documents are delimited within one file

Examples: "new_line", "array"

SnowflakeServer object

account string required

The Snowflake account used by the server.

database string required

The name of the database.

schema string required

The name of the schema.

host string

The host to the Snowflake server

port integer

The port to the Snowflake server.

warehouse string

The name of the cluster of resources that is a Snowflake virtual warehouse.

SqlserverServer object

host string required

The host to the database server

Examples: "localhost"

database string required

The name of the database.

Examples: "database"

schema string required

The name of the schema in the database.

Examples: "dbo"

port integer

The port to the database server.

Default: 1433

Examples: 1433

SynapseServer object

host string required

The host of the Synapse server.

port integer required

The port of the Synapse server.

database string required

The name of the database.

TrinoServer object

host string required

The Trino host URL.

Examples: "localhost"

port integer required

The Trino port.

catalog string required

The name of the catalog.

Examples: "hive"

schema string required

The name of the schema in the database.

Examples: "my_schema"

VerticaServer object

host string required

The host of the Vertica server.

port integer required

The port of the Vertica server.

database string required

The name of the database.

schema string required

The name of the schema.

Validate with Lintel

Properties

Definitions