Type object
Schema URL https://catalog.lintel.tools/schemas/schemastore/serverless-framework-configuration/_shared/latest--aws-sagemaker-dataqualityjobdefinition.json
Parent schema serverless-framework-configuration
Type: object

Resource Type definition for AWS::SageMaker::DataQualityJobDefinition. Source:- No source definition found, add manually please

Properties

DataQualityAppSpecification object required

Container image configuration object for the monitoring job.

6 nested properties
ImageUri string | Aws_CF_FunctionString required

The container image to be run by the monitoring job.

ContainerArguments string[]

An array of arguments for the container used to run the monitoring job.

maxItems=50
ContainerEntrypoint string[]

Specifies the entrypoint for a container used to run the monitoring job.

maxItems=100
PostAnalyticsProcessorSourceUri string | Aws_CF_FunctionString

The Amazon S3 URI.

RecordPreprocessorSourceUri string | Aws_CF_FunctionString

The Amazon S3 URI.

Environment object

Sets the environment variables in the Docker container

DataQualityJobInput object required

The inputs for a monitoring job.

2 nested properties
EndpointInput object

The endpoint for a monitoring job.

4 nested properties
EndpointName string | Aws_CF_FunctionString required

The name of the endpoint used to run the monitoring job.

LocalPath string | Aws_CF_FunctionString required

Path to the filesystem where the endpoint data is available to the container.

S3DataDistributionType string | Aws_CF_FunctionString

Whether input data distributed in Amazon S3 is fully replicated or sharded by an S3 key. Defauts to FullyReplicated

S3InputMode string | Aws_CF_FunctionString

Whether the Pipe or File is used as the input mode for transfering data for the monitoring job. Pipe mode is recommended for large datasets. File mode is useful for small files that fit in memory. Defaults to File.

BatchTransformInput object

The batch transform input for a monitoring job.

5 nested properties
DataCapturedDestinationS3Uri string | Aws_CF_FunctionString required

A URI that identifies the Amazon S3 storage location where Batch Transform Job captures data.

DatasetFormat object required

The dataset format of the data to monitor

3 nested properties
Csv object

The CSV format

Json object

The Json format

Parquet boolean

A flag indicate if the dataset format is Parquet

LocalPath string | Aws_CF_FunctionString required

Path to the filesystem where the endpoint data is available to the container.

S3DataDistributionType string | Aws_CF_FunctionString

Whether input data distributed in Amazon S3 is fully replicated or sharded by an S3 key. Defauts to FullyReplicated

S3InputMode string | Aws_CF_FunctionString

Whether the Pipe or File is used as the input mode for transfering data for the monitoring job. Pipe mode is recommended for large datasets. File mode is useful for small files that fit in memory. Defaults to File.

DataQualityJobOutputConfig object required

The output configuration for monitoring jobs.

2 nested properties
MonitoringOutputs MonitoringOutput[] required

Monitoring outputs for monitoring jobs. This is where the output of the periodic monitoring jobs is uploaded.

minLength=1maxLength=1
KmsKeyId string | Aws_CF_FunctionString

The AWS Key Management Service (AWS KMS) key that Amazon SageMaker uses to encrypt the model artifacts at rest using Amazon S3 server-side encryption.

JobResources object required

Identifies the resources to deploy for a monitoring job.

1 nested properties
ClusterConfig object required

Configuration for the cluster used to run model monitoring jobs.

4 nested properties
InstanceCount integer required

The number of ML compute instances to use in the model monitoring job. For distributed processing jobs, specify a value greater than 1. The default value is 1.

min=1max=100
InstanceType string | Aws_CF_FunctionString required

The ML compute instance type for the processing job.

VolumeSizeInGB integer required

The size of the ML storage volume, in gigabytes, that you want to provision. You must specify sufficient ML storage for your scenario.

min=1max=16384
VolumeKmsKeyId string | Aws_CF_FunctionString

The AWS Key Management Service (AWS KMS) key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s) that run the model monitoring job.

RoleArn string | Aws_CF_FunctionString required

The Amazon Resource Name (ARN) of an IAM role that Amazon SageMaker can assume to perform tasks on your behalf.

JobDefinitionName string | Aws_CF_FunctionString

The name of the job definition.

DataQualityBaselineConfig object

Baseline configuration used to validate that the data conforms to the specified constraints and statistics.

3 nested properties
BaseliningJobName string | Aws_CF_FunctionString

The name of a processing job

ConstraintsResource object

The baseline constraints resource for a monitoring job.

1 nested properties
S3Uri string | Aws_CF_FunctionString

The Amazon S3 URI.

StatisticsResource object

The baseline statistics resource for a monitoring job.

1 nested properties
S3Uri string | Aws_CF_FunctionString

The Amazon S3 URI.

NetworkConfig object

Networking options for a job, such as network traffic encryption between containers, whether to allow inbound and outbound network calls to and from containers, and the VPC subnets and security groups to use for VPC-enabled jobs.

3 nested properties
EnableInterContainerTrafficEncryption boolean

Whether to encrypt all communications between distributed processing jobs. Choose True to encrypt communications. Encryption provides greater security for distributed processing jobs, but the processing might take longer.

EnableNetworkIsolation boolean

Whether to allow inbound and outbound network calls to and from the containers used for the processing job.

VpcConfig object

Specifies a VPC that your training jobs and hosted models have access to. Control access to and from your training and model containers by configuring the VPC.

2 nested properties
SecurityGroupIds string[] required

The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

minItems=1maxItems=5
Subnets string[] required

The ID of the subnets in the VPC to which you want to connect to your monitoring jobs.

minItems=1maxItems=16
EndpointName string | Aws_CF_FunctionString

The name of the endpoint used to run the monitoring job.

StoppingCondition object

Specifies a time limit for how long the monitoring job is allowed to run.

1 nested properties
MaxRuntimeInSeconds integer required

The maximum runtime allowed in seconds.

min=1max=86400
Tags Tag[]

An array of key-value pairs to apply to this resource.

maxItems=50

Definitions

DataQualityBaselineConfig object

Baseline configuration used to validate that the data conforms to the specified constraints and statistics.

BaseliningJobName string | Aws_CF_FunctionString

The name of a processing job

ConstraintsResource object

The baseline constraints resource for a monitoring job.

1 nested properties
S3Uri string | Aws_CF_FunctionString

The Amazon S3 URI.

StatisticsResource object

The baseline statistics resource for a monitoring job.

1 nested properties
S3Uri string | Aws_CF_FunctionString

The Amazon S3 URI.

ConstraintsResource object

The baseline constraints resource for a monitoring job.

S3Uri string | Aws_CF_FunctionString

The Amazon S3 URI.

StatisticsResource object

The baseline statistics resource for a monitoring job.

S3Uri string | Aws_CF_FunctionString

The Amazon S3 URI.

S3Uri string | Aws_CF_FunctionString

The Amazon S3 URI.

DataQualityAppSpecification object

Container image configuration object for the monitoring job.

ImageUri string | Aws_CF_FunctionString required

The container image to be run by the monitoring job.

ContainerArguments string[]

An array of arguments for the container used to run the monitoring job.

maxItems=50
ContainerEntrypoint string[]

Specifies the entrypoint for a container used to run the monitoring job.

maxItems=100
PostAnalyticsProcessorSourceUri string | Aws_CF_FunctionString

The Amazon S3 URI.

RecordPreprocessorSourceUri string | Aws_CF_FunctionString

The Amazon S3 URI.

Environment object

Sets the environment variables in the Docker container

DataQualityJobInput object

The inputs for a monitoring job.

EndpointInput object

The endpoint for a monitoring job.

4 nested properties
EndpointName string | Aws_CF_FunctionString required

The name of the endpoint used to run the monitoring job.

LocalPath string | Aws_CF_FunctionString required

Path to the filesystem where the endpoint data is available to the container.

S3DataDistributionType string | Aws_CF_FunctionString

Whether input data distributed in Amazon S3 is fully replicated or sharded by an S3 key. Defauts to FullyReplicated

S3InputMode string | Aws_CF_FunctionString

Whether the Pipe or File is used as the input mode for transfering data for the monitoring job. Pipe mode is recommended for large datasets. File mode is useful for small files that fit in memory. Defaults to File.

BatchTransformInput object

The batch transform input for a monitoring job.

5 nested properties
DataCapturedDestinationS3Uri string | Aws_CF_FunctionString required

A URI that identifies the Amazon S3 storage location where Batch Transform Job captures data.

DatasetFormat object required

The dataset format of the data to monitor

3 nested properties
Csv object

The CSV format

Json object

The Json format

Parquet boolean

A flag indicate if the dataset format is Parquet

LocalPath string | Aws_CF_FunctionString required

Path to the filesystem where the endpoint data is available to the container.

S3DataDistributionType string | Aws_CF_FunctionString

Whether input data distributed in Amazon S3 is fully replicated or sharded by an S3 key. Defauts to FullyReplicated

S3InputMode string | Aws_CF_FunctionString

Whether the Pipe or File is used as the input mode for transfering data for the monitoring job. Pipe mode is recommended for large datasets. File mode is useful for small files that fit in memory. Defaults to File.

EndpointInput object

The endpoint for a monitoring job.

EndpointName string | Aws_CF_FunctionString required

The name of the endpoint used to run the monitoring job.

LocalPath string | Aws_CF_FunctionString required

Path to the filesystem where the endpoint data is available to the container.

S3DataDistributionType string | Aws_CF_FunctionString

Whether input data distributed in Amazon S3 is fully replicated or sharded by an S3 key. Defauts to FullyReplicated

S3InputMode string | Aws_CF_FunctionString

Whether the Pipe or File is used as the input mode for transfering data for the monitoring job. Pipe mode is recommended for large datasets. File mode is useful for small files that fit in memory. Defaults to File.

BatchTransformInput object

The batch transform input for a monitoring job.

DataCapturedDestinationS3Uri string | Aws_CF_FunctionString required

A URI that identifies the Amazon S3 storage location where Batch Transform Job captures data.

DatasetFormat object required

The dataset format of the data to monitor

3 nested properties
Csv object

The CSV format

1 nested properties
Header boolean

A boolean flag indicating if given CSV has header

Json object

The Json format

1 nested properties
Line boolean

A boolean flag indicating if it is JSON line format

Parquet boolean

A flag indicate if the dataset format is Parquet

LocalPath string | Aws_CF_FunctionString required

Path to the filesystem where the endpoint data is available to the container.

S3DataDistributionType string | Aws_CF_FunctionString

Whether input data distributed in Amazon S3 is fully replicated or sharded by an S3 key. Defauts to FullyReplicated

S3InputMode string | Aws_CF_FunctionString

Whether the Pipe or File is used as the input mode for transfering data for the monitoring job. Pipe mode is recommended for large datasets. File mode is useful for small files that fit in memory. Defaults to File.

MonitoringOutputConfig object

The output configuration for monitoring jobs.

MonitoringOutputs MonitoringOutput[] required

Monitoring outputs for monitoring jobs. This is where the output of the periodic monitoring jobs is uploaded.

minLength=1maxLength=1
KmsKeyId string | Aws_CF_FunctionString

The AWS Key Management Service (AWS KMS) key that Amazon SageMaker uses to encrypt the model artifacts at rest using Amazon S3 server-side encryption.

MonitoringOutput object

The output object for a monitoring job.

S3Output object required

Information about where and how to store the results of a monitoring job.

3 nested properties
LocalPath string | Aws_CF_FunctionString required

The local path to the Amazon S3 storage location where Amazon SageMaker saves the results of a monitoring job. LocalPath is an absolute path for the output data.

S3Uri string | Aws_CF_FunctionString required

A URI that identifies the Amazon S3 storage location where Amazon SageMaker saves the results of a monitoring job.

S3UploadMode string | Aws_CF_FunctionString

Whether to upload the results of the monitoring job continuously or after the job completes.

S3Output object

Information about where and how to store the results of a monitoring job.

LocalPath string | Aws_CF_FunctionString required

The local path to the Amazon S3 storage location where Amazon SageMaker saves the results of a monitoring job. LocalPath is an absolute path for the output data.

S3Uri string | Aws_CF_FunctionString required

A URI that identifies the Amazon S3 storage location where Amazon SageMaker saves the results of a monitoring job.

S3UploadMode string | Aws_CF_FunctionString

Whether to upload the results of the monitoring job continuously or after the job completes.

MonitoringResources object

Identifies the resources to deploy for a monitoring job.

ClusterConfig object required

Configuration for the cluster used to run model monitoring jobs.

4 nested properties
InstanceCount integer required

The number of ML compute instances to use in the model monitoring job. For distributed processing jobs, specify a value greater than 1. The default value is 1.

min=1max=100
InstanceType string | Aws_CF_FunctionString required

The ML compute instance type for the processing job.

VolumeSizeInGB integer required

The size of the ML storage volume, in gigabytes, that you want to provision. You must specify sufficient ML storage for your scenario.

min=1max=16384
VolumeKmsKeyId string | Aws_CF_FunctionString

The AWS Key Management Service (AWS KMS) key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s) that run the model monitoring job.

ClusterConfig object

Configuration for the cluster used to run model monitoring jobs.

InstanceCount integer required

The number of ML compute instances to use in the model monitoring job. For distributed processing jobs, specify a value greater than 1. The default value is 1.

min=1max=100
InstanceType string | Aws_CF_FunctionString required

The ML compute instance type for the processing job.

VolumeSizeInGB integer required

The size of the ML storage volume, in gigabytes, that you want to provision. You must specify sufficient ML storage for your scenario.

min=1max=16384
VolumeKmsKeyId string | Aws_CF_FunctionString

The AWS Key Management Service (AWS KMS) key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s) that run the model monitoring job.

NetworkConfig object

Networking options for a job, such as network traffic encryption between containers, whether to allow inbound and outbound network calls to and from containers, and the VPC subnets and security groups to use for VPC-enabled jobs.

EnableInterContainerTrafficEncryption boolean

Whether to encrypt all communications between distributed processing jobs. Choose True to encrypt communications. Encryption provides greater security for distributed processing jobs, but the processing might take longer.

EnableNetworkIsolation boolean

Whether to allow inbound and outbound network calls to and from the containers used for the processing job.

VpcConfig object

Specifies a VPC that your training jobs and hosted models have access to. Control access to and from your training and model containers by configuring the VPC.

2 nested properties
SecurityGroupIds string[] required

The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

minItems=1maxItems=5
Subnets string[] required

The ID of the subnets in the VPC to which you want to connect to your monitoring jobs.

minItems=1maxItems=16
VpcConfig object

Specifies a VPC that your training jobs and hosted models have access to. Control access to and from your training and model containers by configuring the VPC.

SecurityGroupIds string[] required

The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

minItems=1maxItems=5
Subnets string[] required

The ID of the subnets in the VPC to which you want to connect to your monitoring jobs.

minItems=1maxItems=16
StoppingCondition object

Specifies a time limit for how long the monitoring job is allowed to run.

MaxRuntimeInSeconds integer required

The maximum runtime allowed in seconds.

min=1max=86400
Tag object

A key-value pair to associate with a resource.

Key string | Aws_CF_FunctionString required

The key name of the tag. You can specify a value that is 1 to 127 Unicode characters in length and cannot be prefixed with aws:. You can use any of the following characters: the set of Unicode letters, digits, whitespace, _, ., /, =, +, and -.

Value string | Aws_CF_FunctionString required

The value for the tag. You can specify a value that is 1 to 255 Unicode characters in length and cannot be prefixed with aws:. You can use any of the following characters: the set of Unicode letters, digits, whitespace, _, ., /, =, +, and -.

EndpointName string | Aws_CF_FunctionString

The name of the endpoint used to run the monitoring job.

JobDefinitionName string | Aws_CF_FunctionString

The name of the job definition.

ProcessingJobName string | Aws_CF_FunctionString

The name of a processing job

DatasetFormat object

The dataset format of the data to monitor

Csv object

The CSV format

1 nested properties
Header boolean

A boolean flag indicating if given CSV has header

Json object

The Json format

1 nested properties
Line boolean

A boolean flag indicating if it is JSON line format

Parquet boolean

A flag indicate if the dataset format is Parquet

Csv object

The CSV format

Header boolean

A boolean flag indicating if given CSV has header

Json object

The Json format

Line boolean

A boolean flag indicating if it is JSON line format

Parquet boolean

A flag indicate if the dataset format is Parquet