bioimageio resource description
Bioimage.io resource descriptions may be produced or consumed by bioimage.io-compatible software
| Type | bioimageio__spec__application__v0_2__ApplicationDescr | bioimageio__spec__application__v0_3__ApplicationDescr | bioimageio__spec__dataset__v0_2__DatasetDescr | bioimageio__spec__dataset__v0_3__DatasetDescr | bioimageio__spec__model__v0_4__ModelDescr | bioimageio__spec__model__v0_5__ModelDescr | bioimageio__spec__notebook__v0_2__NotebookDescr | bioimageio__spec__notebook__v0_3__NotebookDescr |
|---|---|
| File match |
bioimageio.yaml
*.bioimageio.yaml
|
| Schema URL | https://catalog.lintel.tools/schemas/schemastore/bioimageio-resource-description/latest.json |
| Source | https://bioimage-io.github.io/spec-bioimage-io/bioimageio_schema_latest.json |
Validate with Lintel
npx @lintel/lintel check
One of
Definitions
Architecture source file
Identifier of the callable that returns a torch.nn.Module instance.
SHA256 hash value of the source file.
key word arguments for the callable
Identifier of the callable that returns a torch.nn.Module instance.
Where to import the callable from, i.e. from <import_from> import <callable>
key word arguments for the callable
File attachments
A custom badge
badge label to display on hover
target URL
badge icon (included in bioimage.io package if not a URL)
A short description of this axis beyond its type and id.
The batch size may be fixed to 1, otherwise (the default) it may be chosen arbitrarily depending on available memory
Known biases, risks, technical limitations, and recommendations for model use.
Biases in training data or model behavior.
Potential risks in the context of bioimage analysis.
Technical limitations and failure modes.
Mitigation strategies regarding known_biases, risks, and limitations, as well as applicable best practices.
Consider:
- How to use a validation dataset?
- How to manually validate?
- Feasibility of domain adaptation for different experimental setups?
key word arguments for BinarizeDescr
The fixed threshold values along axis
The threshold axis
A short description of this axis beyond its type and id.
Cast the tensor data type to EnsureDtypeKwargs.dtype (if not matching).
This can for example be used to ensure the inner neural network model gets a different input tensor data type than the fully described bioimage.io model does.
Examples:
The described bioimage.io model (incl. preprocessing) accepts any
float32-compatible tensor, normalizes it with percentiles and clipping and then
casts it to uint8, which is what the neural network in this example expects.
- in YAML
yaml inputs: - data: type: float32 # described bioimage.io model is compatible with any float32 input tensor preprocessing: - id: scale_range kwargs: axes: ['y', 'x'] max_percentile: 99.8 min_percentile: 5.0 - id: clip kwargs: min: 0.0 max: 1.0 - id: ensure_dtype # the neural network of the model requires uint8 kwargs: dtype: uint8
- in Python:
>>> preprocessing = [
... ScaleRangeDescr(
... kwargs=ScaleRangeKwargs(
... axes= (AxisId('y'), AxisId('x')),
... max_percentile= 99.8,
... min_percentile= 5.0,
... )
... ),
... ClipDescr(kwargs=ClipKwargs(min=0.0, max=1.0)),
... EnsureDtypeDescr(kwargs=EnsureDtypeKwargs(dtype="uint8")),
... ]
key word arguments for EnsureDtypeDescr
1 nested properties
key word arguments for EnsureDtypeDescr
Environmental considerations for model training and deployment.
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
GPU/CPU specifications
Total compute hours
If applicable
Geographic location
kg CO2 equivalent
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Dataset used for evaluation.
Source of the dataset.
Role of the dataset used for evaluation.
train: dataset was (part of) the training datavalidation: dataset was (part of) the validation data used during training, e.g. used for model selection or hyperparameter tuningtest: dataset was (part of) the designated test data; not used during training or validation, but acquired from the same source/distribution as training dataindependent: dataset is entirely independent test data; not used during training or validation, and acquired from a different source/distribution than training dataunknown: role of the dataset is unknown; choose this if you are not certain if (a subset) of the data was seen by the model during training.
Number of evaluated samples.
(Abbreviations of) each evaluation factor.
Evaluation factors are criteria along which model performance is evaluated, e.g. different image conditions like 'low SNR', 'high cell density', or different biological conditions like 'cell type A', 'cell type B'. An 'overall' factor may be included to summarize performance across all conditions.
Descriptions (long form) of each evaluation factor.
(Abbreviations of) metrics used for evaluation.
Description of each metric used.
Results for each metric (rows; outer list) and each evaluation factor (columns; inner list).
Model being evaluated.
Interpretation of results for general audience.
Consider: - Overall model performance - Comparison to existing methods - Limitations and areas for improvement
A file description
File source
SHA256 hash value of the source file.
key word arguments for FixedZeroMeanUnitVarianceDescr
The mean value(s) to normalize with.
The standard deviation value(s) to normalize with.
Size must match mean values.
The axis of the mean/std values to normalize each entry along that dimension separately.
Subtract a given mean and divide by the standard deviation.
Normalize with fixed, precomputed values for
FixedZeroMeanUnitVarianceKwargs.mean and FixedZeroMeanUnitVarianceKwargs.std
Use FixedZeroMeanUnitVarianceAlongAxisKwargs for independent scaling along given
axes.
Examples:
-
scalar value for whole tensor
- in YAML
preprocessing: - id: fixed_zero_mean_unit_variance kwargs: mean: 103.5 std: 13.7- in Python
preprocessing = [FixedZeroMeanUnitVarianceDescr( ... kwargs=FixedZeroMeanUnitVarianceKwargs(mean=103.5, std=13.7) ... )]
-
independently along an axis
- in YAML
preprocessing: - id: fixed_zero_mean_unit_variance kwargs: axis: channel mean: [101.5, 102.5, 103.5] std: [11.7, 12.7, 13.7]- in Python
preprocessing = [FixedZeroMeanUnitVarianceDescr( ... kwargs=FixedZeroMeanUnitVarianceAlongAxisKwargs( ... axis=AxisId("channel"), ... mean=[101.5, 102.5, 103.5], ... std=[11.7, 12.7, 13.7], ... ) ... )]
key word arguments for FixedZeroMeanUnitVarianceDescr
The mean value to normalize with.
The standard deviation value to normalize with.
Output tensor shape depending on an input tensor shape.
shape(output_tensor) = shape(input_tensor) * scale + 2 * offset
Name of the reference tensor.
output_pix/input_pix for each dimension.
'null' values indicate new dimensions, whose length is defined by 2*offset
Position of origin wrt to input.
The size/length of this axis can be specified as
- fixed integer
- parameterized series of valid sizes (
ParameterizedSize) - reference to another axis with an optional offset (
SizeReference)
A short description of this axis beyond its type and id.
If a model has a concatenable input axis, it can be processed blockwise,
splitting a longer sample axis into blocks matching its input tensor description.
Output axes are concatenable if they have a SizeReference to a concatenable
input axis.
The size/length of this axis can be specified as
- fixed integer
- reference to another axis with an optional offset (
SizeReference) - data dependent size using
DataDependentSize(size is only known after model inference)
A short description of this axis beyond its type and id.
Tuple (minimum, maximum) specifying the allowed range of the data in this tensor.
None corresponds to min/max of what can be expressed by type.
[
null,
null
]
Scale for data on an interval (or ratio) scale.
Offset for data on a ratio scale.
Source of the .keras weights file.
wraps a packaging.version.Version instance for validation in pydantic models
Keras backend used to create these weights.
SHA256 hash value of the source file.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
A comment about this weights entry, for example how these weights were created.
A fixed set of nominal or an ascending sequence of ordinal values.
In this case data.type is required to be an unsigend integer type, e.g. 'uint8'.
String values are interpreted as labels for tensor values 0, ..., N.
Note: as YAML 1.2 does not natively support a "set" datatype,
nominal values should be given as a sequence (aka list/array) as well.
A sequence of valid shapes given by shape_k = min + k * step for k in {0, 1, ...}.
The minimum input shape
The minimum shape change
Describes a range of valid tensor axis sizes as size = min + n*step.
- min and step are given by the model description.
- All blocksize paramters n = 0,1,2,... yield a valid
size. - A greater blocksize paramter n = 0,1,2,... results in a greater size. This allows to adjust the axis size more generically.
A path relative to the rdf.yaml file (also if the RDF source is a URL).
Describes what small numerical differences -- if any -- may be tolerated in the generated output when executing in different environments.
A tensor element output is considered mismatched to the test_tensor if abs(output - test_tensor) > absolute_tolerance + relative_tolerance * abs(test_tensor). (Internally we call numpy.testing.assert_allclose.)
Motivation: For testing we can request the respective deep learning frameworks to be as reproducible as possible by setting seeds and chosing deterministic algorithms, but differences in operating systems, available hardware and installed drivers may still lead to numerical differences.
Maximum relative tolerance of reproduced test tensor.
Maximum absolute tolerance of reproduced test tensor.
Maximum number of mismatched elements/pixels per million to tolerate.
Limits the output tensor IDs these reproducibility details apply to.
[]
Limits the weights formats these details apply to.
[]
Run mode name
Run mode specific key word arguments
Key word arguments for ScaleLinearDescr
The axis of gain and offset values.
multiplicative factor
additive term
A tensor axis size (extent in pixels/frames) defined in relation to a reference axis.
axis.size = reference.size * reference.scale / axis.scale + offset
Note:
- The axis and the referenced axis need to have the same unit (or no unit).
- Batch axes may not be referenced.
- Fractions are rounded down.
- If the reference axis is
concatenablethe referencing axis is assumed to beconcatenableas well with the same block order.
Example:
An unisotropic input image of wh=10049 pixels depicts a phsical space of 200196mm².
Let's assume that we want to express the image height h in relation to its width w
instead of only accepting input images of exactly 10049 pixels
(for example to express a range of valid image shapes by parametrizing w, see ParameterizedSize).
w = SpaceInputAxis(id=AxisId("w"), size=100, unit="millimeter", scale=2) h = SpaceInputAxis( ... id=AxisId("h"), ... size=SizeReference(tensor_id=TensorId("input"), axis_id=AxisId("w"), offset=-1), ... unit="millimeter", ... scale=4, ... ) print(h.size.get_size(h, w)) 49
⇒ h = w * w.scale / h.scale + offset = 100 * 2mm / 4mm - 1 = 49
tensor id of the reference axis
axis id of the reference axis
The softmax function.
Examples:
-
in YAML
postprocessing: - id: softmax kwargs: axis: channel -
in Python:
postprocessing = [SoftmaxDescr(kwargs=SoftmaxKwargs(axis=AxisId("channel")))]
key word arguments for SoftmaxDescr
1 nested properties
The axis to apply the softmax function along. Note: Defaults to 'channel' axis (which may not exist, in which case a different axis id has to be specified).
key word arguments for SoftmaxDescr
The axis to apply the softmax function along. Note: Defaults to 'channel' axis (which may not exist, in which case a different axis id has to be specified).
The size/length of this axis can be specified as
- fixed integer
- parameterized series of valid sizes (
ParameterizedSize) - reference to another axis with an optional offset (
SizeReference)
A short description of this axis beyond its type and id.
If a model has a concatenable input axis, it can be processed blockwise,
splitting a longer sample axis into blocks matching its input tensor description.
Output axes are concatenable if they have a SizeReference to a concatenable
input axis.
The size/length of this axis can be specified as
- fixed integer
- reference to another axis with an optional offset (see
SizeReference)
A short description of this axis beyond its type and id.
The halo should be cropped from the output tensor to avoid boundary effects.
It is to be cropped from both sides, i.e. size_after_crop = size - 2 * halo.
To document a halo that is already cropped by the model use size.offset instead.
A tensor axis size (extent in pixels/frames) defined in relation to a reference axis.
axis.size = reference.size * reference.scale / axis.scale + offset
Note:
- The axis and the referenced axis need to have the same unit (or no unit).
- Batch axes may not be referenced.
- Fractions are rounded down.
- If the reference axis is
concatenablethe referencing axis is assumed to beconcatenableas well with the same block order.
Example:
An unisotropic input image of wh=10049 pixels depicts a phsical space of 200196mm².
Let's assume that we want to express the image height h in relation to its width w
instead of only accepting input images of exactly 10049 pixels
(for example to express a range of valid image shapes by parametrizing w, see ParameterizedSize).
w = SpaceInputAxis(id=AxisId("w"), size=100, unit="millimeter", scale=2) h = SpaceInputAxis( ... id=AxisId("h"), ... size=SizeReference(tensor_id=TensorId("input"), axis_id=AxisId("w"), offset=-1), ... unit="millimeter", ... scale=4, ... ) print(h.size.get_size(h, w)) 49
⇒ h = w * w.scale / h.scale + offset = 100 * 2mm / 4mm - 1 = 49
3 nested properties
tensor id of the reference axis
axis id of the reference axis
A short description of this axis beyond its type and id.
Stardist postprocessing including non-maximum suppression and converting polygon representations to instance labels
as described in:
- Uwe Schmidt, Martin Weigert, Coleman Broaddus, and Gene Myers. Cell Detection with Star-convex Polygons. International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Granada, Spain, September 2018.
- Martin Weigert, Uwe Schmidt, Robert Haase, Ko Sugawara, and Gene Myers. Star-convex Polyhedra for 3D Object Detection and Segmentation in Microscopy. The IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, Colorado, March 2020.
Note: Only available if the stardist package is installed.
The probability threshold for object candidate selection.
The IoU threshold for non-maximum suppression.
Grid size of network predictions.
Border region in which object probability is set to zero.
The probability threshold for object candidate selection.
The IoU threshold for non-maximum suppression.
Grid size of network predictions.
Border region in which object probability is set to zero.
Number of rays for 3D star-convex polyhedra.
Anisotropy factors for 3D star-convex polyhedra, i.e. the physical pixel size along each spatial axis.
Optional label to apply to any area of overlapping predicted objects.
The size/length of this axis can be specified as
- fixed integer
- parameterized series of valid sizes (
ParameterizedSize) - reference to another axis with an optional offset (
SizeReference)
A short description of this axis beyond its type and id.
If a model has a concatenable input axis, it can be processed blockwise,
splitting a longer sample axis into blocks matching its input tensor description.
Output axes are concatenable if they have a SizeReference to a concatenable
input axis.
The size/length of this axis can be specified as
- fixed integer
- reference to another axis with an optional offset (see
SizeReference)
A short description of this axis beyond its type and id.
The halo should be cropped from the output tensor to avoid boundary effects.
It is to be cropped from both sides, i.e. size_after_crop = size - 2 * halo.
To document a halo that is already cropped by the model use size.offset instead.
A tensor axis size (extent in pixels/frames) defined in relation to a reference axis.
axis.size = reference.size * reference.scale / axis.scale + offset
Note:
- The axis and the referenced axis need to have the same unit (or no unit).
- Batch axes may not be referenced.
- Fractions are rounded down.
- If the reference axis is
concatenablethe referencing axis is assumed to beconcatenableas well with the same block order.
Example:
An unisotropic input image of wh=10049 pixels depicts a phsical space of 200196mm².
Let's assume that we want to express the image height h in relation to its width w
instead of only accepting input images of exactly 10049 pixels
(for example to express a range of valid image shapes by parametrizing w, see ParameterizedSize).
w = SpaceInputAxis(id=AxisId("w"), size=100, unit="millimeter", scale=2) h = SpaceInputAxis( ... id=AxisId("h"), ... size=SizeReference(tensor_id=TensorId("input"), axis_id=AxisId("w"), offset=-1), ... unit="millimeter", ... scale=4, ... ) print(h.size.get_size(h, w)) 49
⇒ h = w * w.scale / h.scale + offset = 100 * 2mm / 4mm - 1 = 49
3 nested properties
tensor id of the reference axis
axis id of the reference axis
A short description of this axis beyond its type and id.
Detailed image preprocessing steps during model training:
Mention:
- Normalization methods
- Augmentation strategies
- Resizing/resampling procedures
- Artifact handling
Number of training epochs.
Batch size used in training.
Initial learning rate used in training.
Learning rate schedule used in training.
Loss function used in training, e.g. nn.MSELoss.
key word arguments for the loss_function
optimizer, e.g. torch.optim.Adam
key word arguments for the optimizer
Regularization techniques used during training, e.g. drop-out or weight decay.
Total training duration in hours.
name
wraps a packaging.version.Version instance for validation in pydantic models
Bioimage.io description of an application.
A human-friendly name of the resource description
The format version of this resource specification
(not the version of the resource description)
When creating a new resource always use the latest micro/patch version described here.
The format_version is important for any consumer software to understand how to parse the fields.
Cover images. Please use an image smaller than 500KB and an aspect ratio width to height of 2:1. The supported image formats are: ('.gif', '.jpeg', '.jpg', '.png', '.svg', '.tif', '.tiff')
UTF-8 emoji for display alongside the id.
The authors are the creators of the RDF and the primary points of contact.
file and other attachments
citations
A field for custom configuration that can contain any keys not present in the RDF spec.
This means you should not store, for example, a github repo URL in config since we already have the
git_repo field defined in the spec.
Keys in config may be very specific to a tool or consumer software. To avoid conflicting definitions,
it is recommended to wrap added configuration into a sub-field named with the specific domain or tool name,
for example:
config:
bioimageio: # here is the domain name
my_custom_key: 3837283
another_key:
nested: value
imagej: # config specific to ImageJ
macro_dir: path/to/macro/file
If possible, please use snake_case for keys in config.
You may want to list linked files additionally under attachments to include them when packaging a resource
(packaging a resource means downloading/copying important linked files and creating a ZIP archive that contains
an altered rdf.yaml file with local references to the downloaded files)
URL to download the resource from (deprecated)
A URL to the Git repository where the resource is being developed.
An icon for illustration
IDs of other bioimage.io resources
The person who uploaded the model (e.g. to bioimage.io)
Maintainers of this resource.
If not specified authors are maintainers and at least some of them should specify their github_user name
Resource description file (RDF) source; used to keep track of where an rdf.yaml was loaded from. Do not set this field in a YAML file.
Associated tags
The version of the resource following SemVer 2.0.
version number (n-th published version, not the semantic version)
badges associated with this resource
URL or relative path to a markdown file with additional documentation.
The recommended documentation file name is README.md. An .md suffix is mandatory.
A SPDX license identifier. We do not support custom license beyond the SPDX license list, if you need that please open a GitHub issue to discuss your intentions with the community.
bioimage.io-wide unique resource identifier assigned by bioimage.io; version unspecific.
URL or path to the source of the application
Bioimage.io description of an application.
A human-friendly name of the resource description. May only contains letters, digits, underscore, minus, parentheses and spaces.
The format version of this resource specification
A string containing a brief description.
Cover images. Please use an image smaller than 500KB and an aspect ratio width to height of 2:1 or 1:1. The supported image formats are: ('.gif', '.jpeg', '.jpg', '.png', '.svg')
UTF-8 emoji for display alongside the id.
The authors are the creators of this resource description and the primary points of contact.
file attachments
citations
A SPDX license identifier. We do not support custom license beyond the SPDX license list, if you need that please open a GitHub issue to discuss your intentions with the community.
A URL to the Git repository where the resource is being developed.
An icon for illustration, e.g. on bioimage.io
IDs of other bioimage.io resources
The person who uploaded the model (e.g. to bioimage.io)
Maintainers of this resource.
If not specified, authors are maintainers and at least some of them has to specify their github_user name
Associated tags
The version of the resource following SemVer 2.0.
A comment on the version of the resource.
URL or relative path to a markdown file encoded in UTF-8 with additional documentation.
The recommended documentation file name is README.md. An .md suffix is mandatory.
badges associated with this resource
A place to store additional metadata (often tool specific).
Such additional metadata is typically set programmatically by the respective tool or by people with specific insights into the tool. If you want to store additional metadata that does not match any of the other fields, think of a key unlikely to collide with anyone elses use-case/tool and save it here.
Please consider creating an issue in the bioimageio.spec repository if you are not sure if an existing field could cover your use case or if you think such a field should exist.
1 nested properties
bioimage.io internal metadata.
bioimage.io-wide unique resource identifier assigned by bioimage.io; version unspecific.
The description from which this one is derived
URL or path to the source of the application
A bioimage.io dataset resource description file (dataset RDF) describes a dataset relevant to bioimage processing.
A human-friendly name of the resource description
The format version of this resource specification
(not the version of the resource description)
When creating a new resource always use the latest micro/patch version described here.
The format_version is important for any consumer software to understand how to parse the fields.
Cover images. Please use an image smaller than 500KB and an aspect ratio width to height of 2:1. The supported image formats are: ('.gif', '.jpeg', '.jpg', '.png', '.svg', '.tif', '.tiff')
UTF-8 emoji for display alongside the id.
The authors are the creators of the RDF and the primary points of contact.
file and other attachments
citations
A field for custom configuration that can contain any keys not present in the RDF spec.
This means you should not store, for example, a github repo URL in config since we already have the
git_repo field defined in the spec.
Keys in config may be very specific to a tool or consumer software. To avoid conflicting definitions,
it is recommended to wrap added configuration into a sub-field named with the specific domain or tool name,
for example:
config:
bioimageio: # here is the domain name
my_custom_key: 3837283
another_key:
nested: value
imagej: # config specific to ImageJ
macro_dir: path/to/macro/file
If possible, please use snake_case for keys in config.
You may want to list linked files additionally under attachments to include them when packaging a resource
(packaging a resource means downloading/copying important linked files and creating a ZIP archive that contains
an altered rdf.yaml file with local references to the downloaded files)
URL to download the resource from (deprecated)
A URL to the Git repository where the resource is being developed.
An icon for illustration
IDs of other bioimage.io resources
The person who uploaded the model (e.g. to bioimage.io)
Maintainers of this resource.
If not specified authors are maintainers and at least some of them should specify their github_user name
Resource description file (RDF) source; used to keep track of where an rdf.yaml was loaded from. Do not set this field in a YAML file.
Associated tags
The version of the resource following SemVer 2.0.
version number (n-th published version, not the semantic version)
badges associated with this resource
URL or relative path to a markdown file with additional documentation.
The recommended documentation file name is README.md. An .md suffix is mandatory.
A SPDX license identifier. We do not support custom license beyond the SPDX license list, if you need that please open a GitHub issue to discuss your intentions with the community.
bioimage.io-wide unique resource identifier assigned by bioimage.io; version unspecific.
"URL to the source of the dataset.
Reference to a bioimage.io dataset.
A valid dataset id from the bioimage.io collection.
version number (n-th published version, not the semantic version) of linked dataset
A bioimage.io dataset resource description file (dataset RDF) describes a dataset relevant to bioimage processing.
A human-friendly name of the resource description. May only contains letters, digits, underscore, minus, parentheses and spaces.
The format version of this resource specification
A string containing a brief description.
Cover images. Please use an image smaller than 500KB and an aspect ratio width to height of 2:1 or 1:1. The supported image formats are: ('.gif', '.jpeg', '.jpg', '.png', '.svg')
UTF-8 emoji for display alongside the id.
The authors are the creators of this resource description and the primary points of contact.
file attachments
citations
A SPDX license identifier. We do not support custom license beyond the SPDX license list, if you need that please open a GitHub issue to discuss your intentions with the community.
A URL to the Git repository where the resource is being developed.
An icon for illustration, e.g. on bioimage.io
IDs of other bioimage.io resources
The person who uploaded the model (e.g. to bioimage.io)
Maintainers of this resource.
If not specified, authors are maintainers and at least some of them has to specify their github_user name
Associated tags
The version of the resource following SemVer 2.0.
A comment on the version of the resource.
URL or relative path to a markdown file encoded in UTF-8 with additional documentation.
The recommended documentation file name is README.md. An .md suffix is mandatory.
badges associated with this resource
A place to store additional metadata (often tool specific).
Such additional metadata is typically set programmatically by the respective tool or by people with specific insights into the tool. If you want to store additional metadata that does not match any of the other fields, think of a key unlikely to collide with anyone elses use-case/tool and save it here.
Please consider creating an issue in the bioimageio.spec repository if you are not sure if an existing field could cover your use case or if you think such a field should exist.
1 nested properties
bioimage.io internal metadata.
bioimage.io-wide unique resource identifier assigned by bioimage.io; version unspecific.
The description from which this one is derived
"URL to the source of the dataset.
Reference to a bioimage.io dataset.
A valid dataset id from the bioimage.io collection.
The version of the linked resource following SemVer 2.0.
free text description
A digital object identifier (DOI) is the prefered citation reference.
See https://www.doi.org/ for details. (alternatively specify url)
URL to cite (preferably specify a doi instead)
bioimage.io internal metadata.
A citation that should be referenced in work using this resource.
free text description
A digital object identifier (DOI) is the prefered citation reference. See https://www.doi.org/ for details. Note: Either doi or url have to be specified.
URL to cite (preferably specify a doi instead/also). Note: Either doi or url have to be specified.
A place to store additional metadata (often tool specific).
Such additional metadata is typically set programmatically by the respective tool or by people with specific insights into the tool. If you want to store additional metadata that does not match any of the other fields, think of a key unlikely to collide with anyone elses use-case/tool and save it here.
Please consider creating an issue in the bioimageio.spec repository if you are not sure if an existing field could cover your use case or if you think such a field should exist.
bioimage.io internal metadata.
BinarizeDescr the tensor with a fixed BinarizeKwargs.threshold.
Values above the threshold will be set to one, values below the threshold to zero.
key word arguments for BinarizeDescr
1 nested properties
The fixed threshold
key word arguments for BinarizeDescr
The fixed threshold
Clip tensor values to a range.
Set tensor values below ClipKwargs.min to ClipKwargs.min
and above ClipKwargs.max to ClipKwargs.max.
key word arguments for ClipDescr
2 nested properties
minimum value for clipping
maximum value for clipping
key word arguments for ClipDescr
minimum value for clipping
maximum value for clipping
Tensor name. No duplicates are allowed.
Axes identifying characters. Same length and order as the axes in shape.
| axis | description |
|---|---|
| b | batch (groups multiple samples) |
| i | instance/index/element |
| t | time |
| c | channel |
| z | spatial dimension z |
| y | spatial dimension y |
| x | spatial dimension x |
For now an input tensor is expected to be given as float32.
The data flow in bioimage.io models is explained
in this diagram..
Specification of input tensor shape.
Tuple (minimum, maximum) specifying the allowed range of the data in this tensor.
If not specified, the full data range that can be expressed in data_type is allowed.
Description of how this input should be preprocessed.
The weights file.
SHA256 hash value of the source file.
Attachments that are specific to this weights entry.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
Dependency manager and dependency file, specified as <dependency manager>:<relative file path>.
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
TensorFlow version used to create these weights
Reference to a bioimage.io model.
A valid model id from the bioimage.io collection.
version number (n-th published version, not the semantic version) of linked model
Specification of the fields used in a bioimage.io-compliant RDF that describes AI models with pretrained weights.
These fields are typically stored in a YAML file which we call a model resource description file (model RDF).
A human-readable name of this model. It should be no longer than 64 characters and only contain letter, number, underscore, minus or space characters.
The authors are the creators of the model RDF and the primary points of contact.
Version of the bioimage.io model description specification used.
When creating a new model always use the latest micro/patch version described here.
The format_version is important for any consumer software to understand how to parse the fields.
Specialized resource type 'model'
URL or relative path to a markdown file with additional documentation.
The recommended documentation file name is README.md. An .md suffix is mandatory.
The documentation should include a '[#[#]]# Validation' (sub)section
with details on how to quantitatively validate the model on unseen data.
Describes the input tensors expected by this model.
A SPDX license identifier. We do notsupport custom license beyond the SPDX license list, if you need that please open a GitHub issue to discuss your intentions with the community.
Describes the output tensors.
Test input tensors compatible with the inputs description for a single test case.
This means if your model has more than one input, you should provide one URL/relative path for each input.
Each test input should be a file with an ndarray in
numpy.lib file format.
The extension must be '.npy'.
Analog to test_inputs.
6 nested properties
Cover images. Please use an image smaller than 500KB and an aspect ratio width to height of 2:1. The supported image formats are: ('.gif', '.jpeg', '.jpg', '.png', '.svg', '.tif', '.tiff')
UTF-8 emoji for display alongside the id.
file and other attachments
citations
A field for custom configuration that can contain any keys not present in the RDF spec.
This means you should not store, for example, a github repo URL in config since we already have the
git_repo field defined in the spec.
Keys in config may be very specific to a tool or consumer software. To avoid conflicting definitions,
it is recommended to wrap added configuration into a sub-field named with the specific domain or tool name,
for example:
config:
bioimageio: # here is the domain name
my_custom_key: 3837283
another_key:
nested: value
imagej: # config specific to ImageJ
macro_dir: path/to/macro/file
If possible, please use snake_case for keys in config.
You may want to list linked files additionally under attachments to include them when packaging a resource
(packaging a resource means downloading/copying important linked files and creating a ZIP archive that contains
an altered rdf.yaml file with local references to the downloaded files)
URL to download the resource from (deprecated)
A URL to the Git repository where the resource is being developed.
An icon for illustration
IDs of other bioimage.io resources
The person who uploaded the model (e.g. to bioimage.io)
Maintainers of this resource.
If not specified authors are maintainers and at least some of them should specify their github_user name
Resource description file (RDF) source; used to keep track of where an rdf.yaml was loaded from. Do not set this field in a YAML file.
Associated tags
The version of the resource following SemVer 2.0.
version number (n-th published version, not the semantic version)
bioimage.io-wide unique resource identifier assigned by bioimage.io; version unspecific.
The persons that have packaged and uploaded this model.
Only required if those persons differ from the authors.
The model from which this model is derived, e.g. by fine-tuning the weights.
Custom run mode for this model: for more complex prediction procedures like test time data augmentation that currently cannot be expressed in the specification. No standard run modes are defined yet.
URLs/relative paths to sample inputs to illustrate possible inputs for the model, for example stored as PNG or TIFF images. The sample files primarily serve to inform a human user about an example use case
URLs/relative paths to sample outputs corresponding to the sample_inputs.
The dataset used to train this model
The weights file.
SHA256 hash value of the source file.
Attachments that are specific to this weights entry.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
Dependency manager and dependency file, specified as <dependency manager>:<relative file path>.
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
ONNX opset version
Tensor name. No duplicates are allowed.
Axes identifying characters. Same length and order as the axes in shape.
| axis | description |
|---|---|
| b | batch (groups multiple samples) |
| i | instance/index/element |
| t | time |
| c | channel |
| z | spatial dimension z |
| y | spatial dimension y |
| x | spatial dimension x |
Data type. The data flow in bioimage.io models is explained in this diagram..
Output tensor shape.
Tuple (minimum, maximum) specifying the allowed range of the data in this tensor.
If not specified, the full data range that can be expressed in data_type is allowed.
The halo that should be cropped from the output tensor to avoid boundary effects.
The halo is to be cropped from both sides, i.e. shape_after_crop = shape - 2 * halo.
To document a halo that is already cropped by the model shape.offset has to be used instead.
Description of how this output should be postprocessed.
The weights file.
callable returning a torch.nn.Module instance.
Local implementation: <relative path to file>:<identifier of implementation within the file>.
Implementation in a dependency: <dependency-package>.<[dependency-module]>.<identifier>.
SHA256 hash value of the source file.
Attachments that are specific to this weights entry.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
Dependency manager and dependency file, specified as <dependency manager>:<relative file path>.
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
The SHA256 of the architecture source file, if the architecture is not defined in a module listed in dependencies
You can drag and drop your file to this
online tool to generate a SHA256 in your browser.
Or you can generate a SHA256 checksum with Python's hashlib,
here is a codesnippet.
key word arguments for the architecture callable
Version of the PyTorch library used.
If depencencies is specified it should include pytorch and the verison has to match.
(dependencies overrules pytorch_version)
Fixed linear scaling.
key word arguments for ScaleLinearDescr
3 nested properties
The subset of axes to scale jointly. For example xy to scale the two image axes for 2d data jointly.
multiplicative factor
additive term
key word arguments for ScaleLinearDescr
The subset of axes to scale jointly. For example xy to scale the two image axes for 2d data jointly.
multiplicative factor
additive term
Scale the tensor s.t. its mean and variance match a reference tensor.
key word arguments for ScaleMeanVarianceDescr
4 nested properties
Mode for computing mean and variance.
| mode | description |
|---|---|
| per_dataset | Compute for the entire dataset |
| per_sample | Compute for each sample individually |
Name of tensor to match.
The subset of axes to scale jointly. For example xy to normalize the two image axes for 2d data jointly. Default: scale all non-batch axes jointly.
Epsilon for numeric stability: "`out = (tensor - mean) / (std + eps) * (ref_std + eps) + ref_mean.
key word arguments for ScaleMeanVarianceDescr
Mode for computing mean and variance.
| mode | description |
|---|---|
| per_dataset | Compute for the entire dataset |
| per_sample | Compute for each sample individually |
Name of tensor to match.
The subset of axes to scale jointly. For example xy to normalize the two image axes for 2d data jointly. Default: scale all non-batch axes jointly.
Epsilon for numeric stability: "`out = (tensor - mean) / (std + eps) * (ref_std + eps) + ref_mean.
Scale with percentiles.
key word arguments for ScaleRangeDescr
For min_percentile=0.0 (the default) and max_percentile=100 (the default)
this processing step normalizes data to the [0, 1] intervall.
For other percentiles the normalized values will partially be outside the [0, 1]
intervall. Use ScaleRange followed by ClipDescr if you want to limit the
normalized values to a range.
6 nested properties
Mode for computing percentiles.
| mode | description |
|---|---|
| per_dataset | compute for the entire dataset |
| per_sample | compute for each sample individually |
The subset of axes to normalize jointly. For example xy to normalize the two image axes for 2d data jointly.
The lower percentile used to determine the value to align with zero.
The upper percentile used to determine the value to align with one.
Has to be bigger than min_percentile.
The range is 1 to 100 instead of 0 to 100 to avoid mistakenly
accepting percentiles specified in the range 0.0 to 1.0.
Epsilon for numeric stability.
out = (tensor - v_lower) / (v_upper - v_lower + eps);
with v_lower,v_upper values at the respective percentiles.
Tensor name to compute the percentiles from. Default: The tensor itself.
For any tensor in inputs only input tensor references are allowed.
For a tensor in outputs only input tensor refereences are allowed if mode: per_dataset
key word arguments for ScaleRangeDescr
For min_percentile=0.0 (the default) and max_percentile=100 (the default)
this processing step normalizes data to the [0, 1] intervall.
For other percentiles the normalized values will partially be outside the [0, 1]
intervall. Use ScaleRange followed by ClipDescr if you want to limit the
normalized values to a range.
Mode for computing percentiles.
| mode | description |
|---|---|
| per_dataset | compute for the entire dataset |
| per_sample | compute for each sample individually |
The subset of axes to normalize jointly. For example xy to normalize the two image axes for 2d data jointly.
The lower percentile used to determine the value to align with zero.
The upper percentile used to determine the value to align with one.
Has to be bigger than min_percentile.
The range is 1 to 100 instead of 0 to 100 to avoid mistakenly
accepting percentiles specified in the range 0.0 to 1.0.
Epsilon for numeric stability.
out = (tensor - v_lower) / (v_upper - v_lower + eps);
with v_lower,v_upper values at the respective percentiles.
Tensor name to compute the percentiles from. Default: The tensor itself.
For any tensor in inputs only input tensor references are allowed.
For a tensor in outputs only input tensor refereences are allowed if mode: per_dataset
The logistic sigmoid funciton, a.k.a. expit function.
The multi-file weights. All required files/folders should be a zip archive.
SHA256 hash value of the source file.
Attachments that are specific to this weights entry.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
Dependency manager and dependency file, specified as <dependency manager>:<relative file path>.
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
Version of the TensorFlow library used.
The weights file.
SHA256 hash value of the source file.
Attachments that are specific to this weights entry.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
Dependency manager and dependency file, specified as <dependency manager>:<relative file path>.
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
Version of the TensorFlow library used.
The weights file.
SHA256 hash value of the source file.
Attachments that are specific to this weights entry.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
Dependency manager and dependency file, specified as <dependency manager>:<relative file path>.
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
Version of the PyTorch library used.
Subtract mean and divide by variance.
key word arguments for ZeroMeanUnitVarianceDescr
5 nested properties
The subset of axes to normalize jointly.
For example xy to normalize the two image axes for 2d data jointly.
Mode for computing mean and variance.
| mode | description |
|---|---|
| fixed | Fixed values for mean and variance |
| per_dataset | Compute for the entire dataset |
| per_sample | Compute for each sample individually |
The mean value(s) to use for mode: fixed.
For example [1.1, 2.2, 3.3] in the case of a 3 channel image with axes: xy.
The standard deviation values to use for mode: fixed. Analogous to mean.
epsilon for numeric stability: out = (tensor - mean) / (std + eps).
key word arguments for ZeroMeanUnitVarianceDescr
The subset of axes to normalize jointly.
For example xy to normalize the two image axes for 2d data jointly.
Mode for computing mean and variance.
| mode | description |
|---|---|
| fixed | Fixed values for mean and variance |
| per_dataset | Compute for the entire dataset |
| per_sample | Compute for each sample individually |
The mean value(s) to use for mode: fixed.
For example [1.1, 2.2, 3.3] in the case of a 3 channel image with axes: xy.
The standard deviation values to use for mode: fixed. Analogous to mean.
epsilon for numeric stability: out = (tensor - mean) / (std + eps).
Binarize the tensor with a fixed threshold.
Values above BinarizeKwargs.threshold/BinarizeAlongAxisKwargs.threshold
will be set to one, values below the threshold to zero.
Examples:
-
in YAML
postprocessing: - id: binarize kwargs: axis: 'channel' threshold: [0.25, 0.5, 0.75] -
in Python:
postprocessing = [BinarizeDescr( ... kwargs=BinarizeAlongAxisKwargs( ... axis=AxisId('channel'), ... threshold=[0.25, 0.5, 0.75], ... ) ... )]
key word arguments for BinarizeDescr
The fixed threshold
Tolerances to allow when reproducing the model's test outputs from the model's test inputs. Only the first entry matching tensor id and weights format is considered.
[]
Funding agency, grant number if applicable
Model architecture type, e.g., 3D U-Net, ResNet, transformer
Text description of model architecture.
Input modality, e.g., fluorescence microscopy, electron microscopy
Biological structure(s) the model is designed to analyze, e.g., nuclei, mitochondria, cells
Bioimage-specific task type, e.g., segmentation, classification, detection, denoising
A new version of this model exists with a different model id.
Describe how the model may be misused in bioimage analysis contexts and what users should not do with the model.
Known biases, risks, technical limitations, and recommendations for model use.
4 nested properties
Biases in training data or model behavior.
Potential risks in the context of bioimage analysis.
Technical limitations and failure modes.
Mitigation strategies regarding known_biases, risks, and limitations, as well as applicable best practices.
Consider:
- How to use a validation dataset?
- How to manually validate?
- Feasibility of domain adaptation for different experimental setups?
Total number of model parameters.
11 nested properties
Detailed image preprocessing steps during model training:
Mention:
- Normalization methods
- Augmentation strategies
- Resizing/resampling procedures
- Artifact handling
Number of training epochs.
Batch size used in training.
Initial learning rate used in training.
Learning rate schedule used in training.
Loss function used in training, e.g. nn.MSELoss.
key word arguments for the loss_function
optimizer, e.g. torch.optim.Adam
key word arguments for the optimizer
Regularization techniques used during training, e.g. drop-out or weight decay.
Total training duration in hours.
Average inference time per image/tile. Specify hardware and image size. Multiple examples can be given.
GPU memory needed for inference. Multiple examples with different image size can be given.
GPU memory needed for training. Multiple examples with different image/batch sizes can be given.
Quantitative model evaluations.
Note: At the moment we recommend to include only a single test dataset (with evaluation factors that may mark subsets of the dataset) to avoid confusion and make the presentation of results cleaner.
Environmental considerations for model training and deployment.
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
5 nested properties
GPU/CPU specifications
Total compute hours
If applicable
Geographic location
kg CO2 equivalent
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Set tensor values below min to min and above max to max.
See ScaleRangeDescr for examples.
key word arguments for ClipDescr
5 nested properties
Minimum value for clipping.
Exclusive with min_percentile
Minimum percentile for clipping.
Exclusive with min.
In range [0, 100).
Maximum value for clipping.
Exclusive with max_percentile.
Maximum percentile for clipping.
Exclusive with max.
In range (1, 100].
The subset of axes to determine percentiles jointly,
i.e. axes to reduce to compute min/max from min_percentile/max_percentile.
For example to clip 'batch', 'x' and 'y' jointly in a tensor ('batch', 'channel', 'y', 'x')
resulting in a tensor of equal shape with clipped values per channel, specify axes=('batch', 'x', 'y').
To clip samples independently, leave out the 'batch' axis.
Only valid if min_percentile and/or max_percentile are set.
Default: Compute percentiles over all axes jointly.
key word arguments for ClipDescr
Minimum value for clipping.
Exclusive with min_percentile
Minimum percentile for clipping.
Exclusive with min.
In range [0, 100).
Maximum value for clipping.
Exclusive with max_percentile.
Maximum percentile for clipping.
Exclusive with max.
In range (1, 100].
The subset of axes to determine percentiles jointly,
i.e. axes to reduce to compute min/max from min_percentile/max_percentile.
For example to clip 'batch', 'x' and 'y' jointly in a tensor ('batch', 'channel', 'y', 'x')
resulting in a tensor of equal shape with clipped values per channel, specify axes=('batch', 'x', 'y').
To clip samples independently, leave out the 'batch' axis.
Only valid if min_percentile and/or max_percentile are set.
Default: Compute percentiles over all axes jointly.
17 nested properties
Tolerances to allow when reproducing the model's test outputs from the model's test inputs. Only the first entry matching tensor id and weights format is considered.
[]
Funding agency, grant number if applicable
Model architecture type, e.g., 3D U-Net, ResNet, transformer
Text description of model architecture.
Input modality, e.g., fluorescence microscopy, electron microscopy
Biological structure(s) the model is designed to analyze, e.g., nuclei, mitochondria, cells
Bioimage-specific task type, e.g., segmentation, classification, detection, denoising
A new version of this model exists with a different model id.
Describe how the model may be misused in bioimage analysis contexts and what users should not do with the model.
Known biases, risks, technical limitations, and recommendations for model use.
4 nested properties
Biases in training data or model behavior.
Potential risks in the context of bioimage analysis.
Technical limitations and failure modes.
Mitigation strategies regarding known_biases, risks, and limitations, as well as applicable best practices.
Consider:
- How to use a validation dataset?
- How to manually validate?
- Feasibility of domain adaptation for different experimental setups?
Total number of model parameters.
11 nested properties
Detailed image preprocessing steps during model training:
Mention:
- Normalization methods
- Augmentation strategies
- Resizing/resampling procedures
- Artifact handling
Number of training epochs.
Batch size used in training.
Initial learning rate used in training.
Learning rate schedule used in training.
Loss function used in training, e.g. nn.MSELoss.
key word arguments for the loss_function
optimizer, e.g. torch.optim.Adam
key word arguments for the optimizer
Regularization techniques used during training, e.g. drop-out or weight decay.
Total training duration in hours.
Average inference time per image/tile. Specify hardware and image size. Multiple examples can be given.
GPU memory needed for inference. Multiple examples with different image size can be given.
GPU memory needed for training. Multiple examples with different image/batch sizes can be given.
Quantitative model evaluations.
Note: At the moment we recommend to include only a single test dataset (with evaluation factors that may mark subsets of the dataset) to avoid confusion and make the presentation of results cleaner.
Environmental considerations for model training and deployment.
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
5 nested properties
GPU/CPU specifications
Total compute hours
If applicable
Geographic location
kg CO2 equivalent
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
tensor axes
Input tensor id. No duplicates are allowed across all inputs and outputs.
free text description
An example tensor to use for testing. Using the model with the test input tensors is expected to yield the test output tensors. Each test tensor has be a an ndarray in the numpy.lib file format. The file extension must be '.npy'.
A sample tensor to illustrate a possible input/output for the model,
The sample image primarily serves to inform a human user about an example use case
and is typically stored as .hdf5, .png or .tiff.
It has to be readable by the imageio library
(numpy's .npy format is not supported).
The image dimensionality has to match the number of axes specified in this tensor description.
Description of the tensor's data values, optionally per channel.
If specified per channel, the data type needs to match across channels.
{
"type": "float32",
"range": [
null,
null
],
"unit": "arbitrary unit",
"scale": 1.0,
"offset": null
}
indicates that this tensor may be None
Description of how this input should be preprocessed.
notes:
- If preprocessing does not start with an 'ensure_dtype' entry, it is added to ensure an input tensor's data type matches the input tensor's data description.
- If preprocessing does not end with an 'ensure_dtype' or 'binarize' entry, an 'ensure_dtype' step is added to ensure preprocessing steps are not unintentionally changing the data type.
Source of the weights file.
wraps a packaging.version.Version instance for validation in pydantic models
SHA256 hash value of the source file.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
A comment about this weights entry, for example how these weights were created.
Reference to a bioimage.io model.
A valid model id from the bioimage.io collection.
The version of the linked resource following SemVer 2.0.
Specification of the fields used in a bioimage.io-compliant RDF to describe AI models with pretrained weights. These fields are typically stored in a YAML file which we call a model resource description file (model RDF).
A human-readable name of this model. It should be no longer than 64 characters and may only contain letter, number, underscore, minus, parentheses and spaces. We recommend to chose a name that refers to the model's task and image modality.
Version of the bioimage.io model description specification used.
When creating a new model always use the latest micro/patch version described here.
The format_version is important for any consumer software to understand how to parse the fields.
Specialized resource type 'model'
Describes the input tensors expected by this model.
Describes the output tensors.
7 nested properties
A string containing a brief description.
Cover images. Please use an image smaller than 500KB and an aspect ratio width to height of 2:1 or 1:1. The supported image formats are: ('.gif', '.jpeg', '.jpg', '.png', '.svg')
UTF-8 emoji for display alongside the id.
The authors are the creators of the model RDF and the primary points of contact.
file attachments
citations
A SPDX license identifier. We do not support custom license beyond the SPDX license list, if you need that please open a GitHub issue to discuss your intentions with the community.
A URL to the Git repository where the resource is being developed.
An icon for illustration, e.g. on bioimage.io
IDs of other bioimage.io resources
The person who uploaded the model (e.g. to bioimage.io)
Maintainers of this resource.
If not specified, authors are maintainers and at least some of them has to specify their github_user name
Associated tags
The version of the resource following SemVer 2.0.
A comment on the version of the resource.
bioimage.io-wide unique resource identifier assigned by bioimage.io; version unspecific.
URL or relative path to a markdown file with additional documentation.
The recommended documentation file name is README.md. An .md suffix is mandatory.
The documentation should include a '#[#] Validation' (sub)section
with details on how to quantitatively validate the model on unseen data.
The persons that have packaged and uploaded this model.
Only required if those persons differ from the authors.
The model from which this model is derived, e.g. by fine-tuning the weights.
Custom run mode for this model: for more complex prediction procedures like test time data augmentation that currently cannot be expressed in the specification. No standard run modes are defined yet.
The dataset used to train this model
2 nested properties
17 nested properties
Tolerances to allow when reproducing the model's test outputs from the model's test inputs. Only the first entry matching tensor id and weights format is considered.
[]
Funding agency, grant number if applicable
Model architecture type, e.g., 3D U-Net, ResNet, transformer
Text description of model architecture.
Input modality, e.g., fluorescence microscopy, electron microscopy
Biological structure(s) the model is designed to analyze, e.g., nuclei, mitochondria, cells
Bioimage-specific task type, e.g., segmentation, classification, detection, denoising
A new version of this model exists with a different model id.
Describe how the model may be misused in bioimage analysis contexts and what users should not do with the model.
Known biases, risks, technical limitations, and recommendations for model use.
Total number of model parameters.
Average inference time per image/tile. Specify hardware and image size. Multiple examples can be given.
GPU memory needed for inference. Multiple examples with different image size can be given.
GPU memory needed for training. Multiple examples with different image/batch sizes can be given.
Quantitative model evaluations.
Note: At the moment we recommend to include only a single test dataset (with evaluation factors that may mark subsets of the dataset) to avoid confusion and make the presentation of results cleaner.
Environmental considerations for model training and deployment.
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Source of the weights file.
ONNX opset version
SHA256 hash value of the source file.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
A comment about this weights entry, for example how these weights were created.
Source of the external ONNX data file holding the weights. (If present source holds the ONNX architecture without weights).
tensor axes
Output tensor id. No duplicates are allowed across all inputs and outputs.
free text description
An example tensor to use for testing. Using the model with the test input tensors is expected to yield the test output tensors. Each test tensor has be a an ndarray in the numpy.lib file format. The file extension must be '.npy'.
A sample tensor to illustrate a possible input/output for the model,
The sample image primarily serves to inform a human user about an example use case
and is typically stored as .hdf5, .png or .tiff.
It has to be readable by the imageio library
(numpy's .npy format is not supported).
The image dimensionality has to match the number of axes specified in this tensor description.
Description of the tensor's data values, optionally per channel.
If specified per channel, the data type needs to match across channels.
{
"type": "float32",
"range": [
null,
null
],
"unit": "arbitrary unit",
"scale": 1.0,
"offset": null
}
Description of how this output should be postprocessed.
note: postprocessing always ends with an 'ensure_dtype' operation.
If not given this is added to cast to this tensor's data.type.
Source of the weights file.
wraps a packaging.version.Version instance for validation in pydantic models
SHA256 hash value of the source file.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
A comment about this weights entry, for example how these weights were created.
Custom depencies beyond pytorch described in a Conda environment file. Allows to specify custom dependencies, see conda docs:
The conda environment file should include pytorch and any version pinning has to be compatible with pytorch_version.
Fixed linear scaling.
Examples:
- Scale with scalar gain and offset
- in YAML
```yaml
preprocessing:
- id: scale_linear kwargs: gain: 2.0 offset: 3.0 ``` - in Python:
>>> preprocessing = [
... ScaleLinearDescr(kwargs=ScaleLinearKwargs(gain= 2.0, offset=3.0))
... ]
- Independent scaling along an axis
- in YAML
```yaml
preprocessing:
- id: scale_linear kwargs: axis: 'channel' gain: [1.0, 2.0, 3.0] ``` - in Python:
>>> preprocessing = [
... ScaleLinearDescr(
... kwargs=ScaleLinearAlongAxisKwargs(
... axis=AxisId("channel"),
... gain=[1.0, 2.0, 3.0],
... )
... )
... ]
Key word arguments for ScaleLinearDescr
multiplicative factor
additive term
Scale a tensor's data distribution to match another tensor's mean/std.
out = (tensor - mean) / (std + eps) * (ref_std + eps) + ref_mean.
key word arguments for ScaleMeanVarianceKwargs
3 nested properties
ID of unprocessed input tensor to match.
The subset of axes to normalize jointly, i.e. axes to reduce to compute mean/std.
For example to normalize 'batch', 'x' and 'y' jointly in a tensor ('batch', 'channel', 'y', 'x')
resulting in a tensor of equal shape normalized per channel, specify axes=('batch', 'x', 'y').
To normalize samples independently, leave out the 'batch' axis.
Default: Scale all axes jointly.
Epsilon for numeric stability:
out = (tensor - mean) / (std + eps) * (ref_std + eps) + ref_mean.
key word arguments for ScaleMeanVarianceKwargs
ID of unprocessed input tensor to match.
The subset of axes to normalize jointly, i.e. axes to reduce to compute mean/std.
For example to normalize 'batch', 'x' and 'y' jointly in a tensor ('batch', 'channel', 'y', 'x')
resulting in a tensor of equal shape normalized per channel, specify axes=('batch', 'x', 'y').
To normalize samples independently, leave out the 'batch' axis.
Default: Scale all axes jointly.
Epsilon for numeric stability:
out = (tensor - mean) / (std + eps) * (ref_std + eps) + ref_mean.
Scale with percentiles.
Examples:
-
Scale linearly to map 5th percentile to 0 and 99.8th percentile to 1.0
- in YAML
preprocessing: - id: scale_range kwargs: axes: ['y', 'x'] max_percentile: 99.8 min_percentile: 5.0- in Python
preprocessing = [ ... ScaleRangeDescr( ... kwargs=ScaleRangeKwargs( ... axes= (AxisId('y'), AxisId('x')), ... max_percentile= 99.8, ... min_percentile= 5.0, ... ) ... ) ... ]
-
Combine the above scaling with additional clipping to clip values outside the range given by the percentiles. - in YAML ```yaml preprocessing:
- id: scale_range kwargs: axes: ['y', 'x'] max_percentile: 99.8 min_percentile: 5.0
- id: clip kwargs: min: 0.0 max: 1.0 ``` - in Python
>>> preprocessing = [
... ScaleRangeDescr(
... kwargs=ScaleRangeKwargs(
... axes= (AxisId('y'), AxisId('x')),
... max_percentile= 99.8,
... min_percentile= 5.0,
... )
... ),
... ClipDescr(
... kwargs=ClipKwargs(
... min=0.0,
... max=1.0,
... )
... ),
... ]
key word arguments for ScaleRangeDescr
For min_percentile=0.0 (the default) and max_percentile=100 (the default)
this processing step normalizes data to the [0, 1] intervall.
For other percentiles the normalized values will partially be outside the [0, 1]
intervall. Use ScaleRange followed by ClipDescr if you want to limit the
normalized values to a range.
5 nested properties
The subset of axes to normalize jointly, i.e. axes to reduce to compute the min/max percentile value.
For example to normalize 'batch', 'x' and 'y' jointly in a tensor ('batch', 'channel', 'y', 'x')
resulting in a tensor of equal shape normalized per channel, specify axes=('batch', 'x', 'y').
To normalize samples independently, leave out the "batch" axis.
Default: Scale all axes jointly.
The lower percentile used to determine the value to align with zero.
The upper percentile used to determine the value to align with one.
Has to be bigger than min_percentile.
The range is 1 to 100 instead of 0 to 100 to avoid mistakenly
accepting percentiles specified in the range 0.0 to 1.0.
Epsilon for numeric stability.
out = (tensor - v_lower) / (v_upper - v_lower + eps);
with v_lower,v_upper values at the respective percentiles.
ID of the unprocessed input tensor to compute the percentiles from. Default: The tensor itself.
key word arguments for ScaleRangeDescr
For min_percentile=0.0 (the default) and max_percentile=100 (the default)
this processing step normalizes data to the [0, 1] intervall.
For other percentiles the normalized values will partially be outside the [0, 1]
intervall. Use ScaleRange followed by ClipDescr if you want to limit the
normalized values to a range.
The subset of axes to normalize jointly, i.e. axes to reduce to compute the min/max percentile value.
For example to normalize 'batch', 'x' and 'y' jointly in a tensor ('batch', 'channel', 'y', 'x')
resulting in a tensor of equal shape normalized per channel, specify axes=('batch', 'x', 'y').
To normalize samples independently, leave out the "batch" axis.
Default: Scale all axes jointly.
The lower percentile used to determine the value to align with zero.
The upper percentile used to determine the value to align with one.
Has to be bigger than min_percentile.
The range is 1 to 100 instead of 0 to 100 to avoid mistakenly
accepting percentiles specified in the range 0.0 to 1.0.
Epsilon for numeric stability.
out = (tensor - v_lower) / (v_upper - v_lower + eps);
with v_lower,v_upper values at the respective percentiles.
ID of the unprocessed input tensor to compute the percentiles from. Default: The tensor itself.
The logistic sigmoid function, a.k.a. expit function.
Examples:
-
in YAML
postprocessing: - id: sigmoid -
in Python:
postprocessing = [SigmoidDescr()]
The multi-file weights. All required files/folders should be a zip archive.
wraps a packaging.version.Version instance for validation in pydantic models
SHA256 hash value of the source file.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
A comment about this weights entry, for example how these weights were created.
The multi-file weights. All required files/folders should be a zip archive.
wraps a packaging.version.Version instance for validation in pydantic models
SHA256 hash value of the source file.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
A comment about this weights entry, for example how these weights were created.
Custom dependencies beyond tensorflow. Should include tensorflow and any version pinning has to be compatible with tensorflow_version.
Source of the weights file.
wraps a packaging.version.Version instance for validation in pydantic models
SHA256 hash value of the source file.
Authors
Either the person(s) that have trained this model resulting in the original weights file.
(If this is the initial weights entry, i.e. it does not have a parent)
Or the person(s) who have converted the weights to this weights format.
(If this is a child weight, i.e. it has a parent field)
The source weights these weights were converted from.
For example, if a model's weights were converted from the pytorch_state_dict format to torchscript,
The pytorch_state_dict weights entry has no parent and is the parent of the torchscript weights.
All weight entries except one (the initial set of weights resulting from training the model),
need to have this field.
A comment about this weights entry, for example how these weights were created.
Subtract mean and divide by variance.
Examples:
Subtract tensor mean and variance
- in YAML
yaml preprocessing: - id: zero_mean_unit_variance
- in Python
>>> preprocessing = [ZeroMeanUnitVarianceDescr()]
key word arguments for ZeroMeanUnitVarianceDescr
2 nested properties
The subset of axes to normalize jointly, i.e. axes to reduce to compute mean/std.
For example to normalize 'batch', 'x' and 'y' jointly in a tensor ('batch', 'channel', 'y', 'x')
resulting in a tensor of equal shape normalized per channel, specify axes=('batch', 'x', 'y').
To normalize each sample independently leave out the 'batch' axis.
Default: Scale all axes jointly.
epsilon for numeric stability: out = (tensor - mean) / (std + eps).
key word arguments for ZeroMeanUnitVarianceDescr
The subset of axes to normalize jointly, i.e. axes to reduce to compute mean/std.
For example to normalize 'batch', 'x' and 'y' jointly in a tensor ('batch', 'channel', 'y', 'x')
resulting in a tensor of equal shape normalized per channel, specify axes=('batch', 'x', 'y').
To normalize each sample independently leave out the 'batch' axis.
Default: Scale all axes jointly.
epsilon for numeric stability: out = (tensor - mean) / (std + eps).
Bioimage.io description of a Jupyter Notebook.
A human-friendly name of the resource description
The format version of this resource specification
(not the version of the resource description)
When creating a new resource always use the latest micro/patch version described here.
The format_version is important for any consumer software to understand how to parse the fields.
The Jupyter notebook
Cover images. Please use an image smaller than 500KB and an aspect ratio width to height of 2:1. The supported image formats are: ('.gif', '.jpeg', '.jpg', '.png', '.svg', '.tif', '.tiff')
UTF-8 emoji for display alongside the id.
The authors are the creators of the RDF and the primary points of contact.
file and other attachments
citations
A field for custom configuration that can contain any keys not present in the RDF spec.
This means you should not store, for example, a github repo URL in config since we already have the
git_repo field defined in the spec.
Keys in config may be very specific to a tool or consumer software. To avoid conflicting definitions,
it is recommended to wrap added configuration into a sub-field named with the specific domain or tool name,
for example:
config:
bioimageio: # here is the domain name
my_custom_key: 3837283
another_key:
nested: value
imagej: # config specific to ImageJ
macro_dir: path/to/macro/file
If possible, please use snake_case for keys in config.
You may want to list linked files additionally under attachments to include them when packaging a resource
(packaging a resource means downloading/copying important linked files and creating a ZIP archive that contains
an altered rdf.yaml file with local references to the downloaded files)
URL to download the resource from (deprecated)
A URL to the Git repository where the resource is being developed.
An icon for illustration
IDs of other bioimage.io resources
The person who uploaded the model (e.g. to bioimage.io)
Maintainers of this resource.
If not specified authors are maintainers and at least some of them should specify their github_user name
Resource description file (RDF) source; used to keep track of where an rdf.yaml was loaded from. Do not set this field in a YAML file.
Associated tags
The version of the resource following SemVer 2.0.
version number (n-th published version, not the semantic version)
badges associated with this resource
URL or relative path to a markdown file with additional documentation.
The recommended documentation file name is README.md. An .md suffix is mandatory.
A SPDX license identifier. We do not support custom license beyond the SPDX license list, if you need that please open a GitHub issue to discuss your intentions with the community.
bioimage.io-wide unique resource identifier assigned by bioimage.io; version unspecific.
Bioimage.io description of a Jupyter notebook.
A human-friendly name of the resource description. May only contains letters, digits, underscore, minus, parentheses and spaces.
The format version of this resource specification
The Jupyter notebook
A string containing a brief description.
Cover images. Please use an image smaller than 500KB and an aspect ratio width to height of 2:1 or 1:1. The supported image formats are: ('.gif', '.jpeg', '.jpg', '.png', '.svg')
UTF-8 emoji for display alongside the id.
The authors are the creators of this resource description and the primary points of contact.
file attachments
citations
A SPDX license identifier. We do not support custom license beyond the SPDX license list, if you need that please open a GitHub issue to discuss your intentions with the community.
A URL to the Git repository where the resource is being developed.
An icon for illustration, e.g. on bioimage.io
IDs of other bioimage.io resources
The person who uploaded the model (e.g. to bioimage.io)
Maintainers of this resource.
If not specified, authors are maintainers and at least some of them has to specify their github_user name
Associated tags
The version of the resource following SemVer 2.0.
A comment on the version of the resource.
URL or relative path to a markdown file encoded in UTF-8 with additional documentation.
The recommended documentation file name is README.md. An .md suffix is mandatory.
badges associated with this resource
A place to store additional metadata (often tool specific).
Such additional metadata is typically set programmatically by the respective tool or by people with specific insights into the tool. If you want to store additional metadata that does not match any of the other fields, think of a key unlikely to collide with anyone elses use-case/tool and save it here.
Please consider creating an issue in the bioimageio.spec repository if you are not sure if an existing field could cover your use case or if you think such a field should exist.
1 nested properties
bioimage.io internal metadata.
bioimage.io-wide unique resource identifier assigned by bioimage.io; version unspecific.
The description from which this one is derived