POPxf (SchemaStore) JSON Schema

Type	object
File match	`.popxf` `.popxf.json`
Schema URL	https://catalog.lintel.tools/schemas/schemastore/popxf/latest.json
Source	https://www.schemastore.org/popxf-1.0.json

Validate with Lintel

npx @lintel/lintel check

Type: object

A detailed specification of all fields in the proposed POPxf data format is given below. Each subsection describes the structure, expected data type, and allowed values of the corresponding entries in the JSON object. The data type object mentioned below refers to a JSON object literal and corresponds to a set of key/value pairs representing named subfields. The format is divided into two main components: the metadata and data fields. An additional $schema field is included to specify the version of the POPxf JSON schema used. All quantities defined in this specification refer to a single datafile. They may be indexed by a superscript $(n)$ with $n \in [1,N]$ to denote quantities in a collection of $N$ datafiles. This is particularly relevant for discussing correlated predictions stored in separate files. Since this specification focuses on the format of a single datafile, we will omit the superscript $(n)$ to keep the notation concise. As a convention, we assume that all dimensionful quantities are given in units of GeV.

Properties

$schema string required

The $schema field allows identifying a JSON file as conforming to the POPxf format and specifies the version of the POPxf JSON schema used. It must be set to

"<https://json.schemastore.org/popxf-1.0.json>"

for files conforming to this version of the specification. The version number will be incremented for future revisions of the JSON schema.

Constant: "https://json.schemastore.org/popxf-1.0.json"

metadata object required

The metadata field contains all contextual and structural information required to interpret the numerical predictions. It is a JSON object with the following subfields:

9 nested properties

observable_names string[] required

Array of $M$ names identifying each observable $O_m$. Must be an array of unique, non-empty strings, with at least one entry.

Examples: {"observable_names":["observable1","observable2","observable3"]}

minItems=1uniqueItems=true

parameters string[] required

Array of $S$ names identifying each model parameter $C_s$ (e.g., Wilson coefficient names). Must be an array of unique, non-empty strings, with at least one entry. In general, this includes $S_\mathbb{R}$ real-valued and $S_\mathbb{C}$ complex-valued parameters with $S = S_\mathbb{R} + S_\mathbb{C}$. The real-valued parameters and the real and imaginary parts of the complex-valued parameters are used as the $R=S_\mathbb{R} + 2\ S_\mathbb{C}$ independent variables of all polynomial terms and can be grouped together in a real-valued parameter vector $\vec{C}$ of length $R$.

Examples: {"parameters":["C1","C2","C3"]}

minItems=1uniqueItems=true

basis object required

Defines the parameter basis (e.g. an operator basis in an EFT). At least one of the two subfields wcxf and custom has to be present. If both subfields are present, any element of parameters (see above) not belonging to the wcxf basis is interpreted as belonging to the custom basis. The subfields are defined as follows:

Examples: {"basis":{"wcxf":{"eft":"SMEFT","basis":"Warsaw","sectors":["dB=de=dmu=dtau=0"]}}}

Any of: object object, object object

2 nested properties

wcxf object

Specifies an EFT basis defined by the Wilson Coefficient exchange format (WCxf) [@Aebischer:2017ugx]. This object contains the following fields:

3 nested properties

eft string required

EFT name defined by WCxf (e.g., "SMEFT")

basis string required

Operator basis name defined by WCxf (e.g., "Warsaw")

sectors string[]

Array of renormalisation-group-closed sectors of Wilson coefficients containing the Wilson coefficients given in parameters (see above). The available sectors for each EFT are defined by WCxf.

custom

Field of any type and substructure to unambiguously specify any parameter basis not defined by WCxf.

scale number | number[] required

The renormalisation scale in GeV at which the parameter vector $\vec{C}$, the polynomial coefficients ${\vec{p}_k \supset \vec{b}_k, \vec{c}_k, ...}$, and the observable coefficients ${\vec{o}_m \supset \vec{b}_m, \vec{c}_m, ...}$ and their uncertainties $\vec{\sigma}_m$ are defined. The parameter vector $\vec{C}$ that enters a given polynomial $P_k$ or observable $O_m$ has to be given at the same scale at which the polynomial coefficients $\vec{p}_k$ or observable coefficients $\vec{o}_m$ are defined, such that the polynomial or observable itself is scale-independent up to higher-order corrections in perturbation theory.

This field can take one of two forms:

single number: A common scale $\mu$ at which all polynomial coefficients $\vec p_k$ or observable coefficients $\vec o_m$ are defined.
- If the observables $O_m$ are expressed in terms of polynomials $P_k$, the polynomials are functions of the parameters evolved to the common scale $\mu$:
  
  $$P_k = a_{k} + \vec{C}(\mu) \cdot \vec{b}_{k}(\mu) + \dots\ $$
- If the observables $O_m$ are themselves polynomials, they are themselves functions of the parameters evolved to the common scale $\mu$:
  
  $$O_m = a_m + \vec{C}(\mu) \cdot \vec{b}_m(\mu) + \dots\ $$
array of numbers: An array defining separate scales $\mu_k$ of polynomial coefficients $\vec p_k$ if metadata.polynomial_names is present, or separate scales $\mu_m$ of observable coefficients $\vec o_m$ if metadata.polynomial_names is absent.
- If metadata.polynomial_names is present, the observables $O_m$ are expressed in terms of polynomials $P_k$ and each polynomial is a function of the parameters evolved to its corresponding scale $\mu_k$:
  
  $$P_k = a_{k} + \vec{C}(\mu_k) \cdot \vec{b}_{k}(\mu_k) + \dots\ $$
  
  The length and order of the array defining the scales $\mu_k$ must match those of the field metadata.polynomial_names. To avoid ambiguities, the following restrictions apply to this case:
  - data.observable_central must be absent;
  - data.observable_uncertainties must be absent or only define uncertainties for the parameter-independent terms (i.e. only the SM uncertainties in EFT applications).
- If metadata.polynomial_names is absent, the observables $O_m$ are themselves polynomials and each observable is a function of the parameters evolved to its corresponding scale $\mu_m$:
  
  $$O_m = a_m + \vec{C}(\mu_m) \cdot \vec{b}_m(\mu_m) + \dots\ $$
  
  The length and order of the array defining the scales $\mu_m$ must match those of the field metadata.observable_names.

Examples: {"scale":91.1876}, {"scale":[100.0,200.0,300.0,400.0,500.0]}

polynomial_names string[]

This field is required to express observables as functions of polynomials. It requires the simultaneous presence of metadata.observable_expressions and data.polynomial_central.

Array of $K$ names identifying the individual polynomials $P_k$ that enter the observable predictions through the functions defined in metadata.observable_expressions (see below). Must contain unique, non-empty strings.

Examples: {"polynomial_names":["polynomial 1","polynomial 2"]}

minItems=1uniqueItems=true

observable_expressions object[]

This field is required to express observables as functions of polynomials. It requires the simultaneous presence of metadata.polynomial_names and data.polynomial_central.

Defines how each observable is constructed from the named polynomials. Must be an array of $M$ objects, one per observable. The length and order of the array must match those of the observable_names field. Each object must contain:

Examples:

{"observable_expressions":[{"variables":{"num":"polynomial 1","den":"polynomial 2"},"expression":"num / den"},{"variables":{"num":"polynomial 2","den":"polynomial 1"},"expression":"num / den"},{"variables":{"p1":"polynomial 1"},"expression":"sqrt(p1**2)"}]}

minItems=1

polynomial_degree integer

Specifies the maximum degree of polynomial terms included in the expansion. If omitted, the default value is 2 (i.e., quadratic polynomial). Values higher than 2 may be used to represent observables involving higher-order terms in the model parameters. The current implementation of the JSON schema defining the data format supports values up to 5. Higher degrees are not prohibited in principle but are currently unsupported to avoid excessively large data structures.

Values: 1 2 3 4 5

Examples: {"polynomial_degree":2}

reproducibility object[]

Collects relevant data that may be required by a third party to reproduce the prediction. Each element of the array should be an object that corresponds to a step in the workflow and has three predefined fields: description, tool and inputs, specified below. In addition, any additional fields containing data deemed useful in this context can be included.

Schematic example:

  "reproducibility": [
    {
      "description": "Description of the first step",
      "tool": { ... },
      "inputs": { ... }
    },
    {
      "description": "Description of the second step",
      "tool": { ... },
      "inputs": { ... }
    },
    ...
  ]

The predefined fields are as follows:

minItems=1

misc object

Optional free-form metadata for documentation purposes. May include fields such as authorship, contact information, date, description of the observable, information identifying the associated correlation file (e.g. hash value or filename), or external references. The format is unrestricted, allowing any JSON-encodable content.

Examples:

{"misc":{"author":"John Doe","contact":"[email protected]","description":"Example dataset","URL":"johndoe.com/exampledata","correlation_file":"correlations.json","correlation_file_hash":"AB47BG3F11DA7DCAA5726008BAAFE176"}}

data object required

The data field contains the numerical representation of all polynomial terms, which define the polynomials $P_k$ and observables $O_m$. This information is provided in terms of central values of polynomial coefficients $\vec{p}_k$ and observable coefficients $\vec{o}_m$, and uncertainties of observable coefficients $\vec{\sigma}_m$.

Each component of $\vec{o}_m$, $\vec{p}_k$, and $\vec{\sigma}_m$ is labelled by a monomial key, written as a stringified tuple of model parameters (e.g., Wilson coefficients) defined in metadata.parameters. For example, the key "('C1', 'C2')" corresponds to the monomial $C_1 C_2$. While the model parameters can be complex numbers, the monomials are defined for the real and imaginary parts of the model parameters (see below) and are therefore strictly real. The format and conventions for monomial keys are as follows:

Each key is a string representation of a Python-style tuple: a comma-separated array of strings enclosed in parentheses.
The length of the tuple is determined by the polynomial degree $d$, as defined by the metadata field polynomial_degree (default value: $d=2$, i.e. quadratic polynomial, if polynomial_degree is omitted). The tuple length equals $d$, unless a real/imaginary tag is included (see below), in which case the length is $d+1$.
The first $d$ entries in the tuple are model parameter names, as defined in the metadata field parameters. These names must be sorted alphabetically to ensure unique monomial keys (assuming the same sorting rules as Python's sort() method which sorts alphabetically according to ASCII or UNICODE-value, where all upper-case letters come before all lower-case letters, and shorter strings take precedence). Empty strings '' are used to represent constant terms (equivalent to $1$) and to pad monomials of lower degree. For example, for a quadratic polynomial in real parameters (see below for how complex parameters are handled):
- A constant $1$ is written as "('','')",
- A linear term $C_1$ is written as "('', 'C1')",
- A quadratic term $C_1 C_2$ is written as "('C1', 'C2')".
To handle complex parameters, the tuple may optionally include a real/imaginary tag as its final element. This tag consists of R (real) and I (imaginary) characters, and its length must match the polynomial degree $d$. It indicates whether each parameter refers to its real or imaginary part. For example:
- "('', 'C1', 'RI')" corresponds to $\mathrm{Im}(C_1)$;
- "('C1', 'C2', 'IR')" corresponds to $\mathrm{Im}(C_1)\mathrm{Re}(C_2)$.
If the real/imaginary tag is omitted, the parameters are assumed to be real. For example:
- "('', 'C1')" corresponds to $\mathrm{Re}(C_1)$;
- "('C1', 'C2')" corresponds to $\mathrm{Re}(C_1)\mathrm{Re}(C_2)$.

These conventions ensure a canonical and unambiguous representation of polynomial terms while offering flexibility in the naming of model parameters. Missing monomials are implicitly treated as having zero coefficients.

The data field is a JSON object with the following subfields:

Any of: object object, object object

3 nested properties

polynomial_central Record<string, number[]>

This field is required to express observables as functions of polynomials. It requires the simultaneous presence of metadata.polynomial_names and metadata.observable_expressions.

An object representing the central values of the polynomial coefficients $\vec{p}_k$ for each named polynomial $P_k$. Each key must be a monomial key as defined above. Each value must be an array of $K$ numbers whose order matches metadata.polynomial_names.

Examples:

{"description":"Specifying two polynomials, $P_k$, given in terms of two complex parameters $C_1$ and $C_2$ as\n\n$$\n\\begin{aligned}\n    P_1 &= 1.0 + 1.2 \\ \\mathrm{Im}(C_1) + 0.8 \\ \\mathrm{Re}(C_1) \\mathrm{Re}(C_2) + 0.5 \\ \\mathrm{Re}(C_1) \\mathrm{Im}(C_2)+ 0.2 \\ \\mathrm{Im}(C_1) \\mathrm{Im}(C_2)\\ , \\\\\n    P_2 &= 1.1 + 1.3 \\ \\mathrm{Im}(C_1)  + 0.85 \\ \\mathrm{Re}(C_1) \\mathrm{Re}(C_2) + 0.55 \\ \\mathrm{Re}(C_1) \\mathrm{Im}(C_2)+ 0.25 \\ \\mathrm{Im}(C_1) \\mathrm{Im}(C_2)\\ .\n\\end{aligned}\n$$","polynomial_central":{"('', '', 'RR')":[1.0,1.1],"('', 'C1', 'RI')":[1.2,1.3],"('C1', 'C2', 'RR')":[0.8,0.85],"('C1', 'C2', 'RI')":[0.5,0.55],"('C1', 'C2', 'II')":[0.2,0.25]}}

observable_central Record<string, number[]>

An object representing the central values of the observable coefficients $\vec{o}_m$ for each observable $O_m$. In case the observables are not themselves polynomials, the observable coefficients correspond to the polynomial approximation of the observables obtained from a Taylor expansion of the observable expressions defined in metadata.observable_expressions. Each key must be a monomial key as defined above. Each value must be an array of $M$ numbers whose order matches metadata.observable_names.

Examples:

{"description":"Specifying three observable predictions, $O_{m}$, given in terms of the three real parameters $C_1$, $C_2$, and $C_3$ as\n\n$$\n\\begin{aligned}\n    O_1 &= 1.0 + 1.2 \\ C_1 + 1.4 \\ C_1C_2+ 1.6 \\ C_1C_3\\ , \\\\\n    O_2 &= 1.1 + 1.3 \\ C_1 + 1.5 \\ C_1C_2+ 1.7 \\ C_1C_3\\ , \\\\\n    O_3 &= 2.3 + 0.3\\ C_1 + 0.7 \\ C_1C_2 + 0.5 \\ C_1C_3\\ .\n\\end{aligned}\n$$","observable_central":{"('', '')":[1.0,1.1,2.3],"('', 'C1')":[1.2,1.3,0.3],"('C1', 'C2')":[1.4,1.5,0.7],"('C1', 'C3')":[1.6,1.7,0.5]}}

observable_uncertainties Record<string, object | number[]>

An object representing the uncertainties on the observable coefficients $\vec{\sigma}_m$ for each observable $O_m$. In case the observables are not themselves polynomials, the observable coefficients correspond to the polynomial approximation of the observables obtained from a Taylor expansion of the observable expressions defined in metadata.observable_expressions. The fields specify the nature of quoted uncertainty. In many cases there may only be a single top-level field, "total", but multiple fields can be used to specify a breakdown into several sources of uncertainty (e.g., statistical, scale, PDF, ...). To avoid mistakes, the names of the top-level fields must not have the format of a monomial key (i.e., stringified tuples as defined above). The value of each top-level field can either be an object or an array of floats. Objects must have the same structure as observable_central, arrays must have length $M$. If instead of an object, an array of floats is specified, it is assumed to correspond to the parameter independent uncertainty only (e.g. the uncertainty on the SM prediction). This would be equivalent to specifying an object containing a single element with the monomial key of the constant term (e.g. "('','')" for a quadratic polynomial).

Examples:

{"observable_uncertainties":{"total":{"('', '')":[0.05,0.06,0.01],"('', 'C1')":[0.1,0.12,0.01],"('C1', 'C2')":[0.02,0.03,0.02],"('C1', 'C3')":[0.05,0.06,0.01]}}}

, {"description":"Specifying only the SM uncertainties:","observable_uncertainties":{"total":[0.05,0.06,0.01]}},

{"description":"Specifying an uncertainty breakdown:","observable_uncertainties":{"MC_stats":{"('', '')":[0.002,0.0012,0.001],"('', 'C1')":[0.001,0.0015,0.0001]},"scale":{"('', '')":[0.04,0.05,0.06],"('', 'C1')":[0.1,0.12,0.01]},"PDF":{"('', '')":[0.03,0.04,0.05],"('', 'C1')":[0.02,0.08,0.01]}}}

{"description":"Specifying a breakdown for SM uncertainties only:","observable_uncertainties":{"MC_stats":[0.002,0.0012,0.001],"scale":[0.04,0.05,0.06],"PDF":[0.03,0.04,0.05]}}

All of

1. variant

2. variant

3. variant

4. variant

5. variant

6. variant

7. variant

8. variant

Definitions

conditionScaleArrayPolynomial object

metadata object

2 nested properties

polynomial_names required

scale array required

stringifiedTuplePattern string

monomialKeyPatternDeg1 object

monomialKeyPatternDeg2 object

monomialKeyPatternDeg3 object

monomialKeyPatternDeg4 object

monomialKeyPatternDeg5 object

monomialKeyPatternConstantDeg1 object

monomialKeyPatternConstantDeg2 object

monomialKeyPatternConstantDeg3 object

monomialKeyPatternConstantDeg4 object

monomialKeyPatternConstantDeg5 object