Core Concepts
Key concepts for working with the Boltz API — entities, metrics, binding, file inputs, constraints, modifications, and bonds.
Entities and chain IDs
Section titled “Entities and chain IDs”Molecular systems are described as a list of entities, each with a type and value field, along with one or more chain IDs. The supported entity types are:
- Protein — Amino acid sequence (single-letter codes) in the
valuefield. Supportsmodificationsandcyclicoptions. - RNA — Ribonucleotide sequence in the
valuefield. - DNA — Deoxyribonucleotide sequence in the
valuefield. - Ligand (SMILES) — Small molecule defined by a SMILES string in the
valuefield. - Ligand (CCD) — Small molecule defined by a CCD code in the
valuefield.
Chain IDs are used throughout the API to reference specific chains in constraints, bonds, binding configuration, and results.
Metrics
Section titled “Metrics”The API returns several confidence and quality metrics with prediction and screening results:
| Metric | What it measures |
|---|---|
| pTM (predicted TM-score) | Overall predicted structural similarity to the true structure. Higher is better. |
| ipTM (interface predicted TM-score) | Confidence in the predicted interface between chains. Variants include protein_iptm and ligand_iptm for specific interaction types. |
| pLDDT (predicted Local Distance Difference Test) | Per-residue confidence in the predicted structure. Reported as normalized 0–1 floats via complex_plddt across the complex and complex_iplddt for the interface. |
| PDE (Predicted Distance Error) | Expected positional error between residue pairs. Reported as complex_pde and complex_ipde for the interface. |
| Structure confidence | Overall confidence score for the predicted structure. |
| Binding confidence | Confidence that binding occurs (when binding is requested). |
| Optimization score | Binding strength ranking score, useful for lead optimization (when binding is requested). |
Binding
Section titled “Binding”Binding configuration tells the model to compute binding metrics for the prediction. There are two binding types:
- Ligand-protein binding (
ligand_protein_binding) — Specify abinder_chain_idpointing to a ligand chain. The ligand must have exactly one copy (single chain ID) and the complex must contain only ligands and proteins. - Protein-protein binding (
protein_protein_binding) — Specifybinder_chain_idspointing to one or more protein chains.
When binding is provided, the prediction output includes binding metrics (binding_confidence and optimization_score) in addition to structural results.
File inputs
Section titled “File inputs”The API accepts file inputs in two formats:
- URL — Provide a publicly accessible URL to the file.
- Base64 — Provide the file contents as a Base64-encoded string, along with a
media_type(e.g.,chemical/x-cif).
Constraints
Section titled “Constraints”Constraints guide predictions by specifying spatial relationships. There are two constraint types:
- Pocket constraints — Define a binding pocket by specifying a
binder_chain_idandcontact_residues(a mapping of chain IDs to arrays of 0-based residue indices). Includes amax_distance_angstromparameter. - Contact constraints — Require two tokens to be within a maximum distance. Tokens can be:
polymer_contact— Identifies a residue on a polymer chain (chain ID + residue index).ligand_contact— Identifies an atom on a ligand chain (chain ID + atom name).
All residue indices in constraints are 0-indexed.
Modifications
Section titled “Modifications”Modifications can be applied to residues in protein, RNA, and DNA entities:
- CCD modifications — Reference a modification by its CCD code at a specific residue index.
- SMILES modifications — Define a custom modification using a SMILES string at a specific residue index.
Bonds are separate from constraints and define covalent bonds between specific atoms. Each bond specifies two atoms via atom1 and atom2, where each atom reference includes a chain_id, residue_index, and atom_name.