Skip to content
Go to Boltz API

Predict structure and binding

Predict 3D structure coordinates, per-residue confidence scores, and binding metrics for an arbitrary biomolecular complex.

Predictions are single-input, single-output operations. You submit a biomolecular complex and get back predicted 3D structures, per-residue confidence, and (optionally) binding metrics. A prediction runs to completion and cannot be paused or stopped.

run() submits the prediction, waits for it to finish, and downloads the result to a local directory. Use start() + client.experiments.download_results() to submit now and download later; download_results() resumes if the download is interrupted. The model (for example boltz-2.1) is always a separate argument from the input body.

import os
from boltz_api import Boltz
client = Boltz(api_key=os.environ["BOLTZ_API_KEY"])
# All the fields from the example below go in `input`; `model` is a separate argument.
prediction_input = {
"entities": [
{"type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"]},
{"type": "ligand_smiles", "value": "CC(=O)OC1=CC=CC=C1C(=O)O", "chain_ids": ["B"]},
],
"binding": {"type": "ligand_protein_binding", "binder_chain_id": "B"},
"num_samples": 3,
} # see Input format for constraints, bonds, templates, model_options, …
# One call: submit, wait, and download the result to a run directory.
run_dir = client.predictions.structure_and_binding.run(
model="boltz-2.1", input=prediction_input, name="my-prediction"
)
# ...or submit now and download later:
prediction = client.predictions.structure_and_binding.start(model="boltz-2.1", input=prediction_input)
run_dir = client.experiments.download_results(id=prediction.id, name="my-prediction") # rerun to resume an interrupted download

A prediction takes the entities that make up the complex, plus optional binding, constraints, bonds, model options, and templates to steer the prediction. Only entities is required. The model (for example boltz-2.1) is a separate argument from the input body.

Toggle the binding type below and the entities swap to match; the example recomposes into a complete, valid input you can copy.

binding
{
  "entities": [
    # polymer entities also accept optional "cyclic", "modifications", and "msa"
    { "type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"] },
    { "type": "ligand_ccd", "value": "AIN", "chain_ids": ["B"] }
  ],
  "binding": { "type": "ligand_protein_binding", "binder_chain_id": "B" },
  "constraints": [ # optional: guide the geometry (pocket and/or contact)
    {
      "type": "pocket", # keep the binder near a set of receptor residues
      "binder_chain_id": "B",
      "contact_residues": { "A": [10, 11, 12] }, # receptor residues lining the pocket (0-indexed)
      "max_distance_angstrom": 6.0,
      "force": False # bias by default; true = hard-enforce
    },
    {
      "type": "contact", # keep two residues/atoms within a distance
      "token1": { "type": "polymer_contact", "chain_id": "A", "residue_index": 3 },
      "token2": { "type": "polymer_contact", "chain_id": "A", "residue_index": 11 },
      "max_distance_angstrom": 8.0,
      "force": False
    }
  ],
  "bonds": [ # optional: covalent bonds between specific atoms
    {
      "atom1": { "type": "polymer_atom", "chain_id": "A", "residue_index": 11, "atom_name": "SG" },
      "atom2": { "type": "polymer_atom", "chain_id": "A", "residue_index": 1, "atom_name": "NZ" }
    }
  ],
  "model_options": { # optional: sampler / diffusion controls
    "recycling_steps": 3, # default 3
    "sampling_steps": 200, # default 200
    "step_scale": 1.638 # diffusion temperature; higher = more diverse poses. default 1.638
  },
  "templates": [ # optional: up to 4 CIF/PDB templates that guide protein-chain geometry
    {
      "template_structure": { # or base64 with media_type "chemical/x-cif" or "chemical/x-pdb"
        "type": "url",
        "url": "https://files.rcsb.org/download/1CRN.cif" # real RCSB mmCIF (crambin); swap for a homolog of your actual target
      },
      "template_chains": [
        { "input_chain_id": "A", "template_chain_id": "A" } # request chain -> template-file chain
      ],
      "force_threshold_angstroms": 5.0 # optional: force the template reference within this distance
    }
  ],
  "num_samples": 3 # optional: number of independent structure samples to predict
}
FieldRequiredWhat it isLink
entitiesYesThe chains that make up the complex (proteins, nucleic acids, ligands).Entities
bindingNoAsks the model to compute binding metrics for the complex.Binding
constraintsNoPocket and contact constraints that guide the geometry.Constraints
bondsNoCovalent bonds between specific atoms.Bonds
model_optionsNoSampler and diffusion controls.Model options
templatesNoCIF/PDB structures that guide protein-chain geometry.Templates
num_samplesNoHow many independent structure samples to predict.

entities is the list of chains in your complex. Each has a type (protein, rna, dna, ligand_smiles, or ligand_ccd), a value (sequence, SMILES, or CCD code), and chain_ids. Polymer entities accept optional modifications and a cyclic flag. Protein entities also accept an msa: { "type": "empty" } to predict without an MSA, or { "type": "custom", "format": "a3m" | "csv", "source": ... } to supply your own (by URL or base64). See Core Concepts for entities, constraints, modifications, and bonds.

Providing binding tells the model to compute binding metrics alongside the structure. Omit it for a structure-only prediction. It has two shapes, selected by type:

  • Ligand–protein (type: "ligand_protein_binding"): set binder_chain_id to the ligand chain. The ligand must have exactly one copy and the complex must contain only proteins and ligands.
  • Protein–protein (type: "protein_protein_binding"): set binder_chain_ids to one or more protein binder chains.

When binding is requested, the output includes binding_metrics in addition to the per-sample structure results.

constraints steer where chains sit relative to each other. Each entry is one of two types:

  • Pocket (type: "pocket"): keep binder_chain_id near the receptor residues in contact_residues (keyed by chain ID, 0-indexed), within max_distance_angstrom (typical 4–8 Å).
  • Contact (type: "contact"): keep two tokens within max_distance_angstrom. Each token is a polymer_contact (chain_id + residue_index) or a ligand_contact (chain_id + atom_name).

Set force: true to hard-enforce a constraint instead of biasing toward it. Atom-level ligand references (ligand_contact) support ligand_ccd entities only, not ligand_smiles.

bonds declares covalent bonds between specific atoms (for example a disulfide or a covalent ligand). Each bond has atom1 and atom2, where an atom is a polymer_atom (chain_id, residue_index, atom_name) or a ligand_atom (chain_id, atom_name). Ligand atoms support ligand_ccd entities only.

model_options tunes the sampler and diffusion:

OptionDefaultWhat it does
recycling_steps3Number of recycling steps during prediction.
sampling_steps200Number of diffusion sampling steps.
step_scale1.638Diffusion step scale (temperature); higher values produce more varied poses.

templates supplies up to four CIF or PDB structures that guide Boltz-2.1's protein-chain geometry at inference time. Each template has a template_structure (an HTTPS url, or a base64 upload with media_type chemical/x-cif or chemical/x-pdb); the file format is inferred from the URL extension or base64 media type. Use template_chains to map each request chain (input_chain_id) to the matching chain in the template file (template_chain_id). Set force_threshold_angstroms to force the template reference within that distance; omit it to use the template without hard forcing.

A prediction returns a single object; there's no streaming page of results.

When you download with run() / start() + client.experiments.download_results() (or the CLI's download-results), the result lands in a self-contained run directory:

boltz-experiments/<name>/
boltz-experiments/
└── my-prediction/ # the name you chose (or an auto-generated one)
├── .boltz-run.json # run + resume state, managed for you (don't edit)
├── run.json # the prediction object: status, model, metrics (download URLs stripped)
└── outputs/
├── archive.tar.gz # the downloaded result archive
└── files/ # extracted from the archive
├── sample_0.cif # one predicted structure per requested sample
└── ...

A prediction has a single output, so it lands under outputs/ rather than the results/index.jsonl + results/<result-id>/ layout the design and screening guides use for their many results.

The prediction object is what retrieve() returns and what run.json mirrors. Poll its status; output is null until the prediction succeeds, then carries the structure samples and binding metrics:

{
  "id": "pred_8f3a2b",
  "status": "succeeded", # pending | running | succeeded | failed
  "model": "boltz-2.1",
  "error": None, # { code, message } once status is "failed"
  "output": { # null until status is "succeeded"
    "best_sample": {}, # the highest-confidence sample (same shape as an entry in all_sample_results)
    "all_sample_results": [
      {
        "metrics": {
          "structure_confidence": 0.91, # 0–1; confidence in the predicted structure
          "ptm": 0.92, # 0–1; global predicted TM-score
          "iptm": 0.86, # 0–1; interface predicted TM-score
          "complex_plddt": 0.95, # 0–1; per-residue confidence averaged over the complex
          "complex_iplddt": 0.88, # 0–1; confidence at inter-chain interfaces
          "complex_pde": 1.2, # predicted distance error; lower is better
          "complex_ipde": 1.8 # interface predicted distance error; lower is better
          # ligand_iptm / protein_iptm appear when ligands / multiple proteins are present
        },
        "structure": { # predicted structure (.cif) for this sample
          "url": "https://.../sample_0.cif",
          "url_expires_at": "2026-02-25T14:03:40Z"
        }
      }
      # one entry per requested sample (num_samples)
    ],
    "binding_metrics": { # present only when "binding" was requested
      "type": "ligand_protein_binding_metrics",
      "binding_confidence": 0.88, # 0–1; confidence protein binding occurs (affinity probability + structural quality); 0.7+ high-confidence
      "optimization_score": 0.53 # 0–1; ranks relative binding strength for lead optimization (ligand–protein only); higher is better
    },
    "archive": { # optional: full result archive (.tar.gz)
      "url": "https://.../archive.tar.gz",
      "url_expires_at": "2026-02-25T14:03:40Z"
    }
  }
}

For full control, or in TypeScript, where there's no managed download, drive the REST API yourself: submit with start(), then poll the prediction with retrieve() until it finishes and read the samples and binding metrics. A prediction has no list_results or stop; results arrive all at once when it succeeds. See Output format for the object shape. Request multiple structure samples with num_samples; output.best_sample is the highest-confidence one.

import time
prediction = client.predictions.structure_and_binding.start(model="boltz-2.1", input=prediction_input)
while prediction.status not in ("succeeded", "failed"):
time.sleep(5)
prediction = client.predictions.structure_and_binding.retrieve(prediction.id)
if prediction.status == "succeeded":
print(f"Binding confidence: {prediction.output.binding_metrics.binding_confidence}")
for sample in prediction.output.all_sample_results:
print(f"{sample.metrics.structure_confidence:.2f} {sample.structure.url}")
MetricRangeWhat it measures
structure_confidence0–1Measures the confidence of the predicted structure (0 = low, 1 = high).
ptm0–1Global predicted TM-score.
iptm0–1Interface predicted TM-score. ligand_iptm / protein_iptm appear per interface type.
complex_plddt0–1Per-residue confidence averaged over the complex.
complex_iplddt0–1Confidence at inter-chain interfaces.
complex_pdeÅngströmPredicted distance error across the complex. Lower is better.
complex_ipdeÅngströmInterface predicted distance error. Lower is better.
binding_confidence0–1Confidence that protein binding occurs, combining affinity probability with structural quality. For triage, 0.7+ is typically the high-confidence range. Present when binding is requested.
optimization_score0–1Ranks relative binding strength for lead optimization (ligand–protein binding only), normalized 0–1; higher is better. Use it to prioritize the top-scoring candidates within the same run.
StatusMeaning
pendingThe prediction is queued and has not started yet.
runningThe prediction is currently being computed.
succeededThe prediction completed successfully. Results are available.
failedThe prediction encountered an error. Check the error field.