Predict structure and binding

Guides

Predict 3D structure coordinates, per-residue confidence scores, and binding metrics for an arbitrary biomolecular complex.

Predictions are single-input, single-output operations. You submit a biomolecular complex and get back predicted 3D structures, per-residue confidence, and (optionally) binding metrics. A prediction runs to completion and cannot be paused or stopped.

Run

run() submits the prediction, waits for it to finish, and downloads the result to a local directory. Use start() + client.experiments.download_results() to submit now and download later; download_results() resumes if the download is interrupted. The model (for example boltz-2.1) is always a separate argument from the input body.

import os
from boltz_api import Boltz

client = Boltz(api_key=os.environ["BOLTZ_API_KEY"])

# All the fields from the example below go in `input`; `model` is a separate argument.
prediction_input = {
    "entities": [
        {"type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"]},
        {"type": "ligand_smiles", "value": "CC(=O)OC1=CC=CC=C1C(=O)O", "chain_ids": ["B"]},
    ],
    "binding": {"type": "ligand_protein_binding", "binder_chain_id": "B"},
    "num_samples": 3,
}  # see Input format for constraints, bonds, templates, model_options, …

# One call: submit, wait, and download the result to a run directory.
run_dir = client.predictions.structure_and_binding.run(
    model="boltz-2.1", input=prediction_input, name="my-prediction"
)

# ...or submit now and download later:
prediction = client.predictions.structure_and_binding.start(model="boltz-2.1", input=prediction_input)
run_dir = client.experiments.download_results(id=prediction.id, name="my-prediction")  # rerun to resume an interrupted download

Write your input to structure-input.yaml (see Input format), then:

PREDICTION_ID=$(
  boltz-api --format raw predictions:structure-and-binding start \
    --model boltz-2.1 --input @yaml://./structure-input.yaml | jq -r '.id'
)

# download-results polls and downloads on your behalf; rerun with the same --name to resume.
boltz-api download-results --id "$PREDICTION_ID" --name my-prediction

The TypeScript client drives the REST API directly. Submit with start() (passing model and the input body), then poll and read the result yourself (see Use the API directly).

import Boltz from "boltz-api";

const client = new Boltz({ apiKey: process.env["BOLTZ_API_KEY"] });

const entities = [
  { type: "protein", value: "MKTIIALSYIFCLVFA", chain_ids: ["A"] },
  { type: "ligand_smiles", value: "CC(=O)OC1=CC=CC=C1C(=O)O", chain_ids: ["B"] },
]; // see Input format for binding, constraints, bonds, model_options, templates, …

const prediction = await client.predictions.structureAndBinding.start({ model: "boltz-2.1", input: { entities } });

Input format

A prediction takes the entities that make up the complex, plus optional binding, constraints, bonds, model options, and templates to steer the prediction. Only entities is required. The model (for example boltz-2.1) is a separate argument from the input body.

Toggle the binding type below and the entities swap to match; the example recomposes into a complete, valid input you can copy.

binding

{
  "entities": [
    # polymer entities also accept optional "cyclic", "modifications", and "msa"
    { "type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"] },
    { "type": "ligand_ccd", "value": "AIN", "chain_ids": ["B"] }
  ],
  "binding": { "type": "ligand_protein_binding", "binder_chain_id": "B" },
  "constraints": [ # optional: guide the geometry (pocket and/or contact)
    {
      "type": "pocket", # keep the binder near a set of receptor residues
      "binder_chain_id": "B",
      "contact_residues": { "A": [10, 11, 12] }, # receptor residues lining the pocket (0-indexed)
      "max_distance_angstrom": 6.0,
      "force": False # bias by default; true = hard-enforce
    },
    {
      "type": "contact", # keep two residues/atoms within a distance
      "token1": { "type": "polymer_contact", "chain_id": "A", "residue_index": 3 },
      "token2": { "type": "polymer_contact", "chain_id": "A", "residue_index": 11 },
      "max_distance_angstrom": 8.0,
      "force": False
    }
  ],
  "bonds": [ # optional: covalent bonds between specific atoms
    {
      "atom1": { "type": "polymer_atom", "chain_id": "A", "residue_index": 11, "atom_name": "SG" },
      "atom2": { "type": "polymer_atom", "chain_id": "A", "residue_index": 1, "atom_name": "NZ" }
    }
  ],
  "model_options": { # optional: sampler / diffusion controls
    "recycling_steps": 3, # default 3
    "sampling_steps": 200, # default 200
    "step_scale": 1.638 # diffusion temperature; higher = more diverse poses. default 1.638
  },
  "templates": [ # optional: up to 4 CIF/PDB templates that guide protein-chain geometry
    {
      "template_structure": { # or base64 with media_type "chemical/x-cif" or "chemical/x-pdb"
        "type": "url",
        "url": "https://files.rcsb.org/download/1CRN.cif" # real RCSB mmCIF (crambin); swap for a homolog of your actual target
      },
      "template_chains": [
        { "input_chain_id": "A", "template_chain_id": "A" } # request chain -> template-file chain
      ],
      "force_threshold_angstroms": 5.0 # optional: force the template reference within this distance
    }
  ],
  "num_samples": 3 # optional: number of independent structure samples to predict
}

Field	Required	What it is	Link
`entities`	Yes	The chains that make up the complex (proteins, nucleic acids, ligands).	Entities
`binding`	No	Asks the model to compute binding metrics for the complex.	Binding
`constraints`	No	Pocket and contact constraints that guide the geometry.	Constraints
`bonds`	No	Covalent bonds between specific atoms.	Bonds
`model_options`	No	Sampler and diffusion controls.	Model options
`templates`	No	CIF/PDB structures that guide protein-chain geometry.	Templates
`num_samples`	No	How many independent structure samples to predict.	—

Entities (`entities`)

entities is the list of chains in your complex. Each has a type (protein, rna, dna, ligand_smiles, or ligand_ccd), a value (sequence, SMILES, or CCD code), and chain_ids. Polymer entities accept optional modifications and a cyclic flag. Protein entities also accept an msa: { "type": "empty" } to predict without an MSA, or { "type": "custom", "format": "a3m" | "csv", "source": ... } to supply your own (by URL or base64). See Core Concepts for entities, constraints, modifications, and bonds.

Binding (`binding`)

Providing binding tells the model to compute binding metrics alongside the structure. Omit it for a structure-only prediction. It has two shapes, selected by type:

Ligand–protein (type: "ligand_protein_binding"): set binder_chain_id to the ligand chain. The ligand must have exactly one copy and the complex must contain only proteins and ligands.
Protein–protein (type: "protein_protein_binding"): set binder_chain_ids to one or more protein binder chains.

When binding is requested, the output includes binding_metrics in addition to the per-sample structure results.

Constraints (`constraints`)

constraints steer where chains sit relative to each other. Each entry is one of two types:

Pocket (type: "pocket"): keep binder_chain_id near the receptor residues in contact_residues (keyed by chain ID, 0-indexed), within max_distance_angstrom (typical 4–8 Å).
Contact (type: "contact"): keep two tokens within max_distance_angstrom. Each token is a polymer_contact (chain_id + residue_index) or a ligand_contact (chain_id + atom_name).

Set force: true to hard-enforce a constraint instead of biasing toward it. Atom-level ligand references (ligand_contact) support ligand_ccd entities only, not ligand_smiles.

Bonds (`bonds`)

bonds declares covalent bonds between specific atoms (for example a disulfide or a covalent ligand). Each bond has atom1 and atom2, where an atom is a polymer_atom (chain_id, residue_index, atom_name) or a ligand_atom (chain_id, atom_name). Ligand atoms support ligand_ccd entities only.

Model options (`model_options`)

model_options tunes the sampler and diffusion:

Option	Default	What it does
`recycling_steps`	3	Number of recycling steps during prediction.
`sampling_steps`	200	Number of diffusion sampling steps.
`step_scale`	1.638	Diffusion step scale (temperature); higher values produce more varied poses.

Templates (`templates`)

templates supplies up to four CIF or PDB structures that guide Boltz-2.1's protein-chain geometry at inference time. Each template has a template_structure (an HTTPS url, or a base64 upload with media_type chemical/x-cif or chemical/x-pdb); the file format is inferred from the URL extension or base64 media type. Use template_chains to map each request chain (input_chain_id) to the matching chain in the template file (template_chain_id). Set force_threshold_angstroms to force the template reference within that distance; omit it to use the template without hard forcing.

Output format

A prediction returns a single object; there's no streaming page of results.

When you download with run() / start() + client.experiments.download_results() (or the CLI's download-results), the result lands in a self-contained run directory:

boltz-experiments/
└── my-prediction/                # the name you chose (or an auto-generated one)
    ├── .boltz-run.json           # run + resume state, managed for you (don't edit)
    ├── run.json                  # the prediction object: status, model, metrics (download URLs stripped)
    └── outputs/
        ├── archive.tar.gz        # the downloaded result archive
        └── files/                # extracted from the archive
            ├── sample_0.cif      # one predicted structure per requested sample
            └── ...

A prediction has a single output, so it lands under outputs/ rather than the results/index.jsonl + results/<result-id>/ layout the design and screening guides use for their many results.

The prediction object is what retrieve() returns and what run.json mirrors. Poll its status; output is null until the prediction succeeds, then carries the structure samples and binding metrics:

{
  "id": "pred_8f3a2b",
  "status": "succeeded", # pending | running | succeeded | failed
  "model": "boltz-2.1",
  "error": None, # { code, message } once status is "failed"
  "output": { # null until status is "succeeded"
    "best_sample": {}, # the highest-confidence sample (same shape as an entry in all_sample_results)
    "all_sample_results": [
      {
        "metrics": {
          "structure_confidence": 0.91, # 0–1; confidence in the predicted structure
          "ptm": 0.92, # 0–1; global predicted TM-score
          "iptm": 0.86, # 0–1; interface predicted TM-score
          "complex_plddt": 0.95, # 0–1; per-residue confidence averaged over the complex
          "complex_iplddt": 0.88, # 0–1; confidence at inter-chain interfaces
          "complex_pde": 1.2, # predicted distance error; lower is better
          "complex_ipde": 1.8 # interface predicted distance error; lower is better
          # ligand_iptm / protein_iptm appear when ligands / multiple proteins are present
        },
        "structure": { # predicted structure (.cif) for this sample
          "url": "https://.../sample_0.cif",
          "url_expires_at": "2026-02-25T14:03:40Z"
        }
      }
      # one entry per requested sample (num_samples)
    ],
    "binding_metrics": { # present only when "binding" was requested
      "type": "ligand_protein_binding_metrics",
      "binding_confidence": 0.88, # 0–1; confidence protein binding occurs (affinity probability + structural quality); 0.7+ high-confidence
      "optimization_score": 0.53 # 0–1; ranks relative binding strength for lead optimization (ligand–protein only); higher is better
    },
    "archive": { # optional: full result archive (.tar.gz)
      "url": "https://.../archive.tar.gz",
      "url_expires_at": "2026-02-25T14:03:40Z"
    }
  }
}

Use the API directly

For full control, or in TypeScript, where there's no managed download, drive the REST API yourself: submit with start(), then poll the prediction with retrieve() until it finishes and read the samples and binding metrics. A prediction has no list_results or stop; results arrive all at once when it succeeds. See Output format for the object shape. Request multiple structure samples with num_samples; output.best_sample is the highest-confidence one.

import time

prediction = client.predictions.structure_and_binding.start(model="boltz-2.1", input=prediction_input)

while prediction.status not in ("succeeded", "failed"):
    time.sleep(5)
    prediction = client.predictions.structure_and_binding.retrieve(prediction.id)

if prediction.status == "succeeded":
    print(f"Binding confidence: {prediction.output.binding_metrics.binding_confidence}")
    for sample in prediction.output.all_sample_results:
        print(f"{sample.metrics.structure_confidence:.2f}  {sample.structure.url}")

boltz-api predictions:structure-and-binding retrieve --id "$PREDICTION_ID"  # rerun until status is succeeded

let prediction = await client.predictions.structureAndBinding.start({
  model: "boltz-2.1",
  input: {
    entities: [
      { type: "protein", value: "MKTIIALSYIFCLVFA", chain_ids: ["A"] },
      { type: "ligand_smiles", value: "CC(=O)OC1=CC=CC=C1C(=O)O", chain_ids: ["B"] },
    ],
    num_samples: 3,
    binding: { type: "ligand_protein_binding", binder_chain_id: "B" },
  },
});

while (prediction.status !== "succeeded" && prediction.status !== "failed") {
  await new Promise((r) => setTimeout(r, 5000));
  prediction = await client.predictions.structureAndBinding.retrieve(prediction.id);
}

if (prediction.status === "succeeded") {
  console.log(`Binding confidence: ${prediction.output.binding_metrics.binding_confidence}`);
  for (const sample of prediction.output.all_sample_results) {
    console.log(`${sample.metrics.structure_confidence}  ${sample.structure.url}`);
  }
}

Metrics

Metric	Range	What it measures
`structure_confidence`	0–1	Measures the confidence of the predicted structure (0 = low, 1 = high).
`ptm`	0–1	Global predicted TM-score.
`iptm`	0–1	Interface predicted TM-score. `ligand_iptm` / `protein_iptm` appear per interface type.
`complex_plddt`	0–1	Per-residue confidence averaged over the complex.
`complex_iplddt`	0–1	Confidence at inter-chain interfaces.
`complex_pde`	Ångström	Predicted distance error across the complex. Lower is better.
`complex_ipde`	Ångström	Interface predicted distance error. Lower is better.
`binding_confidence`	0–1	Confidence that protein binding occurs, combining affinity probability with structural quality. For triage, 0.7+ is typically the high-confidence range. Present when binding is requested.
`optimization_score`	0–1	Ranks relative binding strength for lead optimization (ligand–protein binding only), normalized 0–1; higher is better. Use it to prioritize the top-scoring candidates within the same run.

Status values

Status	Meaning
`pending`	The prediction is queued and has not started yet.
`running`	The prediction is currently being computed.
`succeeded`	The prediction completed successfully. Results are available.
`failed`	The prediction encountered an error. Check the `error` field.

Predict structure and binding

Run

Input format

Entities (entities)

Binding (binding)

Constraints (constraints)

Bonds (bonds)

Model options (model_options)

Templates (templates)

Output format

Use the API directly

Metrics

Status values

Entities (`entities`)

Binding (`binding`)

Constraints (`constraints`)

Bonds (`bonds`)

Model options (`model_options`)

Templates (`templates`)