Skip to content
Go to Boltz API

Design proteins

Generate novel protein binders against a target, monitor progress, fetch results, and stop early if needed.

Protein design generates novel protein binders optimized for binding confidence and structure confidence. You give it a target to design against and a binder specification describing what to design. It streams back scored binders as they're generated, so you can fetch them before the run finishes and stop early.

run() submits the design, waits while binders are generated, and downloads scored results to a local directory. Use start() + client.experiments.download_results() to submit now and download later; download_results() resumes if the download is interrupted.

import os
from boltz_api import Boltz
client = Boltz(api_key=os.environ["BOLTZ_API_KEY"])
target = {
"type": "no_template",
"entities": [{"type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"]}],
"epitope_residues": {"A": [10, 11, 12]},
} # see Input format for structure templates, rules, ...
binder_specification = {
"type": "no_template",
"modality": "custom_protein",
"entities": [{"type": "designed_protein", "chain_ids": ["B"], "value": "10..20"}],
}
# One call: submit, wait, and download results to a run directory.
run_dir = client.protein.design.run(target=target, binder_specification=binder_specification, num_proteins=10, name="my-design")
# ...or submit now and download later:
design = client.protein.design.start(target=target, binder_specification=binder_specification, num_proteins=10)
run_dir = client.experiments.download_results(id=design.id, name="my-design") # rerun to resume an interrupted download

A design run pairs a target (the molecule or complex you're designing a binder against) with a binder specification that says what to design. For each, you choose how you're providing it: use No template if you only have a sequence, or Structure template if you have a 3D structure (a CIF file) to guide the design.

Toggle the options below to assemble the request, then copy it. The deep-dive sections explain each part, and Run shows the run() / start() calls that consume it.

target
binder_specification
{
  "target": {
    "type": "no_template",
    "entities": [
      # entity types: protein | rna | dna | ligand_smiles | ligand_ccd (at least one)
      # polymer entities also accept optional "modifications" and "cyclic": true
      { "type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"] },
      { "type": "ligand_ccd", "value": "ATP", "chain_ids": ["L1"] }
    ],
    "epitope_residues": { "A": [13, 14, 15] }, # optional: keyed by chain ID; residues the binder should contact (0-indexed)
    "non_binding_residues": { "A": [0, 1, 2] }, # optional: keyed by chain ID; residues to keep the binder away from
    "epitope_ligand_chains": ["L1"] # optional: ligand chain IDs that form part of the epitope
    # also optional: "constraints" and "bonds" (see Core Concepts)
  },
  "binder_specification": {
    "type": "no_template",
    "modality": "custom_protein", # peptide | antibody | nanobody | custom_protein
    "entities": [
      # at least one designed_protein; UPPERCASE = fixed residues, numbers/ranges = designed length
      # may also include fixed protein | rna | dna | ligand_smiles | ligand_ccd entities
      { "type": "designed_protein", "chain_ids": ["B"], "value": "MKTAYI10..20VKSHFSRQ" }
    ],
    "rules": {
      "excluded_amino_acids": ["C"], # optional: single-letter codes to exclude
      "max_hydrophobic_fraction": 0.5, # optional: cap hydrophobic-residue fraction (0–1)
      "excluded_sequence_motifs": ["NXS"] # optional: reject these motifs; X = any residue
    }
    # "rules" is optional; "bonds" is also optional (see Core Concepts)
  },
  "num_proteins": 10 # how many binders to generate (10 to 1,000,000)
}
FieldRequiredWhat it isLink
targetYesThe molecule or complex you're designing a binder against.Target
binder_specificationYesWhat to design: a scaffold to redesign, or a sequence pattern to generate.Binder
num_proteinsYesHow many binders to generate.

The target is the molecule or complex you're designing a binder against: one or more proteins, optionally with nucleic acids or ligands. The type field picks how you provide it.

Use this when you only have sequences. List the target's entities (proteins, RNA, DNA, or ligands by SMILES or CCD code) and the pipeline assembles the complex without a reference structure. Polymer entities can carry modifications and a cyclic flag.

You then shape where the binder engages:

  • Epitope: epitope_residues marks the residues you want the binder to contact, keyed by chain ID (0-indexed). To put a whole ligand in the epitope, list its chain in epitope_ligand_chains.
  • Non-binding residues: non_binding_residues marks residues to steer the binder away from. They can't overlap the epitope on the same chain.
  • Constraints and bonds: constraints (pocket and contact) and bonds (covalent links) give finer geometric control. See Core Concepts.

Structure template (type: "structure_template")

Section titled “Structure template (type: "structure_template")”

Use this when you have a 3D structure. Provide the CIF file as structure (base64-encoded or a URL) and choose the chains to use in chain_selection. Only the chains you list are included; anything else in the file is ignored.

For each polymer chain you select:

  • Crop: crop_residues chooses which residues to keep ("all", or a list of 0-indexed positions). Cropping away distant regions focuses the pipeline on the binding site.
  • Epitope, non-binding, flexible: the same epitope and non-binding concepts as above, plus flexible_residues, the residues allowed to move during design (e.g. a flexible loop). Every index must fall within the cropped set.

Ligand chains are given as { "chain_type": "ligand" } and are always kept whole.

The binder specification is what you're designing. It's independent of the target: any target form works with any binder form. All three forms share two settings:

  • Modality: modality declares the kind of binder (peptide, antibody, nanobody, or custom_protein). It sets the design priors the pipeline uses.
  • Rules (optional): rules constrains the designed sequence. excluded_amino_acids drops specific residues, max_hydrophobic_fraction caps the fraction of hydrophobic residues (0–1), and excluded_sequence_motifs rejects sequence motifs (X is a wildcard, so "NXS" avoids N-glycosylation sites).

Pick one of three forms:

Design from sequence, with no starting structure. Provide entities with at least one designed protein. Its value interleaves fixed residues (UPPERCASE) with designed regions written as a length or a min..max range:

  • "MKTAYI5..10VKSHFSRQ": fixed MKTAYI, then 5–10 designed residues, then fixed VKSHFSRQ
  • "20": 20 fully designed residues
  • "ACDE8GHI": fixed ACDE, 8 designed residues, then fixed GHI

You can add fixed entities (protein, RNA, DNA, ligand) to form a larger complex, and bonds for covalent links.

Structure template (type: "structure_template")

Section titled “Structure template (type: "structure_template")”

Redesign parts of an existing scaffold. Upload it as structure and select chains with chain_selection, cropping each as needed. Mark the regions to redesign with design_motifs, a list of two kinds:

  • Replacement (type: "replacement"): replace residues start_index to end_index (inclusive, 0-indexed) with a designed region sized by design_length_range {min, max}.
  • Insertion (type: "insertion"): insert a designed region after after_residue_index (use -1 to insert before the first residue), sized by design_length_range.

Leave design_motifs off a chain to keep it as fixed scaffold context.

Let Boltz supply the scaffold. Set binder to a curated family (boltz_nanobody or boltz_antibody) and Boltz designs from template lists it maintains and updates over time.

When you download with run() / start() + client.experiments.download_results() (or the CLI's download-results), results land in a self-contained run directory:

run() and start() + client.experiments.download_results() poll on your behalf, append each result as it's generated, and download its files into a self-contained run directory. Rerun with the same name to resume.

boltz-experiments/<name>/
boltz-experiments/
└── my-run/ # the name you chose (or an auto-generated one)
├── .boltz-run.json # run + resume state, managed for you (don't edit)
├── run.json # the run object: status, progress, engine (download URLs stripped)
└── results/
├── index.jsonl # the manifest: one JSON record per result
└── <result-id>/
├── archive.tar.gz # the downloaded result archive
├── metadata.json # this result's fields (metrics, sequence/SMILES, …)
└── files/ # extracted from the archive
├── metrics.json
├── <result-id>_predicted.cif # predicted structure
└── pae.npz

results/index.jsonl is what you read to triage a run: one compact JSON record per result, appended as results arrive. Each record mirrors the API result minus its artifacts (those are short-lived download URLs), and adds a paths map pointing at the files downloaded for that result. Each record also carries the designed binder `entities` (the generated sequence is the `value` of the chain you marked for design) and all `metrics`.

one results/index.jsonl record (pretty-printed; the file stores one per line)
{
"id": "<result-id>",
"created_at": "2026-02-25T13:03:40Z",
"metrics": { "binding_confidence": 0.94, "structure_confidence": 0.95 },
"paths": {
"archive": "results/<result-id>/archive.tar.gz",
"files": "results/<result-id>/files",
"metrics": "results/<result-id>/files/metrics.json",
"structure": "results/<result-id>/files/<result-id>_predicted.cif",
"pae": "results/<result-id>/files/pae.npz"
}
}

Everything is downloaded by default. To keep just the manifest and skip the archives, pass download_mode="metadata_only".

Each result is a scored, designed complex. This is what list_results() streams (and what each index.jsonl record mirrors):

{
  "data": [
    {
      "id": "prot_des_result_8f3a2b", # unique result ID
      "created_at": "2026-02-25T13:03:40Z",
      "entities": [
        # the designed binder complex: designed chains plus any fixed entities from the input.
        # the generated binder sequence is the "value" of the chain you marked for design (chain "B" here)
        { "type": "protein", "chain_ids": ["A"], "value": "MKTIIALSYIFCLVFA" },
        { "type": "protein", "chain_ids": ["B"], "value": "GSAEELKKLAEELAKQGNSEEVKKLAEKLAQ" }
      ],
      "metrics": {
        "binding_confidence": 0.88, # 0–1; confidence protein binding occurs (affinity probability + structural quality); 0.7+ high-confidence
        "structure_confidence": 0.91, # 0–1; confidence in the predicted structure
        "iptm": 0.86, # 0–1; interface predicted TM-score
        "min_interaction_pae": 4.9, # Ångström; interface error, lower is better
        "helix_fraction": 0.74, # 0–1; fraction of the designed sequence in alpha helices
        "sheet_fraction": 0.0, # 0–1; fraction in beta sheets
        "loop_fraction": 0.26 # 0–1; fraction in coil/loop regions
      },
      "artifacts": {
        # short-lived presigned download URLs; check url_expires_at and download promptly
        "structure": { # predicted bound structure (.cif); may be null until ready
          "url": "https://.../structure.cif",
          "url_expires_at": "2026-02-25T14:03:40Z"
        },
        "archive": { # full result archive (.tar.gz): structure, metrics.json, and pae.npz
          "url": "https://.../archive.tar.gz",
          "url_expires_at": "2026-02-25T14:03:40Z"
        }
      },
      "warnings": [] # optional quality warnings for this result, if any
    }
    # ...more results on this page
  ],
  "has_more": True, # true if more pages remain
  "first_id": "prot_des_result_8f3a2b", # ID of the first item; pass as before_id for the previous page
  "last_id": "prot_des_result_4ab7e0" # ID of the last item; pass as after_id for the next page
}

The run object tracks status and progress. It's what retrieve() returns, and what run.json mirrors:

{
  "id": "prot_des_run_8f3a2b",
  "status": "running", # pending | running | succeeded | failed | stopped
  "progress": {
    "total_proteins_to_generate": 100,
    "num_proteins_generated": 37, # generated and available to download so far
    "latest_result_id": "prot_des_result_8f3a2b"
  },
  "error": None, # { code, message } once status is "failed"
  "pipeline": "boltzprot",
  "pipeline_version": "1.0",
  "livemode": True, # false for runs created with a test key
  "workspace_id": "ws_3a2b",
  "created_at": "2026-02-25T12:00:00Z",
  "started_at": "2026-02-25T12:00:05Z",
  "completed_at": None, # set when the run finishes
  "stopped_at": None, # set if you stop the run early
  "data_deleted_at": None # set once the run's data is deleted
  # "input" echoes the request you submitted (null after data deletion)
}

For full control (and the only option in TypeScript, which has no managed download), drive the REST API yourself: poll the run for status, page through results as they're generated (cursor-paginated, so you can read them before the run finishes), and stop early. See Output format for the object shapes.

import time
design = client.protein.design.start(target=target, binder_specification=binder_specification, num_proteins=10)
# Poll the run for status and progress.
while design.status not in ("succeeded", "failed", "stopped"):
time.sleep(10)
design = client.protein.design.retrieve(design.id)
p = design.progress
print(f"{design.status}: {p.num_proteins_generated}/{p.total_proteins_to_generate}")
BINDER_CHAIN = "B" # the chain you marked for design
def binder_sequence(result):
return next(e.value for e in result.entities if BINDER_CHAIN in e.chain_ids)
# Best first: highest binding confidence, then lowest interface error.
results = list(client.protein.design.list_results(design.id))
results.sort(key=lambda r: (-r.metrics.binding_confidence, r.metrics.min_interaction_pae))
for r in results[:5]:
print(
f"{r.id} "
f"bind={r.metrics.binding_confidence:.2f} "
f"struct={r.metrics.structure_confidence:.2f} "
f"iPAE={r.metrics.min_interaction_pae:.1f}Å "
f"{binder_sequence(r)}"
)
# Stop early once you've collected enough; results already produced stay available.
client.protein.design.stop(design.id)
prot_des_result_8f3a2b bind=0.88 struct=0.91 iPAE=4.9Å GSAEELKKLAEELAKQGNSEEVKKLAEKLAQ
prot_des_result_4ab7e0 bind=0.81 struct=0.90 iPAE=5.1Å GSDELQKLAESLAKKGNTEEAKKLAEELANG
prot_des_result_1c92d4 bind=0.81 struct=0.88 iPAE=5.3Å GSKEEVERLAKKLEELGGSDEELKRLAEKLA
MetricRangeWhat it measures
binding_confidence0–1Confidence that protein binding occurs, combining affinity probability with structural quality. For triage, 0.7+ is typically the high-confidence range.
structure_confidence0–1Measures the confidence of the predicted structure (0 = low, 1 = high).
iptm0–1Interface predicted TM-score. Confidence in the protein–protein interface.
min_interaction_paeÅngströmMinimum predicted aligned error at the interface. Lower values mean higher confidence.
helix_fraction0–1Fraction of the designed sequence forming alpha helices.
sheet_fraction0–1Fraction of the designed sequence forming beta sheets.
loop_fraction0–1Fraction of the designed sequence in coil/loop regions.
StatusMeaning
pendingThe run is queued and has not started yet.
runningThe run is actively generating protein binders. Results may already be available.
succeededThe run completed all requested proteins.
failedThe run encountered an error. Check the error field.
stoppedThe run was stopped early. Partial results are available.