Screen protein libraries

Guides

Score your own protein sequences against a target, fetch results as they arrive, and stop early if needed.

A protein library screen scores protein sequences you provide against a target. Each protein is evaluated for binding confidence, structure confidence, and secondary structure composition. Results stream in as proteins are scored — you can fetch them before the screen finishes and stop early.

Results and artifacts

Library screens generate protein results over time. As soon as a protein is scored, you can read and download the result without waiting for the full screen to finish.

Each screened protein result includes scoring metrics such as binding confidence, structure confidence, and interaction PAE. Each result also includes downloadable artifacts for the predicted bound structure and PAE.

Define a target

There are two ways to define the target.

Structure template target

Provide a CIF file and select which chains and residues to include. The file can be provided as a URL or base64-encoded content. Each chain gets a crop_residues field (an array of 0-indexed residue indices, or "all" to keep the entire chain). You can optionally specify epitope_residues (residues the binder should contact) and flexible_residues (residues allowed to move during design).

{
  "target": {
    "type": "structure_template",
    "structure": {
      "type": "base64",
      "media_type": "chemical/x-cif",
      "data": "ZGF0YV90YXJnZXQK..."
    },
    "chain_selection": {
      "A": {
        "chain_type": "polymer",
        "crop_residues": "all",
        "epitope_residues": [45, 46, 47, 48, 49],
        "flexible_residues": [44, 50]
      },
      "B": {
        "chain_type": "polymer",
        "crop_residues": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
      }
    }
  }
}

No-template target

Define the target from an entity list when you don’t have a structure file. Optionally add epitope_residues (keyed by chain ID), constraints, and bonds.

{
  "target": {
    "type": "no_template",
    "entities": [
      { "type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"] }
    ],
    "epitope_residues": {
      "A": [45, 46, 47, 48, 49]
    }
  }
}

Provide proteins

Pass proteins inline as an array of objects. Each entry has an entities field (an array of entities forming the complex to screen) and an optional id:

{
  "proteins": [
    {
      "entities": [
        { "type": "protein", "value": "MKTAYIVKSHFSRQ", "chain_ids": ["B"] }
      ],
      "id": "binder-001"
    },
    {
      "entities": [
        {
          "type": "protein",
          "value": "ACDEFGHIKLMNPQRSTVWY",
          "chain_ids": ["B"]
        }
      ],
      "id": "binder-002"
    }
  ]
}

When you provide an id, it’s returned as external_id on the corresponding result — use this to correlate results back to your input library.

Run a screen and download results

run_protein_library_screen() submits the screen, waits while proteins are scored, downloads result archives, and returns a local run directory.

import os
from boltz_api import Boltz

client = Boltz(api_key=os.environ["BOLTZ_API_KEY"])

run_dir = client.experiments.run_protein_library_screen(
    target={
        "type": "no_template",
        "entities": [
            {"type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"]},
        ],
        "epitope_residues": {"A": [10, 11, 12]},
    },
    proteins=[
        {
            "entities": [
                {"type": "protein", "value": "MKTAYIVKSHFSRQ", "chain_ids": ["B"]}
            ],
            "id": "binder-001",
        },
        {
            "entities": [
                {"type": "protein", "value": "ACDEFGHIKLMNPQRSTVWY", "chain_ids": ["B"]}
            ],
            "id": "binder-002",
        },
    ],
    name="protein-screen-hotspots",
)
print(run_dir)

The run directory contains the sanitized run record, resumable download state, a result manifest, and downloaded files for each screened protein:

boltz-experiments/protein-screen-hotspots/
  .boltz-run.json
  run.json
  results/
    index.jsonl
    prot_scr_result_.../
      metadata.json
      archive.tar.gz
      files/
        result/
          metrics.json
          predicted_structure.cif
          pae.npz

Run a screen and download results

The CLI starts the remote screen, then download-results waits, resumes if interrupted, and writes results under boltz-experiments/protein-screen-hotspots/.

Save your inputs to protein-screen.yaml:

target:
  type: no_template
  entities:
    - type: protein
      value: MKTIIALSYIFCLVFA
      chain_ids: ["A"]
proteins:
  - entities:
      - type: protein
        value: MKTAYIVKSHFSRQ
        chain_ids: ["B"]
    id: binder-001
  - entities:
      - type: protein
        value: ACDEFGHIKLMNPQRSTVWY
        chain_ids: ["B"]
    id: binder-002

Then start the screen and download:

SCREEN_ID=$(
  boltz-api --format raw protein:library-screen start \
    --input @yaml://./protein-screen.yaml |
    jq -r '.id'
)

run_dir=$(boltz-api download-results --id "$SCREEN_ID" --name protein-screen-hotspots)
echo "$run_dir"

Use --input @json://./protein-screen.json if your input file is JSON.

The run directory contains the sanitized run record, resumable download state, a result manifest, and downloaded files for each screened protein:

boltz-experiments/protein-screen-hotspots/
  .boltz-run.json
  run.json
  results/
    index.jsonl
    prot_scr_result_.../
      metadata.json
      archive.tar.gz
      files/
        result/
          metrics.json
          predicted_structure.cif
          pae.npz

Start a screen

import os
from boltz_api import Boltz

client = Boltz(api_key=os.environ["BOLTZ_API_KEY"])

screen = client.protein.library_screen.start(
    target={
        "type": "no_template",
        "entities": [
            {"type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"]},
        ],
    },
    proteins=[
        {
            "entities": [
                {"type": "protein", "value": "MKTAYIVKSHFSRQ", "chain_ids": ["B"]}
            ],
            "id": "binder-001",
        },
        {
            "entities": [
                {"type": "protein", "value": "ACDEFGHIKLMNPQRSTVWY", "chain_ids": ["B"]}
            ],
            "id": "binder-002",
        },
    ],
)
print(f"Screen ID: {screen.id}, Status: {screen.status}")

Start a screen

import Boltz from "boltz-api";

const apiKey = process.env["BOLTZ_API_KEY"];

const client = new Boltz({ apiKey });

let screen = await client.protein.libraryScreen.start({
  target: {
    type: "no_template",
    entities: [
      { type: "protein", value: "MKTIIALSYIFCLVFA", chain_ids: ["A"] },
    ],
  },
  proteins: [
    {
      entities: [
        { type: "protein", value: "MKTAYIVKSHFSRQ", chain_ids: ["B"] },
      ],
      id: "binder-001",
    },
    {
      entities: [
        { type: "protein", value: "ACDEFGHIKLMNPQRSTVWY", chain_ids: ["B"] },
      ],
      id: "binder-002",
    },
  ],
});
console.log(`Screen ID: ${screen.id}, Status: ${screen.status}`);

Run a screen and download results

Ask the agent to save the payload to protein-screen.yaml, estimate cost, then submit with a stable idempotency key.

target:
  type: no_template
  entities:
    - type: protein
      value: MKTIIALSYIFCLVFA
      chain_ids: ["A"]
proteins:
  - entities:
      - type: protein
        value: MKTAYIVKSHFSRQ
        chain_ids: ["B"]
    id: binder-001
  - entities:
      - type: protein
        value: ACDEFGHIKLMNPQRSTVWY
        chain_ids: ["B"]
    id: binder-002

boltz-api protein:library-screen estimate-cost \
  --input @yaml://./protein-screen.yaml

SCREEN_ID=$(
  boltz-api protein:library-screen start \
    --idempotency-key "protein-screen-hotspots" \
    --input @yaml://./protein-screen.yaml \
    --raw-output --transform id
)

boltz-api download-results \
  --id "$SCREEN_ID" \
  --name "protein-screen-hotspots" \
  --root-dir "./boltz-experiments" \
  --poll-interval-seconds 10

Start now, download later

The main run_protein_library_screen() example already waits and downloads. To submit now and download later, use start_protein_library_screen() and resume with wait_and_download().

run_dir = client.experiments.start_protein_library_screen(
    target={
        "type": "no_template",
        "entities": [
            {"type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"]},
        ],
        "epitope_residues": {"A": [10, 11, 12]},
    },
    proteins=[
        {
            "entities": [
                {"type": "protein", "value": "MKTAYIVKSHFSRQ", "chain_ids": ["B"]}
            ],
            "id": "binder-001",
        },
    ],
    name="submit-now-finish-later",
)

client.experiments.wait_and_download(run_dir=run_dir)

Resume downloads

download-results is the progress monitor and downloader. It can be rerun with the same name to resume from local checkpoint state.

boltz-api download-results --name protein-screen-hotspots
boltz-api --format json download-status --name protein-screen-hotspots

Monitor progress

import time

while screen.status not in ("succeeded", "failed", "stopped"):
    time.sleep(10)
    screen = client.protein.library_screen.retrieve(screen.id)
    progress = screen.progress
    print(
        f"Status: {screen.status}, Screened: {progress.num_proteins_screened}/{progress.total_proteins_to_screen}"
    )

Monitor progress

while (!["succeeded", "failed", "stopped"].includes(screen.status)) {
  await new Promise((r) => setTimeout(r, 10000));
  screen = await client.protein.libraryScreen.retrieve(screen.id);
  const progress = screen.progress;
  console.log(
    `Status: ${screen.status}, Screened: ${progress.num_proteins_screened}/${progress.total_proteins_to_screen}`,
  );
}

Resume downloads

Agents can rerun download-results with the same name and root directory to resume from local checkpoint state.

boltz-api download-results \
  --name "protein-screen-hotspots" \
  --root-dir "./boltz-experiments" \
  --poll-interval-seconds 10

boltz-api --format json download-status \
  --name "protein-screen-hotspots" \
  --root-dir "./boltz-experiments"

Inspect downloaded results

Result pages and artifact archives are already downloaded into the run directory.

print(run_dir)
for result_dir in (run_dir / "results").iterdir():
    print(result_dir)

Inspect downloaded results

Result pages and artifact archives are already downloaded into the run directory.

ls boltz-experiments/protein-screen-hotspots/results

Fetch paginated results

for result in client.protein.library_screen.list_results(screen.id):
    print(f"Result {result.id} (external: {result.external_id}):")
    print(f"  Binding confidence: {result.metrics.binding_confidence}")
    print(f"  Structure confidence: {result.metrics.structure_confidence}")
    print(f"  Min interaction PAE: {result.metrics.min_interaction_pae}")
    print(f"  Archive URL: {result.artifacts.archive.url}")

Fetch paginated results

for await (const result of client.protein.libraryScreen.listResults(
  screen.id,
)) {
  console.log(`Result ${result.id} (external: ${result.external_id}):`);
  console.log(`  Binding confidence: ${result.metrics.binding_confidence}`);
  console.log(`  Structure confidence: ${result.metrics.structure_confidence}`);
  console.log(`  Min interaction PAE: ${result.metrics.min_interaction_pae}`);
  console.log(`  Archive URL: ${result.artifacts.archive.url}`);
}

Inspect downloaded results

Use the local result manifest and downloaded artifact directories after download-results completes or while it is resuming.

ls ./boltz-experiments/protein-screen-hotspots/results
head -n 5 ./boltz-experiments/protein-screen-hotspots/results/index.jsonl

Stop early

client.experiments.stop(run_dir=run_dir)

boltz-api protein:library-screen stop --id "$SCREEN_ID"

client.protein.library_screen.stop(screen.id)

await client.protein.libraryScreen.stop(screen.id);

boltz-api protein:library-screen stop --id "<screen-id-from-start>"

Metrics

Metric	Range	What it measures
`binding_confidence`	0–1	Confidence that binding occurs.
`structure_confidence`	0–1	Overall confidence in the predicted structure.
`iptm`	0–1	Interface predicted TM-score.
`min_interaction_pae`	Angstroms	Minimum predicted aligned error at the interface. Lower is better.
`helix_fraction`	0–1	Fraction of residues in helices.
`sheet_fraction`	0–1	Fraction of residues in sheets.
`loop_fraction`	0–1	Fraction of residues in loops.

Status values

Status	Meaning
`pending`	The screen is queued and has not started yet.
`running`	The screen is actively scoring proteins. Results may already be available.
`succeeded`	All proteins have been screened.
`failed`	The screen encountered an error. Check the `error` field.
`stopped`	The screen was stopped early. Partial results are available.