Screen small molecule libraries

Guides

Score your own small molecules against a protein target, fetch results as they arrive, and stop early if needed.

A small molecule library screen scores molecules you provide against a protein target. Each molecule is evaluated for binding confidence, optimization score, and structure confidence. Results stream in as molecules are scored — you can fetch them before the screen finishes and stop early.

Results and artifacts

Library screens generate molecule results over time. As soon as a molecule is scored, you can read and download the result without waiting for the full screen to finish.

Each screened molecule result includes scoring metrics such as binding confidence, optimization score, and structure confidence. Each result also includes downloadable artifacts for the predicted bound structure and PAE.

Define a target

Targets are protein-only entities. The engine automatically identifies the binding pocket to use during the run. You can provide hints to help it find the right one:

pocket_residues — If you already know the pocket residues, pass them directly as a map of chain ID to an array of 0-indexed residue indices.
reference_ligands — If you have known binders, pass them as an array of SMILES strings. The engine uses these to locate the pocket region.

You can provide one or both. Either will help the engine use the correct binding pocket, and providing both gives it the strongest signal.

{
  "target": {
    "entities": [
      { "type": "protein", "value": "MKTIIALSYIFCLVFA...", "chain_ids": ["A"] }
    ],
    "pocket_residues": {
      "A": [10, 11, 12, 35, 36, 37]
    }
  }
}

Provide molecules

Pass molecules inline as an array of objects, each with a smiles field and an optional id:

{
  "molecules": [
    { "smiles": "CC(=O)OC1=CC=CC=C1C(=O)O", "id": "aspirin" },
    { "smiles": "C1=CC=C(C=C1)O", "id": "phenol" },
    { "smiles": "CC1=CC=CC=C1" }
  ]
}

When you provide an id, it’s returned as external_id on the corresponding result — use this to correlate results back to your input library.

Molecular filters

Filters control which generated molecules pass through to results. All custom filters use AND logic — a molecule must pass every filter.

Built-in filter

The boltz_smarts_catalog_filter_level parameter controls Boltz’s built-in structural alert filtering. Our medicinal chemistry team has curated these filters from extensive drug discovery experience, encoding patterns known to cause toxicity, reactivity, or poor pharmacokinetics.

Level	Description
`recommended` (default)	Balanced filtering that catches the most common problematic substructures.
`extra`	Stricter filtering with additional alerts.
`aggressive`	Most conservative — rejects anything with a known structural concern.
`disabled`	No built-in filtering.

Custom filters

Add any combination of these to the custom_filters array:

Filter type	What it does
`lipinski_filter`	Lipinski’s Rule of Five — set `max_mw`, `max_logp`, `max_hbd`, `max_hba`. Optional `allow_single_violation`.
`rdkit_descriptor_filter`	RDKit descriptor ranges — `mol_wt`, `mol_logp`, `tpsa`, `num_h_donors`, `num_h_acceptors`, `num_rotatable_bonds`, `num_heteroatoms`, `num_aromatic_rings`, `num_rings`, `fraction_csp3`. Each accepts `{min, max}`.
`smarts_custom_filter`	Reject molecules matching any of the provided SMARTS `patterns`.
`smarts_catalog_filter`	Reject molecules matching a named catalog: `PAINS`, `PAINS_A`, `PAINS_B`, `PAINS_C`, `BRENK`, `CHEMBL`, `CHEMBL_BMS`, `CHEMBL_Dundee`, `CHEMBL_Glaxo`, `CHEMBL_Inpharmatica`, `CHEMBL_LINT`, `CHEMBL_MLSMR`, `CHEMBL_SureChEMBL`, `NIH`.
`smiles_regex_filter`	Reject molecules whose SMILES matches any of the provided regex `patterns`.

Molecules that don’t pass the filters are skipped and won’t appear in results.

Run a screen and download results

run_small_molecule_library_screen() submits the screen, waits while molecules are scored, downloads result archives, and returns a local run directory.

import os
from boltz_api import Boltz

client = Boltz(api_key=os.environ["BOLTZ_API_KEY"])

run_dir = client.experiments.run_small_molecule_library_screen(
    target={
        "entities": [
            {"type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"]},
        ],
    },
    molecules=[
        {"smiles": "CC(=O)OC1=CC=CC=C1C(=O)O", "id": "aspirin"},
        {"smiles": "C1=CC=C(C=C1)O", "id": "phenol"},
        {"smiles": "CC1=CC=CC=C1"},
    ],
    name="small-molecule-library-screen",
)
print(run_dir)

The run directory contains the sanitized run record, resumable download state, a result manifest, and downloaded files for each screened molecule:

boltz-experiments/small-molecule-library-screen/
  .boltz-run.json
  run.json
  results/
    index.jsonl
    sm_scr_result_.../
      metadata.json
      archive.tar.gz
      files/
        result/
          metrics.json
          predicted_structure.cif
          pae.npz

Run a screen and download results

The CLI starts the remote screen, then download-results waits, resumes if interrupted, and writes results under boltz-experiments/small-molecule-library-screen/.

Save your inputs to small-molecule-screen.yaml:

target:
  entities:
    - type: protein
      value: MKTIIALSYIFCLVFA
      chain_ids: ["A"]
molecules:
  - smiles: CC(=O)OC1=CC=CC=C1C(=O)O
    id: aspirin
  - smiles: C1=CC=C(C=C1)O
    id: phenol
  - smiles: CC1=CC=CC=C1

Then start the screen and download:

SCREEN_ID=$(
  boltz-api --format raw small-molecule:library-screen start \
    --input @yaml://./small-molecule-screen.yaml |
    jq -r '.id'
)

run_dir=$(boltz-api download-results --id "$SCREEN_ID" --name small-molecule-library-screen)
echo "$run_dir"

Use --input @json://./small-molecule-screen.json if your input file is JSON.

The run directory contains the sanitized run record, resumable download state, a result manifest, and downloaded files for each screened molecule:

boltz-experiments/small-molecule-library-screen/
  .boltz-run.json
  run.json
  results/
    index.jsonl
    sm_scr_result_.../
      metadata.json
      archive.tar.gz
      files/
        result/
          metrics.json
          predicted_structure.cif
          pae.npz

Start a screen

import os
from boltz_api import Boltz

client = Boltz(api_key=os.environ["BOLTZ_API_KEY"])

screen = client.small_molecule.library_screen.start(
    target={
        "entities": [
            {"type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"]},
        ],
    },
    molecules=[
        {"smiles": "CC(=O)OC1=CC=CC=C1C(=O)O", "id": "aspirin"},
        {"smiles": "C1=CC=C(C=C1)O", "id": "phenol"},
        {"smiles": "CC1=CC=CC=C1"},
    ],
)
print(f"Screen ID: {screen.id}, Status: {screen.status}")

Start a screen

import Boltz from "boltz-api";

const apiKey = process.env["BOLTZ_API_KEY"];

const client = new Boltz({ apiKey });

let screen = await client.smallMolecule.libraryScreen.start({
  target: {
    entities: [
      { type: "protein", value: "MKTIIALSYIFCLVFA", chain_ids: ["A"] },
    ],
  },
  molecules: [
    { smiles: "CC(=O)OC1=CC=CC=C1C(=O)O", id: "aspirin" },
    { smiles: "C1=CC=C(C=C1)O", id: "phenol" },
    { smiles: "CC1=CC=CC=C1" },
  ],
});
console.log(`Screen ID: ${screen.id}, Status: ${screen.status}`);

Run a screen and download results

Ask the agent to save the payload to small-molecule-screen.yaml, estimate cost, then submit with a stable idempotency key.

target:
  entities:
    - type: protein
      value: MKTIIALSYIFCLVFA
      chain_ids: ["A"]
molecules:
  - smiles: CC(=O)OC1=CC=CC=C1C(=O)O
    id: aspirin
  - smiles: C1=CC=C(C=C1)O
    id: phenol
  - smiles: CC1=CC=CC=C1

boltz-api small-molecule:library-screen estimate-cost \
  --input @yaml://./small-molecule-screen.yaml

SCREEN_ID=$(
  boltz-api small-molecule:library-screen start \
    --idempotency-key "small-molecule-library-screen" \
    --input @yaml://./small-molecule-screen.yaml \
    --raw-output --transform id
)

boltz-api download-results \
  --id "$SCREEN_ID" \
  --name "small-molecule-library-screen" \
  --root-dir "./boltz-experiments" \
  --poll-interval-seconds 10

Start now, download later

The main run_small_molecule_library_screen() example already waits and downloads. To submit now and download later, use start_small_molecule_library_screen() and resume with wait_and_download().

run_dir = client.experiments.start_small_molecule_library_screen(
    target={
        "entities": [
            {"type": "protein", "value": "MKTIIALSYIFCLVFA", "chain_ids": ["A"]},
        ],
    },
    molecules=[
        {"smiles": "CC(=O)OC1=CC=CC=C1C(=O)O", "id": "aspirin"},
        {"smiles": "C1=CC=C(C=C1)O", "id": "phenol"},
    ],
    name="submit-now-finish-later",
)

client.experiments.wait_and_download(run_dir=run_dir)

Resume downloads

download-results is the progress monitor and downloader. It can be rerun with the same name to resume from local checkpoint state.

boltz-api download-results --name small-molecule-library-screen
boltz-api --format json download-status --name small-molecule-library-screen

Monitor progress

import time

while screen.status not in ("succeeded", "failed", "stopped"):
    time.sleep(10)
    screen = client.small_molecule.library_screen.retrieve(screen.id)
    progress = screen.progress
    print(
        f"Status: {screen.status}, Screened: {progress.num_molecules_screened}/{progress.total_molecules_to_screen}"
    )

Monitor progress

while (!["succeeded", "failed", "stopped"].includes(screen.status)) {
  await new Promise((r) => setTimeout(r, 10000));
  screen = await client.smallMolecule.libraryScreen.retrieve(screen.id);
  const progress = screen.progress;
  console.log(
    `Status: ${screen.status}, Screened: ${progress.num_molecules_screened}/${progress.total_molecules_to_screen}`,
  );
}

Resume downloads

Agents can rerun download-results with the same name and root directory to resume from local checkpoint state.

boltz-api download-results \
  --name "small-molecule-library-screen" \
  --root-dir "./boltz-experiments" \
  --poll-interval-seconds 10

boltz-api --format json download-status \
  --name "small-molecule-library-screen" \
  --root-dir "./boltz-experiments"

Inspect downloaded results

Result pages and artifact archives are already downloaded into the run directory.

print(run_dir)
for result_dir in (run_dir / "results").iterdir():
    print(result_dir)

Inspect downloaded results

Result pages and artifact archives are already downloaded into the run directory.

ls boltz-experiments/small-molecule-library-screen/results

Fetch paginated results

for result in client.small_molecule.library_screen.list_results(screen.id):
    print(f"Result {result.id} (external: {result.external_id}): {result.smiles}")
    print(f"  Binding confidence: {result.metrics.binding_confidence}")
    print(f"  Optimization score: {result.metrics.optimization_score}")
    print(f"  Structure URL: {result.artifacts.structure.url}")

Fetch paginated results

for await (const result of client.smallMolecule.libraryScreen.listResults(
  screen.id,
)) {
  console.log(
    `Result ${result.id} (external: ${result.external_id}): ${result.smiles}`,
  );
  console.log(`  Binding confidence: ${result.metrics.binding_confidence}`);
  console.log(`  Optimization score: ${result.metrics.optimization_score}`);
  console.log(`  Structure URL: ${result.artifacts.structure.url}`);
}

Inspect downloaded results

Use the local result manifest and downloaded artifact directories after download-results completes or while it is resuming.

ls ./boltz-experiments/small-molecule-library-screen/results
head -n 5 ./boltz-experiments/small-molecule-library-screen/results/index.jsonl

Stop early

client.experiments.stop(run_dir=run_dir)

boltz-api small-molecule:library-screen stop --id "$SCREEN_ID"

client.small_molecule.library_screen.stop(screen.id)

await client.smallMolecule.libraryScreen.stop(screen.id);

boltz-api small-molecule:library-screen stop --id "<screen-id-from-start>"

Metrics

Metric	Range	What it measures
`binding_confidence`	0–1	Likelihood of binding. Primary metric for hit discovery.
`optimization_score`	—	Binding strength ranking. Use for lead optimization.
`structure_confidence`	0–1	Overall confidence in the predicted structure.
`iptm`	0–1	Interface predicted TM-score.
`ptm`	0–1	Global predicted TM-score.
`plddt`	0–1	Per-residue structure confidence.
`complex_plddt`	0–1	pLDDT across the full complex.
`complex_iplddt`	0–1	Interface pLDDT for the complex.

Status values

Status	Meaning
`pending`	The screen is queued and has not started yet.
`running`	The screen is actively scoring molecules. Results may already be available.
`succeeded`	All molecules have been screened.
`failed`	The screen encountered an error. Check the `error` field.
`stopped`	The screen was stopped early. Partial results are available.