How It Works

Under the hood

Four steps, one pipeline, results in under a minute.

01

Structure prediction

Your sequence is passed through a deep learning structure prediction model that estimates the 3D coordinates of each residue directly from sequence — no multiple sequence alignment required. This makes it significantly faster than classical approaches while remaining accurate enough for most research and screening workflows.

The output is a PDB file (a standard molecular coordinates format) and a per-residue confidence score called pLDDT (predicted Local Distance Difference Test), scaled 0–100. Regions above 70 are generally reliable; regions below 50 are likely disordered or poorly constrained.

02

Mutation scoring

Mutation scoring uses a protein language model trained on hundreds of millions of protein sequences from across the tree of life. It learns which amino acids are evolutionarily tolerated at each position — a strong proxy for functional and structural importance.

For each residue, the model outputs a log-probability score for the wild-type amino acid. A score near zero means the residue is highly expected at that position; a strongly negative score suggests that position is unusual and likely sensitive to substitution. The heatmap lets you identify mutational hotspots at a glance.

03

Literature retrieval

If you provide a protein name, foldfunc queries a utility API to retrieve relevant published abstracts. These are passed into the interpretation step to ground the analysis in real science rather than model priors alone.

Steps 1 and 3 run concurrently to minimise total latency.

04

Biological interpretation

The sequence, confidence scores, mutation profile, and retrieved literature are synthesised by an AI reasoning model into a structured biological interpretation. The output covers:

  • Protein family classification and functional context
  • Structural observations tied to high- and low-confidence regions
  • Mutation-sensitive positions and their likely significance
  • Open research questions suggested by the analysis
  • A reliability note based on pLDDT confidence

The model reasons only from what is passed in — it has no internet access during this step. Retrieved literature is the primary external knowledge source.