osprey.ccs

Osprey for Compiled Conformation Spaces (CCS)

Provides functions to setup designs using the compiled conformation space system, and run them.

Precision

Determines the level of floating-point precision for calculations.

Enumeration: Precision

  • Float32:

  • Float64:

PosInterDist

Defines how position interactions should be distributed among conformation fragments.

Enumeration: PosInterDist

  • DesmetEtAl1992: Uses the traditional distribution introduced in the original DEE paper:

The dead-end elimination theorem and its use in protein side-chain positioning
J Desmet, M De Maeyer, B Hazes, I Lasters
Nature, 1992
https://doi.org/10.1038/356539a0

Namely, pos-static and pos interactions are placed on “single” conf fragments. And pos-pos interactions are placed on “pair” conf fragments.

  • TighterBounds: A newer distribution that yields tighter lower bounds on conformation energy. No interactions are placed on “single” conf fragments. Instead, the pos-static and pos interactions are distributed evenly among the “pair” conf fragments involving that position, without double-counting. This re-distribution allows minimizing “pair” conf fragments in the presence of the fixed atoms, which generally results in less optimistic lower bounds for “pair” energies.

loadConfSpace

loadConfSpace(path)

Loads a compiled conformation space.

Arguments

  • path str: Path to the compiled conformation space file (usually has a .ccsx or .ccs extension)

Returns

ConfSpace

cudaEnergyCalculator

cudaEnergyCalculator(confSpace, precision, parallelism)

An energy calculator that uses CUDA GPUs to accelerate the computation. get GPU info from the Parallelism instance

Arguments

Returns

CudaConfEnergyCalculator

nativeEnergyCalculator

nativeEnergyCalculator(confSpace, precision)

A reference implementation of a conformation minimizer and energy calculator in C++, rather than Java.

It’s not terribly optimized, but it’s already faster than the original implementation in Java.

Arguments

Returns

NativeConfEnergyCalculator

javaEnergyCalculator

javaEnergyCalculator(confSpace)

A basic conformation energy calculator implementation, written in pure Java.

Not the fastest implementation available. Try the NativeConfEnergyCalculator first, or the CudaConfEnergyCalculator if you have GPUs available.

Arguments

Returns

CPUConfEnergyCalculator

bestEnergyCalculator

bestEnergyCalculator(confSpace, parallelism)

Builds the best conformation energy calculator based on the given resources, and using Float64 precision.

Arguments

Returns

ConfEnergyCalculator

calcReferenceEnergies

calcReferenceEnergies(ecalc, minimize=True)

Calculate reference energies for a conformation space.

This calculator uses just the ‘single’ energy of the design position as the reference energy.

Arguments

  • ecalc ConfEnergyCalculator:
  • minimize bool: True to minimize conformations, false to use rigid conformations.

Returns

SimpleReferenceEnergies

calcEnergyMatrix

calcEnergyMatrix(ecalc, tasks=_useJavaDefault, eref=None, posInterDist=PosInterDist.DesmetEtAl1992, minimize=True, includeStaticStatic=True, cachePath=None)

Calculates energy matrices for a conformation space.

Arguments

  • ecalc ConfEnergyCalculator:
  • eref SimpleReferenceEnergies: Reference energies used to design against the unfolded state, useful for GMEC designs
  • posInterDist PosInterDist: Defines what position interactions should be used for conformations and conformation fragments
  • minimize bool: True to minimize conformations, false to use rigid conformations.
  • includeStaticStatic bool: True to include the static-static energies in conformation energies
  • cachePath str: Path to file where energy matrix should be saved between computations.

If design settings are changed between runs, Osprey will make some effort to detect that the energy matrix cache is out-of-date and compute a new energy matrix instead of usng the cached, incorrect one. Osprey might not detect all design changes though, and incorrectly reuse a cached energy matrix, so it is best to manually delete the entry matrix cache file after changing design settings.

Energy matrix computation can take a long time, but often the results can be reused between computations. Use a cache file to skip energy matrix computation on the next Osprey run if the energy matrix has already been computed once before.

Returns

EnergyMatrix

ecalcAdapter

ecalcAdapter(ecalc, tasks, eref=None, posInterDist=PosInterDist.DesmetEtAl1992, minimize=True, includeStaticStatic=True)

A translation layer that allows the new conf energy calculators to be somewhat compatible with the old ones.

Arguments

  • ecalc ConfEnergyCalculator:
  • tasks TaskExecutor:
  • eref SimpleReferenceEnergies: Reference energies used to design against the unfolded state, useful for GMEC designs
  • posInterDist PosInterDist: Defines what position interactions should be used for conformations and conformation fragments
  • minimize bool:
  • includeStaticStatic bool: True to include the static-static energies in conformation energies

Returns

ConfEnergyCalculatorAdapter

freeEnergyCalc

freeEnergyCalc(confSpace, parallelism=None, cluster=None, precision=Structs.Precision.Float64, nodeDBFile=None, nodeDBMem=2 * 1024 * 1024, seqDBFile=None, seqDBMathContext=new MathContext(2048, RoundingMode.HALF_UP), posInterDist=PosInterDist.TighterBounds, staticStatic=True, tripleCorrectionThreshold=None, conditions=BoltzmannCalculator.Conditions.Classic, nodeScoringLog=None, nodeStatsReportingSeconds=None)

A free energy calculator based on a scalable and memory-bounded partition function calculator.

Used with memory-bounded implementations of K*. See kstarBoundedMem

Arguments

  • confSpace MultiStateConfSpace:
  • parallelism Parallelism: Information about parallel hardware resources available, if any.
  • cluster Cluster: Information about the cluter resources available, if any.
  • precision Precision: The precision of floating-point calculations for the energy function.
  • nodeDBFile str: Path to the Node Database file, if any. If no file is given, the calculation will be done in RAM.
  • nodeDBMem int: Amount of memory (RAM, in bytes) to use for the calculation. This memory will be pre-allocated.
  • seqDBFile str: Path to the file for the Sequence Database, if any. If no file is given, the sequence info will be stored in RAM.
  • seqDBMathContext java.math.MathContext: Precision to use for calculations involving sequence partition function values.

SeqDB needs a LOT of precision to keep the bounds acccurate, since we’re adding/subtracting numbers with very different exponents empirical testing shows anything higher than 512 starts to show noticeable performance penalties unfortunately, testing also shows that values of 1024 of less can cause some pfunc computations to get stuck so we’ll need something really huge by default to be safe for most computations

  • posInterDist PosInterDist:
  • staticStatic bool: True to include interactions within the static region (unaffected by motions and mutations) of the input molecules.

Including these energies, or not, won’t change the rankings of sequences/conformations in a single design, but including them will make your energies comaparable across different designs on the same molecules.

Osprey has been optimized so that including these interactions, or not, should have no noticeable impact on performance.

  • tripleCorrectionThreshold java.lang.Double: Threshold (in kcal/mol) for when triple interactions (ie between three design positions, rather than one or two) should be included in the energy matrix.

Use null to never include triple corrections to the energy matrix.

Use a non-null value to include triple corrections in the energy matrix. For example, a value like 10 kcal/mol seems to work well, but feel free to experiment with different values in your own designs.

Pairwise energies in energy matrices are often overly optimistic (ie too low) which leads to loose lower bounds on conformation energies. Using triple information to correct (ie raise) the energy bounds on conformations can speed up large Osprey calculations significantly by using a more accurate conformtion order to compute free energies. However, the tradeoff for faster free energy calculation is a slower energy matrix calculation step, so the speedup tends to only pay off for larger designs.

A triple correction will be added to the energy matix when all of the triple’s constituent single and pairwise energies are below the threshold. This setting ensures we don’t waste time correcting energy matrix energies that are already very high, like clashes, since triple corrections are expensive to compute.

  • conditions Conditions: Environmental conditions (like temperature) for calculations that use them (like Boltzmann weighting).
  • nodeScoringLog java.io.File: For debugging Osprey performance, most users will not need this.
  • nodeStatsReportingSeconds int: For debugging Osprey performance, most users will not need this.

Returns

Coffee

kstarBoundedMem

kstarBoundedMem(complex, design, target, gWidthMax=1.0, maxSimultaneousMutations=1, stabilityThreshold=5.0, timing=Timing.Efficient, reportStateProgress=True, ensembleTracking=_useJavaDefault, ensembleMinUpdate=_useJavaDefault)

An implementation of the K* design algorithm that uses the bounded-memory free energy calculator. See freeEnergyCalc.

For an example of how to use this function, see examples/python.ccs/kstar/kstar.boundedMem.py in your Osprey distribution.

Arguments

  • complex ConfSpace:
  • design ConfSpace:
  • target ConfSpace:
  • gWidthMax float: Sets the largest precision desired for free energy calculations. Resulting free energy values may be more precise than this maximum value. If the computation is stopped early, the free energy values may be less precise than this value.
  • maxSimultaneousMutations java.lang.Integer:
  • stabilityThreshold java.lang.Double: Pruning criteria to remove sequences with unstable unbound states relative to the wild type sequence. Defined in units of kcal/mol.

Set to null to disable the filter entirely.

If the the wild-type sequence is not in the list of sequences to compute, and the stability threshold is not null, then the free energies for the wild-type sequence will be calculated anyway.

  • More precisely, a sequence is pruned when the following expression is true:

L(G_s) > U(G_w) + t

  • where:
    • L(G_s) is the lower bound on the free energy for sequence s
    • U(G_w) is the upper bound on the free energy for the wild-type sequence w
    • t is the stability threshold
  • timing Timing:
  • reportStateProgress bool: Set to true to report progress on computing individual conf space states to the log.
  • ensembleTracking [int,str]: Tracks the lowest-energy conformations for the best sequences and periodically writes out ensemble PDB files to the specified directory.
  • ensembleMinUpdate [int,java.util.concurrent.TimeUnit]: Sets the minimum interval for writing the next lowest-energy ensemble PDB files.

Returns

KStarDirector