Voronota-JS is an expansion of the core Voronota software. Voronota-JS provides a way to write JavaScript scripts for the comprehensive analysis of macromolecular structures, including the Voronoi tesselation-based analysis.
Currently, the Voronota-JS package contains several executables:
Voronota-JS relies on several externally developed software projects (big thanks to their authors):
Please refer to the core Voronota quick install guide.
Download the latest archive from the official downloads page: https://github.com/kliment-olechnovic/voronota/releases.
The archive contains the Voronota-JS software in the ‘expansion_js’ subdirectory. There is a ready-to-use statically compiled ‘voronota-js’ program for 64 bit Linux systems. This executable can be rebuilt from the provided source code to work on any modern Linux, macOS or Windows operating systems.
Voronota-JS has no required external dependencies, only a C++14-compliant compiler is needed to build it.
You can build using CMake for makefile generation. Starting in the directory containing “CMakeLists.txt” file, run the sequence of commands:
cmake ./
make
Alternatively, to keep files more organized, CMake can be run in a separate “build” directory:
mkdir build
cd build
cmake ../
make
cp ./voronota-js ../voronota-js
For example, “voronota-js” executable can be built from the sources in “src” directory using GNU C++ compiler:
g++ -std=c++14 -I"./src/dependencies" -O3 -o "./voronota-js" $(find ./src/ -name '*.cpp')
The script ‘voronota-js-ligand-cadscore’ for the protein-ligand variation of CAD-score is included to the Voronota-JS expansion of Voronota.
The ‘voronota-js-ligand-cadscore’ scripts has a special input mode for CASP15 protein-ligand model files: when using –casp15-* options for specifying input, the input files are split into receptor and ligand parts and passed to the same script with basic input options. The target input in the CASP15 mode should be formatted in the same as the model input. The basic input mode option is more flexible, but it requires all the receptor and ligand files available separately, and the ligand IDs (names) to be specified explicitly.
For example, let us take two files with protein-ligand models from CASP15: ‘H1135LG035_1’ and ‘H1135LG046_1’. Then let us execute the following command:
voronota-js-ligand-cadscore \
--casp15-target ./T1118v1LG035_1 \
--casp15-model ./T1118v1LG046_1 \
--casp15-target-pose 1 \
--casp15-model-pose 2 \
| column -t
which outputs:
target model contacts_set CAD_score target_area model_area
T1118v1LG035_1_pose1 T1118v1LG046_1_pose2 interface 0.568301 648.036 808.139
T1118v1LG035_1_pose1 T1118v1LG046_1_pose2 interface_plus_adjacent 0.629109 1781.9 2189.88
Note, that CAD-score values will be different if we swap target and model inputs. The script can do the swap automatically and output additional results when the ‘–and-swap true’ is added to the arguments, then the output will look as follows:
target model contacts_set CAD_score target_area model_area
T1118v1LG035_1_pose1 T1118v1LG046_1_pose2 interface 0.568301 648.036 808.139
T1118v1LG035_1_pose1 T1118v1LG046_1_pose2 interface_plus_adjacent 0.629109 1781.9 2189.88
T1118v1LG046_1_pose2 T1118v1LG035_1_pose1 interface 0.528076 808.139 648.036
T1118v1LG046_1_pose2 T1118v1LG035_1_pose1 interface_plus_adjacent 0.519405 2189.88 1781.9
The output table columns mean the following
There are some important aspects:
Examples of files generated using the ‘–details-dir’ option:
Example pymol call for displaying files generated using the ‘–drawing-dir’ option:
pymol \
draw_T1118v1LG035_1_pose1_interface_and_adjacent_contacts_in_pymol.py \
T1118v1LG035_1_pose1_cutout_interface_residues_ligand.pdb \
T1118v1LG035_1_pose1_cutout_interface_residues_receptor.pdb
Example of visualized contacts (with direct interface contacts in green, adjacent contacts in yellow, ligand atoms in red, receptor atoms in cyan): protein_ligand_3D.png.
‘voronota-js-voromqa’ script provides an interface to VoroMQA dark (newer) and light (classic) methods.
Options:
--input | -i string * input file path or '_list' to read file paths from stdin
--restrict-input string query to restrict input atoms, default is '[]'
--sequence-file string input sequence file
--select-contacts string query to select contacts, default is '[-min-seq-sep 2]'
--inter-chain string set query to select contacts to '[-inter-chain]'
--constraints-file string input distance constraints file
--output-table-file string output table file path, default is '_stdout' to print to stdout
--output-dark-scores string output PDB file with dark scores as B-factors
--output-light-scores string output PDB file with light scores as B-factors
--output-alignment string output alignment sequence alignment file
--cache-dir string path to cache directory
--tour-sort string tournament sorting mode, default is '_mono', options are '_homo', '_hetero' or custom
--sbatch-parameters string parameters to run Slurm sbatch if input is '_list'
--split-models-prefix string name prefix for splitting input PDB file by models
--smoothing-window number residue scores smoothing window size, default is 0
--processors number maximum number of processors to use, default is 1
--casp-b-factors flag to write CASP-required B-factors in output PDB files
--use-scwrl flag to use Scwrl4 to rebuild side-chains
--guess-chain-names flag to guess chain names based on sequence nubmering
--order-by-residue-id flag to order atoms by residue ids
--input-is-script flag to treat input file as vs script
--as-assembly flag to treat input file as biological assembly
--help | -h flag to display help message and exit
Standard output:
space-separated table of scores
Examples:
voronota-js-voromqa --input model.pdb
voronota-js-voromqa --cache-dir ./cache --input model.pdb
ls *.pdb | voronota-js-voromqa --cache-dir ./cache --input _list
ls *.pdb | voronota-js-voromqa --cache-dir ./cache --input _list | column -t
ls *.pdb | voronota-js-voromqa --cache-dir ./cache --input _list \
--processors 8 \
--inter-chain \
--tour-sort _hetero
ls *.pdb | voronota-js-voromqa --cache-dir ./cache --input _list \
--processors 8 \
--select-contacts '[-a1 [-chain A -rnum 1:500] -a2 [-chain B -rnum 1:500]]' \
--tour-sort '-columns full_dark_score sel_energy -multipliers 1 -1 -tolerances 0.02 0.0'
‘voronota-js-only-global-voromqa’ script computes global VoroMQA scores and can use fast caching.
Options:
--input | -i string * input file path or '_list' to read file paths from stdin
--restrict-input string query to restrict input atoms, default is '[]'
--output-table-file string output table file path, default is '_stdout' to print to stdout
--output-dark-pdb string output path for PDB file with VoroMQA-dark scores, default is ''
--output-light-pdb string output path for PDB file with VoroMQA-light scores, default is ''
--processors number maximum number of processors to run in parallel, default is 1
--sbatch-parameters string sbatch parameters to run in parallel, default is ''
--cache-dir string cache directory path to store results of past calls
--run-faspr string path to FASPR binary to rebuild side-chains
--input-is-script flag to treat input file as vs script
--as-assembly flag to treat input file as biological assembly
--help | -h flag to display help message and exit
Standard output:
space-separated table of scores
Examples:
voronota-js-only-global-voromqa --input model.pdb
ls *.pdb | voronota-js-only-global-voromqa --input _list --processors 8 | column -t
‘voronota-js-membrane-voromqa’ script provides an interface to the VoroMQA-based method for assessing membrane protein structures.
Options:
--input | -i string * input file path
--restrict-input string query to restrict input atoms, default is '[]'
--membrane-width number membrane width or list of widths to use, default is 25.0
--output-local-scores string prefix to output PDB files with local scores as B-factors
--as-assembly flag to treat input file as biological assembly
--help | -h flag to display help message and exit
Standard output:
space-separated table of scores
Examples:
voronota-js-membrane-voromqa --input model.pdb --membrane-width 30.0
voronota-js-membrane-voromqa --input model.pdb --membrane-width 20.0,25.0,30.0
voronota-js-membrane-voromqa --input model.pdb \
--membrane-width 20,25,30 \
--output-local-scores ./local_scores/
‘voronota-js-ifeatures-voromqa’ script computes multiple VoroMQA-based features of protein-protein complexes.
Options:
--input | -i string * input file path or '_list' to read file paths from stdin
--output-table-file string output table file path, default is '_stdout' to print to stdout
--processors number maximum number of processors to use, default is 1
--use-scwrl flag to use Scwrl4 to rebuild side-chains
--as-assembly flag to treat input file as biological assembly
--help | -h flag to display help message and exit
Standard output:
space-separated table of scores
Examples:
voronota-js-ifeatures-voromqa --input model.pdb
ls *.pdb | voronota-js-ifeatures-voromqa --input _list --processors 8 | column -t
‘voronota-js-fast-iface-voromqa’ script rapidly computes VoroMQA-based interface energy of protein complexes.
Options:
--input | -i string * input file path or '_list' to read file paths from stdin
--restrict-input string query to restrict input atoms, default is '[]'
--subselect-contacts string query to subselect inter-chain contacts, default is '[]'
--constraints-required string query to check required contacts, default is ''
--constraints-banned string query to check banned contacts, default is ''
--constraint-clashes number max allowed clash score, default is 0.9
--output-table-file string output table file path, default is '_stdout' to print to stdout
--output-ia-contacts-file string output inter-atom contacts file path, default is ''
--output-ir-contacts-file string output inter-residue contacts file path, default is ''
--processors number maximum number of processors to run in parallel, default is 1
--sbatch-parameters string sbatch parameters to run in parallel, default is ''
--stdin-file string input file path to replace stdin
--run-faspr string path to FASPR binary to rebuild side-chains
--input-is-script flag to treat input file as vs script
--as-assembly flag to treat input file as biological assembly
--detailed-times flag to output detailed times
--score-symmetry flag to score interface symmetry
--blanket flag to keep nucleic acids and use blanket potential
--help | -h flag to display help message and exit
Standard output:
space-separated table of scores
Examples:
voronota-js-fast-iface-voromqa --input model.pdb
ls *.pdb | voronota-js-fast-iface-voromqa --input _list --processors 8 | column -t
‘voronota-js-fast-iface-cadscore’ script rapidly computes interface CAD-score for two protein complexes.
Options:
--target | -t string * target file path
--model | -m string * model file path or '_list' to read file paths from stdin
--restrict-input string query to restrict input atoms, default is '[]'
--subselect-contacts string query to subselect inter-chain contacts, default is '[]'
--subselect-site string query to subselect atoms for binding site scoring, default is '[]'
--output-table-file string output table file path, default is '_stdout' to print to stdout
--processors number maximum number of processors to run in parallel, default is 1
--sbatch-parameters string sbatch parameters to run in parallel, default is ''
--stdin-file string input file path to replace stdin
--run-faspr string path to FASPR binary to rebuild side-chains
--permuting-allowance number maximum number of chains for exhaustive remapping, default is 4
--as-assembly flag to treat input files as biological assemblies
--remap-chains flag to calculate and use optimal chains remapping
--remap-chains-logging flag to print log of chains remapping to stderr
--ignore-residue-names flag to ignore residue names in residue identifiers
--test-common-ids flag to fail quickly if there are no common residues
--crude flag to enable very crude faster mode
--lt flag to enable voronota-lt faster mode
--help | -h flag to display help message and exit
Standard output:
space-separated table of scores
Examples:
voronota-js-fast-iface-cadscore --target target.pdb --model model.pdb
ls *.pdb | voronota-js-fast-iface-cadscore --target target.pdb --model _list --processors 8 | column -t
‘voronota-js-fast-iface-cadscore-matrix’ script rapidly computes interface CAD-score between complexes.
Options:
--restrict-input string query to restrict input atoms, default is '[]'
--subselect-contacts string query to subselect inter-chain contacts, default is '[]'
--output-table-file string output table file path, default is '_stdout' to print to stdout
--processors number maximum number of processors to run in parallel, default is 1
--sbatch-parameters string sbatch parameters to run in parallel, default is ''
--stdin-file string input file path to replace stdin
--permuting-allowance number maximum number of chains for exhaustive remapping, default is 4
--as-assembly flag to treat input files as biological assemblies
--remap-chains flag to calculate and use optimal chains remapping
--ignore-residue-names flag to ignore residue names in residue identifiers
--crude flag to enable very crude faster mode
--lt flag to enable voronota-lt faster mode
--help | -h flag to display help message and exit
Standard output:
space-separated table of scores
Examples:
ls *.pdb | voronota-js-fast-iface-cadscore-matrix | column -t
find ./complexes/ -type f -name '*.pdb' | voronota-js-fast-iface-cadscore-matrix > "full_matrix.txt"
(find ./group1/ -type f | awk '{print $1 " a"}' ; find ./group2/ -type f | awk '{print $1 " b"}') | voronota-js-fast-iface-cadscore-matrix > "itergroup_matrix.txt"
‘voronota-js-fast-iface-contacts’ script rapidly computes contacts of inter-chain interface in a molecular complex.
Options:
--input string * input file path or '_list' to read file paths from stdin
--restrict-input string query to restrict input atoms, default is '[]'
--subselect-contacts string query to subselect inter-chain contacts, default is '[]'
--output-contacts-file string output table file path, default is '_stdout' to print to stdout
--output-bsite-file string output binding site table file path, default is ''
--output-drawing-script string output PyMol drawing script file path, default is ''
--processors number maximum number of processors to run in parallel, default is 1
--sbatch-parameters string sbatch parameters to run in parallel, default is ''
--stdin-file string input file path to replace stdin
--run-faspr string path to FASPR binary to rebuild side-chains
--custom-radii-file string path to file with van der Waals radii assignment rules
--with-sas-areas flag to also compute and output solvent-accessible areas of interface residue atoms
--coarse-grained flag to output inter-residue contacts
--input-is-script flag to treat input file as vs script
--as-assembly flag to treat input file as biological assembly
--use-hbplus flag to run 'hbplus' to tag H-bonds
--expand-ids flag to output expanded IDs
--og-pipeable flag to format output to be pipeable to 'voronota query-contacts'
--help | -h flag to display help message and exit
Standard output:
tab-separated table of contacts
Examples:
voronota-js-fast-iface-contacts --input "./model.pdb" --expand-ids > "./contacts.tsv"
voronota-js-fast-iface-contacts --input "./model.pdb" --with-sas-areas --coarse-grained --og-pipeable | voronota query-contacts --summarize-by-first
cat "./model.pdb" | voronota-js-fast-iface-contacts --input _stream --with-sas-areas --coarse-grained --og-pipeable | voronota query-contacts --summarize
ls *.pdb | voronota-js-fast-iface-contacts --input _list --processors 8 --output-contacts-file "./output/-BASENAME-.tsv"
‘voronota-js-fast-iface-data-graph’ script generates interface data graphs of protein complexes.
Options:
--input string * input file path or '_list' to read file paths from stdin
--restrict-input string query to restrict input atoms, default is '[]'
--subselect-contacts string query to subselect inter-chain contacts, default is '[]'
--with-reference string input reference complex structure file path, default is ''
--output-data-prefix string * output data files prefix
--output-table-file string output table file path, default is '_stdout' to print to stdout
--processors number maximum number of processors to run in parallel, default is 1
--sbatch-parameters string sbatch parameters to run in parallel, default is ''
--stdin-file string input file path to replace stdin
--run-faspr string path to FASPR binary to rebuild side-chains
--coarse-grained flag to output a coarse-grained graph
--input-is-script flag to treat input file as vs script
--as-assembly flag to treat input file as biological assembly
--help | -h flag to display help message and exit
Standard output:
space-separated table of generated file paths
Examples:
voronota-js-fast-iface-data-graph --input model.pdb --output-prefix ./data_graphs/
ls *.pdb | voronota-js-fast-iface-data-graph --input _list --processors 8 --output-prefix ./data_graphs/ | column -t
‘voronota-js-voroif-gnn’ scores protein-protein interfaces using the VoroIF-GNN method
Options:
--input string * input file path, or '_list' to read multiple input files paths from stdin
--gnn string GNN package file or directory with package files, default is '${SCRIPT_DIRECTORY}/voroif/gnn_packages/v1'
--gnn-add string additional GNN package file or directory with package files, default is ''
--restrict-input string query to restrict input atoms, default is '[]'
--subselect-contacts string query to subselect inter-chain contacts, default is '[]'
--as-assembly string flag to treat input file as biological assembly
--input-is-script string flag to treat input file as vs script
--conda-path string conda installation path, default is ''
--conda-env string conda environment name, default is ''
--faspr-path string path to FASPR binary, default is ''
--run-faspr string flag to rebuild sidechains using FASPR, default is 'false'
--processors number maximum number of processors to run in parallel, default is 1
--sbatch-parameters string sbatch parameters to run in parallel, default is ''
--stdin-file string input file path to replace stdin, default is '_stream'
--local-column string flag to add per-residue scores to the global output table, default is 'false'
--cache-dir string cache directory path to store results of past calls, default is ''
--output-dir string output directory path for all results, default is ''
--output-pdb-file string output path for PDB file with interface residue scores, default is ''
--output-pdb-mode string mode to write b-factors ('overwrite_all', 'overwrite_iface' or 'combine'), default is 'overwrite_all'
--help | -h flag to display help message and exit
Standard output:
space-separated table of global scores
Important note about output interpretation:
higher GNN scores are better, lower GNN scores are worse (with VoroMQA energy it is the other way around)
Examples:
voronota-js-voroif-gnn --conda-path ~/miniconda3 --input './model.pdb'
voronota-js-voroif-gnn --input './model.pdb' --gnn "${HOME}/git/voronota/expansion_js/voroif/gnn_packages/v1"
find ./models/ -type f | voronota-js-voroif-gnn --conda-path ~/miniconda3 --input _list
Requirements installation example using Miniconda (may need more than 10 GB of disk space):
# prepare Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
source ~/miniconda3/bin/activate
# install PyTorch using instructions from 'https://pytorch.org/get-started/locally/'
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
# install PyTorch Geometric using instructions from 'https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html'
pip install torch_geometric
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.1.0+cu121.html
# install Pandas
conda install pandas
# if you do not have R installed in you system, install it (not necessarily using conda, e.g 'sudo apt-get install r-base' in Ubuntu)
conda install r -c conda-forge
‘voronota-js-ligand-cadscore’ script computes protein-ligand variation of CAD-score.
Input options, basic:
--target-receptor string * target receptor file path
--target-ligands string * list of target ligand file paths
--target-ligand-ids string * list of target ligand IDs
--model-receptor string * model receptor file path
--model-ligands string * list of model ligand file paths
--model-ligand-ids string * list of model ligand IDs
--target-name string target name to use for output, default is 'target_complex'
--model-name string model name to use for output, default is 'model_complex'
Input options, alternative:
--casp15-target string * target data file in CASP15 format, alternative to --target-* options
--casp15-target-pose string * pose number to select from the target data file in CASP15 format
--casp15-model string * model data file in CASP15 format, alternative to --model-* options
--casp15-model-pose string * pose number to select from the model data file in CASP15 format
Other options:
--table-dir string directory to output scores table file named '${TARGET_NAME}_vs_${MODEL_NAME}.txt'
--details-dir string directory to output lists of contacts used for scoring
--drawing-dir string directory to output files to visualize with pymol
--and-swap string flag to compute everything after swapping target and model, default is 'false'
--ignore-ligand-headers string flag to ignore title header in ligand files
--help | -h display help message and exit
Standard output:
space-separated table of scores
Examples:
voronota-js-ligand-cadscore --casp15-target ./T1118v1LG035_1 --casp15-target-pose 1 --casp15-model ./T1118v1LG046_1 --casp15-model-pose 1
voronota-js-ligand-cadscore \
--target-receptor ./t_protein.pdb --target-ligands './t_ligand1.mol ./t_ligand2.mol ./t_ligand3.mol' --target-ligand-ids 'a a b' \
--model-receptor ./m_protein.pdb --model-ligands './m_ligand1.mol ./m_ligand2.mol ./m_ligand3.mol' --model-ligand-ids 'a a b'