This page is meant to make users aware of helpful packages for our research, both in-house and external. Feel free to add any more packages as you see fit!

GoodVibes

GoodVibes is an in-house Python package from the Paton group designed to compute thermochemical values from electronic structure frequency calculations. This program allows a user to collect energy, enthalpy, entropy, and free energies from quantum chemistry output files at variable temperatures and concentrations, while applying a variety of useful corrections including quasi-harmonic entropy corrections, zero-point energy corrections, and frequency scaling. Other features include Boltzmann averaging, relative energy and thermochemistry calculations and plotting, and duplicate checking.

Publication is available here

Documentation can be found in the Readme here

DBSTEP

DFT-Based STEric Parameters (or DBSTEP) is an in-house Python package from the Paton group for collecting steric parameters from 3D molecular coordinates. Steric parameters available include percent buried volume, Sterimol (L, Bmin, Bmax) and new vectorized versions of these parameters, vol2vec and Sterimol2vec. This program can be used from the command line or in a script to obtain steric values from a variety of file formats or RDKit mol objects as input. Optional output allows for the visualization of the parameter measurements in PyMOL.

Documentation can be found in the Readme here

REGGAE

REGression Generator and AnalyzEr (or REGGAE) is an in-house R-language script from the Paton group for statistical analysis of datasets. Statistical diagnostics include linear and non-linear regression modeling, feature selection, data splitting, pairwise correlations, ANOVA and QSAR analysis, PCA, and cross-validation analysis. Users can also generate plots for these analyses.

Documentation can be found in the Readme here

DISCO

DIStributing Computed Outputs (or DISCO) is an in-house Python script from the Paton group used to parse through Gaussisan NBO and GIAO outputs for atomistic and molecular properties. DISCO collects NBO atomic charges, NMR tensor values, and/or NMR chemical shifts, HOMO, LUMO, and bond distance values.

Documentation can be found in the Readme here.

AQME

Automated QM Environments (or AQME) is an in-house Python package from the Paton group, and is an ensemble of automated QM workflows that can be run through Jupyter Notebooks, command lines, and yaml files. Workflows include conformational sampling, post-processing of QM output files to fix errors, generation of ready-to-run QM files, and generation of semi-empirical descriptors in json and csv files (and more).

Publication is available here

Documentation can be found in the Readme here

Slurm Job Tracking

An in-house Python script from the Paton group that will allow you to track completion of slurm jobs. Especially useful if you are running numerous jobs in numerous locations.

To install and use:

  • Copy jobcheck.py to machine
  • (optional) Add alias to your .bashrc >> alias sq='python ~/jobcheck.py'

  • Run command
    • When you run for the first time, the script initializes and logs current

    job information. - Note you should run the command everytime after you submit jobs, or else it won't log current job info.

This is what it looks like when you run the command and a job has ended since the last time the command was run:

running_example

Now I can quick cd into the directory of the finished job, or copy it over to my local machine.

pyssian

pyssian is an object oriented library for parsing Gaussian logfiles and input files which aims to facilitate the everyday scripting of computational chemists using Gaussian.

Source

Documentation

A package containing some useful scripts based on this library that can be used as examples of the usage of the library can be found in the pyssian-utils repository.

Source

Documentation

RDKit

RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python. It includes BSD license - a business friendly license for open source, Core data structures and algorithms in C++, Python 3.x wrapper generated using Boost.Python, Java and C# wrappers generated with SWIG, 2D and 3D molecular operations, Descriptor and Fingerprint generation for machine learning, Molecular database cartridge for PostgreSQL supporting substructure and similarity searches as well as many descriptor calculators, Cheminformatics nodes for KNIME, and Contrib folder with useful community-contributed software harnessing the power of the RDKit.

GitHub Tutorial GitHub

CREST

CREST was developed as a program for conformational sampling at the extended tight-binding level GFN-xTB. It provides a variety of sampling procedures, for example for improved thermochemistry, or solvation. Access the GitHub.

ROBERT

ROBERT is a Python package designed to help inexperienced researchers get started training machine-learning models. It is an ensemble of automated machine-learning protocols that can be run sequentially through a single command line or graphical user interface. The program works for regression and classification problems.

Documentation can be found here.

Publication can be found here.

YADDA

YADDA, or Yet Another Distortion-interaction Decomposition Automation is a package developed by the Paton group which helps in the automation of distortion-interaction analysis along an IRC, as well as energy decomposition analysis.

Documentation can be found here.