End-to-End Workflow in ACME

1. Activating the conda environment

First we start by activating the DL_CPU conda environment. This environment is only on ACME and not in the local machines and it is shared between the groups.

conda activate DL_CPU

Activating this conda environment gives us access to the different software/packages that we might want to use, such as xTB and AQME. Without this step, the computer will behave as though the programs are not installed.

2. Using OpenBabel

We have previously created a .pdb file with gaussian for the pentene in the folder example_workflow in our home folder ( /home/username/ ). Our first step of the workflow will be to go to that folder and convert the .pdb file to a .xyz file that can be read by the xtb software.

cd example_workflow
obabel -ipdb pentene.pdb -oxyz -O*.xyz

Here the -ipdb specifies that the format of the input is pdb

3. Running an XTB calculation

We shouldn't directly run the xtb software because that will make the calculation not to run in one of the compute node but actually run in the node that controls the queue system and keeps everything connected.

As a consecuence our next step is to prepare the SLURM script to submit the calculation to the queue system. Luckily we have available several scripts in acme that automate that process. For setting up an xtb calculation we can use xtbsetup.sh

xtbsetup.sh -i pentene.xyz -c 0 -s ch2cl2

We can also specify the number of processors as well as any extra command line options:

xtbsetup.sh -i pentene.xyz -c 0 -p 1 -a '--opt'

Note

Remember that we can always see how to use a command using either -h or --help i.e. xtbsetup.sh -h

the output file will look like:

(Currently empty if someone runs the workflow it will be very appreciated
if they can provide the generated files to include them as examples)

Now we can proceed to submit the calculation to the queue system

sbatch ./job_xtb.sh

We can check the status of our calculation using the squeue command which will return us something along the lines of:

#Currently empty it would be awesome if someone runs the workflow example and 
#gives the generated files so that they can be included here.

4. Running a CREST calculation

Next, we will perform a conformational search using crest. To submit crest calculations we have the crestsetup.sh script

crestsetup.sh -i xtbopt.xyz -c 0 -p 32

This will create the file job_crest.sh with the contents:

(Currently empty if someone runs the workflow it will be very appreciated
if they can provide the generated files to include them as examples)

Now we submit the calculation the same way that we did with the xtb job.

sbatch job_crest.sh

Note

One way that we can follow how the calculation is going is to see the end of the log file. for crest calculations this file is crest.out and we can use the tail -f crest.out command to display in the terminal the end of the file. To go back to the usual console, use ctrl+c

5. Running a CENSO calculation

Now we are going to prepare a censo calculation. To do so we need to first create a .censorc file in our home directory.

cd ..
touch .censorc

Now we can use our favorite command-line editor to include all the parameters that are required for this program to run. For a detailed explanation please go to url. Here we include an example of its contents:

$CENSO global configuration file: .censorc
$VERSION:1.2.0 

ORCA: /usr/local/ORCA503/
ORCA version: 5.0.3
GFN-xTB: /usr/local/xtb/bin/xtb
CREST: /opt/apps/xtb/crest
mpshift: /path/including/binary/mpshift-binary
escf: /opt/apps/BIOVIA/TURBOMOLE/bin/x86_64-unknown-linux-gnu/escf

#COSMO-RS
ctd = BP_TZVP_C30_1601.ctd cdir = "/software/cluster/COSMOthermX16/COSMOtherm/CTDATA-FILES" ldir = "/software/cluster/COSMOthermX16/COSMOtherm/CTDATA-FILES"
$ENDPROGRAMS

$CRE SORTING SETTINGS:
$GENERAL SETTINGS:
nconf: all                       # ['all', 'number e.g. 10 up to all conformers'] 
charge: 0                        # ['number e.g. 0'] 
unpaired: 0                      # ['number e.g. 0'] 
solvent: gas                     # ['gas', 'acetone', 'acetonitrile', 'aniline', 'benzaldehyde', 'benzene', 'ccl4', '...'] 
prog_rrho: xtb                   # ['xtb'] 
temperature: 298.15              # ['temperature in K e.g. 298.15'] 
trange: [273.15, 378.15, 5]      # ['temperature range [start, end, step]'] 
multitemp: on                    # ['on', 'off'] 
evaluate_rrho: on                # ['on', 'off'] 
consider_sym: on                 # ['on', 'off'] 
bhess: on                        # ['on', 'off'] 
imagthr: automatic               # ['automatic or e.g., -100    # in cm-1'] 
sthr: automatic                  # ['automatic or e.g., 50     # in cm-1'] 
scale: automatic                 # ['automatic or e.g., 1.0 '] 
rmsdbias: off                    # ['on', 'off'] 
sm_rrho: alpb                    # ['alpb', 'gbsa'] 
progress: off                    # ['on', 'off'] 
check: on                        # ['on', 'off'] 
prog: orca                       # ['tm', 'orca'] 
func: r2scan-3c                  # ['b3-lyp', 'b3lyp', 'b3lyp-3c', 'b3lyp-d3', 'b3lyp-d3(0)', 'b3lyp-d4', 'b3lyp-nl', '...'] 
basis: automatic                 # ['automatic', 'def2-TZVP', 'def2-mSVP', 'def2-mSVP', 'def2-mSVP', 'def2-mSVP', '...'] 
maxthreads: 1                    # ['number of threads e.g. 2'] 
omp: 1                           # ['number cores per thread e.g. 4'] 
balance: off                     # ['on', 'off'] 
cosmorsparam: automatic          # ['automatic', '12-fine', '12-normal', '13-fine', '13-normal', '14-fine', '...'] 

$PART0 - CHEAP-PRESCREENING - SETTINGS:
part0: on                        # ['on', 'off']
func0: b97-3c                     # ['b3-lyp', 'b3lyp', 'b3lyp-3c', 'b3lyp-d3', 'b3lyp-d3(0)', 'b3lyp-d4', '...']
basis0: automatic               # ['automatic', 'def2-SV(P)', 'def2-TZVP', 'def2-mSVP', 'def2-mSVP', 'def2-mSVP', '...']
part0_gfnv: gfn2                 # ['gfn1', 'gfn2', 'gfnff']
part0_threshold: 6.0             # ['number e.g. 4.0']

$PART1 - PRESCREENING - SETTINGS:
# func and basis is set under GENERAL SETTINGS
part1: on                        # ['on', 'off']
smgsolv1: cosmo                # ['alpb_gsolv', 'cosmo', 'cosmors', 'cosmors-fine', 'cpcm', 'dcosmors', '...']
part1_gfnv: gfn2                 # ['gfn1', 'gfn2', 'gfnff']
part1_threshold: 4.0             # ['number e.g. 5.0']

$PART2 - OPTIMIZATION - SETTINGS:
# func and basis is set under GENERAL SETTINGS
part2: off                        # ['on', 'off'] 
prog2opt: prog                   # ['tm', 'orca', 'prog', 'automatic'] 
part2_threshold: 2.5             # ['number e.g. 4.0'] 
sm2: smd                         # ['cosmo', 'cpcm', 'dcosmors', 'default', 'smd'] 
smgsolv2: smd                    # ['alpb_gsolv', 'cosmo', 'cosmors', 'cosmors-fine', 'cpcm', 'dcosmors', '...'] 
part2_gfnv: gfn2                 # ['gfn1', 'gfn2', 'gfnff'] 
ancopt: on                       # ['on'] 
hlow: 0.01                       # ['lowest force constant in ANC generation, e.g. 0.01'] 
opt_spearman: on                 # ['on', 'off'] 
part2_P_threshold: 99            # ['Boltzmann sum threshold in %. e.g. 95 (between 1 and 100)'] 
optlevel2: automatic             # ['crude', 'sloppy', 'loose', 'lax', 'normal', 'tight', 'vtight', 'extreme', '...'] 
optcycles: 8                     # ['number e.g. 5 or 10'] 
spearmanthr: -4.0                # ['value between -1 and 1, if outside set automatically'] 
radsize: 10                      # ['number e.g. 8 or 10'] 
crestcheck: off                  # ['on', 'off'] 

$PART3 - REFINEMENT - SETTINGS:
part3: off                       # ['on', 'off'] 
prog3: prog                      # ['tm', 'orca', 'prog'] 
func3: pw6b95                    # ['b3-lyp', 'b3lyp', 'b3lyp-3c', 'b3lyp-d3', 'b3lyp-d3(0)', 'b3lyp-d4', 'b3lyp-nl', '...'] 
basis3: def2-TZVPD               # ['DZ', 'QZV', 'QZVP', 'QZVPP', 'SV(P)', 'SVP', 'TZVP', 'TZVPP', 'aug-cc-pV5Z', '...'] 
smgsolv3: smd                    # ['alpb_gsolv', 'cosmo', 'cosmors', 'cosmors-fine', 'cpcm', 'dcosmors', '...'] 
part3_gfnv: gfn2                 # ['gfn1', 'gfn2', 'gfnff'] 
part3_threshold: 99              # ['Boltzmann sum threshold in %. e.g. 95 (between 1 and 100)'] 

$NMR PROPERTY SETTINGS:
$PART4 SETTINGS:
part4: off                       # ['on', 'off'] 
couplings: on                    # ['on', 'off'] 
progJ: prog                      # ['tm', 'orca', 'prog'] 
funcJ: pbe0                      # ['b3-lyp', 'b3lyp', 'b3lyp-3c', 'b3lyp-d3', 'b3lyp-d3(0)', 'b3lyp-d4', 'b3lyp-nl', '...'] 
basisJ: def2-TZVP                # ['DZ', 'QZV', 'QZVP', 'QZVPP', 'SV(P)', 'SVP', 'TZVP', 'TZVPP', 'aug-cc-pV5Z', '...'] 
sm4J: smd                        # ['cosmo', 'cpcm', 'dcosmors', 'smd'] 
shieldings: on                   # ['on', 'off'] 
progS: prog                      # ['tm', 'orca', 'prog'] 
funcS: pbe0                      # ['b3-lyp', 'b3lyp', 'b3lyp-3c', 'b3lyp-d3', 'b3lyp-d3(0)', 'b3lyp-d4', 'b3lyp-nl', '...'] 
basisS: def2-TZVP                # ['DZ', 'QZV', 'QZVP', 'QZVPP', 'SV(P)', 'SVP', 'TZVP', 'TZVPP', 'aug-cc-pV5Z', '...'] 
sm4S: smd                        # ['cosmo', 'cpcm', 'dcosmors', 'smd'] 
reference_1H: TMS                # ['TMS'] 
reference_13C: TMS               # ['TMS'] 
reference_19F: CFCl3             # ['CFCl3'] 
reference_29Si: TMS              # ['TMS'] 
reference_31P: TMP               # ['TMP', 'PH3'] 
1H_active: on                    # ['on', 'off'] 
13C_active: on                   # ['on', 'off'] 
19F_active: off                  # ['on', 'off'] 
29Si_active: off                 # ['on', 'off'] 
31P_active: off                  # ['on', 'off'] 
resonance_frequency: 300.0       # ['MHz number of your experimental spectrometer setup'] 

$OPTICAL ROTATION PROPERTY SETTINGS:
$PART5 SETTINGS:
optical_rotation: off            # ['on', 'off'] 
funcOR: pbe                      # ['functional for opt_rot e.g. pbe'] 
funcOR_SCF: r2scan-3c            # ['functional for SCF in opt_rot e.g. r2scan-3c'] 
basisOR: def2-SVPD               # ['basis set for opt_rot e.g. def2-SVPD'] 
frequency_optical_rot: [589.0]   # ['list of freq in nm to evaluate opt rot at e.g. [589, 700]'] 
$END CENSORC

We are now ready to generate the submit script for our censo calculation using censosetup.sh

cd example_workflow
censosetup.sh -i crest_conformers.xyz -c 0 -p 4 -o 8

This will create the file job_censo.sh with the contents:

(Currently empty if someone runs the workflow it will be very appreciated
if they can provide the generated files to include them as examples)

Now we submit the calculation to the queue system.

sbatch job_censo.sh

Note

We can follow the calculation status in the log file censo.out

6. Running QM calculations. Gaussian, ORCA, AQME and Goodvibes

After we have generated our conformers using semi-empirical and low cost DFT we now proceed to refine them using the DFT method of our choice with Gaussian. To do so we will use the in-house developed software aqme

python -m aqme --qprep --program gaussian --files enso_ensemble_part2.xyz --qm_input 'm062x def2svp opt freq=noraman' --mem 16GB --nproc 8

This will create a new folder named QCALC with our gaussian inputs.

cd QCALC
# this command will display the full contents of a file in our terminal
cat enso_ensemble_part2_conf_6.com
(Currently empty if someone runs the workflow it will be very appreciated
if they can provide the generated files to include them as examples)

Now to submit all the gausian calculations we have the gsub

gsub *.com -n 8 -q normal

Note

We will be able to follow each one of the calculations progress in their respective .log files.

We now will use another in-house developed software, goodvibes, to check if we have any imaginary frequencies or duplicate geometries.

python -m goodvibes *.log --imag --dup

Finally we are going to run SP calculations using ORCA. We use aqme to generate the input files (That have the extension .inp) and proceed to submit them following the same exact process as with Gaussian calculations

python -m aqme --qprep --program orca --files '*.log' --qm_input 'wb97m-v def2-tzvp' --mem 2GB --nproc 8
cd QCALC/
gsub *.inp -n 8 -q normal

Warning

Be careful of how you write the basis set and functional as sometimes the "spelling" changes across programs and will lead to error terminations. i.e. def2tzvp is valid for Gaussian but for ORCA def2-tzvp is the appropriate "spelling".

Now, we are going to obtain the thermochemistry of our calculations with the single point corrections using goodvibes

First we will rename all ORCA calculations so that we can clearly remember that they are SP calculations. We can do it with a python script or in a python console:

from pathlib import Path
files = list(Path.cwd().glob('*.out'))
# we rename the input and the output files
for ofile in files:
    ifile = ofile.parent/f'{ofile.stem}.inp'
    ifile.rename(f'{ifile.stem}_sp.inp')
    ofile.rename(f'{ofile.stem}_sp.out')

We can actually do this in a single line:

python -c "from pathlib import Path; files = list(Path.cwd().glob('*.out')); for ofile in files: ifile = ofile.parent/f'{ofile.stem}.inp'; ifile.rename(f'{ifile.stem}_sp.inp'); ofile.rename(f'{ofile.stem}_sp.out')"

We can do it using bash:

# It is recommended to run first the following line to ensure that the
# renaming is going to work as expected:
for item in *.out; do echo $item " -> " ${item%.*}_sp.out; done

# After checking that it will do the modifications that we want we proceed to
# rename the files:
for item in *.out; do mv -v $item ${item%.*}_sp.out; done
for item in *.inp; do mv -v $item ${item%.*}_sp.inp; done

Finally we move all the QM outputs calculations to the same folder and run goodvibes

# We first rename all the calculations
# we first move all calculations to the same place
mv *.out ../
cd ..
python -m goodvibes *.log --imag --dup --spc sp

The final step of our example workflow is going to include the drawing of the Potential Energy Surface (PES). To do so we need a file, pes.yaml with the following contents:

--- # PES
​
Me: [CIS-Me, TRANS-Me]
Ph: [CIS-Ph, TRANS-Ph]
2ClPh: [CIS-2ClPh, TRANS-2ClPh]
tBu: [CIS-tBu, TRANS-tBu]
Cy: [CIS-Cy, TRANS-Cy]
Furan: [CIS-Furan, TRANS-Furan]
2MeThiofuran: [CIS-2MeThiofuran, TRANS-2MeThiofuran]
2Me: [CIS-2Me, TRANS-2Me]
4ClPh: [CIS-4ClPh, TRANS-4ClPh] 
Ene: [CIS-Ene, TRANS-Ene]
Nap: [CIS-Nap, TRANS-Nap]
​
​
--- # SPECIES
​
CIS-Me: CIS_MeBoryl_*
TRANS-Me: TRANS_MeBoryl_*
CIS-Ph: CIS_PhBoryl_*
TRANS-Ph: TRANS_PhBoryl_*
CIS-2ClPh: CIS_Ph2ClBoryl_*
TRANS-2ClPh: TRANS_Ph2ClBoryl_*
CIS-tBu: CIS_tBuBoryl_*
TRANS-tBu: TRANS_tBuBoryl_*
CIS-Cy: CIS_CyBoryl_*
TRANS-Cy: TRANS_CyBoryl_*
CIS-Furan: CIS_FuranBoryl_*
TRANS-Furan: TRANS_FuranBoryl_*
CIS-2MeThiofuran: CIS_2MeThiofuranBoryl_*
TRANS-2MeThiofuran: TRANS_2MeThiofuranBoryl_*
CIS-2Me: CIS_Me2Boryl_*
TRANS-2Me: TRANS_Me2Boryl_*
CIS-4ClPh: CIS_4ClPh*
TRANS-4ClPh: TRANS_4ClPh*
CIS-Ene: CIS_Ene*
TRANS-Ene: TRANS_Ene*
CIS-Nap: CIS_Nap*
TRANS-Nap: TRANS_Nap*
​
--- # FORMAT
    dec :  2 
    legend : False
    color : black, #ab3737
    pointlabel : False 
    gridlines: False
    show_conformers: False
    show_gconf: False
    dpi : 400
    title: BR
    ylim : -15,30
​

And now we use goodvibes to draw the PES.

python -m goodvibes *.log --imag --dup --spc sp --pes pes.yaml

PES