Investigating the S22 Dataset: Interaction Energies

Investigating the S22 Dataset: Interaction Energies

Abstract: We investigate the accuracy of simulations on the JSCH-2005 S22 dataset as computed by Qbox and Quantum Espresso. The S22 dataset consists of 22 molecules that have been used as a training set to evaluate the accuracy of approaches such as the density functional theory. This dataset is representative of a variety of noncovalent interactions that are biologically important such as hydrogen bonding and dispersion interactions. Extensive evaluations of 40 density functionals on the S22 set can be found in the publication of Yan Zhao and Donald Truhlar. Zhao and Truhlar carried out their DFT calculations using a modified Gaussian 03 program. Their published supporting data tabulates detailed interaction energies calculated for different functionals. We seek to evaluate the S22 dataset using the planewave pseudopotential codes Qbox and Quantum Espresso and compare the calculated interaction energies with each other and those published by Zhao and Truhlar. In addition, we use the ESTEST framework to present our supporting data using its post-processing features.


The S22 database designed by Jurecka et al. is intended to be used as a benchmark to evaluate methods like DFT. These have been used to evaluate the accuracy of different density functionals such as in the published work of Zhao and Truhlar or to test new or improvements to methods and implementations run on a single code.

The focus of this work is to compare the accuracy of existing DFT methods and functionals for different codes--namely Qbox and Quantum Espresso and also compare against the published results of a non-planewave code. These simulations focus on comparisons using the PBE density functional for the purposes of verification and one Vanderbilt Ultrasoft pseudopotential to give additional insight on the relative accuracy between these methods. We show how these comparisons can be easily compiled using the ESTEST framework and present our results along with links to supporting data at the level of the original simulation input/output that can easily be used for further post-processing or verification.

The S22 Data

The S22 data consists of 8 hydrogen-bonded complexes, 8 dispersion-dominated complexes, and 7 mixed complexes. The dispersion-dominated complexes contain datas 16--22, the mixed-complexes consist of 1--2,4--8, and the hydrogen-bonded ones are 9--15 & 3.

Mixed complexes

  1. (NH3)2
  2. (H2O)2
  3. (HCOOH)2
  4. (HCONH2)2
  5. HB uracil dimer
  6. 2-pyridoxine...2-aminopyridine
  7. A...T WC

Hydrogen-bonded complexes

  1. Benzene...CH4
  2. (CH4)2
  3. (C2H4)2
  4. PD-Benzene dimer
  5. Pyrazine dimer
  6. Stacked uracil dimer
  7. Stacked indole-benzene
  8. Stacked A...T

Dispersion-dominated complexes

  1. Ethene...Ethyne
  2. Benzene...H2O
  3. Benzene...NH3
  4. Benzene...HCN
  5. T indole...benzene
  6. T benzene dimer
  7. Phenol dimer


For each of the 22 systems we ran SCF calculations to consistency to determine the total energy of the compound. In addition we ran for dimer systems we calculated to the total energy for each monomer separately and similar considerations we taken for compounds with two distinct molecular components. Thus a total of 66 simulations were executed using each code--Qbox and Quantum Espresso that were necessary to calculate the interaction energies of the S22 data.

The pseudopotentials used for Qbox is the PBE generated using the method of Hamann, Schluter and Chiang, as modified by Vanderbilt available to download at the website. A translation of this potential was also used for the Quantum Espresso simulations to provide a set of comparisons used for validation purposes. We also simulate the S22 data with Quantum Espresso using a Vanderbilt ultrasoft pseudopotential for additional comparisons. These planewave simulations on the molecules were each carried out using a simple cubic supercell.


The computed interaction energies of the S22 data are tabulated below.

Table 1. Interaction energies in (kcal/mol) computed using QBox with PBE pseudopotential

Table 2. Interaction energies in (kcal/mol) computed using Quantum Espresso with PBE pseudopotential

Table 3. Interaction energies in (kcal/mol) computed using Quantum Espresso with Vanderbilt ultrasoft pseudopotential


6 7 8 11 14 15 20 21 22
Δ = | IEQbox-IEQE | 0.02 0.02 0.02 0.01 0.01 0.01 0.03 0.03 0.03

Table 4. Absolute error in (kcal/mol) for selected systems for interaction energies computed by Qbox and Quantum Espresso using the PBE pseudopotential

In terms of validation there is very good agreement between simulations run with Qbox and Quantum Espresso using the same parameters and equivalent pseudopotentials. Table 4. tabulates the select absolute errors in interaction energy between the calculations of QBox and Quantum Espresso with the largest error being only three hundredths of one kcal/mol or about 5x10-5 hartree. The untabulated errors in Table 4 had errors even less than 0.01 kcal/mol.



A. Links to interaction energy tables with supporting data references

B. Links to energy component tables with supporting data references