# The Dalton quantum chemistry program system

The authors have declared no conflicts of interest in relation to this article.

## Abstract

Dalton is a powerful general-purpose program system for the study of molecular electronic structure at the Hartree–Fock, Kohn–Sham, multiconfigurational self-consistent-field, Møller–Plesset, configuration-interaction, and coupled-cluster levels of theory. Apart from the total energy, a wide variety of molecular properties may be calculated using these electronic-structure models. Molecular gradients and Hessians are available for geometry optimizations, molecular dynamics, and vibrational studies, whereas magnetic resonance and optical activity can be studied in a gauge-origin-invariant manner. Frequency-dependent molecular properties can be calculated using linear, quadratic, and cubic response theory. A large number of singlet and triplet perturbation operators are available for the study of one-, two-, and three-photon processes. Environmental effects may be included using various dielectric-medium and quantum-mechanics/molecular-mechanics models. Large molecules may be studied using linear-scaling and massively parallel algorithms. Dalton is distributed at no cost from http://www.daltonprogram.org for a number of UNIX platforms.

This article is categorized under:

- Software > Quantum Chemistry

## Graphical Abstract

## INTRODUCTION

Dalton is a general-purpose program system for advanced quantum-chemical molecular electronic-structure calculations, distributed under a license agreement at no cost to the user. With Dalton, molecular systems may be studied using a variety of electronic-structure methods, including the Hartree–Fock (HF) and Kohn–Sham (KS) self-consistent-field (SCF) methods for wide applicability, the multiconfigurational SCF (MCSCF) method for high flexibility, and various coupled-cluster (CC) methods for high accuracy. At all these levels of theory, a wealth of molecular properties may be calculated, enabling the user of the program to study, for example, molecular structure, energetics, reactivity, spectroscopic parameters, linear, and nonlinear optical processes. Small systems may be accurately benchmarked using full configuration-interaction (FCI) techniques. Environmental effects may be incorporated at different levels of theory. For some electronic-structure models, large molecules can be studied using linear-scaling and massively parallel algorithms. In the present paper, we give an overview of the Dalton program system, with illustrations and emphasis on the Dalton2013 release.

## ELECTRONIC-STRUCTURE MODELS

With the Dalton program, the electronic structure of a molecule can be described using all standard nonrelativistic wave-function and density-functional models of modern quantum chemistry. In particular, the wave function may be calculated using HF, MCSCF, configuration-interaction (CI), and CC theories. Using density-functional theory (DFT), calculations may be carried out with a range of KS exchange–correlation functionals. Dalton calculations are performed with generally contracted Gaussian-type orbitals (GTOs) with solid-harmonic or Cartesian angular factors as one-electron basis functions; some models also use two-electron functions for explicit correlation.

### HF and KS SCF Theories

In Dalton, SCF calculations may be performed using a variety of optimization techniques. Apart from the traditional iterative Roothaan–Hall diagonalization method with direct-inversion-in-the-iterative-subspace (DIIS) convergence acceleration, the SCF energy may be optimized using a robust second-order trust-region method.1 All calculations may be carried out in either serial or parallel manner. SCF calculations may also be carried out using the linear-scaling module discussed later.

With Dalton, KS studies may be performed with a variety of KS exchange–correlation functionals, including local-density-approximation (LDA) functionals, generalized-gradient-approximation (GGA) functionals such as BLYP and PBE, global hybrid functionals such as B3LYP and PBE0, and range-separated hybrid functionals such as CAM-B3LYP and r-CAM-B3LYP. Additional functionals may be obtained by combining the included functionals for exchange and correlation in new ways. Double-hybrid functionals such as B2PLYP and MPW2PLYP, with a second-order perturbation contribution, are available. For the GGA and hybrid functionals, the exchange–correlation potentials may be asymptotically corrected to ensure a proper long-range behavior, important for the study of Rydberg states using response theory.2 Empirical DFT-D2 and DFT-D3 dispersion corrections may be applied. Dalton offers an implementation of spin-restricted and spin-unrestricted DFT for open-shell states.3, 4

### MCSCF Theory

For complicated electronic ground states characterized by static correlation and for valence-excited and core–hole states, MCSCF theory often provides the best solution. Dalton is characterized by an advanced MCSCF functionality with respect to both the construction of the MCSCF wave function and its optimization.5-8 MCSCF ground- and excited-state wave functions can be constructed using the flexible concept of a generalized active space (GAS),9 including the complete-active-space (CAS) and restricted-active-space (RAS)10 models as special cases. Dalton uses robust and efficient second-order MCSCF optimization techniques, based on the concept of a trust region.

To recover dynamic as well as static correlation, MCSCF theory may be combined with *N*-electron valence state second-order perturbation theory (NEVPT2).11 The NEVPT2 approach is similar to second-order CAS perturbation theory (CASPT2) but it is not affected by intruder-state problems.12, 13 All MCSCF and post-MCSCF multireference-CI calculations in Dalton are performed using the parallelized LUCITA module14, 15 with the wave function expanded either in configuration-state functions or in Slater determinants. As an important special case, FCI wave functions for benchmarking may be calculated using LUCITA.

### CC Theory

For dynamical correlation, a variety of (spin-restricted) CC-based models have been implemented in Dalton: the coupled-cluster-singles (CCS) model, second-order-Møller–Plesset (MP2) theory, degeneracy-corrected second-order perturbation theory (DCPT2), the coupled-cluster-doubles (CCD) model, the coupled-cluster-singles-and-doubles (CCSD) model,16, 17 and the coupled-cluster-singles-doubles-perturbative-triples (CCSD(T)) models. Several ring-CCD and random-phase-approximation (RPA) models are also available,18 including RPA with second-order screened exchange (RPA+SOSEX).19 Frequency-dependent ground- and excited-state properties can be studied using several specially developed models.20, 21 In particular, Dalton contains the iterative CC222 and CC323 models for frequency-dependent properties and the noniterative CCSDR(3) triples-correction model24 for excitation energies.

For faster basis-set convergence, the explicitly correlated MP2-R12/-F12, CCSD(T)(F12), and CC3(F12) models are available in Dalton.25-27 For higher efficiency, Cholesky decomposition techniques have been implemented for the CCSD(T) energy28 and for CC2 linear response properties.29 Atomic subsystems can be defined by Cholesky decomposition to reduce the computational requirements of the CC models.30

### Relativistic Corrections

Although Dalton is a nonrelativistic electronic-structure code based on the Schrödinger equation, it also has some functionality for incorporating the effects of relativity—in particular, effective core potentials may be used for energy and response functions.31 Relativistic effects may also be included using Douglas–Kroll–Hess second-order one-electron scalar integrals.31 When molecular response properties are calculated, the Breit–Pauli one- and two-electron spin–orbit operators may be included perturbatively.32-34 Dalton offers scaled spin–orbit integrals and the atomic-mean-field approximation to the spin–orbit integrals35 and parity-violation integrals.36

## MOLECULAR PROPERTIES

Dalton is a versatile tool for studying molecular properties. By response theory, the linear, quadratic, and cubic response of a molecule to external perturbations, including geometrical distortions and electromagnetic fields, may be explored in detail and a wide variety of spectroscopic parameters can be calculated. In the following, the molecular properties implemented in Dalton are described.

### Geometrical Properties

Molecular gradients (forces) and Hessians (force constants) may be calculated analytically at the HF, KS, and MCSCF37, 38 levels of theory, whereas analytical molecular gradients are available for the CCS, CC2, CCD, MP2, RPA, CCSD, and CCSD(T) models,18, 39 also in the frozen-core approximation. Numerical differentiation may be performed automatically for these quantities when an analytical implementation is not available.40

Dalton has an extensive functionality for exploring potential-energy surfaces by means of molecular gradients and Hessians. Minima can be determined using first-order (quasi-Newton) and second-order (Newton) methods.41 Dalton also has several first- and second-order methods for determining saddle points (transition states), including mode following42 and image minimization.43 Geometry optimizations may be performed for excited states44, 45 and core-ionized states.46 Potential-energy surfaces can be mapped by the calculation of intrinsic reaction coordinates. Born–Oppenheimer direct-dynamics studies can be performed by calculating classical molecular trajectories on the potential-energy surfaces, iteratively solving Newton's equation of motion for the atoms in the system, allowing dynamical studies of systems containing more than one hundred atoms.47, 48

Dalton may be used to calculate molecular vibrational spectra. In addition to harmonic frequencies, infrared intensities are obtained in the double-harmonic approximation49 and Raman intensities at a chosen frequency in the Placzek approximation. Dalton has an automated procedure for calculating rovibrationally averaged molecular geometries50 and vibrational averages for molecular properties at the HF, KS, and MCSCF levels of theory.40, 51

### Magnetic Properties

Nuclear shielding and indirect nuclear spin–spin coupling constants of nuclear-magnetic-resonance (NMR) spectroscopy may be studied at many levels of theory, including DFT,52-54 HF, and MCSCF theory,55, 56 the second-order polarization-propagator approximation (SOPPA),57 and CC theory; for an illustration, see the KS calculation of spin–spin coupling constants in valinomycin in Figure 1. Likewise, the hyperfine coupling tensors,58, 59 electronic *g* tensors,60 and zero-field splitting tensors61 of electron-paramagnetic-resonance (EPR) spectroscopy may be calculated. A number of related properties such as spin–rotation tensors,62 magnetizabilities,63 rotational,62 and vibrational64 *g* tensors are available. In addition, several chiroptical properties may be studied: vibrational circular dichroism,65 electronic circular dichroism,66 optical rotation,67 and vibrational Raman optical activity.68

In all calculations involving an external magnetic field, gauge-origin invariance is ensured by the use of London atomic orbitals, also known as gauge-including atomic orbitals. Alternatively, invariance may be ensured by the diamagnetic-zero version of the method of continuous transformation of the origin of the current density (CTOCD-DZ).69

### Frequency-Dependent Properties

Frequency-dependent linear response functions may be calculated at the HF, KS,70 SOPPA,71 and MCSCF levels of theory; at most of these levels of theory, quadratic and cubic response functions are also available. These functions contain a wealth of information not only about the reference state but also about other states of the molecular electronic system.72 For example, the poles of a linear response function (such as the dipole–dipole polarizability tensor) correspond to excitation energies from the reference state (not necessarily the ground state), whereas the corresponding residues represent transition moments between the reference state and the excited states. Quadratic response functions73 such as the first hyperpolarizability tensor provide information about nonlinear molecular response and two-photon processes—for example, coherent two-photon absorption74, 75 and two-photon absorption circular dichroism.76 By means of quadratic response theory, phosphorescence phenomena such as spin-forbidden dipole transitions induced by spin–orbit coupling can be studied.55, 77 Cubic response functions78-80 such as the second hyperpolarizability tensor give information about three-photon processes81 and excited-state polarizabilities,82 various higher-order spectroscopies such as second harmonic generation circular intensity difference,83 as well as a wide range of birefringences. Many of these properties may be studied in the presence of an externally applied electric field.

Several response functions have been implemented for the CCS, CC2, and CCSD models, including ground-state expectation values,84 excitation energies,85 linear response functions,86 quadratic response functions,87 cubic response functions,88 and their residues.86, 89 For the CC3 model, response properties including excitation energies have been implemented.85, 90 The double residues of the quadratic and cubic response functions allow first- and second-order properties of excited states—excited-state dipole moments and polarizabilities,91 for example—to be obtained even for single-reference methods.

Dalton contains implementations of special methodologies for the calculation of a variety of X-ray spectroscopies, such as X-ray emission, absorption,92 shake-up,93 X-ray Raman,94 and X-ray circular dichroism95 spectroscopies. For HF and KS theories, this is accomplished with the ‘static exchange’ approximation92 and the RPA-restricted channel technology, both of which are implemented with integral-direct methods, for applications to large systems such as polymers and surface adsorbates.

In standard response theory, the response functions become divergent under resonance conditions. Such divergences are unphysical, arising from the assumption of infinite lifetimes of excited states. In reality, relaxation mechanisms deplete the excited-state populations, making perturbation theory sound also under these conditions. Using the Ehrenfest theorem, an equation of motion for state vectors that mimics the inclusion of relaxation in density-matrix theory by the Liouville equation has been established.96 The resulting complex-polarization-propagator (CPP) approach has been implemented at the HF, KS, MCSCF, and CC levels of theory.96-100 The CPP response functions are resonance convergent and complex valued, providing a direct means of addressing spectroscopies carried out in regions of electronic resonances such as electronic circular dichroism,101 resonance Raman,102 magnetic circular dichroism,103 and vis/UV/X-ray absorption98, 104 spectroscopies. In Figure 2, we have plotted the NEXAFS linear absorption cross-section of Gd acetate nanoparticles, extracted from the imaginary part of the KS CPP electric-dipole polarizability.105 This calculation addresses the spectral region of the carbon K-edge, demonstrating the universal treatment of the spectrum in CPP theory.

### Environmental Effects

In Dalton, environmental effects may be included in quantum-chemical calculations either by treating the environment as a homogeneous dielectric continuum or by describing it at the level of polarizable molecular mechanics. In the simplest continuum approach, the self-consistent-reaction-field (SCRF) approach,106-108 the solute is placed in a spherical cavity in the dielectric medium. The SCRF method has been implemented for the HF, KS, CC2, CCSD, MCSCF, and SOPPA models and may be applied to linear, quadratic, and cubic response functions (up to quadratic response functions for the CC models) as well as for geometrical properties (molecular gradients and Hessians) and spin-resonance properties.109-114 In the more elaborate polarized-dielectric-continuum model (PCM),115 the solute is placed in a molecule-shaped cavity, thereby improving the solute–solvent description. Solvent effects at the level of PCM may be evaluated using the HF, KS, or MCSCF models and are available for properties up to the level of cubic response theory116-118 and for molecular gradients. The SCRF and PCM solvation models have both been implemented in a general nonequilibrium formalism.

Using the polarizable-embedding (PE) sch-eme,119-125 the environment may be treated as a structured and polarizable medium—that is, the discrete nature of the environment is retained and is described by distributed multipoles and anisotropic dipole–dipole polarizabilities. The PE method may be applied at the HF, KS, MP2, SOPPA, CCSD, or CC2 levels of theory and properties up to linear response may be evaluated. In addition, quadratic-response properties are available at the HF and KS levels.

Figure 3 illustrates the calculation of biophotonics properties with Dalton. The left insert shows an X-ray structure of channelrhodopsin, highlighting the chromophore embedded in the binding pocket of the protein. Using the PE method, the effect of the protein on the chromophore is described by representing the protein by distributed multipoles and (anisotropic) polarizabilities, retaining a quantum-mechanical description of the chromophore itself. The calculated embedding potential is highly anisotropic, significantly affecting the optical properties of the chromophore as illustrated by the one- and two-photon absorption spectra calculated at different levels of theory: by neglecting the protein environment entirely (‘vacuum’), by adjusting the chromophore geometry to that inside the protein (‘geometry’), by including the electrostatics but neglecting protein relaxation (‘electrostatics’), and by including both electrostatics and protein relaxation (‘polarization’).

## LARGE MOLECULES

For large molecules, containing up to more than a thousand atoms, Dalton contains a massively parallel, linear-scaling module for the HF, KS, and MP2 electronic-structure models; CCSD energies are also available in a massively parallel but not linear-scaling manner. For large molecules, it is essential to employ stable energy-optimization schemes so as to avoid divergence or convergence to spurious solutions, which is achieved by combining the robust three-level/augmented Roothaan–Hall method with a line-search trust-region technique.126-128 Fock/KS matrices and integrals are calculated by combining *J*-engine and density-fitting techniques with linear-scaling techniques such as the continuous fast multipole method (CFMM)129, 130 and the linear-scaling exchange (LinK) method.131, 132 This new SCF module is efficient also for medium-sized molecules, often reducing CPU time by an order of magnitude or more relative to optimizations carried out with the standard module—enabling, for example, studies of dynamics for systems containing more than 100 atoms.48

To illustrate the capabilities of Dalton for large molecules, we consider two systems—namely, the 168-atom valinomycin molecule and the 392-atom titin-I27SS model,38 see Figure 4. For valinomycin, geometry optimizations were conducted at the BP86/6-31G** and CAM-B3LYP/6-31G** levels of theory. Using 64 cores on eight 2.60 GHz Intel Xeon E5-2670 nodes, the geometry optimizations converged in 33 and 30 geometry steps, respectively. These optimizations took a total of 67 and 219 min, with an average of 2.0 and 7.3 min per geometry step, respectively. For the titin model, a single geometry step takes 25 min at the CAM-B3LYP/6-31G** level of theory.

Dalton has linear-scaling HF and KS modules for a number of molecular properties,133, 134 including polarizabilities, excitation energies, one- and two-photon absorption spectra, magnetic-circular-dichroism parameters,135, 136 and NMR shielding constants. Excited-state geometry optimizations are available.45 It is also possible to generate various absorption spectra using damped response theory.100, 137

To recover dynamical correlation in large molecules, Dalton utilizes a novel algorithm for generating highly local HF orbitals.138, 139 Such orbitals constitute the basis for the linear-scaling, massively parallel MP2 implementation in Dalton, which exploits the inherent locality of dynamical correlation in the Divide–Expand–Consolidate (DEC) strategy.140-142

As an illustration, we consider the calculation of the MP2 electrostatic potential of insulin. The HF orbitals were localized by minimizing the sum of the second powers of the orbital variances, generating a set of local occupied orbitals and a set of local virtual orbitals.138 Even the least local orbitals from this set, which are plotted in Figure 5 (left), are localized to small regions of the insulin molecule. The use of these local orbitals allows the inherently local electron correlation effects to be described efficiently using the DEC scheme. In Figure 5 (right), the DEC scheme was used to calculate the MP2 electrostatic potential for insulin, which may be used to identify areas of reactivity.

## DALTON HISTORY

The development that led to Dalton began in the early 1980s, as a collaboration between Scandinavian research groups at the Universities of Aarhus (H. J. Aa. Jensen, P. Jørgensen, J. Olsen), Oslo (T. Helgaker), and Uppsala (H. Ågren). Up to the mid 1990s, several programs that later became the Dalton code were run separately from a simple Bourne shell script. Molecular integrals over GTOs were calculated using the HERMIT code,143 HF and MCSCF wave functions were optimized using the SIRIUS code,1 whereas frequency-independent and -dependent molecular properties were calculated using the ABACUS144 and RESPONS145 codes, respectively.

The separate HERMIT, SIRIUS, ABACUS, and RESPONS codes were merged in 1995 and released as **Dalton 1.0** in 1997. This first release was essentially an MCSCF code with emphasis on molecular properties, including molecular gradients, Hessians, NMR properties, and linear, quadratic, and cubic response theory. **Dalton 1.1** was released later in 1997, with bug fixes and performance improvements.

The next major release was **Dalton 1.2** in 2001, which included integral-direct CC functionality that had been developed since the early 1990s16 and properties up to cubic response. Other important additions were SOPPA, nonequilibrium solvation, and vibrational averaging of molecular properties. The release included a new integral module ERI, developed for vector computers and integral-driven CC theory.17

In 2005, **Dalton 2.0** was released with a DFT module with functionality for molecular gradients and Hessians, NMR and EPR parameters, linear and quadratic response properties, based on an initial DFT implementation from 2000.52 In addition, Dalton 2.0 included NEVTP2 and MP2–R12 modules and numerous other additions and improvements.

The **Dalton2011** release introduced a new module for linear-scaling HF and KS calculations of large systems, using density fitting and CFMM techniques. In addition, Dalton2011 introduced a variety of additions and improvements to existing modules (CC3 and CCSD-R12), Cholesky techniques in the CC module,146 range-separated functionals,147 an excitation-energy diagnostic for DFT,148 and an atomic integral-direct implementation of the SOPPA149 and RPA(D)150 models. For the treatment of environmental effects, the PCM module was introduced.

In the **Dalton2013** release, the linear-scaling module has been extended significantly to incorporate local orbitals, the DEC-MP2 model, magnetic properties for HF and KS theories, and molecular dynamics. New massively parallel techniques have been introduced for SCF, DEC-MP2, and CCSD energies. Parallel MCSCF techniques have been introduced, extending the limit for MCSCF configuration-space sizes by at least an order of magnitude. Spin-multiplet151 and spin-flip152 density-functional response-theory methodologies, based on collinear and noncollinear exchange–correlation kernels, have been introduced, allowing computations of excited states with double-excitation character and various low-spin ground states.

## PROGRAM STRUCTURE

### Program Modules

Dalton consists of several modules, developed more or less independently. The HERMIT module calculates not only the integrals needed for energies but also for a large number of molecular properties, including all one- and two-electron integrals of the Breit–Pauli Hamiltonian. The ERI module is a vectorized and distribution-oriented integral generator that is invoked in certain calculations—in particular, in integral-direct CC calculations.

The SIRIUS module contains the (MC)SCF energy optimization code, whereas the CC module performs CC optimizations and property calculations. The LUCITA module performs large-scale CI calculations for general CI expansions and serves the MCSCF module with parallel evaluation of configuration vectors and density matrices. The DFT module performs the numerical exchange–correlation integration and serves the SIRIUS module with the required KS matrix elements.

The ABACUS module evaluates second-order properties for the (MC)SCF models—in particular, second-order static molecular properties in which the basis set depends on the applied perturbation. The RESPONS module is a general-purpose code for evaluating response functions, up to cubic response for (MC)SCF wave functions, quadratic, and cubic response for KS theory, and linear response for the SOPPA model. Linear-scaling calculations are performed with the LSDALTON module.

### Programming Details and Language

The Dalton program is written in FORTRAN 77, FORTRAN 90, and C, with machine dependencies isolated using C preprocessor directives. All floating-point computations are performed in 64-bit precision but the code takes advantage of 32-bit precision to reduce integer storage requirements in some sections.

The parallelization of the regular Dalton SCF modules for small molecules has been done exclusively using MPI and scales (for sufficiently large systems) with up to 90% efficiency on 1000 processors,153, 154 even demonstrating superlinear scaling.155 Also the CI routines have been parallelized using MPI.15 The Dalton SCF modules for large molecules have been parallelized in a hybrid MPI/OpenMP scheme.

### Hardware/Software Supported

Dalton runs on a variety of UNIX systems. The current release of the program has been tested on IBM-AIX, Linux, and OS X but is easily ported to other UNIX systems.

### Program Distribution

The Dalton source code is distributed to users at no cost. Three types of licenses are available: personal, site, and benchmark. With all licenses, the copyright to the code remains with the authors and is not put in the public domain—in particular, the source and binary code may not be redistributed. Furthermore, no fee may be charged for the use of Dalton. For copies of Dalton distributed with a site license, users may not access the source and object files. Benchmark licenses differ from personal and site licenses in that access is given to Dalton only for a restricted period of time. One year after its release, more than 700 personal and 150 site licenses had been issued for Dalton 2011.

### User Support

Dalton is distributed with limited user support. The code is installed using CMake with an automatic adaptation to supported platforms. Dalton is distributed with a cross-referenced manual and a test suite comprising more than 400 test jobs. Information about patches, releases, and other updates are provided on the Dalton forum (http://daltonprogram.org/forum), where users can also exchange experiences, view tutorials and seek help on installing and using the Dalton programs.

## CONCLUSION

We have presented the Dalton quantum chemistry program system—a highly flexible general-purpose code for molecular electronic-structure calculations. As a result of 30 years of continuous and vigorous development by a large number of authors, Dalton offers today not only a large selection of molecular electronic-structure models (including all standard models) but also the ability to calculate an exceptionally broad range of molecular properties from these models. In this manner, Dalton constitutes a unique tool for quantum-mechanical calculations of molecular systems, ranging from linear-scaling studies of large systems to high-accuracy benchmark studies on small systems.