Incremental Treatments of the Full Configuration Interaction Problem

The recent many-body expanded full configuration interaction (MBE-FCI) method is reviewed by critically assessing its advantages and drawbacks in the context of contemporary near-exact electronic structure theory. Besides providing a succinct summary of the history of MBE-FCI to date within a generalized and unified theoretical setting, its finer algorithmic details are discussed alongside our optimized computational implementation of the theory. A selected few of the most recent applications of MBE-FCI are revisited, before we close by outlining its future research directions as well as its place among modern near-exact wave function-based methods.

holds all occupied orbitals of a given system, the method will be that of our initial work on virtual orbital MBEs. 11 However, for any choice of reference space different from these limiting cases, the basic theory is entirely generalizable to all conceivable correlation domains.
An MBE-FCI calculation now proceeds by performing an orbital-based MBE in the resulting expansion space, which will serve to recover the residual correlation missing from performing an initial complete active space configuration interaction (CASCI) calculation in the reference space alone. Denoting this (truncated FCI) reference correlation energy as E ref , an MBE-FCI decomposition of the FCI correlation energy will formally read as In Eq. 2, the action of the operator S l onto [Ω] k is to yield all possible unique subtuples of order l (1 ≤ l < k), and [Ω] k is defined on par with p above in Eq. 1.
The most general treatment of electron correlation is now achieved by choosing upon an empty reference space, in which case the total expansion space will span all of the MOs of the system at hand. As for when the reference space encompasses all virtual or occupied rather than first order with all possible unique pairs of correlating occupied and virtual orbitals. However, despite the fact that MBE-FCI will be free of any constraints in this case, it will importantly not represent a true tabula rasa approach to FCI as an implicit bias still exists in the choice of MO basis. To that end, Stoll has recently presented work on the optimization of localized orbitals at a partially correlated level, rather than for the traditional uncorrelated HF starting point. 29 In our experience hitherto, spatially localized orbitals often represent an excellent choice of MO basis within the scope of MBE-FCI as they allow for more compressed expansions (fewer significant contributions altogether), in turn, leading to faster convergence profiles. Another typical choice is to use a set of orbitals tailored to a specific correlated method. 30,31 We have previously proposed the use of these MOs in combination with a so-called base model in the expansion. 12 In this case, rather than the total correlation energy, one expands the gap between FCI and an intermediate correlated model, for which a preliminary calculation on the whole system is feasible, in part, also to compute the one-electron reduced density matrix needed to derive its natural orbitals (NOs).
As the quantity that need be recovered by the MBE will be significantly reduced, the use of base models can lead to faster convergences. However, additional constraints are tied into the MBE in this case as the base model is fundamentally required to yield a reasonable approximation to FCI for it to accelerate the underlying MBE-FCI algorithm. In cases where this is not true, e.g., in the presence of static and strong correlation, typical choices of base models, e.g., the acclaimed CCSD(T) method of coupled cluster theory, 32 will generally be limited in their performance as they are themselves based on a single determinant.
Now, in the case where a given system is indeed dominated by a single reference determinant (say, the HF determinant), this determinant will have the largest weight in the linear FCI expansion of the wave function and the system is said to be dominated by dynamical correlation alone. However, in cases where more than one determinant have significant weights, one will need to be able to describe the important (and common) constituents of these from the beginning and these will then need be included in the reference space. More generally, capturing the integral part of the total electron correlation in the reference space will lead to a faster convergence of the expansion. This observation will hold true even when the HF determinant dominates the FCI wave function, and reducing the size of the expansion space will additionally lead to fewer possible orbital tuples throughout the MBE. For linear molecules, the π-pruning technique of Refs. 13 and 15 has further been introduced in order to deal with the D ∞h and C ∞v point groups within their D 2h and C 2v subgroups, respectively. Essentially, π-pruning is a sort of prescreening filter that works to prune away all increment calculations that fail to simultaneously include the x-and y-components of a given pair of degenerate π-orbitals. The use of π-pruning generally results in much shorter (faster) expansions for molecules belonging to non-Abelian point groups, while at the same time warranting convergence onto states spanned by the correct irreducible representation.
While π-pruning is a specific filter designed for the treatment of a specific type of systems, it also serves as an example of how it is generally possible to add alternative filters in a top-down manner in order to accelerate or assure convergence onto a target FCI property.

Implementation
As a platform for our theoretical work on MBE-FCI over the past couple of years, we have developed our own Python-based, open-source PyMBE code. 33 All electronic structure kernels within PyMBE are formulated upon the PySCF program, 34,35 and the MPI4Python module handles parallel communication over the message passing interface (MPI) standard. [36][37][38] Ever since its conception, the PyMBE code has seen several rounds of heavy optimizations.
In particular, the memory handling and footprint of the involved 1-and 2-electron integrals as well as all involved intermediates and results have been overhauled. The recursive nature of Eq. 2 implies that we need to be able to look up a vast number of subset contributions when calculating a given increment. In PyMBE, this is achieved by representing every tuple of orbitals by its hash and using binary searches in finding all subset occurrences. However, this setup implies that a sorted array of hashes must be stored in addition to the actual increments which they correspond to, adding to the memory requirements in the code. For that reason, a hybrid MPI+MPI approach to shared-memory allocation and access has been implemented throughout our code, in which the MPI Win allocate shared function is used as a departure from the standard abstract and distributed memory model of MPI. 39 As previously detailed in Ref. 14, the underlying memory organisation on a given computer node gets exposed to MPI, allowing us, in turn, to bypass the expensive and convoluted MPI- At any given order in the MBE, all possible orbital tuples are being yielded by a generator function, which takes into account the composition of the reference space. In case this is empty, only tuples that make reference to both occupied and virtual orbitals are allowed, as touched upon in Section 2. The task scheduling has been implemented in a round-robin fashion and the input generator has been designed in such a way that those tuples that correspond to large (determinant-wise) CASCI calculations will precede the smaller ones. Once a given process has been assigned a tuple (i.e., a unique work task), it proceeds by computing the specific core and active space indices needed for the CASCI calculation. Next, the corresponding 1-and 2-electron integrals are extracted alongside the core energy. In PyMBE, these are stored in the transformed MO basis in shared memory with the electron repulsion integrals compressed into a matrix form with 4-fold symmetry. Finally, the CASCI energy is calculated, before the increment is computed and stored alongside the corresponding hash. If enabled, a full suite of restart functionalities guarantee that MBE-FCI is trivially protected against hardware failures, strict time limits, etc., at only a minimum of associated penalty.
The most recent version of our screening protocol has been implemented such that MOs get screened away from the full expansion space according to the absolute magnitude of the tuples which they take part in, such that a certain percentage of the MOs that contribute the least get removed (governed by a dedicated input parameter). In turn, shrinking the expansion space leads to a reduced number of increment calculations at the orders to follow, as only those specific MOs of the expansion space (at any given order) that give rise to the numerically largest increments are retained among the tuples at the following order.
In the case one or more orbitals get screened away from the effective expansion space, the code next enters a recently implemented purging module, which is designed to retain only those contributions at lower orders that are needed going forward. The rationale behind this step in an MBE-FCI calculation is that all subsequent tuples will not make reference to the screened orbitals and their increments at lower orders are hence not required anymore. This purging procedure generally works to lower our memory requirements significantly.
The parallel scaling potential of MBE-FCI was assessed in Ref. 13. Being computerather than memory-bound, the resource utilization of the theory and its implementation within the PyMBE code is best measured in terms of its strong scalability. In Figure 1  At scale, the efficiencies at 512 (12,288 cores) and 1,024 nodes (24,576 cores) amount to 79% and 91% for the expansions in the cc-pVTZ and cc-pV5Z basis sets, respectively, and the difference in performance between the two basis sets can be ascribed to the significantly larger number of individual CASCI calculations in the latter of the two expansions. MBE-FCI is thus seen to offer a highly scalable treatment of the electron correlation problem with a massive parallelism that is ideally suitable for modern distributed supercomputers.

Applications
Having covered its theoretical basis, we will now review a selected few of the molecular systems for which MBE-FCI has been applied to date. Most recently, MBE-FCI took part Although improvements have subsequently been made to the code base-to the extent where the total compute time can be reduced by close to a factor 2 (cf. Section 3)-the theory behind MBE-FCI is arguably somewhat more expensive than its alternatives. However, with these exhaustive resource requirements follows rigour, as evidenced by how well converged the final energy is. The change in energy across the final two orders in the expansion amounts to a mere −0.04 mE H , or −0.1 kJ/mol, that is, well within thermochemical tolerance. However, and this is important to emphasize, no methodical measure of the final uncertainty against FCI currently exists, which we will elaborate further on in Section 5. That being said, given ample computational resources, even larger systems (of a similar nature) will also be amenable to a treatment by MBE-FCI as the dimension of the largest CASCI calculations in Fig. 2 is well within the capabilities of even today's optimized FCI kernels.
As an example of how to deal with static correlation, MBE-FCI was applied in Ref. promise of providing detailed information in the thermodynamic limit whenever it has to rely on CASSCF(N ,N ) expansion references. To that end, as was discussed in detail in Ref.
13, canonical CASSCF orbitals are bound to remain delocalized over large sections of the chain, a fact which in turn inhibits the orbital screening. For this reason, additional results, using a reference space comprising only the RHF determinant, but localized PM rather than canonical virtual orbitals, were furthermore presented. Presented in Fig. 3, these results are in excellent agreement with DMRG for all but the shortest bond distances in the repulsive region where the concept of locality is anyways somewhat ill-defined. Importantly, due to the formulation on a standard RHF rather than a CASSCF reference, MBE-FCI is potentially transferrable to larger chains and basis sets (and even other topologies, e.g., rings and sheets), thus offering a viable approach for the treatment of the thermodynamic limit.
While its original formulation was focussed solely on the calculation of correlation energies for closed-and open-shell systems, MBE-FCI has recently been extended to the treatment of excitation energies and dipole moments for ground and excited states. 15 In analogy with Eq. 1, excitation energies may be computed by an expansion of the energetic gap between the ground and an excited state, E 0n , rather than the correlation energy As a CASCI calculation in an active space absent of any form of electron correlation will yield no correlation energy (comprising only the HF solution) and hence no excited states, Adding the nuclear component, µ nuc = K Z K r K , returns the molecular dipole moment.
Finally, transition dipole moments, t 0n , may be evaluated on par with Eq. 5, except for the fact that the individual increments are computed on the basis of transition RDMs, γ 0n , which may be arrived at using the wave functions of both states involved in a given CASCI calcula-tion. Being a vector rather than a scalar quantity, the screening procedure proceeds along all three Cartesian components (x, y, z) in the case of (transition) dipole moments and must be simultaneously fulfilled for all if a given MO is to be screened away from the expansion space. What are then the practical limitations of current-generation incremental approaches like MBE-FCI? Consider the potential energy curve of the chromium dimer in Fig. 5, which has nowadays developed into an appraised stress test for high-accuracy electronic structure theory. As per valence-bond theory, Cr 2 has a formal hextuple bond and the 1 Σ + g ground state will eventually dissociate into two equivalent atoms, each in a configuration of high spin with a total of 6 unpaired electrons in the Cr 3d and 4s atomic orbitals (AOs). Unlike for the H 10 chain in Fig. 3, where the occupied (and to some extent even the virtual) MOs localize optimally onto the involved atomic centers, this is not necessarily the case in general, more complex system built from atoms of arbitrary covalency. For instance, in the present case of Cr 2 , the 12 electrons in question, alongside the 12 MOs that map to the corresponding AOs, will demand special consideration along the bond dissociation coordinate. In the language of MBE-FCI, the smallest possible reference space, which would remain unaltered as the bonds are elongated, will thus coincide with this (12,12) active space. As CASSCF in this valence space only yields a very shallow minimum (at an unreasonable large bond length) for this particular system, the actual MBE in the expansion space, which is accountable for the general treatment of dynamical correlation, will soon come to involve excessively large For the specific case of the Cr 2 dissociation, alternative approaches like DMRG or even state-of-the-art semistochastic heat-bath CI (SHCI) are needed for a qualitatively correct description of the electronic structure in all correlation domains. 47 The SHCI results from Ref. 47 are reproduced in Fig. 5, but it is important to note that even this method is having its capabilities stretched in the shoulder region from 1.8 to 2.7Å, despite the use of a modest basis set of double-ζ quality and a scalar-relativistic Hamiltonian. 49 Further to that, the basis set dependence exhibited by, e.g., CCSD(T) or SHCI is strong at all considered bond lengths, which explains the pronounced differences with respect to experiment data that are still visible in Fig. 5. As impressive as the SHCI results are on their own, the prospects of rationalizing the odd profile of the Cr 2 dissociation curve by means of near-exact quantum chemistry thus remain somewhat elusive for now.

16
In comparison with most of the alternative methods in existence today, many of these are bound to allow for faster and perhaps more affordable routes towards simulating FCI properties than MBE-FCI, cf. Ref. 10. However, MBE-FCI arguably sets itself apart from the rest by offering an incremental, robust, and widely applicable approach which is principally not restricted by the exponential scaling wall encountered in, e.g., the various SCI approaches. In addition, the flexibility of orbital-based MBEs admits a number of further knobs to turn over traditional approaches centred around individual determinants or configuration state functions; namely, besides variances with respect to the employed orbital basis, generalized MBE-FCI allows for the use of different reference spaces. In the asymptotic limit of an untruncated expansion, the choice of reference space will be irrelevant as the expansion trivially yields the exact FCI results, but upon introducing effective protocols for screening (incrementally) negligible orbitals away from the corresponding expansion spaces, some choices of reference spaces will yield noticeable more compact expansions than others.
However, how to decide upon an optimal choice in as black-box a manner as possible still remains mostly unsolved. One feasible option may be to leverage information on independent orbital correlations at low orders in the MBE, but additional work on more automated selection schemes will be the topic of future work on extensions and refinements of MBE-FCI.
Moving forward, it will furthermore be interesting to see if regression techniques or related statistical processes may be implemented to learn certain components of MBE-FCI in lieu of a brute-force account of the underlying electronic structure all the way up throughout the MBE. For instance, despite the fact that efficient protocols have been implemented to screen away incremental contributions to the MBE that are deemed energetically redundant, these remain rather ad hoc in the sense that they lack rigour and rely on simplified estimates of the correlation between MOs. A promising idea is now to use modern machine learning, 50,51 particularly models capable of disentangling the correlation patterns present among the individual MOs on the basis of the corresponding increments. 52 Given machine models for different systems, these will collectively aid in the design of transferable descriptors for use in all future MBE-FCI calculations. Not only will these enhancements of MBE-FCI see its runtime execution accelerated, but refined models may further be used to correct final results by terms that account for the most important of the screened increments. In turn, this will enable proxies for assessing the inherent uncertainty of an MBE-FCI run, something which is currently missing. While the use of regression for this purpose will necessarily result in a departure from the otherwise rigorous grounds of MBE-FCI, the importance of available error estimates cannot be underestimated, a point which is equally true in the case of other near-exact methods. This pertinent issue was also recently discussed in Ref. 41.
In extending MBE-FCI further, it is worth noting that the underlying theory is by no means restricted to FCI targets; for instance, orbital-based expansions may easily well be employed within coupled cluster theory. In addition, the use of alternative and approximative FCI solvers as a means to allow for larger (and faster) CASCI calculations remains to be explored within MBE-FCI. Finally, in the spirit of recent work seeking to revitalize the idea of transcorrelation, 53-58 a natural, albeit non-trivial extension of the current generation of MBE-FCI will be to allow for expansions to spawn from a similarity-transformed Hamiltonian, akin to what is found in equation-of-motion coupled cluster theory. 59 Its appealing traits as a correlated zeroth-order formulation of electron correlation aside, the non-Hermiticity of the theory will pose entirely new conceptual as well as technical challenges.
Incorporating correlation directly into the very Ansatz of MBE-FCI, however, is bound to result in even faster convergent MBEs and may further prompt the development of a wealth of alternative methods. These may not necessarily be aimed at a rigorously defined target property (e.g., FCI ground-or excited-state energies or dipole moments), but rather seeking to profit from the fact that even low-order truncations of MBE-FCI will yield qualitatively accurate properties, as an alternative to contemporary correlated wave function theory.