v1.5.0 (2025-11-24)
-------------------
* New Features (PBC systems)
  - PBC GDF extended to k-mesh computations; k-point GDF integrals stored in host memory with compression.
  - Multigrid algorithm supports PBC k-point SCF and band structure calculations.
  - Add the .analyze() method for PBC gamma-point and k-mesh DFT to summarize results and charge populations.
  - Fermi and Gaussian smearing for PBC and molecular DFT.
  - PBC RSJK algorithm for J/K matrix evaluation (J via MD J-engine; K via Rys quadrature).
  - Analytical nuclear gradients using RSJK and AFTDF for PBC gamma-point HF, k-mesh HF, and hybrid DFT.
  - Stress tensor evaluation using RSJK and AFTDF for PBC gamma-point HF, k-mesh HF, and hybrid DFT.
  - Geometry optimizer for PBC DFT.
* New Features (molecular systems)
  - Support for QMMM point charges and external electric fields.
  - 3c2e integrals contracted with density matrices and auxiliary vectors for memory-efficient DF Coulomb matrix evaluation.
  - DFT Hessian second-derivative grid response.
  - Minimum energy crossing point (MECP) search functionality.
  - PCM support for TDDFT derivative coupling calculations.
  - Basic GKS and two-component numerical integration, including GPU-accelerated multi-collinear functionals.
  - Multi-collinear spin-flip TDA/TDDFT excitation energies and analytical gradients.
* Improvements (PBC systems)
  - Linear-dependency handling for basis functions in molecular and PBC DFT calculations.
  - Refactored PBC nuclear gradients for more efficient GTH pseudopotential evaluation.
  - Faster GTH pseudopotential evaluation using the multigrid algorithm on large systems.
* Improvements (molecular systems)
  - Optimized DFT numerical integration memory usage, achieving ~20% performance gains.
  - Refactored and optimized molecular four-center-integral J/K builder, achieving 50-100% speed-up.
  - Improved phase determination method for NACV.
  - More numerically stable Hessian integrals for large-exponent GTOs.
  - MD J-engine optimized with reduced CUDA register pressure.
  - Third-order XC derivatives can be evaluated on GPU (requires gpu4pyscf-libxc 0.7).
  - Default auxbasis_response level increased to 2 for Hessian calculations with DF integrals.
  - Dimension checks for eigh, enabling scipy fallback for large arrays (size > 21350).
* Fixes
  - Handle eps=inf in solvent models.
  - Fixed an edge case in EDA electrostatics when cross-fragment nocc is 1.
  - Fixed EDA crash caused by fragments accessing JK matrices after DF 3-index tensors were freed.
  - Workaround for CUDA 13 compiler bugs affecting ECP kernels (disabling compilier optimizations)
  - Molecular and PBC 3c2e integral dimension issues for generally contracted basis sets.
  - Removed pre-allocated streams that caused inconsistent synchronization.
  - SMD Hessian.
  - UHF crash when level_shift is enabled.
* API updates
  - Fix to_gpu/to_cpu interface in SMD, TDDFT, and PCM-TDDFT
  - Added from_cpu hook on the GPU side, allowing pyscf to invoke this hook in its to_gpu method.

v1.4.3 (2025-08-20)
-------------------
* New Features
  - Geometry optimization for excited states using TDDFT-ris methods.
  - Non-adiabatic coupling vectors for TDDFT-ris methods.
  - Analytical gradients for DFT+U with k-point sampling.
  - Stress tensor calculations for semi-local DFT with k-point sampling and at the gamma-point.
  - ASE interface to support crystal lattice optimization.
  - Multigrid v2 algorithm for meta-GGA functionals.
* Improvements
  - GPU kernels for PBC overlap and kinetic integrals.
  - Reduced GPU memory usage for TDDFT-ris by storing tensors in host memory.
  - In PBC methods, scaled k-points (fractional coordinates) are stored to
    simplify lattice optimization calculations.
  - A preconditioned Krylov solver to accelerate convergence in TDDFT and
    dynamic polarizability calculations.
* Fixes
  - Basis decontraction issue for d and f orbitals.
  - A bug in Multigrid v2 algorithm related to non-orthogonal lattices.
  - Incorrect virtual orbital energies when level_shift was enabled.

v1.4.2 (2025-07-20)
-------------------
* New Features
  - Raman spectrum calculations.
  - Non-adiabatic coupling vector for time-dependent RKS, including the coupling.
    between ground state and excited states as well as among excited states.
  - DFT+U for molecule and PBC systems.
  - ALMO EDA 2 method.
  - Analytical gradients for TDDFT-ris method.
  - Analytical gradients for PBC k-point DFT.
  - Efficient analytical gradients for PBC Gamma-point DFT using the multigrid algorithm.
  - A custom CuPy memory pool to reduce GPU memory usage.
* Improvements
  - Improved PBC GDF integral computation at the Gamma point, including reduced.
    GPU memory usage and enhanced computational efficiency.
  - Set the J engine as the defult Coulomb matrix algorithm in the direct SCF driver.
  - Efficient Multigrid integral algorithm for various functions in PBC DFT.
    Gamma point computation such as get_nuc, get_pp, and GGA functionals.
  - Supporting xc='HF' setting in DFT.
* Fixes
  - Ensured compatibility with CUDA 12.3.
  - Issues related to the combination of density fitting, PCM solvent, and TDDFT.

v1.4.1 (2025-05-20)
-------------------
* New Features
  - Analytical hessian for VV10 functionals
  - DFT polarizability with VV10 functionals
  - TDDFT for VV10 functionals
  - Non-adiabatic coupling constants for TDDFT states
  - TDDFT gradients and geometry optimization solver for excited states
  - LR-PCM for TDDFT and TDDFT gradients
  - TDDFT-ris method
* Improvements
  - Optimization CUDA kernel and integral screening for MD J-engine. The MD
    J-engine is utilized by default for large system HF and DFT computation.
  - Optimization for PBC gaussian density fitting at gamma point.
  - ECP gradients CUDA kernel
  - Reduced atomicAdd overhead in Rys JK kernel
* Fixes
  - MINAO initial guess for ghost atoms

v1.4.0 (2025-03-27)
-------------------
* New Features
  - RKS and UKS TDDFT Gradients for density fitting and direct-SCF methods.
  - ECP integrals and its first and second derivatives accelerated on GPU.
  - Multigrid algorithm for Coulomb matrix and LDA, GGA, MGGA functionals computation.
  - PBC Gaussian density fitting integrals.
  - ASE interface for molecular systems.
* Improvements
  - Reduce memory footprint in SCF driver.
  - Reduce memory requirements for PCM energy and gradients.
  - Reduce memory requirements for DFT gradients.
  - Utilize the sparsity in cart2sph coefficients in the cart2sph transformation in scf.jk kernel
  - Molecular 3c2e integrals generated using the block-divergent alogrithm.
  - Support I orbitals in DFT.
* Fixes
  - LRU cached cart2sph under the multiple GPU environment.
  - A maxDynamicSharedMemorySize setting bug in gradient and hessian calculation under the multiple GPU environment.
  - Remove the limits of 6000 GTO shells in DFT numerical integration module.

v1.3.2 (2025-03-10)
-------------------
* Improvements
  - Dump xc info and grids into to log file
  - Optimize 4-center integral evaluation CUDA kernels using warp divergent algorithm
  - Support up to I orbitals in DFT
  - Fix out-of-bound issue in DFT hessian for heavy atoms (>=19)
* Deprecation
  - SM60 is not supported in PyPi package

v1.3.1 (2025-02-04)
-------------------
* New Features
  - Analytical Hessian for PCM solvent model
  - Driver for 3c methods (wB97x-3c, R2Scan-3c, B97-3c, etc.)
* Improvements
  - Preconditioner and computation efficiency of Davidson iterations for TDDFT


v1.3.0 (2025-01-07)
-------------------
* New Features
  - PBC analytical Fourier transform on GPU
* Improvements
  - Optimized computation efficiency and memory footprint for density fitting Hessian
  - Support pickle serialization for most classes (SCF, DF, PCM, etc.)
  - Efficiency of moving CuPy arrays between GPU cards


v1.2.1 (2024-12-20)
-------------------
* New Features
  - Change the license from GPL v3.0 to Apache 2.0
  - Multi-GPU support for SCF, Gradients, and Hessian computation using AO-direct algorithm
  - Add PBC HF and DFT with k-points, UHF/UKS, and density fitting
* Improvements
  - Change the default conv_tol_cpscf = 1e-3 / batch of atoms to conv_tol_cpscf = 1e-6 / atom
  - Fix numerical instability in complex-valued TDHF diagonalization
  - Improve PCM and QMMM with int1e_grids kernel
  - Support non-symmetric int3c2e integral
  - Optimize Hessian calculation with direct SCF
  - Improve the numerical stability of int3c2e for point charge
  - Add CI workflow for multi-GPU
* Fixes
  - Fix non-contiguous array error in p2p transfer between GPUs.
  - Fix bugs in NMR calculations


v1.2.0 (2024-12-09)
-------------------
* New Features
  - Spin-conserved TDA and TDDFT methods
  - Spin-flip TDA method.
  - J-engine using McMuchie-Davidson integral algorithm
  - Support multi-GPU density fitting energy, gradients and Hessian computation.
  - Second order SCF solver
* Improvements
  - Support non-hermitian density matrix in J/K builder
  - Secondary grids for CPHF solver
  - 3-center integral computation efficiency for gradients and hessian
  - One-electron Coulomb integrals against point charges and Gaussian charge distributions on grids.
  - Automatically apply SCF initial guess from existing wavefunction


v1.1.0 (2024-10-29)
-------------------
* New Features
  - Add esp charge and resp charge by @wxj6000 in #208
  - New Rys kernel by @sunqm in #221
  - Optimize nuclear gradients using new Rys kernel by @sunqm in #224
  - GPU kernel for analytical hessian by @sunqm in #227
  - Add QM/MM by @MoleOrbitalHybridAnalyst in #218
* Improvements
  - Improved compatiability with pyscf 2.7.0 by @wxj6000 in #216
  - Add skipping SCF cycles by @kvkarandashev in #229
  - Skip building gint, gvhf, ... when building libxc by @wxj6000 in #210
* Bugfix
  - Typo in build_wheels.sh by @wxj6000 in #209
  - Typo in dft_driver.py by @wxj6000 in #220
  - Bugfix: cusolver error when specifying gpu by @wxj6000 in #213
  - Bugfix: error in int2c2e by @wxj6000 in #212
  - Bugfix: inconsistent gradient with CPU. Improved to_cpu, uks gradient, and grid_response by @wxj6000 in #230
  - Bugfix: recompute int3c2e in DF UHF by @wxj6000 in #226
  - New Contributors
  - @MoleOrbitalHybridAnalyst made their first contribution in #218
  - @kvkarandashev made their first contribution in #229


v1.0.2 (2024-09-03)
-------------------
* Bugfix: append data in h5 file by @wxj6000 in #200
* Support customized CHELPG radii by @wxj6000 in #202
* Add cupy installation guide for developer installation instructions by @henryw7 in #204
* Bugfix: save density when spin unrestricted by @wxj6000 in #205
* Add chkfile support for pysisyphus by @henryw7 in #203


v1.0.1 (2024-08-24)
-------------------
* Bugfix in rks.reset by @wxj6000 in #191. The bug leads to the failure of geometry optimization with direct SCF (#190)
* Bugfix when CUDA unified memory is disabled. Removed CUDA unified memory in libxc, and reduced the overhead in calling libxc @wxj6000 in #180, #189
* Bugfix and Improvement in opt_driver by @wxj6000 in #187 #197
* Support SMD in opt_driver and dft driver @liuyu-chem1996 in #196
* Support thermo calculation in dft_driver @liuyu-chem1996 in #192


v1.0.0 (2024-07-23)
-------------------
Released features:
* Density fitting scheme and direct SCF scheme
* SCF, analytical gradient, and analytical Hessian calculations for Hartree-Fock and DFT
* Spin-conserved and spin-flip TDA and TDDFT for excitated states
* Nonlocal functional correction (vv10) for SCF and gradient
* PCM models, SMD model, their analytical gradients, and semi-analytical Hessian matrix
* Unrestricted Hartree-Fock and unrestricted DFT, gradient, and Hessian
* MP2/DF-MP2 and CCSD (experimental)
* Polarizability, IR, and NMR shielding (experimental)
