Skip to content

Adding DFTKMakieExt.jl & BandsUnfold.jl#1264

Draft
physarief78 wants to merge 5 commits intoJuliaMolSim:masterfrom
physarief78:Supercell-bands-unfolding-technique
Draft

Adding DFTKMakieExt.jl & BandsUnfold.jl#1264
physarief78 wants to merge 5 commits intoJuliaMolSim:masterfrom
physarief78:Supercell-bands-unfolding-technique

Conversation

@physarief78
Copy link
Copy Markdown

@physarief78 physarief78 commented Feb 24, 2026

Dear maintainers and community.

Based on open issues #1235 that i make—i have done added the unfolding technique to map the supercell bands into its primitive to analyze the matching between expanded system to its fundamental physics theory. So i create two new files:

  1. BandsUnfold.jl for the main calculation for unfolding
  2. DFTKMakieExt.jl for plotting the results

Why i use Makie? Because i want more high-quality and optimized plotting package to make a comperhensive plot, compact enough, yet adaptive. For the calculation technique, mathematically we can do it by letting the supercell lattice $\mathbf{L}_s$ and the primitive lattice $\mathbf{L}_p$ be related by an integer transformation matrix $\mathbf{M}$:

$$ \mathbf{L}_s = \mathbf{L}_p \mathbf{M}. $$

Their reciprocal lattices satisfy $\mathbf{G}_s = \mathbf{M}^{\mathsf T} \mathbf{G}_p$. A Bloch state in the supercell at wavevector $\mathbf{k}_s$ (in the supercell Brillouin zone) and band index $n$ is expanded as

$$ \psi_{n,\mathbf{k}_s}(\mathbf{r}) = \sum_{\mathbf{G}_s} c_{\mathbf{G}_s}^{(n)} e^{i(\mathbf{k}_s + \mathbf{G}_s)\cdot \mathbf{r}}, $$

where $\mathbf{G}_s$ are reciprocal lattice vectors of the supercell. Because the supercell is a periodic repetition of the primitive cell, this state can be expressed as a superposition of primitive‑cell Bloch states at a primitive wavevector $\mathbf{k}_p$ that satisfies

$$ \mathbf{k}_p = \mathbf{k}_s + \mathbf{G}_s \quad (\text{mod } \mathbf{G}_p), $$

with $\mathbf{G}_p$ any reciprocal lattice vector of the primitive cell. In practice, for a given $\mathbf{k}_s$ we examine all plane‑wave coefficients and test whether the shifted wavevector falls onto the desired primitive $\mathbf{k}$-point (or along the chosen $\mathbf{k}$-path).

The spectral weight with which the supercell state at $(\mathbf{k}_s, n)$ contributes to the primitive state at $\mathbf{k}_p$ is defined as

$$ w_{n,\mathbf{k}_s}(\mathbf{k}_p) = \sum_{\substack{\mathbf{G}_s \ \mathbf{k}_p = \mathbf{k}_s + \mathbf{G}_s ;(\text{mod } \mathbf{G}_p)}} |c_{\mathbf{G}_s}^{(n)}|^2. $$

For spin‑polarized calculations (collinear spin, :full), the sum includes contributions from both spin components.

The implementation proceeds in several steps:

  1. Precompute transformation matrices
    From the primitive cell band data (band_data_prim) we extract the primitive lattice $\mathbf{L}_p$ and reciprocal lattice $\mathbf{G}_p$. From the supercell results (scfres_sc) we obtain the supercell reciprocal lattice $\mathbf{G}_s$.
    The matrix
    $$
    \mathbf{T} = \frac{\mathbf{L}*p^{\mathsf T}}{2\pi} \cdot \mathbf{G}_s
    $$
    maps a supercell reciprocal vector (as an integer triplet) to primitive‑cell fractional coordinates.

  2. Define the target $\mathbf{k}$-path
    The primitive $\mathbf{k}$-points are taken from band_data_prim. Their Cartesian coordinates are
    $\mathbf{k}_{\text{cart}} = \mathbf{G}p \cdot \mathbf{k}{\text{frac}}$. For each such point, we compute the corresponding supercell fractional coordinate $\mathbf{k}_s^{\text{frac}} = \mathbf{G}s^{-1} \mathbf{k}{\text{cart}}$. These are the $\mathbf{k}$-points for which we will run the supercell band calculation in batches.

  3. Batch processing
    To avoid excessive memory usage, we process supercell $\mathbf{k}$-points in batches of size batch_size. For each batch, we call compute_bands to obtain wavefunction coefficients and eigenvalues.
    Inside the batch, we use threading to process individual $\mathbf{k}$-points in parallel. For each $\mathbf{k}$-point we:

    • Compute the difference $\Delta \mathbf{k}_{\text{frac}} = \mathbf{T}(\mathbf{k}_s^{\text{cart}} - \mathbf{k}_p^{\text{cart}})$.
    • Loop over all plane‑wave $\mathbf{G}$-vectors $\mathbf{G}s$ and check whether
      $\Delta \mathbf{k}
      {\text{frac}} + \mathbf{T}\mathbf{G}_s$ has all components within a tolerance (e.g., $10^{-4}$) of an integer. If yes, that $\mathbf{G}$-vector contributes.
    • For each band, sum the squared coefficients of all contributing $\mathbf{G}$-vectors to obtain the spectral weight $w$. If $w > \text{threshold}$, we record the primitive $\mathbf{k}$-point index, the eigenvalue, and the weight.

    Thread safety is ensured by using thread‑local temporary arrays for each $\mathbf{k}$-point in the batch. After the batch finishes, results are merged into global arrays.

  4. Return data
    The function returns the unfolded $\mathbf{k}$-indices, eigenvalues, and spectral weights for each spin channel, together with plotting information (kdistances, ticks) and the Fermi energy from the supercell calculation.

For the code implementation in the running file like this:

  1. Packages that used:
using DFTK
using LinearAlgebra
using CairoMakie
using PseudoPotentialData
  1. Compute the primitve cell (THIS IS MUST! Because we will map the folded [supercell] bands to match the primitive):

setup_threading() # use the 6 cores 12 threads with AMD Ryzen 5 8500G

a = 10.26 
lattice_prim = a / 2 * [[0 1 1.]; [1 0 1.]; [1 1 0.]]
psp_family = PseudoFamily("dojo.nc.sr.lda.v0_4_1.standard.upf")
Si = ElementPsp(:Si, psp_family)

atoms_prim = [Si, Si]
positions_prim = [ones(3)/8, -ones(3)/8]

model_prim = model_DFT(lattice_prim, atoms_prim, positions_prim; functionals=LDA())
basis_prim = PlaneWaveBasis(model_prim; Ecut=50.0, kgrid=(4, 4, 4))

scfres_prim = self_consistent_field(basis_prim, tol=1e-4)

band_data_prim = compute_bands(scfres_prim; kline_density=30)
  1. Compute the supercell:
sc_system = create_supercell(lattice_prim, atoms_prim, positions_prim, (2, 2, 2))
model_sc = model_DFT(sc_system.lattice, sc_system.atoms, sc_system.positions; functionals=LDA())
basis_sc = PlaneWaveBasis(model_sc; Ecut=50.0, kgrid=(2, 2, 2))

scfres_sc = self_consistent_field(basis_sc, tol=1e-5)
  1. Compare unfolded bands and folded bands + plotting:
unfold_results = unfold_bands(scfres_sc, band_data_prim; n_bands=128)

band_data_sc_raw = compute_folded_bands(scfres_sc, band_data_prim; n_bands=128)

fig = Figure(size = (1200, 600), fontsize = 20)

ax_raw = plot_folded_bands!(
        fig[1, 1], 
        band_data_sc_raw,
        unfold_results; 
        title_text="Raw Supercell Bands (Folded)"
 )

ax_unf, sc_unf = plot_unfolded_bands!(
        fig[1, 2], 
        unfold_results; 
        title_text="Unfolded Supercell Bands"
 )

Colorbar(fig[1, 3], sc_unf, label="Spectral Weight")

ylims!(ax_raw, -10, 10) 
ylims!(ax_unf, -10, 10)

hideydecorations!(ax_unf, grid=false, ticks=false)

display(fig)

save_path = "band_unfolding_comparison_lowqual.png"
save(save_path, fig, px_per_unit = 2)

and we can get these results:

band_unfolding_comparison

For more analysis and robust physics proof. We can do the same thing—but with the doping to be N-type and P-type Si. So the results will like these:

doped_si_unfolding_comparison

As can be seen from the N-type and P-type comparison, the valence/conduction bands will be shifted due to we introduce an atom that will bring the new charge carrier.

For performance analysis of bands unfolding can be seen in this table:

Tot / % measured: 2137s / 100.0% | 0.98TiB / 100.0%

Section ncalls time %tot avg alloc %tot avg
Stage 4: Plotting & Saving 1 1013s 47.4% 1013s 458GiB 45.6% 458GiB
Stage 3: Band Unfolding 1 1010s 47.2% 1010s 501GiB 49.9% 501GiB
Stage 2: Supercell SCF 1 79.2s 3.7% 79.2s 27.0GiB 2.7% 27.0GiB
Stage 1: Primitive SCF & Path 1 35.2s 1.6% 35.2s 17.6GiB 1.8% 17.6GiB

Note: the stage 4 part is computing the folded bands + saving the plotting with CairoMakie.

With kgrid = 2 x 2 x 2 for $2^3$ supercell (for primitive the kgrid = 4 x 4 x 4), Ecut = 50.0, and nbands = 128, this is an impersive performance.

And that's all my proposal of the PR. Sorry for all the shortcomings, this is my first time doing the PR. I hope this can help to make the DFTK.jl more reliable and capable to do complex things with its methodological developent.

I would be grateful to receive your feedback and any suggestions for improvement.

@physarief78 physarief78 marked this pull request as draft February 24, 2026 04:54
@physarief78 physarief78 marked this pull request as ready for review February 24, 2026 17:54
@physarief78 physarief78 marked this pull request as draft February 24, 2026 17:58
@physarief78 physarief78 marked this pull request as ready for review February 24, 2026 17:59
@antoine-levitt
Copy link
Copy Markdown
Member

Hi,

Thanks for this, looks very nice! However this is the kind of postprocessing that is largely independent from DFTK itself and can fully be done in an external package. I think the best way forward is to keep it in a separate package. Maybe we can add a reference somewhere in the docs to an "applications" page where we list all apps that depend on DFTK? There's a bunch of stuff like that that could go to such mini-apps (like the periodic green function, or even some stuff currently in DFTK itself, like DOS plotting or refinement).

(also when making this kind of PR please disclose which part was generated by AI, what steps you took to make sure it was correct, etc. We're also interested in feedback in AI workflows, if something can be improved on how to make AI tools work better with DFTK, etc)

@physarief78
Copy link
Copy Markdown
Author

physarief78 commented Feb 25, 2026

Thank you for the feedback and response, mister @antoine-levitt !

Honestly, I first thought about making this into a new package. However, the computation I need heavily depends on DFTK.jl for the band structure and density of states (DoS) calculations, and I cannot imagine how my code would work without DFTK.jl. So, I decided to simply implement this directly in DFTK.jl.

Plus, I'm also not entirely sure how to expand my code into a standalone package, since my background and motivation for creating it are solely to analyze my college work using DFTK.jl (specifically for band folding and unfolding calculations). That's really my main focus. However, if it's better to keep this work outside of DFTK.jl, would you be willing to offer any suggestions on how I should approach developing it as a separate package?

Regarding the AI I used for coding, I'm still unsure how to acknowledge it in my code. Initially, I hard-coded everything, but my code took 9 hours just for the unfolding computation—that's far too long and inefficient. So I used Gemini to help improve it, but rather than directly asking the AI to rewrite the code, I asked it for insights on how I could enhance it and which parts I should modify to achieve better performance. Then I tried coding it again, and if I encountered errors, I turned to the AI again, asking it to explain why the code still wouldn't run properly. If the error persisted, I gathered all the information from manually reading the documentation or watching tutorial videos, fed that context back to the AI, and guided it on what needed to be done correctly.

@antoine-levitt
Copy link
Copy Markdown
Member

Regarding making a new package, you can just make it like any other julia package (you can find tutorials on how to do it). Then you make that package depend on dftk, with an explicit upper bound on the dftk version, so it's always guaranteed to work even if we make breaking changes to dftk. If we make a list of packages that depend on dftk that allows us to look at how potentially breaking changes we make affect them and then we can help downstream packages adapt.

Re AI, in my limited experience it's great at making one-shot versions of features (with appropriate guidance), and refactoring when needed. It's terrible at figuring out its own bugs. I think to go to the next level it needs better feedback loops as well as better instructions (eg agents.md files) but I haven't been able to set it up correctly (julia compilation times appears to be a big factor, the AI just gives up if running tests takes more than a few seconds and then tries to "fix" it)

@physarief78
Copy link
Copy Markdown
Author

Thanks for your suggestions!

I think I will set this PR back to draft while I try to finalize this into a package that will be integrated with and use DFTK as a dependency.

And regarding the AI, based on my experience using it, we also need to brainstorm with it before prompting it to generate something (like ask the AI the fundamental physics/math logic for the problem that we want to code it for it solver). When we do that, the answers we get are often much better than if we just tell the AI straight away what we want.

@physarief78 physarief78 marked this pull request as draft February 26, 2026 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants