Skip to content

A hybrid Quantum-Classical Transformer implementation based on nanoGPT, using PyTorch and PennyLane to replace attention heads with Variational Quantum Circuits (VQC).

License

Notifications You must be signed in to change notification settings

lorenzomaiuri-dev/quantum-gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quantum GPT (Hybrid QNN-NanoGPT)

Python PyTorch PennyLane MIT

A hybrid Quantum-Classical implementation of a Generative Pre-trained Transformer (GPT). This project adapts Andrej Karpathy's nanoGPT architecture by replacing classical linear layers in the Self-Attention mechanism with Variational Quantum Circuits (VQC) using PennyLane.

🚀 Scientific Concept

In a standard Transformer, the Attention Head projects input tokens into Query, Key, and Value spaces using linear matrices ($W_Q, W_K, W_V$).

In this Quantum-Hybrid architecture, we replace these dense layers with a parameterized quantum evolution:

$$ x \xrightarrow{\text{Adapter}} z \in \mathbb{R}^n \xrightarrow{R(\phi)} |\psi(z)\rangle \xrightarrow{U(\theta)_{\text{entangle}}} \langle Z \rangle \to y $$

Where:

  • Adapter: A classical bottleneck layer compressing high-dimensional embeddings to $n$ qubits.
  • $R(\phi)$: Angle embedding encoding classical data into quantum states.
  • $U(\theta)$: A sequence of trainable entangling layers (Strongly Entangling Layers).
  • $\langle Z \rangle$: Expectation value measurement returning the projected vector.

Why?

This architecture allows us to study if the high-dimensional Hilbert space and quantum interference can capture semantic relationships more efficiently (parameter-wise) than classical linear algebra, despite the constraints of current NISQ simulation.

This allows exploring the expressivity of quantum circuits within a sequence modeling task.

Note: We employ a Quantum Bottleneck architecture. High-dimensional classical embeddings are projected down to a lower-dimensional quantum latent space via a trainable adapter, processed by the VQC, and projected back. This maintains computational feasibility while exploiting quantum interference.

📂 Project Structure

quantum-transformer/
├── checkpoints/                # Saved models
├── data/                       # Input text data
├── src/                        # Source code
│   ├── config.py               # Hyperparameters & flags
│   ├── dataset.py              # Tokenizer & Dataloader
│   ├── model.py                # Transformer Architecture
│   └── quantum_layers.py       # PennyLane Circuits & Hybrid Layers
├── main.py                     # Entry point (Train/Generate)
└── requirements.txt            # Dependencies

🛠️ Installation

Clone the repository:

git clone https://github.com/lorenzomaiuri-dev/quantum-gpt.git
cd quantum-transformer

Install dependencies:

pip install -r requirements.txt

⚡ Usage

Training

To train the model on the Shakespeare dataset (included in data/):

python main.py --mode train

Note: Quantum simulation is CPU-intensive. The default configuration uses a "Quantum Bottleneck" (4-8 qubits) to keep training times feasible on consumer hardware.

Generation

To generate text using the trained checkpoint:

python main.py --mode generate

⚙️ Configuration

You can modify hyperparameters in src/config.py:

# Quantum Settings
USE_QUANTUM = True      # Set False to use standard Linear Layers
N_QUBITS = 4            # Number of qubits per head
N_QLAYERS = 2           # Depth of the quantum circuit

🧠 Architecture Details

Embedding Dimension: 8 (scaled down for simulation speed)
Heads: 2
Qubits per Head: 4

📊 Preliminary Results (Coming Soon)

Comparison between Classical (64 params) vs Hybrid Quantum (4 qubits) attention heads:

  • Loss Convergence: Comparing training stability.
  • Parameter Efficiency: Can quantum circuits learn with fewer parameters?
  • Runtime Analysis: Quantifying the overhead of quantum simulation.

🙏 Acknowledgements

Andrej Karpathy for the original nanoGPT and Video Lecture.
Xanadu for the PennyLane library used for quantum machine learning.

About

A hybrid Quantum-Classical Transformer implementation based on nanoGPT, using PyTorch and PennyLane to replace attention heads with Variational Quantum Circuits (VQC).

Topics

Resources

License

Stars

Watchers

Forks

Languages