Skip to content

Commit 95ea87b

Browse files
committed
Refactored
Now focuses on Microsoft's official BitNet model
1 parent fb944f4 commit 95ea87b

File tree

5 files changed

+280
-112
lines changed

5 files changed

+280
-112
lines changed

.github/workflows/docker-image.yml

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
name: Build and Push Docker Image
2+
3+
on:
4+
push:
5+
tags:
6+
- "v*.*.*"
7+
8+
jobs:
9+
build-and-push:
10+
runs-on: ubuntu-latest
11+
steps:
12+
- name: Checkout repository
13+
uses: actions/checkout@v4
14+
15+
- name: Set up Docker Buildx
16+
uses: docker/setup-buildx-action@v3
17+
18+
- name: Log in to Docker Hub
19+
uses: docker/login-action@v3
20+
with:
21+
username: ${{ secrets.DOCKERHUB_USERNAME }}
22+
password: ${{ secrets.DOCKERHUB_TOKEN }}
23+
24+
- name: Build and push Docker image
25+
uses: docker/build-push-action@v5
26+
with:
27+
context: .
28+
push: true
29+
tags: grctest/fastapi_bitnet:latest

Dockerfile

Lines changed: 16 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,32 @@
1-
FROM python:3.9
1+
FROM python:3.10
22

33
WORKDIR /code
44

55
COPY ./app /code
66

7-
RUN if [ -z "$(ls -A /code/models)" ]; then \
8-
echo "Error: No models found in /code/models" && exit 1; \
9-
fi
7+
# Clone BitNet with submodules directly into /code (ensures all files and submodules are present)
8+
RUN git clone --recursive https://github.com/microsoft/BitNet.git /tmp/BitNet && \
9+
cp -r /tmp/BitNet/* /code && \
10+
rm -rf /tmp/BitNet
1011

12+
# Install dependencies
1113
RUN apt-get update && apt-get install -y \
1214
wget \
1315
lsb-release \
1416
software-properties-common \
1517
gnupg \
16-
cmake && \
17-
bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)" && \
18-
apt-get clean && \
19-
rm -rf /var/lib/apt/lists/*
20-
21-
RUN git clone --recursive https://github.com/microsoft/BitNet.git /tmp/BitNet && \
22-
cp -r /tmp/BitNet/* /code && \
23-
rm -rf /tmp/BitNet
18+
cmake \
19+
clang \
20+
&& bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)" \
21+
&& apt-get clean \
22+
&& rm -rf /var/lib/apt/lists/*
2423

24+
# Install Python dependencies
2525
RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt && \
26-
pip install "fastapi[standard]" "uvicorn[standard]"
27-
28-
RUN if [ -d "/code/models/Llama3-8B-1.58-100B-tokens" ]; then \
29-
python /code/setup_env.py -md /code/models/Llama3-8B-1.58-100B-tokens -q i2_s --use-pretuned && \
30-
find /code/models/Llama3-8B-1.58-100B-tokens -type f -name "*f32*.gguf" -delete; \
31-
fi
32-
33-
RUN if [ -d "/code/models/bitnet_b1_58-large" ]; then \
34-
python /code/setup_env.py -md /code/models/bitnet_b1_58-large -q i2_s --use-pretuned && \
35-
find /code/models/bitnet_b1_58-large -type f -name "*f32*.gguf" -delete; \
36-
fi
37-
38-
RUN if [ -d "/code/models/bitnet_b1_58-3B" ]; then \
39-
python /code/setup_env.py -md /code/models/bitnet_b1_58-3B -q i2_s --use-pretuned && \
40-
find /code/models/bitnet_b1_58-3B -type f -name "*f32*.gguf" -delete; \
41-
fi
26+
pip install "fastapi[standard]" "uvicorn[standard]" httpx fastapi-mcp
27+
28+
# (Optional) Run your setup_env.py if needed
29+
RUN python /code/setup_env.py -md /code/models/BitNet-b1.58-2B-4T -q i2_s
4230

4331
EXPOSE 8080
4432

README.md

Lines changed: 8 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,16 @@ It's offers the same functionality as the [Electron-BitNet](https://github.com/g
88

99
## Setup instructions
1010

11+
If running in dev mode, run Docker Desktop on windows to initialize docker in WSL2.
12+
13+
Launch WSL: `wsl`
14+
1115
Install Conda: https://anaconda.org/anaconda/conda
1216

1317
Initialize the python environment:
1418
```
1519
conda init
16-
conda create -n bitnet python=3.9
20+
conda create -n bitnet python=3.11
1721
conda activate bitnet
1822
```
1923

@@ -22,11 +26,9 @@ Install the Huggingface-CLI tool to download the models:
2226
pip install -U "huggingface_hub[cli]"
2327
```
2428

25-
Download one/many of the 1-bit models from Huggingface below:
29+
Download Microsoft's official BitNet model:
2630
```
27-
huggingface-cli download 1bitLLM/bitnet_b1_58-large --local-dir app/models/bitnet_b1_58-large
28-
huggingface-cli download 1bitLLM/bitnet_b1_58-3B --local-dir app/models/bitnet_b1_58-3B
29-
huggingface-cli download HF1BitLLM/Llama3-8B-1.58-100B-tokens --local-dir app/models/Llama3-8B-1.58-100B-tokens
31+
huggingface-cli download microsoft/BitNet-b1.58-2B-4T-gguf --local-dir app/models/BitNet-b1.58-2B-4T
3032
```
3133

3234
Build the docker image:
@@ -39,14 +41,4 @@ Run the docker image:
3941
docker run -d --name ai_container -p 8080:8080 fastapi_bitnet
4042
```
4143

42-
Once it's running navigate to http://127.0.0.1:8080/docs
43-
44-
---
45-
46-
Note:
47-
48-
If seeking to use this in production, make sure to extend the docker image with additional [authentication security](https://github.com/mjhea0/awesome-fastapi?tab=readme-ov-file#auth) steps. In its current state it's intended for use locally.
49-
50-
Building the docker file image requires upwards of 40GB RAM for `Llama3-8B-1.58-100B-tokens`, if you have less than 64GB RAM you will probably run into issues.
51-
52-
The Dockerfile deletes the larger f32 files, so as to reduce the time to build the docker image file, you'll need to comment out the `find /code/models/....` lines if you want the larger f32 files included.
44+
Once it's running navigate to http://127.0.0.1:8080/docs

0 commit comments

Comments
 (0)