ALOS – Advanced Logistics Operating System is a concept for a regional logistics OS for complex trade corridors
(Iran–Caucasus–Black Sea, INSTC, Middle Corridor, etc.).
This repository contains a minimal MVP of the customs engine module called
ANI – Automated Neural Inspector.
Goal: given a messy, mixed-language item description
(e.g.բլուտուզ խոսփաքեր),
produce a suggested HS (Tariff) code and a simple risk flag.
This prototype demonstrates the core idea of the customs engine:
-
Normalization
- Converts Armenian script and mixed text into a cleaned Latin form.
- Example:
բլուտուզ խոսփաքեր (новый, 2025)→blutuz khospaker novyi 2025
-
Embeddings & semantic search
- Uses
sentence-transformersto encode the normalized text into vector embeddings. - Compares it to a tiny in-memory knowledge base of reference descriptions.
- Uses
-
HS classification (very small demo)
- Picks the closest description and returns:
- HS code (e.g.
851821for bluetooth speakers), - risk flag (
low,encryption,dual-use), - confidence score.
- HS code (e.g.
- Picks the closest description and returns:
It is not production-ready, but is meant to show the architecture:
normalization → embeddings → semantic search → HS classification.
alos-ani-mvp/
├─ README.md
├─ requirements.txt
├─ alos_core/
│ ├─ __init__.py
│ └─ customs/
│ ├─ __init__.py
│ ├─ normalizer.py # Armenian → Latin + cleanup
│ ├─ engine.py # Embeddings + semantic HS lookup
│ └─ api.py # FastAPI HTTP endpoint
└─ examples/
└─ demo_classify.py # Simple CLI demo
Core modules
alos_core/customs/normalizer.py
ArmNormalizer class:
lowercases, trims,
maps Armenian letters to Latin (բ → b, խ → kh, ու → u, …),
removes non-alphanumeric noise.
alos_core/customs/engine.py
CustomsEngine class:
loads a multilingual SentenceTransformer model
(paraphrase-multilingual-MiniLM-L12-v2),
keeps a small, hard-coded knowledge base of item descriptions with HS codes,
provides predict_hs(raw_query: str) → dict with:
hs_code, risk_flag, confidence, etc.
alos_core/customs/api.py
FastAPI application exposing a single endpoint:
POST /classify with JSON body {"text": "..."}
returns JSON with HS code prediction and metadata.
examples/demo_classify.py
Simple interactive CLI tool for quick testing.
Installation
bash
Копировать код
git clone https://github.com/arutovan-droid/alos-ani-mvp.git
cd alos-ani-mvp
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
Usage
1. CLI demo
Run:
bash
Копировать код
python -m examples.demo_classify
Sample interaction:
text
Копировать код
ALOS ANI demo. Type text (Ctrl+C to exit).
> բլուտուզ խոսփաքեր
HS: 851821 | conf=0.91 | risk=low | matched='wireless bluetooth speaker portable audio device' | norm='blutuz khospaker'
> айфон 15 про
HS: 851712 | conf=0.88 | risk=encryption | matched='smartphone mobile phone iphone samsung with encryption' | norm='аифон 15 про'
(Numbers in this example are illustrative — exact scores depend on the model run.)
2. HTTP API (FastAPI)
Start the API server:
bash
Копировать код
uvicorn alos_core.customs.api:app --reload
You should see something like:
text
Копировать код
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
Swagger UI
Open in a browser:
http://127.0.0.1:8000/docs
You can test the /classify endpoint interactively there.
cURL example
bash
Копировать код
curl -X POST http://127.0.0.1:8000/classify \
-H "Content-Type: application/json" \
-d '{"text": "բլուտուզ խոսփաքեր"}'
Example JSON response:
json
Копировать код
{
"raw_input": "բլուտուզ խոսփաքեր",
"normalized": "blutuz khospaker",
"matched_desc": "wireless bluetooth speaker portable audio device",
"confidence": 0.91,
"hs_code": "851821",
"risk_flag": "low"
}
Notes on language & data
In real customs documents in this region, we mostly see:
Armenian script + Russian/English,
sometimes mixed with transliterated Chinese/Turkish product names.
Armenian written in Latin (“barev dzez”-style, hy-Latn) is common in chats,
but relatively rare in official documents.
The current MVP:
Explicitly handles Armenian script via a custom normalizer,
Treats existing Latin text as-is,
Is easy to extend later with:
better Armenian normalization,
dedicated support for hy-Latn,
larger HS code dictionaries and proper vector databases.
Roadmap / TODO (high-level)
Replace the tiny hard-coded knowledge base with:
a proper HS code + description dataset,
a vector DB (Qdrant / FAISS / Chroma).
Add top-k results and ambiguity handling:
multiple candidate HS codes with scores,
thresholds for human review.
Extend language handling:
richer Armenian normalization,
better support for Russian + hybrid text,
optional hy-Latn handling if real data requires it.
Add logging & telemetry:
track corrections from human inspectors,
support human-in-the-loop learning.
Technical Roadmap (Vector Database Integration)
Phase 1 – Current MVP (done):
Customs microservice (ANI – Automated Neural Inspector) built on FastAPI.
Semantic search using SentenceTransformer to match free-text item descriptions with HS codes.
Simple in-memory knowledge base for pilot scenarios.
Phase 2 – Vector Database for Scale (Qdrant / FAISS):
Integrate a dedicated vector database (Qdrant or FAISS) as the primary store for embeddings.
Support millions of records (item descriptions, HS codes, historical declarations) with fast cosine-similarity search.
Enable online learning: every correction made by a human broker (human-in-the-loop) is written back into the vector store and improves future predictions.
Partition the vector index by region/jurisdiction (EU, EAEU, Iran, China, etc.) to reflect local rules and tariff specifics.
Phase 3 – Shared “central memory” for all ALOS modules:
Logistics planning, insurance, and customs modules all run on top of the same vector knowledge base.
This allows:
full history per client and item (routes, risks, delays, claims);
cross-modal risk models (route + cargo type + jurisdiction);
more accurate delay and cost predictions across the entire corridor.
One-line summary for slides:
Next step: migrate from a simple in-memory example store to a production-grade vector database (Qdrant/FAISS), so the system can handle millions of items and declarations in real time.
Disclaimer
This repository is a minimal proof-of-concept for the ALOS customs engine (ANI):
It is not meant for production use as-is.
git clone https://github.com/arutovan-droid/alos-ani-mvp.git
The HS codes and descriptions in the knowledge base are illustrative.
There is no legal or compliance guarantee in this code.
It is intended to demonstrate one possible architecture for
an AI-assisted customs classification engine in a complex multilingual environment.
## Quickstart
ALOS ANI MVP is a minimal **Automated Neural Inspector** for customs HS code classification.
Input: free-text description from invoice/CMR.
Output: suggested HS code, confidence and risk flag.
---
## 1. Run locally (Python + venv)
```bash
# From repo root
python -m venv .venv
.venv\Scripts\activate # on Windows
# source .venv/bin/activate # on macOS/Linux
pip install -r requirements.txt
uvicorn alos_core.customs.api:app --reload
Open the interactive docs:
http://127.0.0.1:8000/docs
Example request (PowerShell / CMD):
bash
Копировать код
curl -X POST "http://127.0.0.1:8000/classify" ^
-H "Content-Type: application/json" ^
-d "{\"text\": \"բլուտուզ խոսփաքեր\"}"
2. Run with Docker
Build image:
bash
Копировать код
docker buildx build -t alos-customs --load .
Run container:
bash
Копировать код
docker run -d --name alos-customs -p 8000:8000 alos-customs
Check that the container is up:
bash
Копировать код
docker ps
API will be available at:
Docs: http://127.0.0.1:8000/docs
Classify example:
bash
Копировать код
curl -X POST "http://127.0.0.1:8000/classify" ^
-H "Content-Type: application/json" ^
-d "{\"text\": \"բլուտուզ խոսփաքեր\"}"
Response example:
json
Копировать код
{
"raw_input": "բլուտուզ խոսփաքեր",
"normalized": "blutuz khospaker",
"matched_desc": "wireless bluetooth speaker portable audio device blutuz khospaker колонка blutuz speaker",
"confidence": 0.26,
"hs_code": "851821",
"risk_flag": "low"
}