Skip to content

Information capacity evaluates an LLM's efficiency via text compression. This metric is introduced by AI Flow team from TeleAI, as a part of the AI Flow framework.

Notifications You must be signed in to change notification settings

TeleAI-AI-Flow/InformationCapacity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI-Flow-Information Capacity

🏆 Leaderboard    |    🖥️ GitHub    |    🤗 Hugging Face   |    📑  Paper

Information Capacity evaluates an LLM's efficiency based on text compression performance relative to computational complexity, harnessing the inherent correlation between compression and intelligence. Larger models can predict the next token more accurately, leading to higher compression gains but at increased computational costs. Consequently, a series of models with varying sizes exhibits consistent information capacity, which can be used to compare model capability across model series and predict model performance within a series. It also facilitates dynamic routing of different-sized models for efficient handling of tasks with varying difficulties, which is especially relevant to the device-edge-cloud infrastructure detailed in the AI Flow framework. With the rapid evolution of edge intelligence, we believe that this hierarchical network will replace the mainstream cloud-centric computing scheme in the near future.

Compared to existing metrics on LLM efficiency, a key difference of information capacity is that it considers the influence of tokenizer efficiency. An effective tokenizer can represent a given text with fewer tokens, thus reducing both the input and output token counts. This reduction not only lowers computational costs and inference delay but also facilitates long-context memory and in-depth reasoning. Tokenizer efficiency exhibits growing significance in light of the exploding input length and the widespread usage of test-time scaling, but is often neglected in LLM evaluations. We assess the information capacity of 49 models across 5 heterogeneous datasets and find consistent evidence regarding the influences of tokenizer efficiency, pretraining data, and the mixture-of-experts (MoE) architecture.

Method

The model intelligence is measured by the data size savings achieved from the LLM's probability prediction. The original size of a text sample in the given dataset is denoted as $C$, which is transformed into a sequence of $L$ tokens by the tokenizer of an LLM $M$. The symbol length of the $i$-th token derived from entropy coding is approximately $-\log p(x_i | x_{<i} ; M)$, and the compression gain is the difference between the original data size and the summed symbol length of all tokens. The computational complexity is measured by the inference floating-point operations (FLOPs) $N_M$ on a logarithmic scale according to the scaling law. We introduce a negative bias $b$ in the numerator so that different-sized models in a series have nearly identical information capacities, thus enabling convenient comparison across different model sizes and architectures.

In summary, the information capacity is defined as: $$ \text{Information Capacity} = \frac{C - \sum_{i} -\log p(x_i | x_{<i} ; M)}{ \log N_M} . $$

Usage

Step 1. Setup an environment viable for model inference.

pip install numpy torch transformers tqdm flash_attn huggingface_hub

Step 2. Clone this repo.

git clone https://github.com/TeleAI-AI-Flow/InformationCapacity.git
cd InformationCapacity

Step 3. Download test datasets.

hf download TeleAI-AI-Flow/InformationCapacity --repo-type=dataset --include "datasets/**" --local-dir .

Step 4. Run evaluation code.

python calc_ic.py -m path/to/model -d datasets/mixed_text.jsonl -l 1024 -b 1

Citation

@misc{yuan2025informationcapacity,
      title={Information Capacity: Evaluating the Efficiency of Large Language Models via Text Compression}, 
      author={Cheng Yuan and Jiawei Shao and Chi Zhang and Xuelong Li},
      year={2025},
      eprint={2511.08066},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2511.08066}, 
}

@misc{an2025aiflowperspectivesscenarios,
      title={AI Flow: Perspectives, Scenarios, and Approaches}, 
      author={Hongjun An and Wenhan Hu and Sida Huang and Siqi Huang and Ruanjun Li and Yuanzhi Liang and Jiawei Shao and Yiliang Song and Zihan Wang and Cheng Yuan and Chi Zhang and Hongyuan Zhang and Wenhao Zhuang and Xuelong Li},
      year={2025},
      eprint={2506.12479},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2506.12479}, 
}

@misc{shao2025aiflownetworkedge,
      title={AI Flow at the Network Edge}, 
      author={Jiawei Shao and Xuelong Li},
      year={2025},
      eprint={2411.12469},
      archivePrefix={arXiv},
      primaryClass={eess.SP},
      url={https://arxiv.org/abs/2411.12469}, 
}

About

Information capacity evaluates an LLM's efficiency via text compression. This metric is introduced by AI Flow team from TeleAI, as a part of the AI Flow framework.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages