Skip to content

kawabata-tomoko/LongNetSample

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

这个仓库基于torchscale进行修改:

  • 删除了有关MOE、VisionEmbedding、MultiWay等内容
  • 简化代码处理逻辑:
    • 将EncoderConfig、DecoderConfig等合并为LongnetConfig
    • 基于ESMTokenizer构造了LongNetTokenizer
    • 将稀疏注意力的支持合并到config设定中
  • 使用Transformers库进行封装,针对语言模型特化
    • 提供LongNetPretrainedModel,对原始repo中分散的参数初始化进行简化合并
    • 提供三类BaseModel:EncoderOnly(LongNetEncoderLM)、DecoderOnly(LongNetDecoderLM)、Seq2SeqModel(LongNetEncoderDecoderLM)
    • 提供基于EncoderOnly和DecoderOnly的序列分类模型示例以及对应的预训练模型示例
    • 提供序列分类训练示例脚本sample

This repository is modified based on torchscale:

  • Removed content related to MOE, VisionEmbedding, MultiWay, etc.
  • Simplified code processing logic:
    • Merged EncoderConfig, DecoderConfig, etc., into LongnetConfig
    • Constructed LongNetTokenizer based on ESMTokenizer
    • Integrated support for sparse attention into the config settings
  • Encapsulated using the Transformers library, specialized for language models:
    • Provided LongNetPretrainedModel to simplify and merge parameter initialization scattered in the original repo
    • Provided three types of BaseModel: EncoderOnly (LongNetEncoderLM), DecoderOnly (LongNetDecoderLM), Seq2SeqModel (LongNetEncoderDecoderLM)
    • Provided examples of sequence classification models based on EncoderOnly and DecoderOnly, along with corresponding pre-trained model examples
    • Provided a sequence classification training example script sample

About

This repository is briefly modified based on torchscale

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors