Skip to content

Commit 6aabbf6

Browse files
committed
update ref format
1 parent d605f61 commit 6aabbf6

2 files changed

Lines changed: 37 additions & 13 deletions

File tree

README-ZH.md

Lines changed: 19 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -227,15 +227,15 @@ $ torchrun --nnodes=${NNODES} --nproc_per_node=${GPU_PER_NODE} --rdzv_id=1 --rdz
227227

228228
## 模型支持
229229

230-
- [CPM: A Large-scale Generative Chinese Pre-trained Language Model.](https://arxiv.org/abs/2012.00413) Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun. 我们支持使用 ``CPM1.from_pretrained(identifier)`` 来加载下列模型:
230+
- CPM-1[^1]. 我们支持使用 ``CPM1.from_pretrained(identifier)`` 来加载下列模型:
231231

232232
- cpm1-large
233233

234-
- [CPM-2: Large-scale Cost-efficient Pre-trained Language Models.](https://arxiv.org/abs/2106.10715) Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun. 我们支持使用 ``CPM2.from_pretrained(identifier)`` 来加载下列模型:
234+
- CPM-2[^2]. 我们支持使用 ``CPM2.from_pretrained(identifier)`` 来加载下列模型:
235235

236236
- cpm2-large
237237

238-
- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.](https://arxiv.org/abs/1810.04805) Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. 我们支持使用 ``Bert.from_pretrained(identifier)`` 来加载下列模型:
238+
- BERT[^3]. 我们支持使用 ``Bert.from_pretrained(identifier)`` 来加载下列模型:
239239

240240
- bert-base-cased
241241
- bert-base-uncased
@@ -244,22 +244,22 @@ $ torchrun --nnodes=${NNODES} --nproc_per_node=${GPU_PER_NODE} --rdzv_id=1 --rdz
244244
- bert-base-chinese
245245
- bert-base-multilingual-cased
246246

247-
- [T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683) Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.. 我们支持使用 ``T5.from_pretrained(identifier)`` 来加载下列模型:
247+
- T5[^4]. 我们支持使用 ``T5.from_pretrained(identifier)`` 来加载下列模型:
248248

249249
- t5-small
250250
- t5-base
251251
- t5-large
252252
- t5-3b
253253
- t5-11b
254254

255-
- [GPT2: Language Models are Unsupervised Multitask Learners.](http://www.persagen.com/files/misc/radford2019language.pdf) Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 我们支持使用 ``GPT2.from_pretrained(identifier)`` 来加载下列模型:
255+
- GPT-2[^5]. 我们支持使用 ``GPT2.from_pretrained(identifier)`` 来加载下列模型:
256256

257257
- gpt2-base
258258
- gpt2-medium
259259
- gpt2-large
260260
- gpt2-xl
261261

262-
- [GPT-J](https://github.com/kingoflolz/mesh-transformer-jax) (from EleutherAI) released in the repo [mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax) by Ben Wang and Aran Komatsuzaki. 我们支持使用 ``GPTj.from_pretrained(identifier)`` 来加载下列模型:
262+
- GPT-J[^6]. 我们支持使用 ``GPTj.from_pretrained(identifier)`` 来加载下列模型:
263263

264264
- gptj-6b
265265

@@ -279,4 +279,16 @@ $ torchrun --nnodes=${NNODES} --nproc_per_node=${GPU_PER_NODE} --rdzv_id=1 --rdz
279279

280280
## 开源许可
281281

282-
该工具包使用[Apache 2.0](https://github.com/OpenBMB/ModelCenter/blob/main/LICENSE)开源许可证。
282+
该工具包使用[Apache 2.0](https://github.com/OpenBMB/ModelCenter/blob/main/LICENSE)开源许可证。
283+
284+
[^1] [CPM: A Large-scale Generative Chinese Pre-trained Language Model.](https://arxiv.org/abs/2012.00413) Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun.
285+
286+
[^2] [CPM-2: Large-scale Cost-efficient Pre-trained Language Models.](https://arxiv.org/abs/2106.10715) Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun.
287+
288+
[^3] [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.](https://arxiv.org/abs/1810.04805) Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
289+
290+
[^4] [T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683) Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
291+
292+
[^5] [GPT2: Language Models are Unsupervised Multitask Learners.](http://www.persagen.com/files/misc/radford2019language.pdf) Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever.
293+
294+
[^6] [GPT-J](https://github.com/kingoflolz/mesh-transformer-jax) (from EleutherAI) released in the repo [mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax) by Ben Wang and Aran Komatsuzaki.

README.md

Lines changed: 18 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -231,15 +231,15 @@ For more information, please refer to the [documentation](https://pytorch.org/do
231231
## Supported Models
232232

233233

234-
- [CPM: A Large-scale Generative Chinese Pre-trained Language Model.](https://arxiv.org/abs/2012.00413) Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun. We currently support loading the following checkpoint via ``CPM1.from_pretrained(identifier)`` of the following:
234+
- CPM-1[^1]. We currently support loading the following checkpoint via ``CPM1.from_pretrained(identifier)`` of the following:
235235

236236
- cpm1-large
237237

238-
- [CPM-2: Large-scale Cost-efficient Pre-trained Language Models.](https://arxiv.org/abs/2106.10715) Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun. We currently support loading the following checkpoint via ``CPM2.from_pretrained(identifier)`` of the following:
238+
- CPM-2[^2]. We currently support loading the following checkpoint via ``CPM2.from_pretrained(identifier)`` of the following:
239239

240240
- cpm2-large
241241

242-
- [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.](https://arxiv.org/abs/1810.04805) Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova. We currently support loading the following checkpoint via ``Bert.from_pretrained(identifier)`` of the following:
242+
- BERT[^3]. We currently support loading the following checkpoint via ``Bert.from_pretrained(identifier)`` of the following:
243243

244244
- bert-base-cased
245245
- bert-base-uncased
@@ -248,22 +248,22 @@ For more information, please refer to the [documentation](https://pytorch.org/do
248248
- bert-base-chinese
249249
- bert-base-multilingual-cased
250250

251-
- [T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683) Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.. We currently support loading the following checkpoint via ``T5.from_pretrained(identifier)`` of the following:
251+
- T5[^4]. We currently support loading the following checkpoint via ``T5.from_pretrained(identifier)`` of the following:
252252

253253
- t5-small
254254
- t5-base
255255
- t5-large
256256
- t5-3b
257257
- t5-11b
258258

259-
- [GPT2: Language Models are Unsupervised Multitask Learners.](http://www.persagen.com/files/misc/radford2019language.pdf) Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. We currently support loading the following checkpoint via ``GPT2.from_pretrained(identifier)`` of the following:
259+
- GPT-2[^5]. We currently support loading the following checkpoint via ``GPT2.from_pretrained(identifier)`` of the following:
260260

261261
- gpt2-base
262262
- gpt2-medium
263263
- gpt2-large
264264
- gpt2-xl
265265

266-
- [GPT-J](https://github.com/kingoflolz/mesh-transformer-jax) (from EleutherAI) released in the repo [mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax) by Ben Wang and Aran Komatsuzaki. We currently support loading the following checkpoint via ``GPTj.from_pretrained(identifier)`` of the following:
266+
- GPT-J[^6]. We currently support loading the following checkpoint via ``GPTj.from_pretrained(identifier)`` of the following:
267267

268268
- gptj-6b
269269

@@ -284,3 +284,15 @@ You can also find us on other platforms:
284284
## License
285285

286286
The package is released under the [Apache 2.0](https://github.com/OpenBMB/ModelCenter/blob/main/LICENSE) License.
287+
288+
[^1] [CPM: A Large-scale Generative Chinese Pre-trained Language Model.](https://arxiv.org/abs/2012.00413) Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun.
289+
290+
[^2] [CPM-2: Large-scale Cost-efficient Pre-trained Language Models.](https://arxiv.org/abs/2106.10715) Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun.
291+
292+
[^3] [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.](https://arxiv.org/abs/1810.04805) Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
293+
294+
[^4] [T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683) Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
295+
296+
[^5] [GPT2: Language Models are Unsupervised Multitask Learners.](http://www.persagen.com/files/misc/radford2019language.pdf) Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever.
297+
298+
[^6] [GPT-J](https://github.com/kingoflolz/mesh-transformer-jax) (from EleutherAI) released in the repo [mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax) by Ben Wang and Aran Komatsuzaki.

0 commit comments

Comments
 (0)