GPT-1,全称基于转换器的生成式预训练模型1(Generative Pre-trained Transformer 1)是继2017年Google推出Transformer架构后,OpenAI推出的第一个大型语言模型[3]。2018年,OpenAI发布了一篇名为《通过生成式预训练提高语言理解能力》(Improving Language Understanding by Generative Pre-Training)的论文,其中介绍了该初期模型以及基于转换器的生成式预训练模型的总体概念[4] 。
^Zhu, Yukun; Kiros, Ryan; Zemel, Richard; Salakhutdinov, Ruslan; Urtasun, Raquel; Torralba, Antonio; Fidler, Sanja. Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books. 2015-06-22. arXiv:1506.06724 [cs.CV]. # of books: 11,038 / # of sentences: 74,004,228 / # of words: 984,846,357 / mean # of words per sentence: 13 / median # of words per sentence: 11
^Williams, Adina; Nangia, Nikita; Bowman, Samuel. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference(PDF). Association for Computational Linguistics. 2018-06-01 [2021-01-23]. (原始内容存档(PDF)于2020-02-11). At 433k examples, this resource is one of the largest corpora available for natural language inference (a.k.a. recognizing textual entailment), [...] offering data from ten distinct genres of written and spoken English [...] while supplying an explicit setting for evaluating cross-genre domain adaptation.
^Mostafazadeh, Nasrin; Roth, Michael; Louis, Annie; Chambers, Nathanael; Allen, James F. LSDSem 2017 Shared Task: The Story Cloze Test(PDF). Association for Computational Linguistics. 2017-04-03 [2021-01-23]. (原始内容存档(PDF)于2020-11-22). The LSDSem’17 shared task is the Story Cloze Test, a new evaluation for story understanding and script learning. This test provides a system with a four-sentence story and two possible endings, and the system must choose the correct ending to the story. Successful narrative understanding (getting closer to human performance of 100%) requires systems to link various levels of semantics to commonsense knowledge.
^Wang, Alex; Singh, Amanpreet; Michael, Julian; Hill, Felix; Levy, Omar; Bowman, Samuel R. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. 2018-04-20. arXiv:1804.07461 [cs.CL].