11/9/2023 0 Comments Define emergent phenomena“GPT-3 is iconic in having introduced the truly distinctive first wave of emergent abilities in LMs with the now well-known few-shot prompting/in-context learning,” Bommasani said. The researchers also took extra efforts to test the LLMs on multi-step reasoning, instruction following, and multi-step computation. They chose several tasks from BIG-Bench, a crowd-sourced benchmark of over 200 tasks “that are believed to be beyond the capabilities of current language models.” They also used challenges from TruthfulQA, Massive Multi-task Language Understanding ( MMLU), and Word in Context ( WiC), all benchmarks that are designed to test the limits of LLMs in tackling complicated language tasks. In their study, the researchers tested several popular LLM families, including LaMDA, GPT-3, Gopher, Chinchilla, and PaLM. Few-shot learning in LLMs drew much attention with the introduction of OpenAI’s GPT-3 in 2020, and its extent and limits have been much studied since then. One of the interesting features of LLMs is their capacity for few-shot and zero-shot learning, the ability to perform tasks that were not included in their training examples. In their study, the researchers focus on computation and model size, but stress that “there is not a single proxy that adequately captures all aspects of scale.” Emergent abilities in large language models Scale can be measured in different ways, including computation (FLOPs), model size (number of parameters), or data size. “This distinguishes emergent abilities from abilities that smoothly improve with scale: it is much more difficult to predict when emergent abilities will arise,” Bommasani said. To identify emergent abilities in large language models, the researchers looked for phase transitions, where below a certain threshold of scale, model performance is near-random, and beyond that threshold, performance is well above random. “Since we wanted to provide a more precise definition, we defined emergent abilities as abilities that are ‘not present in smaller models but are present in larger models,’” Rishi Bommasani, PhD student at Stanford University and co-author of the paper, told TechTalks. Inspired by Anderson’s work, Jacob Steinhardt, Professor at UC Berkeley, defined emergence as “when quantitative changes in a system result in qualitative changes in behavior.” In an essay titled, “ More is Different” (PDF), Nobel laureate physicist Philip Anderson discussed the idea that quantitative changes can lead to qualitatively different and unexpected phenomena. This new study is focused on emergence in the sense that has long been discussed in domains such as physics, biology, and computer science. The study sheds light on the relation between the scale of large language models and their “emergent” abilities. A new study by researchers at Google, Stanford University, DeepMind, and the University of North Carolina at Chapel Hill explores novel tasks that LLMs can accomplish as they grow larger and are trained on more data. Large language models (LLMs) have become the center of attention and hype because of their seemingly magical abilities to produce long stretches of coherent text, do things they weren’t trained on, and engage (to some extent) in topics of conversation that were thought to be off-limits for computers.īut there is still a lot to be learned about the way LLMs work and don’t work. This article is part of our coverage of the latest in AI research.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |