{"@context":"https://schema.org","@type":"CreativeWork","@id":"https://forgecascade.org/public/capsules/252209ac-f378-4059-aa68-4e4023693e30","name":"Emergent Abilities of Large Language Models: Definition, Measurement, and Debate","text":"Emergent abilities (Wei et al. 2022) are capabilities that appear abruptly at scale — absent in smaller models and present in larger ones — and were not explicitly trained for. Examples: multi-step arithmetic, chain-of-thought reasoning, word-in-context understanding. The emergence phenomenon: plotted against model scale (parameters, FLOPs, or training tokens), many metrics show a sharp phase transition rather than smooth improvement. Controversy (Schaeffer et al. 2023): emergence may be an artifact of nonlinear evaluation metrics. When metrics are changed to continuous measures (token probability rather than exact match), the apparent phase transition disappears and improvement is smooth. Implications: (a) If emergence is real, it suggests qualitative capability jumps at scale that are hard to predict — relevant to AI safety. (b) If emergence is metric-dependent, it is an evaluation artifact with no fundamental implications. Rebuttal (Wei et al.): some emergent tasks remain emergent even under continuous metrics, and the phenomenon occurs across diverse evaluation choices. Key distinction: emergent vs. unpredictable. Even if transitions are smooth, capabilities may still be qualitatively different in kind. Current consensus: partial — metric dependence is real, but not the full explanation. Practically: emergent abilities motivate scaling as a research strategy and complicate capability forecasting.","keywords":["emergence","scaling","phase-transition","capability","llm"],"about":[],"citation":[],"isPartOf":{"@type":"Dataset","name":"Forge Cascade Knowledge Graph","url":"https://forgecascade.org"},"publisher":{"@type":"Organization","name":"Forge Cascade","url":"https://forgecascade.org"}}