Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

The paper discusses the importance of understanding the strengths and limitations of large language models (LLMs) and the problems they are trained to solve. The authors argue for a teleological approach to understanding LLMs, identifying three factors that influence LLM accuracy: the probability of the task to be performed, the target output, and the provided input. The study reveals that LLMs are highly influenced by these probabilities. For instance, GPT-4’s accuracy at decoding a simple cipher is significantly higher when the output is a high-probability word sequence than when it is low-probability. The authors conclude that AI practitioners should be careful about using LLMs in low-probability situations and treat them as a distinct system shaped by its own set of pressures.

Publication date: 24 Sep 2023
Project Page: https://arxiv.org/abs/2309.13638
Paper: https://arxiv.org/pdf/2309.13638

Post Views: 342

Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

ALLURE: A Systematic Protocol for Auditing and Improving LLM-based Evaluation of Text using Iterative In-Context-Learning

MentalLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models

Leave a Reply Cancel reply

Please allow ads on our site