Usage
  • 134 views
  • 113 downloads

Iterative Large Language Models Evolution through Self-Critique

  • Author / Creator
    Li, Qianxi
  • Training large language models (LLMs) often requires extensive human supervision and struggles with modeling long-range text semantic dependencies. To address these challenges, we introduce our framework ELITE — Evolving Language models Iteratively Through self-critiquE — inspired by human learning processes such as self-critique and experience-based refinement. ELITE allows an LLM to autonomously generate, refine, and iteratively improve self-critiques ability by learning a mapping from questions-and-answers to critique labels through supervised fine-tuning (SFT). This approach drastically reduces the number of human annotations from 2550 to 98 demonstrations by using a small set of prompt examples for initial configuration. Experimental results demonstrate that ELITE significantly outperforms an existing self-evolution baseline by 11% and SFT baselines by 4.8% on in-domain tasks. It also shows a performance improvement of 13.8%, 11.2%, and 7.2% on three out-of-domain data sets (SQuAD, BoolQ, and GSM8K), demonstrating generalization. The ELITE training framework can thus enable more adaptive, intelligent LLM systems that can improve themselves with relatively little human assistance.

  • Subjects / Keywords
  • Graduation date
    Fall 2024
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-e3ag-9c86
  • License
    This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.