Enhancing LLM Capabilities 

Beyond Scaling Up

(Nov. 15, 2024; EMNLP'24 Tutorial)

Contact: wenpeng@psu.edu

Wenpeng Yin

Penn State

Muhao Chen

UC Davis

Rui Zhang

Penn State

Dan Roth

UPenn & Oracle

Abstract

General-purpose large language models (LLMs) are progressively expanding both in scale and access to unpublic training data. This has led to notable progress in a variety of AI problems. Nevertheless, two questions exist: i) Is scaling up the sole avenue of extending the capabilities of LLMs? ii) Instead of developing general-purpose LLMs, how to endow LLMs with specific knowledge? This tutorial targets researchers and practitioners who are interested in capability extension of LLMs that go beyond scaling up. To this end, we will discuss several lines of research that follow that direction, including: (i) optimizing input prompts to fully exploit LLM potential, (ii) enabling LLMs to self-improve responses through various feedback signals, (iii) updating or editing the internal knowledge of LLMs when necessary, (iv) leveraging incidental structural supervision from target tasks, and (v) defending against potential attacks and threats from malicious users. At last, we will conclude the tutorial by outlining directions for further investigation.

Dan Roth

Introduction (Slides)

Rui Zhang

Prompt Optimization for LLMs (Slides)

Wenpeng Yin

LLM Self-Improvement (Slides)

Fei Wang

Knowledge Update of LLMs (Slides)

Ben Zhou

Aligning with Structures of Target Problems (Slides)

Muhao Chen

Safety Enhancement for LLMs (Slides)

Dan Roth

Conclusion & Future Directions (Slides)

               Recording