In-Context Learning

(참고)

How does in-context learning work? A framework for understanding the differences from traditional supervised learning

Bayesian inference view of in-context learning

Pretraining을 통해 다양하게 학습된 concept 를 통해 LLM은 베이지언추론을 하듯이 concept을 활용할 수 있게 된다는 주장

어쩌면 TODS 반제품주의자들이 생각했던 slot의 필요성은 부정되는게 아닐까?

ChatGPT가 보여주는 뛰어난 언어활용능력의 본질이 바로 이것?

Untitled

흥미로운 실험(Min et. al. Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?)

Demonstration에서 label을 random하게 줘도, concept만 맞으면 추론성능은 크게 떨어지지 않더라. 중요한건 반복되는 패턴에서 규칙을 배울수 있는가임.

Untitled

Untitled

https://arxiv.org/pdf/2205.05055.pdf

In-context Learning에 대한 Deepmind의 고찰 (in-context learning에 대해 깊이있게 이해하려면 읽어보면 좋겠음)

what aspects of the training regime lead to this emergent behavior? Here, we show that this behavior is driven by the distributions of the training data itself.

we found that in-context learning traded off against more conventional weight-based learning, and models were unable to achieve both simultaneously

we found that naturalistic data distributions were only able to elicit in-context learning in transformers, and not in recurrent models.

In-context Learning and Induction Heads