Project group Facts4Chat

Current generative language models such as ChatGPT have issues with factual correctness. They can be used very well in creative writing, but they are not half as useful in a professional or scientific context, as all answers need to be checked for factual accuracy. For example, a chatbot for answering product questions on the web site of a company must answer with correct product features, and must not recommend the products of competitors. However, there are many projects underway to improve this by integrating information retrieval functionality (e.g., via plugins in ChatGPT) to provide the chat model with the facts, and have the generative model only perform the user interface.

The task of this project group is:

A longer description is available on the German version of this page.


  1. A. Vaswani, N. Shazeer, N. Parmar et al. “Attention is All you Need”. In: NeurIPS. 2017, p. 5998–6008.
  2. A. Radford, K. Narasimhan, T. Salimans et al. Improving language understanding by generative pre-training. 2018.
  3. J. Devlin, M. Chang, K. Lee et al. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”. In: NAACL-HLT. 2019, p. 4171–4186.
  4. N. Reimers und I. Gurevych. “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks”. In: EMNLP-IJCNLP. 2019, p. 3980–3990.
  5. D. M. Ziegler, N. Stiennon, J. Wu et al. Fine-Tuning Language Models from Human Preferences. 2019.
  6. T. B. Brown, B. Mann, N. Ryder et al. “Language Models are Few-Shot Learners”. In: NeurIPS. 2020.
  7. N. Stiennon, L. Ouyang, J. Wu et al. Learning to summarize from human feedback. 2020.
  8. Y. Bai, p. Kadavath, S. Kundu et al. Constitutional AI: Harmlessness from AI Feedback. 2022.
  9. p. Borgeaud, A. Mensch, J. Hoffmann et al. “Improving Language Models by Retrieving from Trillions of Tokens”. In: International Conference on Machine Learning, ICML. Bd. 162. Proceedings of Machine Learning Research. 2022, S. 2206–2240. url:
  10. T. Dao, D. Y. Fu, p. Ermon et al. “FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness”. In: NeurIPS. 2022. url:
  11. J. Geiping und T. Goldstein. Cramming: Training a Language Model on a Single GPU in One Day. 2022.
  12. J. Hoffmann, p. Borgeaud, A. Mensch et al. Training Compute-Optimal Large Language Models. 2022.
  13. L. Ouyang, J. Wu, X. Jiang et al. “Training language models to follow instructions with human feedback”. In: NeurIPS. 2022.
  14. K. Shuster, J. Xu, M. Komeili et al. BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage. 2022.
  15. R. Thoppilan, D. D. Freitas, J. Hall et al. LaMDA: Language Models for Dialog Applications. 2022.
  16. V. Lialin, V. Deshpande und A. Rumshisky. Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning. 2023.
  17. R. Taori, I. Gulrajani, T. Zhang et al. Stanford Alpaca: An Instruction-following LLaMA model. GitHub repository. 2023.
  18. H. Touvron, T. Lavril, G. Izacard u. a. LLaMA: Open and Efficient Foundation Language Models. 2023.