Publications
Conference papers |
Building a generative AI showroom for foundation models with different modalities International Conference on Multimedia Information Processing and Retrieval, MIPR 2024 |
Do Multilingual Large Language Models Mitigate Stereotype Bias? Workshop on Cross-Cultural Considerations in NLP, C3NLP 2024 |
Evaluation of Document Deduplication Algorithms for Large Text Corpora International Conference on Machine Learning, Optimization and Data Science, LOD 2024, to appear |
ILLUMINER: Instruction-tuned Large Language Models as Few-shot Intent Classifier and Slot Filler Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 |
Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions? Conference on Empirical Methods in Natural Language Processing, EMNLP 2024 |
OpenGPT-X – Novel Architecture Exploration International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2023, vom Projektpartner Forschungszentrum Jülich |
OpenGPT-X - Training Large Language Models on HPC Systems Joint Laboratory for Extreme-Scale Computing Workshop, JLESC 2022, vom Projektpartner Forschungszentrum Jülich |
OpenGPT-X – Training Large Language Models on HPC Systems ISC High Performance Conference 2023, vom Projektpartner Forschungszentrum Jülich |
Tokenizer Choice For LLM Training: Negligible or Crucial? Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2024 |
Papers |
Data Processing for the OpenGPT-X Model Family 2024 |
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs 2024 |
Towards Multilingual LLM Evaluation for European Languages 2024 |