OpenGPT-X: Teuken 7B

The European, open, multilingual large language model

Companies from all industries can now implement AI applications with »Teuken 7B« - the large language model from the OpenGPT-X research project is now available to download free of charge as open source on Hugging Face: »Teuken 7B-instruct-v0.4« is trained from scratch with the 24 official languages of the EU and has seven billion parameters. Developers from research and companies can download »Teuken-7B« and adapt, supplement and further fine-tune it as a basis for their applications. After this step, a model is created that is optimized for specific use cases in the company.

»Teuken 7B« is available in the following versions:

  • for research purposes:
    »Teuken 7B-instruct-research-v0.4«
  • for non-commercial purposes:
    »Teuken 7B-instruct-v0.6«
    »Teuken 7B-base-v0.6«
  • for companies for commercial purposes under the »Apache 2.0« license:
    »Teuken 7B-instruct-commercial-v0.4«

In addition to the two Fraunhofer Institutes IAIS and IIS and the Jülich Research Center, the German AI Association, TU Dresden, the German Research Center for Artificial Intelligence (DFKI), IONOS, Aleph Alpha, ControlExpert and Westdeutscher Rundfunk (WDR) have collaborated on OpenGPT-X as partners.

 

Register for the free webinar

Multilingual

  • Our model is fully multilingual, with training in all 24 EU languages.
  • It contains approximately 50 percent non-English pretraining data.
  • A performance comparison demonstrates that the model produces similar results across the linguistic spectrum.
  • Consequently, it reflects European characteristics, norms, and values, facilitating effective multilingual communication.

Open

  • The model can be downloaded free of charge in various versions and licenses from Hugging Face.
  • »Teuken 7B« can be used and further developed in versions 0.4 and 0.6 in research.
  • The »Apache 2.0« license permits adaptation, further development, and utilization of »Teuken 7B-instruct-commercial-v0.4« for commercial artificial intelligence applications.
  • Sensitive data may remain within the organization.

Science-driven

  • The product was developed by scientists for commercial use.
  • The multilingual tokenizer allows for particularly energy-efficient training and operation of multilingual applications.
  • The European Leaderboard, developed by our team, is designed to evaluate and assess a range of models in response to multilingual tasks.
  • Podcast Knowledge Science: Mehdi Ali and Michael Fromm from Fraunhofer IAIS explain the development of multilingual European AI systems (in German).

Application of Teuken-7B in the company

Download »Teuken 7B«

Developers can download »Teuken 7B« free of charge on Hugging Face.

Webinar / in German

Teuken 7B – GenAI in the public sector

  • 19.09.25, 11:00 - 11:45 a.m.
  • for companies / free of charge

Webinar / in English

As soon as a webinar in English becomes available, we will announce it here. Follow us on social media to stay up to date!

Get started with us

We adapt »Teuken 7B« to align with your organizational procedures. To learn more about our offerings or to schedule a consultation, please do not hesitate to contact us.

 

Technical Info & Research

 

Model cards and benchmarks

Technical data regarding the model and its utilization. Comparative graphical illustrations and technical explanations of the model in relation to other models.

 

Use Cases

The following represents a representative sample of specific application examples drawn from a variety of sectors, including industry, healthcare, legal, finance, and media.

 

Publications and Code Repositories

Research results on multilingual language models

 

LLM Community

We respond to technical and scientific inquiries from the community and provide a forum for feedback and discourse on the OpenGPT-X Discord server.

FAQ about Teuken-7B

  • »Teuken 7B« is available in the following versions:

    • »Teuken 7B-instruct-research-v0.4« for research purposes.
    • »Teuken 7B-instruct-commercial-v0.4« for companies for commercial purposes under the »Apache 2.0« license. The model has already been optimized for chat through Instruction Tuning. »Teuken 7B-instruct-commercial-v.04« is comparable to the research version in terms of performance, although the research version achieves better results in the benchmarks by one to two percent.
    • Teuken 7B-instruct-v0.6« and »Teuken 7B-base-v0.6« for non-commercial purposes under the license »CC BY-NC 4.0«. The update features significant improvements compared to »Teuken 7B-instruct-v.04«, including increased performance, improved robustness and reliability, and expanded application flexibility.
  • »Teuken 7B« is available for download free of charge and open source on Hugging Face.

  • Companies in particular have the opportunity to take part in a free webinar in which Fraunhofer scientists explain which applications can be realized with appropriate further processing on the basis of »Teuken 7B«.

  • »Teuken 7B-instruct-commercial-v0.4« is multilingual and has been optimized for chat through Instruction Tuning, so it can be used as a multilingual chatbot, e.g. in international customer service or to make company knowledge accessible to employees.

    The following other applications can be implemented with »Teuken 7B«:

    Areas of application:

    • Summarize documents
    • Generate texts
    • Extracting information from texts

    »Teuken 7B« can be further processed through Continued Pretraining, Finetuning, Instruction Tuning, Model Merging etc. in order to adapt the model to the company's own purposes. The result is a model that is optimized for the individual use cases in the company.

  • Select the version »Teuken 7B-instruct-commercial-v0.4«.

    In order to adapt the model to your own business purposes, »Teuken 7B-instruct-commercial« can also be further processed with your own data through Continued Pretraining, Finetuning, Instruction Tuning, Model Merging, etc.

    The model performs well in a performance comparison with other open source models, but still has potential for development in the areas of logical thinking, coding and mathematics. In addition, »Teuken 7B«, like other large language models, can generate content that is inappropriate, offensive or harmful.

  • »Teuken 7B-instruct« is a chatbot that is primarily intended for corporate applications and research projects. Developers from companies and the scientific community can use it to develop their individual chat applications. »Teuken 7B-instruct-commercial-v0.4« can also be further processed with your own data through continued pretraining, fine-tuning, instruction tuning, model merging, etc. to adapt the model to your own business purposes.

  • Yes, companies can use »Teuken 7B-instruct-commercial-v0.4« commercially for their AI applications under the »Apache 2.0 license«.

  • The basic model in version »Teuken 7B-base-v0.6« can be downloaded here for research purposes and for private, educational, and non-commercial use.

    Basic models are particularly susceptible to the generation of inappropriate, offensive or harmful content. At the same time, base models offer the advantage that they can be developed into powerful special models through fine-tuning and instruction tuning if used correctly and responsibly.

  • Currently no. The EU AI Act will not apply until August 2025. AI models that were placed on the market before this date do not have to comply with the requirements of the EU AI Act until August 2027 (grandfathering).

  • The OpenGPT-X research project has been completed.

    As this is an open source project, we also assume that adapted or specialized versions of the model will be developed for different applications by the scientific community or companies.

  • Our scientists are in contact with the LLM community via the OpenGPT-X Discord server. This is also the place for questions and feedback about the model.

OpenGPT-X: Digital sovereignty for Europe

The OpenGPT-X project, comprising ten partners, commenced on January 1st. The project commenced in January 2022 with funding from the Federal Ministry for Economic Affairs and Climate Action (BMWK) to the value of approximately €14 million and ended on March 31, 2025. The project, led by Fraunhofer IAIS and Fraunhofer IIS, is investigating the entire value chain of generative AI. This includes high-scale GPU-based infrastructure and data for training large language models, model development, and productive application in the form of prototypes and proof of concepts (PoCs). The project's overarching goal was to develop a large open-source AI speech model for research and industry, tailored to Europe's multilingual needs.

The release of »Teuken 7B« marks the achievement of this goal, offering a publicly funded alternative for future scientific investigations and economic applications of generative AI.

 

 

Teuken-7B Trennerbild

Teuken 7B webinar

We recommend all interested parties to participate in our free webinar.

The webinar is an introduction to Teuken and LLMs.

The next dates:

Teuken 7B – GenAI in the public sector

  • 19.09.25, 11:00 - 11:45 a.m.
  • for companies / free of charge
  • The webinar will be held in German.

If you have already taken part in a webinar or have a specific request, you can also start directly with a consultation appointment. Please use the following form:

Teuken 7B consultation date

* Pflichtfelder