Viewing posts categorised under: News
31Jan
Visit to the German Stasi Records Archive: Participation in the expression of interest procedure for the virtual reconstruction of the Stasi files

The German Federal Archives store 40 to 55 million pages of torn Stasi files. These are to be restored through automatic virtual reconstruction. A previous pilot project was unable to complete the task adequately, as even the Tagesschau reported on 21.04.2023. An expression of interest procedure has now been launched for a two-part project consisting of a scanning process and virtual reconstruction. We are applying for the automatic virtual reconstruction. The core task is to develop an automated process for arranging scanned document snippets into full pages and complete documents.

At the end of January, we visited the German Stasi Records Archive to talk to the Vice President of the Federal Archives, Alexandra Titze, and present our approach. As a research-oriented technology provider, LangTec showcases an innovative AI-based approach that enables efficient processing of the large amounts of text and data that the volume of Stasi documents inevitably represent.

We are going to follow the topic with great interest and look forward to a possible future collaboration!

   

Read More
21Dec
A journey from Bethlehem to South America – This year’s LangTec Winter Holiday Party

This the season of joy, laughter, twinkling lights and delicious food and this year’s LangTec Winter Holiday Party was no exception.  We began on a snowy evening at the Hamburg Planetarium where we travelled back 2000 years to investigate theories of the true origin of the celestial wonder of the “Star of Bethlehem”.
From there we took a Moia and sped forward in time to Yaku, a modern Peruvian/Mexican fusion restaurant in Hamburg’s Grindelviertel, where we indulged in course after course of new, wonderful and colourful flavour combinations and merry mescal cocktails.

Happy holidays from our LangTec family to yours!

Read More
02Nov
parson and LangTec announce AI cooperation

The use of artificial intelligence is becoming increasingly important in technical communication. In order to offer our customers the best possible solutions for AI-based text and language technology applications, parson has started a cooperation with the Hamburg-based technology provider LangTec. LangTec develops innovative language technology solutions for the efficient processing of large amounts of text and data with special focus on AI and machine learning.

“LangTec’s know-how and many years of experience with machine learning and artificial intelligence perfectly complement our expertise in the field of technical documentation. Together we can develop the best possible solutions for the use of artificial intelligence in technical communication,” says Ulrike Parson, CEO of parson AG.

“We are very pleased about this close cooperation with parson AG. As one of the leading providers of AI-based language technology in the German-speaking market, it is particularly important for us to work with established players who are ready to put custom text analytics solutions into production. AI becomes valuable when it helps to gain concrete competitive advantages,” says Dr. Patrick McCrae, founder and managing director of LangTec.

About parson

parson is a leading service provider of smart content and intelligent information solutions. parson AG advises its customers on the digitalization of content processes and the introduction of a sustainable content strategy. For products, software and services, parson delivers semantically enriched, modular content such as user documentation, programming instructions, online help, eLearning content and specifications.

 

Fine-Tuning of a Language Model

To kick off the partnership with LangTec, parson presents a pilot project at this year’s tcworld conference 2023. This model projects deals with the fine-tuning of a Large Language Model (LLM) and was realized jointly with LangTec.

In their presentation, Helle Hannken-Illjes and Ulrike Parson show first results of the domain-specific fine-tuning of a pre-trained large language model (LLM) on customer-specific data. The presented model can be operated locally and is hence also perfectly suited for processing sensitive customer data:

AI yes, but not ChatGPT! How do I get my own language model? (in German)

Helle Hannken-Illjes and Ulrike Parson, presentation

tcworld 2023, November 15, 2023, 9.00 a.m., room C6.2

Find out more

Read More
22Oct
LangTec Lands a Deal as Tough as Steel

In a new project aimed at advancing sustainability in the industrial sector, LangTec is happy to announce its collaboration with one of the leading global players in the steel and metal distribution industry. As part of this collaboration, LangTec contributes senior backend development expertise to help build solutions that revolutionise the way we think about carbon emissions in the supply chain.

Our client’s Sustainability Team has been on a mission to empower endcustomers with all the information they need to make environmentally conscious choices. Equipped with a groundbreaking system for calculating carbon emissions throughout the entire product lifecycle, the Sustainability Team is now adding LangTec’s development expertise to take this initiative to the next level.

We are contributing to an improved solution that will make it easier to track and analyse the environmental impact of various materials and vendors, from raw material delivery to the final leg of transport to the end-user. Moreover, LangTec will be involved in integrating an authentication system to further streamline the user experience and provide a seamless interaction path towards greener, more sustainable customer choices.

Read More
07Aug
Massively parallel production solution in the cloud for transforming short texts at scale deployed

For our client from Southern Germany we designed, extensively tested and now deployed to the cloud a production system which each day transforms ten-thousands texts within a few hours and reports them back. The client sent short texts are automatically transformed using ChatGPT from OpenAI. As part of this process ChatGPT answers are linguistically checked for undesired words whose presence triggers regeneration of such texts. To reach the required processing speed while staying below load limitations, we use a high degree of parallelization on different architectural levels.

Read More
02Aug
Convincing results for AI-based authoring assistance for technical writers

As already reported here in a joint research phase with a leading northern German company in the field of technical documentation LangTec has been working on a AI-based authoring support system for text completion in the text domain of technical documentation.

Now we can report extremely convincing results for this proof of concept. Among other results we were able to improve prediction accuracy on the next word by 45 (top 10 predictions) to 62 (top 1) percentage points due to finetuning the base language model on client data. Additionally, this project allowed LangTec to use and extend its expertise in the field of neural language models, especially with respect to deeper interventions in the standard software architecture of the language model.

Read More
27Jul
From The Alster off to Thailand – This Year’s Summer Party

This year the weather did not mean well with us at first: It took us three attempts, but on that sunny day it worked out and we met at the Supper Club Hamburg for cool drinks and water sport activities. With an SUP, a canoe and a pedal boat we toured over the Alster and met Irish musicians and the waves of the Alster steamer on the way… Everyone stayed dry and so in the evening we moved towards Schanze to the restaurant JING JING. An excellent, super flavourful 4-course menu propelled us straight to Thailand! Now what exactly was part of the concept, and what not, remains open, but we can say for sure that we had a fantastic time 🙂

 

Read More
01Jun
AI-based Authoring Support System

Technical documentation requires a specific sublanguage by convention and regulation, which poses a challenge when different technical writers must create new documents in a coherent manner. So far, our client employs rule-based systems that enforce consistent structure and style, which can only detect problems that trigger manually-created style checking rules.

In a joint research phase, LangTec will now develop the proof of concept for an authoring support system that offers text completion features in document creation without the need to specify any explicitly defined rules up-front.

The goal of this AI project is to support technical writers by reducing any repetitive manual work. The central task will be to suggest the most fitting continuation of sentences when creating new technical documents. LangTec will select one of the the available very large language models, perform a domain-specific finetuning based on a large number of existing documents, and formally evaluate the resulting model’s prediction accuracy.

Read More
15May
New Project: Paraphrasing Short Texts Using ChatGPT

In order to increase the search engine relevance of text extracts, we are investigating the paraphrasing competence of the language model ChatGPT for one of our customers. The goal is to automatically reduce full texts (text summarization) and to generate text snippets using paraphrasing that are evaluated by the search engine as “unique” as possible. By optimizing the textual uniqueness, the snippets should be ranked as high as possible in the hit list in the web search and thus lead to a better conversion.
For the implementation of this project, LangTec specifically designed and operationalized a measure to quantify textual uniqueness. In this project, LangTec thus also contributes its computational linguistics expertise and many years of experience in the automated generation of texts.

Read More
28Apr
Successful implementation of data-driven job ads generation with ChatGPT

For a provider in the field of recruiting, we successfully implemented the generation of sample job advertisements using the large-scale language model ChatGPT. The texts generated with our solution serve recruiting companies in the creation of their job advertisements as a starting point for further individual adaptations.
For this purpose, LangTec developed a web application as well as suitable instructions, so-called “prompts,” for the language model of OpenAI. The prompts also included textually relevant information from a domain-specific database, which should be considered by ChatGPT in the output.
After having implemented extensive text generation solutions based on template-based text generation (NLG) over many years in the past with our in-house developed solution TextWriter, we are now happy to use generative language models in our text generation projects with this project.

Read More
Top