RESEARCH-DRIVEN LANGUAGE TECHNOLOGY AND AI
WE DEVELOP INNOVATIVE LANGUAGE-TECHNOLOGY AND AI-SOLUTIONS FOR YOUR ENTERPRISE
Our core areas of expterise are semantic text analytics (NLP), automated text, data and document generation (NLG), machine learning (ML) and artificial intelligence (AI). With our team of computational linguists, data scientists, software engineers and classical linguists, we’ve been operating successfully in the market place since 2011. For a comprehensive list of our clients as well as project descriptions, please click here.
Text Analytics or text mining utilises a plethora of methods from computational linguistics and artificial intelligence in order to convert unstructured textual data into structured information. Specifically, patterns and structures are extracted from input texts based on lexical properties, syntactic structures, statistical observations and machine learning with the overall aim of gaining deep semantic insights from textual input.
Depending on project objective and context LangTec chooses from a wide range of possible machine learning methods. In our projects we have used both unsupervised learning, e.g., in document clustering, topic modelling or the creation of vector-based word and language models, as well as supervised learning, e.g., in text and document classification or selective information extraction. In the deep-learning domain we increasingly avail of pretrained language models. For our research-driven projects we also use transfer learning and model distillation.
The term ‘Artificial Intelligence’ denotes a broad category subsuming all of our project and product-related activities here at LangTec. We build solutions that intelligently solve real-world problems. Our expectation into our own work is that the resulting solutions achieve a quality and efficiency level that substantially exceeds human-level performance. Only if the resulting solution really outperforms humans notably do we deem the label ‘artificial intelligence’ appropriate. And even if artificial intelligence and machine learning are extremely closely interwoven these days, does our understanding of the term ‘AI’ extend far beyond just machine learning.
Today, the automated generation of journalistic content from structured data is almost commodity. Automated text generation draws on methods from computational linguistics and artificial intelligence to create human-readably copy text informed by structured data. LangTec’s solution TextWriter permits to optimise generated texts with regards to a number of parameters such as text uniqueness, SEO relevance, readability, text length, target group or output channel. Typical applications of TextWriter are domains characterised by data exhibiting a high level of temporal volatility, e.g., weather reports or sport event coverage, or the requirement for extremely broad individual coverage such as product data in e-commerce or localised regional news.
With the advent of deep learning new machine learning techniques have become available over the past 10 years whose increase in performance comes at the cost of a substantially increased need for annotated training data. Regrettably, in actual practice consistently and comprehensively annotated training data is not always available, be it for reasons of data protection, copyright or simply the insufficient scope or quality of costly manual annotation.
LangTec’s DocumentCreator addresses this challenge and permits to create large volumes of training data with wide structural variance based on just a few input samples. With DocumentCreator in place your machine learning algorithms can be trained, evaluated and tuned robustly prior to deployment into production even when only very little actual data is at hand.
Structured knowledge representations, i.e., formal models of qualitative interdependencies within a given domain, have gained increasing importance in industrial application over the past decade, not least because it now is technically feasible to implement and employ efficiently large-scale knowledge representations in commercial contexts. Structured knowledge representations such as ontologies, knowledge graphs or triple stores make modelled domain knowledge accessible to computational processing and thus contribute significantly to raising the quality of deep semantic analytics to a level beyond data-only. We help you build and use knowledge representations taylored to your specific needs.