Natural Language Processing and Software Engineering

Content

This lecture provides the basics for the automated processing of natural language texts. Language processing is becoming increasingly important. 

Linguistic input plays a critical role in interactive systems, such as voice commands, assistance systems, and query interfaces. Additionally, the analysis and processing of text-based software artifacts represents an important field of research. Computational linguistics is therefore not only of great importance for software applications, but also for software engineering itself. 

The aim of this lecture is to provide basic knowledge of natural language processing (NLP) and its potential applications in the development of software systems. Key topics include the automated analysis of texts, the challenges posed by the inherent ambiguity of natural language, the translation of natural language texts into software models, and the use of large language models (LLMs) in software engineering. The lecture will also explore current research developments and trends in the field.

Competency Goals: 

Students know basic concepts of linguistics such as syntax, semantics and pragmatics, and can explain and compare them. They are familiar with lexical relations such as polysemy, homonymy, and troponymy and can identify relevant examples. Furthermore, they can identify and compare connections between the relations. 

Students are familiar with basic concepts of computational linguistics. Basic techniques such as part-of-speech tagging, lemmatization, word similarities and disambiguation can be explained. Associated methods (lexical, rule-based, or probabilistic) can be described and their respective strengths and weaknesses assessed. Different parsing methods can be named, explained and conceptually reproduced. 

Students can describe and compare the structure, content and benefits of different knowledge bases. In addition to the overarching concepts of ontology, lexical databases and other knowledge representations, they are also familiar with specific representatives, such as WordNet, DBpedia and similar, and can use them.

Students understand the connection between the functionality of basic computational linguistics techniques and their applicability in software engineering. In addition, they can break down tool chains into individual components and evaluate them. In particular, students will be able to analyze and evaluate different applications. These include automated modeling, improving requirements specifications, and traceability link recovery. In addition, students can explain the concept of large language models (LLMs) and their application and use in the field of language processing. Students can identify application scenarios in software engineering for text analysis systems and design their own solutions.

Workload:

3 ECTS correspond to approximately 90 hours of work, including:

approx. 30 hours of attending lectures

approx. 45 hours of preparation and follow-up work

approx. 15 hours of exam preparation

Language of instruction English