Trustworthy Human Language Technologies (TrustHLT) is a research group associated with the Professorship for Natural Language Processing at Paderborn University and led by Ivan Habernal. TrustHLT started in 2021 as an independent research group at the Technical University of Darmstadt, where some of the group members are affiliated.
I hold a W2 Professorship for Natural Language Processing at Paderborn University, Germany. My current research areas include privacy-preserving NLP, legal NLP, and explainable and trustworthy models. My research track spans argument mining and computational argumentation, crowdsourcing, large-scale corpora, serious games, sentiment and sarcasm on social media, and semantic web.
Lena is currently exploring the research area of computational argumentation in the legal domain.
Timour's current research areas include privacy-preserving NLP, differential privacy in graph neural networks, and privacy-preserving semantic representations of language.
Sebastian' research areas include privacy-preserving NLP with a focus on text rewriting with provable guarantees.
Martin's thesis compares privacy-preserving inference methods, applying them to NLP tasks and developing software to connect PyTorch with techniques like homomorphic encryption and garbled circuits.
Marius investigates question answering in the German legal domain. His thesis explores how well existing models can support laymen to receive a first legal aid, based on a created dataset of questions in lay language to answers in legalese.
Chris's thesis focused on finding best practices on how to optimally adapt the concept of differential privacy in NLP environments while putting the needs of the end-users first and considering perceptional biases to make differential privacy more accessible.
Lijie is a second-year PhD student in Computer Science at King Abdullah University of Science and Technology. Her research interests cover machine learning algorithm on Explainable AI (XAI), Differential Privacy, and Differential Private Natural Language Models. She is also interested in Machine Unlearning, and other security issues in data field.
2022, Research internship Sudarshan is an undergraduate student in Computer Science from India. His primary research interest is in creating language processing tools that are socially and ethically responsible. He is working on a research project related to differentially private synthetic data generation.
Nina wrote her thesis on privacy-preserving techniques for crowdsourcing sensitive text data.
Johanna studied computer science at TU Darmstadt. In her bachelor thesis, she compiled an easily accessible legal benchmark dataset to enable evaluating models on a variety of legal NLP tasks.
Lars, student of information systems technologies, cooperated with political scientists to identify indoctrination in German history textbooks through entity emotion analysis.
Ying explored privacy-preserving transformer models in the legal domain. Her thesis combined large-scale pre-training with differential privacy and evaluates the trade-off between privacy-preserving capability and downstream performance.
Sarah explored ethical argumentation in scientific literature. Her thesis focused on controversial technologies and automatic mining of absent, shifting, and evolving ethical arguments.
Manuel was a bachelor's student at the TU Darmstadt focusing on machine learning. He wrote his thesis on the effectiveness and impact on accuracy using differential privacy in NLP.
Lena studied computer science at TU Darmstadt. In her thesis she dealt with differentially private language representation learning.
Daniel explored legal argument mining in court decisions with focus on ECHR decisions and their art of argumentation in regard to their importance level.
Fabian's research area included legal argument mining, expert annotations, and low-resource and few-shot transfer learning for annotation recommendations.
TrustHLT has currently the following open positions
We're looking for a postdoctoral research to join our group at Paderborn University and strengthen our research on privacy-preserving NLP. Read the full job posting.
We are looking for a student research assistant (HiWi) to implement models and aid research in the field of Empirical Computational Argumentation in Legal Proceedings. The primary focus will be on processing long documents, argument mining and argument reasoning in the context of the European Court of Human Rights, which is one of the most important courts concerning human rights. Read the full job posting (PDF).
The Text, Speech and Dialogue (TSD 2023) conference invited me to the beautiful city of Pilsen, Czech Republic, to give a keynote talk on privacy in NLP.
Timour Igamberdiev successfully defended his dissertation thesis on Differential Privacy in NLP with magna cum laude grade. Timour is the first PhD student graduating from the TrustHLT group!
TrustHLT has two papers on privacy-preserving NLP accepted to the Findings of the Association for Computational Linguistics: ACL 2023, co-authored by Timour Igamberdiev, Cleo Matzken, and Steffen Eger.
We held a tutorial on Privacy-Preserving Natural Language Processing at the The 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023) in Dubrovin, Croatia. The slides are available at GitHub.
It was my pleasure to give an invited talk about Privacy-Preserving Natural Language Processing at the Aalto University in Helsinki. The video recording should soon become available.
The 17th Conference of the European Chapter of the Association for Computational Linguistics will host our tutorial on Privacy-Preserving Natural Langauge Processing in Dubrovnik, in May 2023.
Our new paper "One size does not fit all: Investigating strategies for differentially-private learning across NLP tasks" by Manuel Senge, Timour Igamberdiev, and myself will be presented at the 2022 Conference on Empirical Methods in Natural Language Processing in Abu Dhabi in December this year.
In this winter term, I'm holding a W2 interim professorship at the The Center for Information and Language Processing at the Ludwig-Maximilians-Universität München.
Our new paper "DP-Rewrite: Towards Reproducibility and Transparency in Differentially Private Text Rewriting" by Timour Igamberdiev (TrustHLT), Thomas Arnold (UKP), and myself will be presented at the 29th International Conference on Computational Linguistics in Korea in October this year.
I'm now a member of hessian.AI — The Hessian Center for Artificial Intelligence. Its mission is to drive research excellence, education, practice and leadership in AI to foster economic growth and improve the human condition.
Our paper on protecting privacy of models trained on graph data using differential privacy has been accepted at the International Conference on Language Resources and Evaluation (LREC) to be held in Marseille, France in June.
Our paper analyzing trickiness of differentially-private text representation learning will be presented at the 60th Annual Meeting of the Association for Computational Linguistics, the world's top conference for natural language processing.
I'm giving an invited lecture at the School of Computing and Information Science, University of Maine with a bit provoking title "If all you have is a hammer, everything looks like a nail: SGD-DP in privacy-preserving NLP" (download slides).
Our paper on the pitfalls of differential privacy in NLP will be presented at the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), one of the world's leading conferences for natural language processing.
I'll be giving a guest lecture at the International Summer School on "AI and Criminal Justice" in Rome on July 12th. This summer school is a great opportunity to acquire an interdisciplinary and in-depth knowledge in the cutting-edge area of AI and criminal justice.
I'm happy to volunteer as a mentor for early career researchers at this year's Conference of the European Chapter of the Association for Computational Linguistics (EACL). One of the topics on the agenda is "How to survive grad school", I'm very much looking forward to some fresh perspectives!
Thanks to Yang Gao for invited me over to Royal Holloway, University of London to give an invited talk on privacy-preserving NLP, a joint work with Timour Igamberdiev. Slides available here.
Happy to join the Area Chairs for sentiment analysis and argument mining at this year's Conference on Empirical Methods in Natural Language Processing (EMNLP).
I happily accepted an invitation to join the standing reviewer board of Computational Linguistics, the "longest-running publication devoted exclusively to the computational and mathematical properties of language".
Together with Isabelle Augenstein and tutorial chairs for NAACL, EMNLP, and ACL-IJCNLP, we are preparing the next year's selection of tutorials to be presented either virtually or in-person.
In this interdisciplinary collaboration, we look into argumentation in the verdicts of the European Court of Human Rights. What makes a verdict of a high importance? Is it the facts? Is it the argumentation pattern? Is it the judges? Or is it something left between the lines?
We combine legal expertise with state-of-the-art NLP.
We collaborate with expert legal researchers Prof. Dr. Christoph Burchard from Geothe University Frankfurt.
Chair for German, European and International Criminal Law and Procedure, Comparative Law and Legal Theory
What does is mean for machine translation models to protect privacy? What personal information do neural machine translation systems leak? Can we protect users during inference?
In this research project supported by the Hessisches Ministerium des Innern und für Sport we tackle privacy-preserving natural language processing in the context of machine translation, including differential privacy and cryptographical tools.
The goal of this project is to explore Natural Language Processing methods that can dynamically identify and obfuscate sensitive information in texts, with a focus on implicit attributes, for example, their ethnic background, income range, or personality traits. These methods will help to preserve the privacy of all individuals - both authors and other persons mentioned in the text. Further, we go beyond specific text sources, like social media, and aim to develop robust and highly adaptable methods that can generalize across domains and registers.
We collaborate with the UKP Lab led by Prof. Dr. Iryna Gurevych.
Director of the Ubiquitous Knowledge Processing (UKP) Lab
Slides are freely available at GitHub under open licences.
Recorded lectures are in a YouTube playlist.