Semi-automated analysis, aggregation, and visualization of user comments


The Forum 4.0 project builds on previous work of the SCAN-4J project with social scientists and strives for an informatics centered and interdisciplinary research partnership. The duration of the project is for three years (2018 – 2020).

Project Summary

Online user comments, such as journalistic content or product features are associated with low quality, Hate speech or even excessive demands for moderation. Forum 4.0 will develop new methods based on text analysis, machine learning (with human-in-the-loop) and Empirical Software Engineering to better exploit the constructive and deliberative potential of user comments. The aim is to systematically analyze, aggregate and visualize the content and quality of comments at runtime in order to enable constructive participation. The project builds on our previous work with social scientists and aims for an informatics-centered, interdisciplinary research network that combines the focal points “Information Governance Technologies” and “Data Science” of We foresee immense transfer potential for the media location of Hamburg.

Involved Research Institutes

Universität Hamburg HAW Informatik Hans-Bredow-Institut



  • Stanik, Christoph, Haering, Marlo, Jesdabodi, Chakajkla, and Maalej, Walid (2020, August). Which App Features Are Being Used? Learning App Feature Usages from Interaction Data. In 2020 IEEE 28th International Requirements Engineering Conference (RE) (pp. 66-77). IEEE.
  • Andersen, J. S., Schöner, T., Maalej, W. (2020). A Word-Level Uncertainty Estimation Approach for Black-Box Text Classifiers using RNNs. In Proceedings of The 28th International Conference on Computational Linguistics COLING 2020. To Appear.
  • Wiedemann G, Yimam S.M., Biemann C. (2020): UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection. Proceedings of The 14th International Workshop on Semantic Evaluation (SemEval), Barcelona, Spain.
  • Yimam, S.M., Alemayehu, H.M., Ayele, A.A., Biemann, C. (2020): Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models. In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020). Barcelona, Spain (online). (to appear)
  • Fedtke, C., & Wiedemann, G. (2020). Hass- und Gegenrede in der Kommentierung massenmedialer Berichterstattung auf Facebook: Eine computergestützte kritische Diskursanalyse. In P. Klimczak, C. Petersen, & S. Breidenbach (Eds.), Soziale Medien? Interdisziplinäre Zugänge zur Onlinekommunikation (91-120). Springer. (to appear)
  • (Minor Revision) Haering, M., Bano, M., Zowghi, D., Kearney, M., Maalej, M., “Automating the Evaluation of Education Apps with App Store Data,” in IEEE Transactions on Learning Technologies


  • Stanik, Christoph, Marlo Haering, and Walid Maalej. “Classifying multilingual user feedback using traditional machine learning and deep learning.” 2019 IEEE 6th International Workshop on Artificial Intelligence for Requirements Engineering (AIRE). IEEE, 2019.
  • Tropmann-Frick, M., & Andersen, J. S. (2019). Towards Visual Data Science-An Exploration. In Proceedings of the International Conference on Human Interaction and Emerging Technologies (pp. 371-377). Springer
  • Andersen, J. S. (2019). A User Centric Visual Analytics Framework for News Discussions. In Proceedings of the Workshops of the EDBT/ICDT 2019 Joint Conference.
  • Yimam S. M., Ayele, A. A., Biemann C. (2019): Analysis of the Ethiopic Twitter Dataset for Abusive Speech in Amharic. In Proceedings of International Conference On Language Technologies For All: Enabling Linguistic Diversity And Multilingualism Worldwide (LT4ALL 2019). Paris, France
  • Wiedemann, G., Remus, S., Chawla, A., Biemann, C. (2019): Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings. Proceedings of KONVENS 2019, Erlangen, Germany
  • Wiedemann, G., Ruppert, E. and Biemann, C. (2019): UHH-LT at SemEval-2019 Task 6: Supervised vs. Unsupervised Transfer Learning for Offensive Language Detection. Proceedings of SemEval 2019, Minneapolis, MN, USA


  • Häring, Marlo, Wiebke Loosen, and Walid Maalej. “Who is Addressed in this Comment? Automatically Classifying Meta-Comments in News Comments.” Proceedings of the ACM on Human-Computer Interaction 2.CSCW (2018): 1-20.
  • Wiedemann, G., Ruppert, E., Jindal, R. and Biemann, C. (2018): Transfer Learning from LDA to BiLSTM-CNN for Offensive Language Detection in Twitter, In Proceedings of GermEval 2018, 14th Conference on Natural Language Processing (KONVENS 2018). Vienna, Austria

Bachelor and Master Theses

  • Bachelor (in progress), Mats Grashoff: Architektur einer Machine-Learning Pipeline für die Semi-Automatische Analyse von Nutzerkommentaren
  • Master, Max Wiechmann: ActiveAnno: Flexible and Effcient Document Annotation Tool with Active Learning Integration. M.Sc. Thesis, U Hamburg
  • Bachelor, Janik Schröder: Entwicklung eines Browser-Plugins zur nutzerseitigen Filtration von Hate Speech in sozialen Netzwerken. B.Sc. Thesis, U Hamburg
  • Bachelor, Master, Tim Dobert: Sentiment Analysis of Informal Online Texts with Neural Networks., Universität Hamburg
  • Bachelor, Carolin Dohmen: Detecting Hate Speech in Social Media – A Machine Learning Approach.
  • Bachelor, Antonio Sanchez Friedeberg: Konzeption und Realisierung von Pipelines für die systematische Kombination von Dimensionsreduktion und Clustering.
  • Bachelor, Olimpio Machachane: Supervised Machine Learning Methods For Classification.
  • Bachelor, Martin Gosch: Experimentelle Evaluation von Frameworks zur Speicherung und Verarbeitung unsicherer Daten.
  • Bachelor, Finn Masurat:  Realtime-Erkennung von Hate-Speech auf Twitter: Eine Evaluation von Apache Flink und Spark.
  • Bachelor, Christopher Wolfarth: Echtzeitanalyse- und Moderation von Online-Kommentaren durch Deep Learning.
  • Master, Tom Schöner: Detecting Uncertainty in Text Classifications: A Sequence to Sequence Approach using Bayesian RNNs.
  • Master, Falko Winkler: Scalable and reproducible processing of data pipelines.



1st Forum 4.0 Kolloquium (in German)

We are proud to host our first interdisciplinary colloquium with local researchers in the area of semi-automated analysis, aggregation, and visualization of online user comments. We compiled an agenda with diverse talks, in which we summarize our current research state in this field. The colloquium will take place on Thursday, 11th of October 2018 at the informatics campus of the University of Hamburg, room D-220 (1. floor).

Agenda – First Forum 4.0 Kolloquium
09:15 – 09-30 Uhr Ankunft und Willkommen
09:30 – 10:45 Uhr Session 1: Bots and offensive participation (W. Maalej)
  • OpeningWalid Maalej
  • Interevent timing in automated retweet cascadesPhilipp Kessling & Christian Stöcker
  • The Anatomy of Hate Speech: Datasets, Analysis and Automatic DetectionMichael Wojatzki
  • Transfer Learning from LDA to BiLSTM-CNN for Offensive Language Detection in TwitterGregor Wiedemann
10:45 – 11:00 Uhr Pause
11:00 – 12:45 Uhr Session 2: Towards constructive participation (C. Biemann)
  • Tinder die Stadt: die App, die Bürger*innen (wieder) mit Lokaljournalismus verkuppeltWiebke Loosen & Julius Reimer
  • Who is Addressed in this Comment? Automatically Classifying Meta-Comments in News CommentsMarlo Häring
  • User Comment Recommendation in Online News ForumsVolodymyr Biryuk
  • Ein Visual Analytics Framework für NutzerkommentareJakob Andersen
  • ClosingWalid Maalej
13:00 – 14:00 Uhr Get together mit drinks and finger-food
14:00 – 18:00 Uhr Internal Forum 4.0 meeting (only project members)