Semi-automated analysis, aggregation, and visualization of user comments

The Forum 4.0 project builds on previous work of the SCAN-4J project with social scientists and strives for an informatics centered and interdisciplinary research partnership. The duration of the project is for three years (2018 – 2020).

Project Summary

Online user comments, such as journalistic content or product features are associated with low quality, Hate speech or even excessive demands for moderation. Forum 4.0 will develop new methods based on text analysis, machine learning (with human-in-the-loop) and Empirical Software Engineering to better exploit the constructive and deliberative potential of user comments. The aim is to systematically analyze, aggregate and visualize the content and quality of comments at runtime in order to enable constructive participation. The project builds on our previous work with social scientists and aims for an informatics-centered, interdisciplinary research network that combines the focal points “Information Governance Technologies” and “Data Science” of ahoi.digital. We foresee immense transfer potential for the media location of Hamburg.

Involved Research Institutes

Universität Hamburg	HAW Informatik	Hans-Bredow-Institut
Applied Software Technology (Prof. Maalej) Language Technology Group (Prof. Biemann)	Technik und Informatik (Prof. Zukunft) Fakultät Design, Medien und Information (Prof. Stöcker) (associated)	Journalism Research (Prof. Loosen) (associated)

Publications

2020

Stanik, Christoph, Haering, Marlo, Jesdabodi, Chakajkla, and Maalej, Walid (2020, August). Which App Features Are Being Used? Learning App Feature Usages from Interaction Data. In 2020 IEEE 28th International Requirements Engineering Conference (RE) (pp. 66-77). IEEE.
Andersen, J. S., Schöner, T., Maalej, W. (2020). A Word-Level Uncertainty Estimation Approach for Black-Box Text Classifiers using RNNs. In Proceedings of The 28th International Conference on Computational Linguistics COLING 2020. To Appear.
Wiedemann G, Yimam S.M., Biemann C. (2020): UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection. Proceedings of The 14th International Workshop on Semantic Evaluation (SemEval), Barcelona, Spain.
Yimam, S.M., Alemayehu, H.M., Ayele, A.A., Biemann, C. (2020): Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models. In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020). Barcelona, Spain (online). (to appear)
Fedtke, C., & Wiedemann, G. (2020). Hass- und Gegenrede in der Kommentierung massenmedialer Berichterstattung auf Facebook: Eine computergestützte kritische Diskursanalyse. In P. Klimczak, C. Petersen, & S. Breidenbach (Eds.), Soziale Medien? Interdisziplinäre Zugänge zur Onlinekommunikation (91-120). Springer. (to appear)
(Minor Revision) Haering, M., Bano, M., Zowghi, D., Kearney, M., Maalej, M., “Automating the Evaluation of Education Apps with App Store Data,” in IEEE Transactions on Learning Technologies

2019

Stanik, Christoph, Marlo Haering, and Walid Maalej. “Classifying multilingual user feedback using traditional machine learning and deep learning.” 2019 IEEE 6th International Workshop on Artificial Intelligence for Requirements Engineering (AIRE). IEEE, 2019.
Tropmann-Frick, M., & Andersen, J. S. (2019). Towards Visual Data Science-An Exploration. In Proceedings of the International Conference on Human Interaction and Emerging Technologies (pp. 371-377). Springer
Andersen, J. S. (2019). A User Centric Visual Analytics Framework for News Discussions. In Proceedings of the Workshops of the EDBT/ICDT 2019 Joint Conference.
Yimam S. M., Ayele, A. A., Biemann C. (2019): Analysis of the Ethiopic Twitter Dataset for Abusive Speech in Amharic. In Proceedings of International Conference On Language Technologies For All: Enabling Linguistic Diversity And Multilingualism Worldwide (LT4ALL 2019). Paris, France
Wiedemann, G., Remus, S., Chawla, A., Biemann, C. (2019): Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings. Proceedings of KONVENS 2019, Erlangen, Germany
Wiedemann, G., Ruppert, E. and Biemann, C. (2019): UHH-LT at SemEval-2019 Task 6: Supervised vs. Unsupervised Transfer Learning for Offensive Language Detection. Proceedings of SemEval 2019, Minneapolis, MN, USA

2018

Häring, Marlo, Wiebke Loosen, and Walid Maalej. “Who is Addressed in this Comment? Automatically Classifying Meta-Comments in News Comments.” Proceedings of the ACM on Human-Computer Interaction 2.CSCW (2018): 1-20.
Wiedemann, G., Ruppert, E., Jindal, R. and Biemann, C. (2018): Transfer Learning from LDA to BiLSTM-CNN for Offensive Language Detection in Twitter, In Proceedings of GermEval 2018, 14th Conference on Natural Language Processing (KONVENS 2018). Vienna, Austria https://www.inf.uni-hamburg.de/en/inst/ab/lt/publications/2018-wiedemannetal-germeval.pdf

Bachelor and Master Theses

Bachelor (in progress), Mats Grashoff: Architektur einer Machine-Learning Pipeline für die Semi-Automatische Analyse von Nutzerkommentaren
Master, Max Wiechmann: ActiveAnno: Flexible and Effcient Document Annotation Tool with Active Learning Integration. M.Sc. Thesis, U Hamburg https://www.inf.uni-hamburg.de/en/inst/ab/lt/teaching/theses/completed-theses/2020-ma-wiechmann.pdf
Bachelor, Janik Schröder: Entwicklung eines Browser-Plugins zur nutzerseitigen Filtration von Hate Speech in sozialen Netzwerken. B.Sc. Thesis, U Hamburg https://www.inf.uni-hamburg.de/en/inst/ab/lt/teaching/theses/completed-theses/2020-ba-schroeder.pdf
Bachelor, Master, Tim Dobert: Sentiment Analysis of Informal Online Texts with Neural Networks., Universität Hamburg https://www.inf.uni-hamburg.de/en/inst/ab/lt/teaching/theses/completed-theses/2019-ma-dobert.pdf
Bachelor, Carolin Dohmen: Detecting Hate Speech in Social Media – A Machine Learning Approach.
Bachelor, Antonio Sanchez Friedeberg: Konzeption und Realisierung von Pipelines für die systematische Kombination von Dimensionsreduktion und Clustering.
Bachelor, Olimpio Machachane: Supervised Machine Learning Methods For Classification.
Bachelor, Martin Gosch: Experimentelle Evaluation von Frameworks zur Speicherung und Verarbeitung unsicherer Daten.
Bachelor, Finn Masurat: Realtime-Erkennung von Hate-Speech auf Twitter: Eine Evaluation von Apache Flink und Spark.
Bachelor, Christopher Wolfarth: Echtzeitanalyse- und Moderation von Online-Kommentaren durch Deep Learning.
Master, Tom Schöner: Detecting Uncertainty in Text Classifications: A Sequence to Sequence Approach using Bayesian RNNs.
Master, Falko Winkler: Scalable and reproducible processing of data pipelines.

Prototypes

Forum 4.0 Kommentaranalyse – https://mast-se.informatik.uni-hamburg.de/
Word Sense Disambiguation using BERT, ELMo and Flair – https://github.com/uhh-lt/bert-sense/
Aspect Based Sentiment Analysis (während des Projektes erweitert) https://github.com/uhh-lt/LT-ABSA/

Events

1st Forum 4.0 Kolloquium (in German)

We are proud to host our first interdisciplinary colloquium with local researchers in the area of semi-automated analysis, aggregation, and visualization of online user comments. We compiled an agenda with diverse talks, in which we summarize our current research state in this field. The colloquium will take place on Thursday, 11th of October 2018 at the informatics campus of the University of Hamburg, room D-220 (1. floor).

Agenda – First Forum 4.0 Kolloquium
09:15 – 09-30 Uhr	Ankunft und Willkommen
09:30 – 10:45 Uhr	Session 1: Bots and offensive participation (W. Maalej)
Opening – Walid Maalej Interevent timing in automated retweet cascades – Philipp Kessling & Christian Stöcker The Anatomy of Hate Speech: Datasets, Analysis and Automatic Detection – Michael Wojatzki Transfer Learning from LDA to BiLSTM-CNN for Offensive Language Detection in Twitter – Gregor Wiedemann
10:45 – 11:00 Uhr	Pause
11:00 – 12:45 Uhr	Session 2: Towards constructive participation (C. Biemann)
Tinder die Stadt: die App, die Bürgerinnen (wieder) mit Lokaljournalismus verkuppelt* – Wiebke Loosen & Julius Reimer Who is Addressed in this Comment? Automatically Classifying Meta-Comments in News Comments – Marlo Häring User Comment Recommendation in Online News Forums – Volodymyr Biryuk Ein Visual Analytics Framework für Nutzerkommentare – Jakob Andersen Closing – Walid Maalej
13:00 – 14:00 Uhr	Get together mit drinks and finger-food
14:00 – 18:00 Uhr	Internal Forum 4.0 meeting (only project members)