Comparing Text Representations: In Search Of Caring Communities

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

Multiple text representation techniques ( BERT, word2vec, LDA topics etc) are compared for a text classification task. This classification task involves identifying caring communities from Dutch Chamber of Commerce data and utilizes a RF classifier. The goal is to identify the highest performing text representation. The classifier using the Word2Vec representation ends up with the highest F1-score.

Keywords

NLP;BERT;ADS;ML;

Citation