Multi-SimLex for Dutch: Comparing Embedding and Prompt-Based Model
Performance on Semantic Similarity

This study introduces a Dutch expansion of the Multi-SimLex dataset. This resource contains 1,888 word pairs annotated for semantic similarity by native Dutch speakers. The research evaluates 18 models using both embedding-based and prompt-based methods. Prompt-based evaluation produced the highest correlation with human judgments. GPT-4 achieved a correlation of 0.761. This suggests large generative models use dynamic reasoning. In contrast embedding-based evaluation favored smaller specialized models like FastText and BERTje. The findings underscore the importance of aligning evaluation strategy with the model's architecture. This study provides a foundational resource for Dutch semantics. It also suggests large language models could serve as a proxy for human ratings in the future.

Keywords

Lexical semantic similarity, Multi-SimLex dataset, computational models, embedding-based evaluation, prompt-based evaluation, dynamic reasoning

URI

https://studenttheses.uu.nl/handle/20.500.12932/50561

Multi-SimLex for Dutch: Comparing Embedding and Prompt-Based Model Performance on Semantic Similarity

Files

Publication date

Authors

DOI

Document Type

Metadata

Collections

License

Abstract

Keywords

Citation

URI