Determining Compositionality of Word Expressions Using Word Space Models

This research focuses on determining semantic compositionality of word expressions using word space models (WSMs). We discuss previous works employing WSMs and present differences in the proposed approaches which include types of WSMs, corpora, preprocessing techniques, methods for determining compositionality, and evaluation testbeds.We also present results of our own approach for determining the semantic compositionality based on comparing distributional vectors of expressions and their components. The vectors were obtained by Latent Semantic Analysis(LSA) applied to the ukWaC corpus. Our results outperform those of all the participants in the Distributional Semantics and Compositionality(DISCO) 2011 shared task.

Keywords: compositionality, word space model, distributional semantics, semantic space, MWE, multiword expression, idiom, latent semantic analysis

Year: 2013

Authors of this publication:

Lubomír Krčmář


Luboš graduated from the University of West Bohemia in 2009. He is a PhD student now. His research is focused on natural language processing, information retrieval, and semantic similarity of texts of varying length. Especially, he is interested in automatic extraction of collocations and idiomatic expression from large corpora.

Karel Ježek

Phone:  +420 377632475, 377632400

Karel is a group coordinator and a supervisor of PhD students working at research projects of this Group.

