Determining Compositionality of Word Expressions Using Word Space Models

Determining Compositionality of Word Expressions Using Word Space Models

This research focuses on determining semantic compositionality of word expressions using word space models (WSMs). We discuss previous works employing WSMs and present differences in the proposed approaches which include types of WSMs, corpora, preprocessing techniques, methods for determining compositionality, and evaluation testbeds.We also present results of our own approach for determining the semantic compositionality based on comparing distributional vectors of expressions and their components. The vectors were obtained by Latent Semantic Analysis(LSA) applied to the ukWaC corpus. Our results outperform those of all the participants in the Distributional Semantics and Compositionality(DISCO) 2011 shared task.

Keywords: compositionality, word space model, distributional semantics, semantic space, MWE, multiword expression, idiom, latent semantic analysis

Year: 2013

Download: download Full text 

Authors of this publication:


Lubomír Krčmář


E-mail: lkrcmar@kiv.zcu.cz

Luboš graduated from the University of West Bohemia in 2009. He is a PhD student now. His research is focused on natural language processing, information retrieval, and semantic similarity of texts of varying length. Especially, he is interested in automatic extraction of collocations and idiomatic expression from large corpora.

Karel Ježek


Phone:  +420 377632475, 377632400
E-mail: jezek_ka@kiv.zcu.cz
WWW: http://www-kiv.zcu.cz/~jezek_ka/

Karel is a group coordinator and a supervisor of PhD students working at research projects of this Group.

Related Projects:


Project

Exploration of Semantic Spaces

Authors:  Karel Ježek, Lubomír Krčmář, Miloslav Konopík
Desc.:This work is focused on semantic relations between words and application of these relations in research fields such as information retrieval, machine translation or document clustering.