Statistical semantics
| Semantics | ||||||||
|---|---|---|---|---|---|---|---|---|
|
||||||||
|
Semantics of programming languages | ||||||||
|
||||||||
In linguistics, statistical semantics applies the methods of statistics to the problem of determining the meaning of words or phrases, ideally through unsupervised learning, to a degree of precision at least sufficient for the purpose of information retrieval.
History
[edit | edit source]The term statistical semantics was first used by Warren Weaver in his well-known paper on machine translation.[1] He argued that word-sense disambiguation for machine translation should be based on the co-occurrence frequency of the context words near a given target word. The underlying assumption that "a word is characterized by the company it keeps" was advocated by J. R. Firth.[2] This assumption is known in linguistics as the distributional hypothesis.[3] Emile Delavenay defined statistical semantics as the "statistical study of the meanings of words and their frequency and order of recurrence".[4] "Furnas et al. 1983" is frequently cited as a foundational contribution to statistical semantics.[5] An early success in the field was latent semantic analysis.
Applications
[edit | edit source]Research in statistical semantics has resulted in a wide variety of algorithms that use the distributional hypothesis to discover many aspects of semantics, by applying statistical techniques to large corpora:
- Measuring the similarity in word meanings[6][7][8][9]
- Measuring the similarity in word relations [10]
- Modeling similarity-based generalization[11]
- Discovering words with a given relation[12]
- Classifying relations between words[13]
- Extracting keywords from documents[14][15]
- Measuring the cohesiveness of text[16]
- Discovering the different senses of words[17]
- Distinguishing the different senses of words[18]
- Subcognitive aspects of words[19]
- Distinguishing praise from criticism[20]
Related fields
[edit | edit source]Statistical semantics focuses on the meanings of common words and the relations between common words, unlike text mining, which tends to focus on whole documents, document collections, or named entities (names of people, places, and organizations). Statistical semantics is a subfield of computational semantics, which is in turn a subfield of computational linguistics and natural language processing.
Many of the applications of statistical semantics (listed above) can also be addressed by lexicon-based algorithms, instead of the corpus-based algorithms of statistical semantics. One advantage of corpus-based algorithms is that they are typically not as labour-intensive as lexicon-based algorithms. Another advantage is that they are usually easier to adapt to new languages or noisier new text types from e.g. social media than lexicon-based algorithms are.[21] However, the best performance on an application is often achieved by combining the two approaches.[22]
See also
[edit | edit source]Lua error in mw.title.lua at line 392: bad argument #2 to 'title.new' (unrecognized namespace name 'Portal').
References
[edit | edit source]- ^ Weaver 1955
- ^ Firth 1957
- ^ Sahlgren 2008
- ^ Delavenay 1960
- ^ Furnas et al. 1983
- ^ Lund, Burgess & Atchley 1995
- ^ Landauer & Dumais 1997
- ^ McDonald & Ramscar 2001
- ^ Terra & Clarke 2003
- ^ Turney 2006
- ^ Yarlett 2008
- ^ Hearst 1992
- ^ Turney & Littman 2005
- ^ Frank et al. 1999
- ^ Turney 2000
- ^ Turney 2003
- ^ Pantel & Lin 2002
- ^ Turney 2004
- ^ Turney 2001
- ^ Turney & Littman 2003
- ^ Sahlgren & Karlgren 2009
- ^ Turney et al. 2003
Sources
[edit | edit source]- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Reprinted in Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).[permanent dead link]
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).