The 53rd Language Lunch

Date: 2016-06-16

Location: G.07 Informatics Forum

Document Embeddings with Context Sampling

Stefanos,Angelidis; None; None

The Paragraph Vector model has been recently proposed as a method for learning low-dimensional representations of multi-sentence documents. As with most embedding models, it relies on a proximity context to associate co-occurring words, while jointly learning embeddings for the documents that contain them. We argue that this technique has certain limitations and propose a general context sampling framework which allows for the instantiation of context policies of varying linguistic complexity as a means of better reflecting documentsemantic content. We evaluate our proposal on a series of document classification and information retrieval tasks and show that context sampling results in significant improvements over the original Paragraph Vector model.rn

Musical Meter Detection Using Context-Free Grammars

Andrew,McLoead; None; None

Meter identification is the organisation of the beats of a given musical performance into a metrical structure, a tree in which each node represents a single note value. The children of each node divide its note into some number of equal-length notes (usually two or three). The metrical structure must be properly aligned in phase with the underlying musical performance so that the root of the tree corresponds with a logical musical segment, often a bar. We show that using a probabilistic context-free grammar (PCFG) to model the rhythmic structure of a musical piece can aid in this musical meter detection. Additionally, we show that the use of a lexicalized PCFG improves performance even further, as it is able to model the rhythmic dependencies found in music.rn

Eventuality-control and disambiguation in free adjuncts and supplementary relative clauses

James,Reid; None; None

Free adjuncts and supplementary relative clauses are noteworthy in that they are able to take eventualities as well as individuals as antecedents:’Although there is no written law barring female citizens from driving, they are not issued local licenses, making it effectively illegal for them to drive”Although there is no written law barring female citizens from driving, they are not issued local licenses, which makes it effectively illegal for them to drive'(from http://tinyurl.com/zaacele, with the interpretation: ‘Their not being issued local licenses makes it effectively illegal for them to drive’)Existing analyses of such free adjuncts (e.g. Behrens 1998) separate out the task of establishing a coherence relation (e.g. Result, for the above examples) from that of the selection of an antecedent. Analysing and manipulating corpus examples reveals that there is a highly complex interplay between not only these two tasks, but also the resolution of e.g. the scope of modals and temporal adverbs. This suggests that an appropriate analysis of these constructions resolves these underspecified and ambiguous aspects of these constructions in tandem, as is done in discourse-coherent frameworks (e.g. Asher and Lascarides 2003). Here, I analyse several corpus examples as a means of demonstrating (i) how syntactic complexity can give rise to ambiguity in complex sentences containing either of these two constructions, and (ii) how hearers must draw on world knowledge as a means of resolving this ambiguity.rn

Who is ziji? Binding interpretations in Chinese-English Bilinguals

Wenjia,Cai; s1342561@sms.ed.ac.uk

It has been well established that prolonged exposure to the second language (L2) accompanied with long-term disuse of the first language (L1) could induce some kind of reconstructing or change in the L1 grammar. Language attrition doesn’t occur in an across-the-board manner within the syntactic module, and studies have shown that structures lying at the interface of different modules, for example syntax and discourse, are more susceptible to language attrition than those within core grammar. According to the Interface Hypothesis (Sorace, 2011) the “interface structures” employ resources from different cognitive domains during online processing, thus more cognitive demanding. Consequently, bilingual speakers who don’t have enough cognitive resources at disposal could find it more difficult to process said structures. The current study intends to replicate previous findings in Tsimpli and Sorace (2004) by comparing online performances of Chinese-English bilinguals when processing pronoun reflexive ziji within syntax-semantics and syntax-discourse interfaces. My hypothesis is that the syntax-discourse interface will show effects of L1 attrition while the syntax-semantics interface remains intact. The currently study will also investigate the role of L1 input/contact during language attrition by comparing potential attriters with high and low frequency of L1 use.rn

Left-to-right Transition-based Parsing for Abstract Meaning Representation

Marco,Damonte; s1333293@sms.ed.ac.uk

Semantic parsing aims at carrying out the difficult task of canonicalizing language and represent its meaning: given a natural language sentence, we want to retrieve its meaning. Abstract Meaning Representation (AMR) is a semantic representation that provides sentences with a deep semantic interpretation that includes most of the shallow-semantic NLP tasks that are usually addressed separately such as Named Entity Recognition and Coreference Resolution. AMR was devised to be easy to annotate so that large datasets could be easily developed, rather than easy to process with current algorithms and techniques, which makes it a challenging and interesting problem. One of the common efficient and accurate parsing strategies, especially for dependency parsing, is greedy transition-based parsing. A transition system is an abstract machine characterized by a set of states and actions between each state. The sentence is scanned left to right only once, hence guaranteeing linear time and space complexity. In this work we adapt the arc-eager transition system of Nivre (2004) forparsing AMR graphs.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.