The 54th Language Lunch

Date: 2016-10-20

Location: G.07 Informatics Forum

Variation of adjective placement in complex noun phrases in Italian

Kristina,Gulordava; None; None

Romance languages show a substantial degree of variation in prenominal versus postnominal adjective placement. In particular, the semantic differences between prenominal and postnominal adjectives has been studied extensively, both in theoretical and computational linguistics (Cinque, 1994; Bouchard, 1998; Alexiadou, 2001; Laenzlinger, 2005; Boleda, 2007; Vecchi, 2013). This previous work has focused on the relative order of a single noun and one or several adjectival modifiers. By contrast, in this work we investigate the distribution of adjectives in complex noun phrases which include additional postnominal modifers such as a prepositional phrase (PP). In such noun phrases, in principle, three word orders are possible, as illustrated by the following examples in Italian:rnrn(1) Adj N PP:  [ un importanteA compitoN [PP di matematica] ]rn
(2) N Adj PP:  [ un compitoN importanteA [P P di matematica] ] 
rn(3) N PP Adj:  [ un compitoN [P P di matematica] importanteA ]rn’an importantA math homeworkN ‘rnrnFirst, we quantitatively describe the distribution of these orders in Italian, based on the data extracted from a syntactically-annotated corpus. The statistical analysis of these new data reveals that the prenominal adjective position is more frequent in complex noun phrases than in simple noun phrases. We investigate this phenomenon more in depth for the case of Italian noun phrases with PP complements introduced by the preposition ‘di’. To this end, we collect a large number of cases of adjective variation from the Wikipedia corpus of Italian. Our initial findings suggest that the preference for prenominal adjective position is induced by the lexico-statistical properties of the N-PP phrases.

The Sound of Social Mobility: Investigating ‘New Middle Class’ Speech in Edinburgh

Victoria,Dickson; None; None

Social class has traditionally been a major topic of interest in the fields of sociolinguistics and dialectology, in which socioeconomic status is regarded as a major source of language variation and change. However, studies of class mobility in variationist sociolinguistics are relatively sparse, the focus remaining largely on contrasts between static social class groups. The present study explores the relationship between class mobility and sociophonetic variation with an auditory analysis of two phonetic variables that are reported to be socially stratified in the context of urban speech in Scotland. These variables are (1) the glottal replacement of /t/ in coda or non-foot-initial onset positions, e.g. bu[ʔ], bu[ʔ]er, moun[ʔ]ain (c.f. Speitel & Johnstone 1983; Johnston 1997; Stuart-Smith 1999), and (2) the phonemic distinction of /w/ and /ʍ/, where sounds represented orthographically with wh (e.g. white, somewhere) are realised as [ʍ] (Stuart-Smith 2004: 61). rnrnSpontaneous speech is analysed from native speakers of Scottish English born in Edinburgh, aged 57-69 years, from three socioeconomic groups: Working Class (WC), Established Middle Class (EMC) and New Middle Class (NMC), the third category consisting of speakers who have experienced upward mobility over their lifetime. Patterns of realisation across the three socioeconomic groups indicate widespread glottalisation and the merging of /ʍ/ with /w/ in WC speech, while EMC speakers in comparison show a higher rate of the prestigious alveolar [tʰ] realisation of /t/ and variable retention of the [ʍ] realisation. rnrnMost striking in the results is that the upwardly mobile NMC group shows the highest production rate of the [tʰ] and [ʍ] variants. Thus, despite their arguably intermediate socioeconomic status, speakers from the NMC group exceed the proportion of overtly prestige variants observed for EMC speech. This result mirrors previous findings by Dickson and Hall-Lew (2015) of a NMC cross-over pattern in the realisation of non-prevocalic /r/ in Edinburgh. It is argued that this distinct pattern among upwardly mobile speakers reflects an ideology of linguistic prestige distinct from that of speakers from a stable socioeconomic background. These results extend previous findings of unique patterns of phonetic variation among upwardly mobile individuals, offering greater insight into the linguistic representation of evolving class identities.

Regular Graph Languages for NLP

Sorcha,Gilroy; None; None

Distributions over strings and trees can be represented by probabilistic regular languages, and this representation characterizes many models in natural language processing. Recently, several datasets have become available which represent compositional semantics as graphs, so it is natural to seek the equivalent of probabilistic regular languages for graphs. To this end, we survey three families of graph languages: Hyperedge Replacement Languages (HRL), which can be made probabilistic; Monadic Second Order Languages (MSOL), which support crucial closure properties of regular languages such as intersection; and Regular Graph Languages (RGL; Courcelle, 1991), a subfamily of both HRL and MSOL which inherits the desirable properties of each, and has not been widely studied or previously applied to NLP. Focusing on RGL, we give a new inclusion proof, provide the first concrete algorithm for grammar intersection and parsing, and demonstrate that RGL is expressive enough to represent some common semantic phenomena.

Evaluating Informal-Domain Word Representations With UrbanDictionary

Naomi,Saphra; None; None

Existing corpora for intrinsic evaluation are not targeted towards tasks in informal domains such as Twitter or news comment forums. We want to test whether a representation of informal words fulfills the promise of eliding explicit text normalization as a preprocessing step. One possible evaluation metric for such domains is the proximity of spelling variants. We propose how such a metric might be computed and how a spelling variant dataset can be collected using UrbanDictionary.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.