The 32nd Language Lunch

Date: 2012-04-12

Location: G.07 Informatics Forum

Determining the number of speakers in a meeting using microphone array features

Erich,Zwyssig; EADS IW; e.p.zwyssig@sms.ed.ac.uk

Steve,Renals; Informatics; S.Renals@ed.ac.uk

Mike,Lincoln; Informatics; mlincol1@inf.ed.ac.uk

The accuracy of speaker diarisation in meetings relies heavily on determining the correct number of speakers. In this paper we present a novel algorithm based on time difference of arrival (TDOA) features that aims to find the correct number of active speakers in a meeting and thus aid the speaker segmentation and clustering process. With our proposed method the microphone array TDOA values and known geometry of the array are used to calculate a speaker matrix from which we determine the correct number of active speakers with the aid of the Bayesian information criterion (BIC). In addition, we analyse several well-known voice activity detection (VAD) algorithms and verified their fitness for meeting recordings. Experiments were performed using the NIST RT06, RT07 and RT09 data sets, and resulted in reduced error rates compared with BIC-based approaches.

Feature spreading and coarticulation coexist in Libyan Arabic

Tareq,Maiteq; Linguistics; tareqmaiteq@ling.ed.ac.uk

Alice,Turk; Linguistics; turk@ling.ed.ac.uk

This is part of ongoing PhD research aiming to quantify how anticipatory pharyngealisation in Arabic, varies as a function of prosodic boundary level (syllable vs. word vs. phrase vs. intonation phrase). Pharyngealisation is manifested in F2 lowering in emphatic compared to plain contexts. F2 was measured at offset, mid and onset points of both vowels in [V2 b V1 # Emphatic trigger] sequences, where the strength of the # was varied syntactically. The duration of the final vowel V1 was also measured to assess how pharyngealisation was affected by temporal distance from the trigger. Six Libyans produced two repetitions of 62 minimal pairs in all boundary conditions. rnLinear mixed effects results show (1) that pharyngealisation on both vowels across syllable boundary is stable (2) effects of pharyngealisation on the final vowel, i.e. V1 across word and phrase boundaries, and (3) No evidence of pharyngealisation across IP boundary. An examination of V1 + pause durations suggests that the lack of coarticulatory effects on the final vowel, i.e., V1 across IP boundary may be due to the temporal distance from the trigger: all tokens in this condition had a pre-trigger pause. These results are consistent with the view that anticipatory coarticulation is qualitatively different within as compared to across word boundaries. They suggest that pharyngealisation within words may be phonological, whereas across word boundaries it is primarily a phonetic process, conditioned by the temporal proximity of the trigger. Implications for speech production models, speaker variability, and prosodic constituency structure are considered.

Verbal problem-solving in Deafness and Autism Spectrum Disorders

Ben,Alderson-Day; Psychology; b.d.alderson-day@sms.ed.ac.uk

People with autism spectrum disorders (ASDs) use less efficient strategies than typically-developingrnparticipants on measures of verbal problem-solving such as the Twenty Questions Task (TQT;rnMinshew et al., 1994). While this can be explained with reference to autism-specific cognitiverndeficits, the problem-solving of deaf participants suggests a contributory role of atypical languagerndevelopment. Like participants with ASD, deaf participants have been reported to ask over-rnspecific questions in their problem-solving on the TQT, even when they possess good languagernskills (Marschark & Everhart, 1999). It is thought that this reflects atypical organization of semanticrnnetworks (Marschark et al., 2004). However, previous research on this profile has not controlledrnfor verbal and non-verbal IQ differences between deaf and hearing participants, so it is unclear howrnsimilar deaf problem-solving is to ASD. Moreover, the link between problem-solving and semanticrnorganization has not been demonstrated empirically. Preliminary results suggest that the problem-rnsolving profile of deaf participants on the TQT is a) less efficient than hearing counterparts and b)rnvery similar to ASD performance. Semantic decision performance in deaf children also indicates linksrnbetween basic-superordinate category associations and questioning efficiency in problem-solving.rnOverlaps in deaf and ASD problem-solving are important in understanding the long-term effects ofrnatypical language development on cognitive skills.

Towards a measure of optimization in natural vowel systems

Jon William,Carr; PPLS; j.w.carr@sms.ed.ac.uk

Computational simulations of the emergence and evolution of phonological systems have shown that, given sufficient time, organizations of the articulatory space emerge in which rnphonemes are optimally distinctive (e.g. Steels, 1997; de Boer, 2000; Oudeyer, 2005; de Boer & Zuidema, 2010). However, there has been little investigation into the typological description of articulatory optimization across the world’s languages.rnIn this poster I introduce a methodology for measuring the optimization of vowel systems, which proceeds in four steps: first, we measure the formant frequencies of a language’s rnmonophthongs; second, we plot the vowels in a perceptual vowel space; third, following Liljencrants and Lindblom (1972), we calculate the potential energy in the system using the inverse-square law from Theoretical Physics; finally, we use Monte Carlo techniques to measure the non-randomness of the system.rnUsing recordings from the UCLA Phonetics Lab Archive (Ladefoged & Blankenship, 2007), rnthis method has been applied to 100 languages. The results suggest that there is a high level of variation in the optimization of vowel systems. I also explore the potential for cross-linguistic correlational studies using this measure, which could reveal whether external social pressures affect the emergent state of vowel systems.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.