Gaelic Algorithmic Research Group

News and resources about computational research on the Gaelic languages

Category: Agallamhan – Interviews

Agallamh le Lucy Evans / An interview with Lucy Evans

Anns an t-sreath seo, tha sinn a’ toirt sùil air laoich a rinn adhartas cudromach ann an teicneolas nan cànanan Gàidhealach. Airson an treasamh agallaimh, cluinnidh sinn bho thè Lucy Evans. Tha Lucy air ùr thighinn gu saoghal na Gàidhlig agus gu saoghal teicneolas cànain, ach tha i an sàs ann am pròiseact a bhios glè chudromach san àm ri teachd, thathar an dòchas. Chuir i crìoch san Lùnastal 2020 air MSc ann an Pròiseasadh Cànan is Cainnt aig Oilthigh Dhùn Èideann. Goirid an dèidh sin, thòisich i mar phàirt de sgioba rannsachaidh a bhios a’ feuchainn ris a’ chiad aithneachar cainnt a chruthachadh dhan Ghàidhlig. Thòisich am pròiseact san t-Sultain 2020 le maoineachas bho Shoillse, an lìonradh nàiseanta rannsachaidh airson glèidheadh agus ath-bheothachadh na Gàidhlig. Tha am pròiseact rannsachaidh na chom-pàirteachas eadar Oilthigh Dhùn Èideann, Oilthigh na Gàidhealtachd is nan Eilean (OGE) agus Quorate Technology Ltd. Anns a’ phìos seo, innsidh Lucy dhuinn ciamar a ghabh i ùidh anns a’ chuspair agus ciamar a bhios cuideigin aig nach eil ach glè bheag de Ghàidhlig ag obair air pròiseact toinnte mar seo.

In this series, we look at persons who have significantly advanced the field of Gaelic, Irish and Manx language technology. For the third interview, we hear from Ms Lucy Evans. Lucy has only recently come to the worlds of Gaelic and language technology, but she is involved in a project that hopefully will come to have great importance in the future. In August 2020, she finished her MSc in Speech and Language Processing at the University of Edinburgh. Shortly after that, she joined a research team that is working to develop the first working speech recogniser for Scottish Gaelic. The project began in September 2020 with funding from Soillse, the national research network for the maintenance and revitalisation of Gaelic language and culture. The research project is a collaboration between the University of the Highlands and Islands, the University of Edinburgh and Quorate Technology. In the interview, Lucy tells us how she took an interest in the subject of speech and language technology and how someone with little Gaelic, at present, is able to work on such a complicated project. 

Interview with Lucy Evans

Agallamh le Lucy Evans

“You’ve recently joined the research team developing an automatic speech recogniser for Scottish Gaelic. Tell us a little bit about your background. For example, where are you from, and what got you into language technology work?”

Lucy Evans

I grew up bilingually in Switzerland, speaking English and Italian, before moving to the UK for secondary school. Being bilingual at a young age definitely sparked a curiosity about language, and I went on to study French and Linguistics at the University of Leeds. There, I absolutely loved studying linguistics, so started looking for jobs where I could apply my knowledge from the subject. This led me to discover the field of computational linguistics, and through this I found the MSc in Speech and Language Processing. The MSc encompasses all aspects of language technology, and so was a perfect introduction to the field!

“You’ve just finished the MSc in Speech and Language Processing at the University of Edinburgh. What did you find particularly interesting about the course? Do you have any advice for someone who is thinking about doing it in the future?”

Honestly, I found the whole course really interesting! I was constantly in awe of what I was learning –  the interface between computer science and linguistics is niche, and so the techniques used are really specialised. I just find the ability of computers to pick up on all the complexities of language so interesting.

My advice for anyone taking the MSc in the future is simply to be prepared for a really intense year – you’ll be challenged constantly, not only academically, but with time management too. Having said this, the stress is definitely worth it! The course covers a huge amount of content in such a short period of time, which means you’ll be left with a really strong background in the field. A second piece of advice is to get friendly with your peers – there is such a sense of community within the course, and this is undoubtedly one of the loveliest aspects of the MSc. You’ll also get a huge amount of support from Simon King, the course director – make the most of this. Everyone really is there to help and support you, and there is so much more to the MSc than just the course content.

“For those not involved in speech technology, it might seem incredible that someone without Gaelic could develop a speech recogniser for the language. Can you explain how this is possible? And how is working with a minority language going to be different from working with a large language like English?”

As long as you have the necessary resources, it’s only the computer that has to do the language learning! One of the resources I’m talking about here is the dictionary – which essentially maps any written Gaelic word to its phonetic pronunciation. Using this and some transcribed speech data, we can split the speech into its smaller phonetic units, depending on the words in the transcription. Then we train the speech recogniser to learn what these smaller units generally sound like. When new speech is input to the speech recogniser, it can use this lower-level acoustic knowledge to predict which phones (and consequent words) make up the input speech. In this way, as long as you have appropriate (and high-quality) resources, you don’t actually need to learn the language you’re working on – the computer can do that itself!

Working with a minority language adds a challenge in that we won’t necessarily have these resources available. Luckily, for Scottish Gaelic, a digital dictionary has already been created. But this is definitely not the case for most minority languages, making the task significantly harder for non-native speakers to attempt. Furthermore, good quality, transcribed speech data is generally not so easy to come by in minority languages. In the world of machine learning, the general pattern is that the more data you have, the better your system will be. So, with less data available for these languages, it’s harder to get a better system up and running. But there are many mediating methods we can use to boost the performance of a low-resource system – it’s really about finding what works best for the dataset.

“In your own lifetime, you’ve seen language technology change and permeate how we work and live. What’s been your own experience of the changes that it has brought?”

When I was younger, I used language technology but was never really aware of what was going on in the background. Take something like a sat-nav: this is probably one of the first speech technologies I came across, and I remember just laughing about the robotic quality of the synthesised speech – I had no idea how complex the problem actually is! But the amount this has progressed in the last 10 years is crazy – it’s really impressive to see how far things have come in such a short time. For example, we can now ask a mobile phone any question and have it answer us instantly, in near-perfect speech. Things like predictive text and spell-check are other language technologies that are now so embedded in my day-to-day life that I almost forget the complex things they’re doing behind the scenes.

“What are your predications for language technology in the year 2050? If you had your own way, what would you like to see by that time?” 

This is a tricky question – considering just the changes in my lifetime, who knows where we’ll be in 30 years from now! In an ideal world, I’d love to see language tech being used more to help people and cultures. This project is an example of that – creating modern technology for endangered languages is an important way to revitalise and preserve those languages! Something I’m also really interested in is using technology to help people with speech disorders, which is definitely something that’s gaining momentum at the moment – it’ll be interesting to see how this can be further improved in years to come.

 

 

 

Agallamh le Mìcheal Bauer

Anns an t-sreath seo, tha sinn a’ toirt sùil air laoich a rinn adhartas mòr ann an teicneolas nan cànanan gàidhealach. Airson an dàrna agallaimh, cluinnidh sinn bho fhear a tha cho cudromach san 21mh linn ri Eideard Dwelly: Mìcheal Bauer. Tha Mìcheal aithnichte airson na h-obrach ealanta a rinn e le Uilleam MacDhunnchaidh airson Am Faclair Beag–faclair air loidhne a thòisich e o chionn còrr is 20 bliadhna is e na oileanach aig Oilthigh Dhùn Èideann. Chan b’ urrainn cus a ràdh air cho feumail agus cho cudromach ’s a tha am faclair seo. Ach tha e air a bhith an sàs ann an iomadach pròiseact eile an lùib teicneolas a’ chànain on a thòisich e air AFB, leithid inneal-bruidhinn Gàidhlig agus aithnichear làmh-sgrìobhainn. Tha e air leabhraichean feumail a chur a-mach leithid Blas na Gàidhlig, a tha a’ teagasg fhuaimean na Gàidhlig. A bharrachd, tha fèill mhòr air na sgilean eadar-theangachaidh aige, gu h-àraid ann an riaghaltas agus saoghal a’ ghnìomhachais. Mòran taing do Mhìcheal airson a bhith deònach an t-agallamh seo a dhèanamh.

In this series, we look at heroes of Gaelic, Irish and Manx language technology . For our second interview, we hear from someone who is perhaps as important to the Gaelic world in the 21st century as the famous lexicographer, Edward Dwelly: Michael Bauer.  Michael is best known for the work he did with Will Robertson on Am Faclair Beag, the important on-line Gaelic dictionary that he began when still a student at Edinburgh University, over 20 years ago. But he has been involved in a wide variety of projects connected to Gaelic language technology since then. For instance, he has been instrumental in the recent development of a Gaelic speech synthesiser and handwriting recogniser. He has also produced a number of excellent Gaelic-related books, such as Blas na Gàidhlig–a superb, linguistically informed guide to Gaelic pronunciation. He is also in high demand as a translator, especially in the government and commercial sectors. Many thanks to Michael for taking the time out to do this interview with us. 

(NB: We’re presenting some of these interviews in a Gaelic or Irish only format. If required, they can be translated to English using Google Translate.)  

Agallamh le Mìcheal Bauer

Interview with Michael Bauer
“Cò às a tha thu is ciamar a chaidh thu an lùib saoghal na Gàidhlig an toiseach?”

’S ann às a’ Ghearmailt a tha mi, taobh a deas na dùthcha. ’S e co-thuiteamas a thug an-seo mi–bha mi aig Oilthigh LMU mun bhliadhna 1997 agus thachair mi ri cuideigin a bha a’ fuireach faisg air Inbhir Nis. ’S ann air an eadar-lìon a bha sin.

Mìcheal Bauer (Akerbeltz)

Thàinig mi an-seo air saor-làithean fada an uairsin agus rinn mi imrich an ath-bhliadhna an dèidh dha Oilthigh Dhùn Èideann àite a thairgsinn dhomh. ’S e cànanachas agus fòn-eòlas a bha mi a’ dèanamh aig an LMU an uairsin agus bha e ’na rud nàdarra dhomh-sa m’ ainm a chur sìos airson Gàidhlig a bharrachd air cànanachas. Sin mar a thachair.

“Dè thug ort a bhith ag obair le teicneolas a’ chànain? Ciamar a thòisich thu san raon seo?”

Ag innse na fìrinn, co-thuiteamas eile. Cha robh mi cho dèidheil–no math–air teicneolas nuair a bha mi òg. Chan urrainn dhomh spot a ruitheas air sgrìn a phrògramachadh fiù an-diugh agus b’ fheudar dha m’ athair maoidheadh orm an aiste mhòr a nì oileanach sa bhliadhna mu dheireadh san àrd-sgoil a sgrìobhadh air a’ PC seach air clò-sgrìobhadair. Mean air mhean, dh’fhàs mi eòlach air an eadar-lìon is rudan mar sin. Bha mi sa chiad bhliadhna aig Oilthigh Dhùn Èideann nuair a tharraing caraid m’ aire do phròiseact a bha a’ dol aig an àm air an robh Google in Your Language. Chuir Google às dhan phròiseact ud beagan bhliadhnaichean air ais ach fad grunn bhliadhnaichean, b’ urrainn dhut d’ ainm a chur sìos mar eadar-theangadair saor-thoileach agus do chànan a chur air na goireasan a bha fosgailte aca airson eadar-theangachadh, mar an search interface aca. Bha mi air mo bheò-ghlacadh leis an nòisean sin, gun robh e nas fhasa–gu ìre–san t-saoghal digiteach ceàrn a dhèanamh airson cànain bheaga leis gun robh bits agus bytes nas saoire na soidhnichean-rathaid no leabhraichean clò-bhuailte. Agus cha do leig am beò-ghlacadh às mi on àm sin.

Bha mi air mo bheò-ghlacadh leis an nòisean sin, gun robh e nas fhasa–gu ìre–san t-saoghal digiteach ceàrn a dhèanamh airson cànain bheaga leis gun robh bits agus bytes nas saoire na soidhnichean-rathaid no leabhraichean clò-bhuailte. Agus cha do leig am beò-ghlacadh às mi on àm sin.

“Am measg nam pròiseactan teicneolais san robh thu an sàs, cò am fear bu chudromaiche no bu thlachdmhoire a bh’ ann dhut fhèin?”

Am faod mi a dhà dhiubh ainmeachadh? [d. Dall ort!] A’ chiad fhear, sin na gleusan airson teacsadh ro-innseach san robh mi an sàs, predictive texting. Bha mi airson sin a dhèanamh fad bhliadhnaichean on chiad turas a chunnaic mi dè cho luath ’s a bha sgrìobhadh air uidheaman mobile le gleus mar sin, seach a bhith sgrìobhadh rudan litir air litir. Ach cha robh comas prògramachaidh sam bith agam mar a thuirt mi roimhe agus an dèidh mar a thachair dha na h-Èireannaich, cha robh mi airson an aon mhearachd a dhèanamh às ùr. ’S e na thachair ann an Èirinn gun do stèidhich Foras na Gaeilge pròiseact Téacs, app airson teacsadh ro-innseach airson na Gaeilge. Dh’obraich sin math ge leòr fad bhliadhnaichean ach cha robh iad ’ga nuadhachadh agus cha do dh’obraich e ach air grunn handsets agus bhàsaich e mu dheireadh thall. Bha mi a’ sireadh pròiseact mòr le iomadh cànan ’na lùib agus sgioba de luchd-leasachaidh a chumadh air dol e. Ach cha robh a leithid idir furasta ri lorg. Ach mu dheireadh thall, thachair mi ri Adaptxt agus le taic o Kevin Scannell, gaisgeach-d nan cànan beaga, chaidh agam air an dàta air an robh feum a chruinneachadh agus chuir Adaptxt Gàidhlig, Gaelg agus Gaeilge ris na cànain aca. B’ fheudar dhuinn gluasad gu gleus eile bliadhnaichean an dèidh sin, Swiftkey, agus tha Gàidhlig air nochdadh ann an gleus eile no dhà on àm sin. Ach bha mi cho sona ri sagart is eallach leabhraichean air nuair a thàinig Adaptxt a-mach. Bha gleusan eile, mar Firefox, air nochdadh sa Ghàidhlig roimhe sin ach bha–agus tha–e doirbh daoine a thàladh air falbh o coimpiutairean làn-Bheurla. Bidh a’ chuid as motha ’gan cleachdadh dìreach mar a thàinig iad às a’ Bhùth agus a ghnàth, tha sin a’ ciallachadh Beurla, Beurla, Beurla. Ach bha uiread a dhaoine deònach Adaptxt a chur air na fònaichean is tablaidean aca gun robh mi fo iongnadh mòr–agus cho toilichte ’s a ghabhas.

Chan eil dad nas fheàrr na a bhith ag obair air seann chlàradh no teacsa le bodach no cailleach a chaochail deicheadan air ais agus dàta a chur ris na mapaichean, a dh’innseadh gur e, aig àm, ponach am facal a bha aig daoine air balach ann am baile Inbhir Nis. Tha e cha mhòr mar séance beag, a’ bruidhinn ris na linntean a dh’aom.

An rud eile, sin fo-phròiseact aig an Fhaclair Bheag, gleus nam mapaichean. Tha sinn uile eòlach air na deasbadan ud a thaobh faclan “nach canadh duine air eilean seo no siud”. Cha robh mi riamh deònach pàirt a ghabhail annta, ged nach eil mi nam matamataigear, tha mi a’ tuigsinn na th’ ann an representative sample agus chan eil aonan, ge be dè cho eòlach ’s a tha iad air cànan, na representative sample. Bhuail na mapaichean a thug Rob Ó Maolalaigh dhuinn sa chùrsa aige air dual-chainntean na Gàidhlig a thaobh na diofar sgìrean a chleachd, can, siobhag seach buaic agus bha guth olc ’nam cheann ag innse dhomh gum biodh rud mar sin snasail san Fhaclair Bheag. Agus ri linn sin, gu math tràth ann am beatha an fhaclair, chuir sinn gleus ris a chumadh dàta mu na h-àitichean ris an robh faclan a’ buntainn. Chan eil dad nas fheàrr na a bhith ag obair air seann chlàradh no teacsa le bodach no cailleach a chaochail deicheadan air ais agus dàta a chur ris na mapaichean, a dh’innseadh gur e, aig àm, ponach am facal a bha aig daoine air balach ann am baile Inbhir Nis. Tha e cha mhòr mar séance beag, a’ bruidhinn ris na linntean a dh’aom. Agus tha e a’ cur solas beag, mu dheireadh thall, air cuid dhe na faclan ann am faclairean mar Dwelly a dh’fhàgadh thu a’ sgròbadh do chinn roimhe a thaobh cò às a thàinig am facal annasach seo no siud.

Mapa airson ‘mand’ (Am Faclair Beag)

“Dè na duilgheadasan a th’ ann ceangailte ri bhith a’ leasachadh teicneolas airson mion-chànan mar a’ Ghàidhlig?”

Tha iomadh rud ann a tha ’ga fhàgail doirbh ach aig deireadh an latha, an dèidh dhomh a bhith an sàs ann an iomairtean teicneolais d’ an leithid fad fichead bliadhna, chanainn gur e gleus sgaoilidh an rud as motha a tha a dhìth oirnn. Innsidh mi dhut carson. Feuch na stràcan. Chan eil e doirbh PC no Mac a chur air dòigh airson ’s gun toireadh iad dhut na stràcan anns gach prògram, gun a bhith a’ tionndadh gu na gleusan àrsaidh ’s toinnte mar ‘Alt 0224’ airson ‘à’. Ach mur eil earbsa annad ann a bhith a’ fiolcadh leis a’ choimpiutair agad, mar is trice cuiridh e eagal do bheatha ort ma mholas cuideigin dhut a dhol a-steach dha na settings. Air an làimh eile, tha daoine a bu chòir a bhith eòlach air rudan mar sin, can muinntir tech supp, no na daoine a dhèiligeas ri riarachadh coimpiutaireachd sna sgoiltean, cho aineolach d’ a thaobh iad fhèin. Nach pailt na litrichean a sgrìobh mi gu comhairlean a thaobh rudan mar an “UK Extended keyboard layout” air coimpiutairean nan sgoiltean agus shaoileadh tu gun do dh’iarr mi orra an space shuttle a phrògramachadh… ’S e na tha a dhìth oirnn buidheann a thèid mun cuairt nan coimhearsnachdan Gàidhlig–agus oifisean nan daoine a nì co-dhùnaidhean a bhuineas ri saoghal digiteach na Gàidhlig–a bheir taic dhaibh leis an teicneolas Gàidhlig a th’ ann an-diugh eadar keyboard layouts agus Firefox ann an Gàidhlig agus a sgaoileas fiosrachaidh mu an dèidhinn. Ach a-rèir coltais, chan eil sin sexy gu leòr airson nam buidhnean stèidhichte… agus ri linn sin, tha aonadan Gàidhlig againn fhathast aig a bheil coimpiutairean air nach urrainn dhut à a sgrìobhadh gun copypaste no rud gòrach mar sin.

Nach pailt na litrichean a sgrìobh mi gu comhairlean a thaobh rudan mar an “UK Extended keyboard layout” air coimpiutairean nan sgoiltean agus shaoileadh tu gun do dh’iarr mi orra an space shuttle a phrògramachadh…

“Anns an làimh eile, bheil cothroman sam bith ann ma bhios tu ag obair le mion-cànan? Cò iad?”

Tha agus chan eil. Aig amannan tha e mar a bhith ’nad shuidhe air dùn-gainmhich. Chan eil stèidh dhaingeann fodhad idir agus an rud a sheas an-dè, falbhaidh e a-màireach. Can Google in Your Language--chaidh a chur ann gun làmh a bhith aig a’ choimhearsnachd ann agus chaidh a spìonadh air falbh gun làmh aig a’ choimhearsnachd. No can rudan mar Adaptxt agus Swiftkey–dìreach nuair a thug sinn ceum air adhart, tha Amazon is Google a’ cur bogsa ’nar dachaighean nach bruidhinn ach Beurla. Agus ma bhruidhneas tu ri teaghlaichean sa Chuimrigh nach bruidhinn dad ach Cuimris aig an taigh, chan e deagh-bhuaidh a th’ aig na h-innealan ud. Tha iomadh cothrom ann ach feumaidh sinn stèidh beagan nas co-ionnan. Feumaidh sinn seasamh còmhla ris na cànain bheaga eile–agus tha mi a’ gabhail feadhainn mar Eastoinis agus Catalanais a-staigh an-sin–agus cothachadh airson stèidh laghail aig ìre an Aonaidh Eòrpaich a sparras air companaidhean mòra cothrom a thoirt do chànain mar a’ Ghàidhlig agus a’ Lugsamburgais ceum a chumail ri ruith nan teicneolasan ùra.

“Nad bheachd fhèin, dè an dùbhlan as motha a th’ ann airson teicneolas na Gàidhlig anns a’ chòig bhliadhna ri teachd?”

Sasamach [d. facal snasail airson ‘Brexit’] na mallachd. Cha suarach an t-airgead a thàinig à diofar sporan an Aonaidh Eòrpaich a chur taic ri pròiseactan teicneolais Ghàidhlig thairis air na bliadhnaichean, eadar maoineachadh acadaimigeach agus maoineachadh nan roinnean, can. Cuiridh mi mo cheann an geall nach cùm Lunnainn an aon taic rinn.

“Dè an fhàisneachd a th’ agad airson teicneolas cànain anns a’ bhliadhna 2050? Dè bu mhath leat fhaicinn airson teicneolas na Gàidhlig ron àm sin?”

An lagh ud a mhol mi gu h-àrd! Ach mas e gleus teicneolais fhèin a bha thu faighneachd, bhiodh e math gleus a nì sgrìobhadh de chainnt math, leis cho dona ’s a tha daoine air sgrìobhadh na Gàidhlig san fharsaingeachd. Ach air an làimh eile, nan cuireamaid sgoil Ghàidhlig anns gach clachan sna h-Eileanan mar a bha againn roimhe, bhiodh sin a cheart cho math, nach biodh?

Sgoil Staoineabrig: an sgoil mu dheireadh ann an Uibhist far an robh a’ chlann uileag ag ionnsachadh tro mheadhan na Gàidhlig. Chaidh a dùnadh ann an 2010 (© Ailean Dòmhnallach 2010)

Ceanglaichean

Agallamh leis an Ollamh Kevin Scannell

Anns an t-sreath seo, bidh sinn a’ coimhead air sàr-laoich a rinn adhartas mòr ann an teicneolas nan cànanan gàidhealach. Airson a’ chiad agallaimh, cha b’ urrainn dhuinn na b’ fheàrr fhaighinn na ‘n t-Ollamh Kevin Scannell à Oilthigh San Louis, anns na Stàitean Aonaichte. Tha Kevin air an t-uabhas de ghoireasan a chur a-mach airson nan trì cànanan Gàidhlig, agus tha e o chionn ghoirid air duais Fulbright fhaighinn gus goireasan airson Gàidhlig na h-Èireann a chruthachadh a chleachdas teicneolas niùrail agus ionnsachadh domhainn. Mòran taing do Kevin a bhith deònach an t-agallamh seo a dhèanamh.

In this series, we look at heroes of language technology who have made significant progress for the Gaelic languages. For the first interview, we couldn’t do better than Professor Kevin Scannell of St. Louis University (USA). Kevin has produced a vast number of resources for the three Gaelic languages (Gaelic, Irish and Manx), and has recently been awarded a Fulbright Award (2019) to develop tools for Irish Gaelic that utilise neural networks and deep learning techniques. Many thanks to Kevin for agreeing to do this interview with us. 

We’re presenting some of these interviews in a Gaelic or Irish only format. If required, they can be translated to English using Google Translate.  

Agallamh leis an Ollamh Kevin Scannell

Interview with Professor Kevin Scannell

An tOllamh Kevin Scannell

Tá Kevin Scannell ina Ollamh le Matamaitic agus Ríomheolaíocht in Ollscoil San Louis, Missouri. Oibríonn sé i gcomhar le grúpaí ar fud an domhain le hacmhainní ríomhaireachta a fhorbairt a chuidíonn leo a dteanga dhúchais a úsáid ar líne. Tá suim ar leith aige sa Ghaeilge agus sna teangacha Ceilteacha eile; tá gramadóir, litreoir, agus teasáras Gaeilge forbartha aige, chomh maith le foclóirí agus inneall aistriúcháin Gàidhlig-Gaelg-Gaeilge.  Glacann sé páirt i dtogra a sholáthraíonn leaganacha Gaeilge de roinnt táirgí ríomhaireachta mór-le-rá: Mozilla Firefox, LibreOffice, Gmail, agus Twitter mar shampla. I 2011, bhunaigh sé an suíomh Indigenous Tweets chun mionteangacha agus teangacha dúchasacha a chur chun cinn sna meáin shóisialta.

“Cá as tú agus cá bhfuair tú Gaeilge ar dtús?”

Is as Bostún Mheiriceá mé ó dhúchas. Thosaigh mé ag foghlaim na Gaeilge i Meiriceá sa 1990idí, i m’aonar, ó leabhair agus ó fhoclóirí. Bhí go leor eolais agam ar litríocht na Gaeilge agus gramadach na Gaeilge ach ní raibh mé compordach leis an teanga labhartha ar feadh blianta fada. Thosaigh mé ag teacht go hÉirinn thart ar 2006 agus tháinig feabhas ar mo chumas labhartha de réir a chéile.

“Cad a thug ort oibriú le teicneolaíocht na teanga? Conas a thosaigh tú sa réimse seo?”

Go bunúsach, thosaigh mé ar an obair seo mar gheall ar na riachtanais a bhí ormsa féin mar fhoghlaimeoir. Sna 1990idí, ghlac mé páirt sna liostaí r-phoist Gaelic-L agus Gaeilge-A agus bhí díomá orm nach raibh seiceálaí litrithe ar fáil. Mar a tharlaíonn sé, bhí mé ag bailiú bunachar sonraí foclóireachta mar chuid de mo phróiseas foghlamtha. Ní raibh mórán oibre i gceist seiceálaí litrithe a chruthú as sin — bá é sin GaelSpell — foilsíodh an chéad leagan 20 bliain ó shin. Ní raibh aon saineolas agam ar an réimse seo ag an am — bhí mé i mo mhatamaiticeoir, ach bhí scileanna ríomhaireachta sách maith agam. Agus ba léir dom ag an am gurbh fhiú corpas a thógáil chun cabhrú liom an bunachar foclóireachta a thógáil níos sciobtha, agus le bheith cinnte go raibh na focail is coitianta agam. Bhailigh mé b’fhéidir milliún focal Gaeilge ón Idirlíon sna 90idí, agus lean mé ar aghaidh leis an obair sin (i dteangacha eile freisin), agus anois tá níos mó ná 200 milliún focal sa gcorpas Gaeilge ar mo ríomhaire!

“I measc na dtionscadal teicneolaíochta a raibh tú páirteach iontu, cé acu ceann ba thábhachtaí nó ba thaitneamhaí duit?”

Creid nó ná creid, déarfainn gurb é GaelSpell an tionscadal is tábhachtaí (de réir líon daoine atá ag baint úsáid as) cé nach bhfuil sé róspéisiúil ó thaobh cúrsaí teicneolaíochta. Rinne mé seiceálaí gramadaí darb ainm An Gramadóir freisin, agus bíonn go leor daltaí scoile agus mac léinn ollscoile á úsáid chun aistí a sheiceáil. Ach an ceann is tábhachtaí dar liomsa ná “An Caighdeánaitheoir”, tionscadal nach bhfuil i mbéal an phobail ar chor ar bith. Rud thar a bheith simplí atá ann — déanann sé caighdeánú ar litriú agus ar ghramadach téacsanna Gaeilge a bhí scríofa roimh an gCaighdeán Oifigiúil. D’fhoilsigh Rialtas na hÉireann mórán leabhar Gaeilge sna 1930idí, ach úsáidtear an seanlitriú iontu (agus an seanchló chomh maith). Mar sin, tá sé i bhfad níos deacra tairbhe a bhaint astu i gcúrsaí NLP, mar shampla, agus bíonn fadhbanna ag an foclóirithe Gaeilge cuardach a dhéanamh sna téacsanna seo.

Leaganacha den fhocal “Gaeilge” sa chorpas

Tá an tionscadal foclóireachta focloir.ie (An Gúm) agus Foclóir na Nua-Ghaeilge (Acadamh Ríoga na hÉireann) ag baint úsáid as an gCaighdeánaitheoir. Agus is féidir é a úsáid chun seantéacsanna a réiteach do lucht léitheoireachta sa lá atá inniu ann, daoine nach bhfuil cleachta leis an seanlitriú. Tá sé sin déanta agam le roinnt seanleabhar.

“Cad iad cuid de na fadhbanna le teicneolaíocht a fhorbairt do mhionteanga mar an Ghaeilge?”

Ba mhaith liom tuilleadh daoine óga a mhealladh chun obair a dhéanamh sa réimse. Tá grúpaí taighde ann in áiteanna éagsúla in Éirinn (DCU, Trinity, NUIG go háirithe) agus bíonn mic léinn máistreachta/PhD acu anois is arís, ach ní leor é sin chun an obair a chur ar bhonn slán fadtéarmach. Ba chóir do Rialtas na hÉireann infheistíocht mhór a dhéanamh sna grúpaí sin (agus i gcinn eile nach iad!); níos mó mac léinn, léachtóirí, ollúna, srl. Tá na daoine céanna i mbun oibre ar phlean teicneolaíocht don Ghaeilge anois, faoi scáth Roinn na Gaeltachta, agus le cúnamh Dé tiocfaidh tuilleadh airgid chun cinn mar thoradh ar an bplean.

An rud eile atá ag teastáil ná comhoibriú níos fearr leis na mórchomhlachtaí teicneolaíochta. Tá saineolas teicneolaíochta agus sonraí againne nach bhfuil ag Google, mar shampla, agus bheadh sé an-éasca feabhas mór a chur ar tháirgí Google cosúil le Google Translate. Agus tá an t-ardán atá acu ag teastáil uainne! Mar shampla, rinne mé aistritheoir Gàidhlig > Gaeilge agus Gaelg > Gaeilge roinnt blianta ó shin, ach is annamh a bhaineann éinne úsáid as; tá sé i bhfad níos éasca rudaí a aistriú go díreach in Chrome.

“Ar an láimh eile, an bhfuil aon deiseanna ann má oibríonn tú le mionteanga? Cé hiad?”

Tá! Tá pobal na Gaeilge an-díograiseach maidir leis an teanga, agus den chuid is mó bíonn siad réidh troid a dhéanamh ar son na teanga nó ar son cearta teanga. Sna cásanna ina rabhamar in ann comhoibriú a dhéanamh leis na comhlachtaí teicneolaíochta, mar shampla an t-aistriúchán a rinneamar ar GMail nó ar WhatsApp, obair dheonach a bhí ann. Ach mar sin féin, bhí sé an-éasca grúpa mór daoine a earcú chun an obair a dhéanamh; thuig siad láithreach an tábhacht a bhaineann leis na táirgí seo a bheith ar fáil i nGaeilge.

“I do thuairim, cad é an dúshlán is mó don teicneolaíocht Ghaelach do na cúig bliana amach romhainn?”

Gan a bheith fágtha as an rás chun samhlacha mór néaracha a chruthú. Tá mé i mbun oibre ar na cúrsaí seo faoi láthair, agus feicim an chumhacht agus na féidearthachtaí atá ann. Ach an taighde atá ar siúl in Google, Facebook, NVIDIA, srl., tá sé dírithe céad faoin gcéad ar Bhéarla. Agus ciallaíonn sé sin go bhfuil an taighde “overfit” ar theangacha gan mórán moirfeolaíochta mar shampla, agus (níos measa) ar theangacha a bhfuil na céadta billiún focal acu le haghaidh traenála. Caithfimid ár dtaighde féin a dhéanamh: cad iad na teicnící is fearr nuair nach bhfuil mórán sonraí traenála agat? Conas is féidir tairbhe a bhaint as na hacmhainní eile atá againn; foclóirí den chéad scoth, saineolas teangeolaíochta, srl.

“Cén fhís atá agat maidir le teicneolaíocht teanga sa bhliain 2050? Cad ba mhaith leat a fheiceáil don teicneolaíocht Ghaelach roimh sin?”

Comhéadain ghutha ar gach gléas/ríomhaire/táirge leictreonach i mbeagnach teanga ar bith. Is é sin an scoilt dhigiteach nua a bheidh ann; beidh an teicneolaíocht ghutha ar fáil i dteangacha áirithe, agus ní bheidh sí ar fáil i dteangacha eile. An chontúirt atá ann ná nach mbeidh daoine sásta cloí leis na teangacha sa dara grúpa. Is é mo thuairim go bhfuil an Ghaeilge go díreach ar an teorainn faoi láthair. Le fís fhadtéarmach, tuilleadh infheistíochta ón Rialtas, agus comhoibriú le comhlachtaí teic, beimid in ann an sliabh a dhreapadh. Ach níl sé deacair malairt an scéil a shamhlú ach oiread.

Naisc

  • Cadhan: na hacmhainní Gaeilge go léir ag Kevin
  • Intergaelic: Aistriúchán meaisín idir Gàidhlig agus Gaeilge, agus Gaelg  agus Gaeilge
  • Léacht a thug Kevin faoi “ailtireacht seirbhís-bhunaithe” do Teicneolaíochtaí Gaeilge

Powered by WordPress & Theme by Anders Norén

css.php

Report this page

To report inappropriate content on this page, please use the form below. Upon receiving your report, we will be in touch as per the Take Down Policy of the service.

Please note that personal data collected through this form is used and stored for the purposes of processing this report and communication with you.

If you are unable to report a concern about content via this form please contact the Service Owner.

Please enter an email address you wish to be contacted on. Please describe the unacceptable content in sufficient detail to allow us to locate it, and why you consider it to be unacceptable.
By submitting this report, you accept that it is accurate and that fraudulent or nuisance complaints may result in action by the University.

  Cancel