ELAN Scripts
Extending the functions of ELAN
ELAN is a media annotation application produced by MPI for Psycholinguistics. It’s widely used for annotating audio and video in language documentation.
This page includes links to scripts and tutorials I’ve created to add functions that weren’t present as of ELAN 5.9.
Resources are by me, not by the creators of ELAN. Please report bugs via this Github repository.
Note that I don’t currently maintain these scripts, but to my knowledge the functionalities they add still aren’t available as of ELAN 6.7. ELAN structure has changed very little between versions 5 and 6 and scripts still work for more recent releases.
Import/Export between ELAN and FLEx
With Sophie Pierson, Sunny Ananthanarayan, and Claire Bowern, I created Flibl, a tool that greatly improves the import/export process between ELAN and FLEx, as well as increasing the dimensionality of lines in FLEx.
Download Flibl from our OSF repository here.
The OSF repository includes documentation, but we also wrote an overview of what Flibl does, how it works, and why it’s better than the built-in import/export. You can read it here.
Multiple Find and Replace (Orthography Conversion)
ELAN has a find and replace function that works with regular expressions. But you need to carry out every find and replace individually, making it hard to use the function to transliterate between orthographies with very different character sets.
I wrote a group of R scripts which carry out find and replace inside ELAN files based on an external CSV file with find-replace pairs. All of the scripts support comments: you can exclude text that is inside certain delimiters from the find-and-replace. They also support multiple orthographies: you can change text on one tier according to the equivalences in one CSV file, and text on another tier according to equivalences in another CSV.
EAF_ChangeOrthography_ByTierName_File.R
User selects a set of tiers (they can be of any type) in a single ELAN file. Script finds and replaces characters in the tiers according to equivalences in an external CSV. All tiers must use the same CSV.
EAF_ChangeOrthography_ByTierType_File.R
User selects up to two tier types in a single ELAN file. Script finds and replaces characters in all tiers of each type according to equivalences in an external CSV. Types can use different CSVs.
EAF_ChangeOrthography_ByTierType_Directory.zip
Same as script 2, but applies to a directory of files, not just one file. To run, download and unzip in the directory where you want to work. Modify the user inputs in the R script as needed. Then, run by opening a command-line window and running the shell script (in bash: using the command ‘sh batch_R.txt’).
Sample CSV file for orthography replacement scripts
This shows the required format for the CSV file used to define replacements in the above scripts.
Copying to Parent Tiers
ELAN has limited functionality for copying annotation values onto a parent (time-aligned) tier. If you copy onto a parent tier with annotations on its child (dependent) tiers, all of the annotations on the child tier are removed.
I wrote a group of R scripts which allow you to copy onto parent tiers without affecting their child annotations.
Copying Between Two Parent Tiers
User selects two parent tiers in a single ELAN file. Script finds all annotations on the source tier which have the same timepoints as annotations on the target tier. Then it copies the source annotation values to the target tier.
To use this script, you must already have blank annotations on your target tier with exactly the same timepoints as the annotations on the source tier.
EAF_CopyParentToParent_Directory.zip
Same as above, but works on all ELAN files in a directory. To run, download and unzip in the directory where you want to work. Modify the user inputs in the R script as needed. Then, run by opening a command-line window and running the shell script (in bash: using the command ‘sh batch_R.txt’).
Copying From Child to Parent Tiers
EAF_CopyChildToParent_ByTier.R
User selects a parent tier and a child tier in a single ELAN file. Script copies all annotations on the child tier onto the parent tier. Child tier must have the stereotype ‘Symbolic Association.’
EAF_CopyChildToParent_ByType.R
User selects a parent tier type and a child tier type in a single ELAN file. In each parent-child pair of tiers, script copies all annotations from the child tier onto the parent tier. Child tier type must have the stereotype ‘Symbolic Association.’ All child tiers must be dependent on parent tiers of the selected parent type.
EAF_CopyChildToParent_ByType_Directory.zip
Same as above, but works on all ELAN files in a directory. To run, download and unzip in the directory where you want to work. Modify the user inputs in the R script as needed. Then, run by opening a command-line window and running the shell script (in bash: using the command ‘sh batch_R.txt’).
EAF_ConcatenateChildToParent_ByTier.R
User selects a parent tier and a child tier of the stereotype type “Symbolic Subdivision” (multiple child annotations fully included in timespan of parent annotation) in a single ELAN file. Script concatenates all annotations on the child tier and copies them to the parent tier. Useful if you have a child tier tokenized to words/morphs that is the child of a blank parent tier (like in some EAFs imported from FieldWorks Language Explorer, a.k.a. FLEx).
batch_EAF_ConcatenateChildToParent.zip
Version of the above script for batch processing. Includes a batch-friendly version of the EAF_ConcatenateChildToParent_ByTier script plus a shell script for running it over a directory.
Version of the ‘Concatenate Child to Parent’ tier that works with two different child tiers – one that contains punctuation and one that contains text. User selects two child tiers (one of type Symbolic Subdivision + its child of type Symbolic Association) and their parent tier. Concatenates the punctuation and text from the child tiers together, then pastes the concatenated strings onto the parent tier.
Creating a subtitled video from an ELAN file
ELAN exports subtitles in only one format – .srt files. Subtitles in .srt format don’t work in all video players.
I created a guide for exporting ELAN subtitles in an alternative format and ‘burning’ them into the video so that they work in all players. See the guide here.
Alternatively, if you already have the ELAN file and video clip that you want to export, check out this script:
This is a combination of a shell script and R script which provide a fully automated way to generate burned subtitles from ELAN files. User selects parent and child tiers (max one child per parent) in an ELAN file. Script converts the parent and child tiers into .ass format subtitles and burns them into the video clip associated with the ELAN file.
To run, download and unzip ZIP file in the folder with your input files, then run by typing ‘sh eaf_to_burned-subtitles.txt’ in the (Unix/Mac) command line. Script requires R and ffmpeg on your computer to work.
Comments are closed
Comments to this thread have been closed by the post author or by an administrator.