CopyColumns script
Description
copycolumns is a python script to copy selected columns from an Excel spreadsheet to new spreadsheet. The script is intended to be used to make a subset of data available in a new spreadsheet.
Usage
The script takes 3 positional arguments:
- The name of a text file containing the header names of the columns to copy.
- The name of a spreadsheet containing the source data. A header matching the column names in the specified header
names file should be present within the first 10 rows of the spreadsheet - The name of file for the output, this file will be created or overwritten.
The following switches are available to modify the script’s behavior:
-p include a minimal set of predefined PlanArt column names
-d include a minimal set of predefined DoubleTalk column names
-i ignore any column names that are not found in the source data’s header
-t place the output data within an Excel table
Column name file format
The column name file should be a simple text file with one column name per line.
Trailing and leading spaces and blank lines and lines will be ignored.
Any line that contains a hash # as the first character (excluding leading spaces) will be ignored, this can be used for comments or to prevent a column name definition being used.
Columns in the new spreadsheet are created in the order they are defined in the column name text file. If the -p or -d switches are specified then the set of predefined column names are prepended to the list of names.
If you are asked to select a text encoding when saving the text file, select utf-8 if available.
Predefined column names
DoubleTalk_headers
Unnamed: 0, RecNo, File Number, Speaker, Beg, A_Offset_Time, B_Onset_Time, End
PlanArt_headers
Filename, Speaker, Beg, End, ExperimentID, Use
Typical usage examples:
create a copy of the columns identified in the file column_name_file.txt, the source data is in source_data_file.xlsx, put the columns is a new spreadsheet with name destination_file.xlsx
python copycolumns.py column_name_file.txt source_data_file.xlsx destination_file.xlsx
create a copy of the columns identified in the file column_name_file.txt, include the predefined columns for DoubleTalk data and place the data in an Excel table
python -dt copycolumns.py column_name_file.txt source_data_file.xlsx destination_file.xlsx
Availability
The python source script and a packaged mac version of the script are available in the @ToDo: add link to script once uploaded.