| | | | | linguist.page@gmail.com

Extract a specific column from a TSV file

Type here
cut -f 2 data.tsv

Extract a specific column from a CSV file

Type here
cut -d ',' -f 1 data.csv

Merge two parallel corpus files side by side

Type here
paste arabic.txt english.txt > parallel.txt

Merge files with a custom delimiter

Type here
paste -d '|' arabic.txt english.txt > parallel.txt

Split a large corpus into smaller files by line count

Type here
split -l 10000 corpus.txt chunk_