profile picture

John Ortega

indigenous

multilingual

low-resource

quechua

adaptation

word alignment

unseen

language adaptation

galician

web crawl

perplexity

giza

fastalign

deep learning

low-resource languages

4

presentations

5

number of views

SHORT BIO

John Evan Ortega received his PhD degree from the University of Alicante in Alicante, Spain in March 2021. He has served as an organizer or committee member for conferences and/or workshops such as PEMDT at AMTA 2020, AACL-IJCNLP 2020, AmericasNLP at NAACL-HLT 2021, and more. He has served in several areas of computer science such as instructor and researcher at New York University, lecturer at Columbia and Rutgers University, and researcher at the University of Santiago de Compostela (CITIUS). He has published several articles of interest on South American low-resource languages including Quechua and Ashaninka. Additionally, Dr. Ortega has extensive experience with creating algorithms for post-editing in machine translation using machine learning and natural language processing. His nearly twenty years of software experience in the private industry for companies such as Nuance Communications, AIG, IheartRadio, and Cigna overlap with a deep appreciation for academic approaches. His passion for solving machine translation problems for languages with low resources is shown in his latest works.

Presentations

Meeting the Needs of Low-Resource Languages: The Value of Automatic Alignments via Pretrained Models

Abteen Ebrahimi and 7 other authors

Introducing QuBERT: A Large Monolingual Corpus and BERT Model for Southern Quechua

John Ortega

AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages

Abteen Ebrahimi and 16 other authors

Revisiting CCNet for Quality Measurements in Galician

John Ortega and 3 other authors