Skip to content

Our Portfolio

Showcasing our ongoing projects in NLP resource creation, linguistic analysis, and technology development for low-resource languages.

🗣️
Corpus Creation

Bhojpuri Speech Corpora

Bootstrapped extensive NLP resources and speech corpora for the Bhojpuri language to support the Govt. of India funded Sampark Machine Translation System.

View Project
🤖
Machine Translation

Maithili & Magahi MT

Conducted typological analysis and POS labelling to generate critical language data experiments for low-resource Indian languages.

View Project
📝
Linguistic Tools

Advanced POS Tagger

Text chunking, data generation, and morpho-syntactic feature annotation designed specifically for scheduled and low-resource Indian languages.

View Project