OPEN INITIATIVE

AI for the Nepali language.

Open datasets, models, and tools for 17 million speakers. Built in the open under CC BY 4.0 and Apache 2.0. Project Bhasa is Suswo's commitment to low-resource language AI.

View on Hugging Face ↗ Contribute on GitHub ↗

Project Bhasa

            
        ROADMAP

Three phases. Long game.

● ACTIVE

Phase 1 — Now

Open Contributor

Open community. Datasets and models published freely on Hugging Face and GitHub under CC BY 4.0 and Apache 2.0.

✓Project scaffolding ✓First dataset pipeline ○Public dataset release ○First model checkpoint

Phase 2 — Next

Open Core

Free base models with a hosted API. Developers call the API for free at limited rate — pay for production scale.

FUTURE

Phase 3 — Future

Commercial Products

Nepali speech-to-text, translation API, writing assistant, and government / NGO contracts.

FUTURE

            
        OPEN DATASETS

Open datasets.

Datasets publishing soon.

Phase 1 is underway. The first dataset will appear here and on Hugging Face when it passes our quality review.

Watch on Hugging Face ↗

            
        OPEN MODELS

Models we've trained.

Models publishing soon.

We train on the published datasets. The first checkpoint will release alongside or shortly after the first dataset.

            
        GET INVOLVED

How to contribute.

Native speakers

Review and validate data — no ML experience needed. If you speak Nepali, you can help.

Contributor guide →

Engineers

Open PRs — dataset pipelines, evaluation scripts, and NLP tooling. We use Python and Hugging Face Transformers.

View repos →

Linguists

Help curate and annotate data correctly. Linguistic expertise ensures our datasets are actually useful.

Get involved →

Help us build AI in Nepali.

Every dataset row, every commit, every share helps build the future of Nepali AI. Open source, open data, open research.

Hugging Face ↗ GitHub ↗