All case studies
AI · R&D

A Nepali-language AI assistant for Nepal's legal archives.

A retrieval-grounded, fine-tuned LLM that helps lawyers, students, and citizens understand and find prior court verdicts — built on the Nepal Kanoon Patrika and Supreme Court Faisala archives.

नेपाली भाषामा फैसला खोज्न र बुझ्न — कानूनी कागजातमा आधारित।

Status
Active R&D
Base model
Gemma 4 31B IT
Architecture
RAG + QLoRA fine-tune
Hardware
RTX 5090 (Blackwell)
Target v1
2.5–3 months
The problem

Six decades of verdicts. None of them searchable in Nepali.

The Nepal Kanoon Patrika (NKP) has been the official record of Nepal's reported court decisions since 1958. The Supreme Court's Faisala archive holds every published verdict since. Together: tens of thousands of decisions, mostly in Devanagari Nepali — scanned PDFs before 2008, Unicode after.

A lawyer or law student needing to find precedent on a specific issue today has two options: pay a commercial database that doesn't speak Nepali well, or hand-scroll through scanned PDFs. Citizens trying to understand their own legal system have effectively zero options.

Off-the-shelf LLMs hallucinate case citations — a documented failure that has cost lawyers their licenses elsewhere. For Nepali legal use, the wrong architecture is unacceptable.

The approach

RAG + fine-tune.
Not either alone.

A fine-tuned LLM memorizes patterns, not facts. Trained on verdicts, it will confidently fabricate case citations, dates, and rulings — the exact failure mode that ends legal careers.

Our architecture splits the two responsibilities. The fine-tune teaches the model legal-Nepali vocabulary, judgment-style reasoning, and the citation format we want. The retrieval layer keeps every factual claim tied to a real, retrieved verdict.

If the relevant verdict isn't in the retrieved context, the model is trained to refuse — explicitly. That's the difference between a legal product and a liability.

Offline pipeline

  1. 1Scrape NKP + Supreme Court Faisala archives, politely rate-limited
  2. 2OCR Devanagari (Surya / Cloud Vision) for pre-2008 scans
  3. 3Extract structured metadata — case#, judges, parties, ratio, decision
  4. 4Section-aware chunking (तथ्य · ठहर · फैसला), 512–800 tokens
  5. 5Embed with BGE-M3, index in Qdrant

Runtime query

  1. 1User question in Nepali
  2. 2Hybrid retrieve top-50 chunks (dense + sparse)
  3. 3bge-reranker-v2-m3 picks top-5
  4. 4Fine-tuned Gemma 4 answers in formal Nepali, cites by case#
  5. 5Refuses if context insufficient — citations verified against the corpus
Why it matters

Legal information is civic infrastructure.

When citizens can read their own legal system, courts get better questions, lawyers do less rote work, and the gap between the law and the public it serves narrows.

Native Nepali

Most legal AI today understands English law in English. Ours reads Devanagari, writes in formal Nepali, and treats both as first-class.

Anti-hallucination by design

Citations are verified against the corpus before display. The model refuses to answer when context is thin — better silent than confidently wrong.

Auditable

Every answer surfaces the underlying verdicts it cites — case number, NKP citation, year. Users can click through and read the source.

Grounded in real archives

NKP from BS 2065 (~2008) forward as Unicode; older volumes via OCR. Constitution, statutes, and the 125K bilingual legal MT corpus layer on top.

Edge-deployable

QLoRA adapters keep the merged model small enough to self-host. No dependence on foreign hyperscalers for sensitive legal queries.

Knowledge wiki layer

A planned v1.5 — a curated wiki of landmark cases + statute interpretations the model maintains, for fast consistent answers to common questions.

Engineering

The stack.

Modern, open-weight, self-hostable. No vendor lock-in.

Models

  • Gemma 4 31B IT (primary)
  • Gemma 4 26B-A4B (speed)
  • Qwen 3 14B (backup)
  • QLoRA via Unsloth 2026.4

Retrieval

  • BGE-M3 embeddings
  • Qdrant vector DB
  • bge-reranker-v2-m3
  • Hybrid dense + sparse

Data

  • NKP archive (BS 2065+)
  • Supreme Court Faisala
  • Constitution of Nepal
  • Surya OCR for scans

Hardware

  • RTX 5090 (32 GB sm_120)
  • CUDA 12.8 / PyTorch 2.11
  • FA2 community kernels
  • Single workstation

Training

  • 10K–20K seed instructions
  • Lawyer-reviewed gold set
  • Synthetic expansion via stronger LLM
  • Citation-correctness eval

Safety

  • Source-cited answers required
  • Refusal training on thin context
  • Lawyer disclaimer on every output
  • Query logging for failure analysis

Phased plan.

Realistic v1 in 2.5–3 months. Faster with a Nepali lawyer engaged from day one.

Phase 0–1

In progress

Eval & scrape

200-Q ground-truth set with lawyer review · NKP + Faisala cached locally

Phase 2–3

Up next

Pipeline & RAG

Structured JSONL extraction · Qdrant index · recall@5 baseline

Phase 4–6

Planned

Instruction data & fine-tune

10K–20K examples · QLoRA training · checkpoint evals

Phase 7–8

Planned

Iterate & release

Re-train weak slices · model card · public demo with disclaimers

Working on Nepali language AI?

We collaborate with lawyers, researchers, and government bodies on Nepali NLP. If you have a corpus, a question, or want to compare notes — get in touch.