AI · R&D

A Nepali-language AI assistant for Nepal's legal archives.

A retrieval-grounded, fine-tuned LLM that helps lawyers, students, and citizens understand and find prior court verdicts — built on the Nepal Kanoon Patrika and Supreme Court Faisala archives.

नेपाली भाषामा फैसला खोज्न र बुझ्न — कानूनी कागजातमा आधारित।

Status: Active R&D
Base model: Gemma 4 31B IT
Architecture: RAG + QLoRA fine-tune
Hardware: RTX 5090 (Blackwell)
Target v1: 2.5–3 months

The problem

Six decades of verdicts. None of them searchable in Nepali.

The Nepal Kanoon Patrika (NKP) has been the official record of Nepal's reported court decisions since 1958. The Supreme Court's Faisala archive holds every published verdict since. Together: tens of thousands of decisions, mostly in Devanagari Nepali — scanned PDFs before 2008, Unicode after.

A lawyer or law student needing to find precedent on a specific issue today has two options: pay a commercial database that doesn't speak Nepali well, or hand-scroll through scanned PDFs. Citizens trying to understand their own legal system have effectively zero options.

Off-the-shelf LLMs hallucinate case citations — a documented failure that has cost lawyers their licenses elsewhere. For Nepali legal use, the wrong architecture is unacceptable.

The approach

RAG + fine-tune.
Not either alone.

A fine-tuned LLM memorizes patterns, not facts. Trained on verdicts, it will confidently fabricate case citations, dates, and rulings — the exact failure mode that ends legal careers.

Our architecture splits the two responsibilities. The fine-tune teaches the model legal-Nepali vocabulary, judgment-style reasoning, and the citation format we want. The retrieval layer keeps every factual claim tied to a real, retrieved verdict.

If the relevant verdict isn't in the retrieved context, the model is trained to refuse — explicitly. That's the difference between a legal product and a liability.

Offline pipeline

1Scrape NKP + Supreme Court Faisala archives, politely rate-limited
2OCR Devanagari (Surya / Cloud Vision) for pre-2008 scans
3Extract structured metadata — case#, judges, parties, ratio, decision
4Section-aware chunking (तथ्य · ठहर · फैसला), 512–800 tokens
5Embed with BGE-M3, index in Qdrant

Runtime query

1User question in Nepali
2Hybrid retrieve top-50 chunks (dense + sparse)
3bge-reranker-v2-m3 picks top-5
4Fine-tuned Gemma 4 answers in formal Nepali, cites by case#
5Refuses if context insufficient — citations verified against the corpus

Why it matters

Legal information is civic infrastructure.

When citizens can read their own legal system, courts get better questions, lawyers do less rote work, and the gap between the law and the public it serves narrows.

Native Nepali

Most legal AI today understands English law in English. Ours reads Devanagari, writes in formal Nepali, and treats both as first-class.

Anti-hallucination by design

Citations are verified against the corpus before display. The model refuses to answer when context is thin — better silent than confidently wrong.

Auditable

Every answer surfaces the underlying verdicts it cites — case number, NKP citation, year. Users can click through and read the source.

Grounded in real archives

NKP from BS 2065 (~2008) forward as Unicode; older volumes via OCR. Constitution, statutes, and the 125K bilingual legal MT corpus layer on top.

Edge-deployable

QLoRA adapters keep the merged model small enough to self-host. No dependence on foreign hyperscalers for sensitive legal queries.

Knowledge wiki layer

A planned v1.5 — a curated wiki of landmark cases + statute interpretations the model maintains, for fast consistent answers to common questions.

Engineering

The stack.

Modern, open-weight, self-hostable. No vendor lock-in.

Models

Gemma 4 31B IT (primary)
Gemma 4 26B-A4B (speed)
Qwen 3 14B (backup)
QLoRA via Unsloth 2026.4

Retrieval

BGE-M3 embeddings
Qdrant vector DB
bge-reranker-v2-m3
Hybrid dense + sparse

Data

NKP archive (BS 2065+)
Supreme Court Faisala
Constitution of Nepal
Surya OCR for scans

Hardware

RTX 5090 (32 GB sm_120)
CUDA 12.8 / PyTorch 2.11
FA2 community kernels
Single workstation

Training

10K–20K seed instructions
Lawyer-reviewed gold set
Synthetic expansion via stronger LLM
Citation-correctness eval

Safety

Source-cited answers required
Refusal training on thin context
Lawyer disclaimer on every output
Query logging for failure analysis

Phased plan.

Realistic v1 in 2.5–3 months. Faster with a Nepali lawyer engaged from day one.

Phase 0–1

In progress

Eval & scrape

200-Q ground-truth set with lawyer review · NKP + Faisala cached locally

Phase 2–3

Up next

Pipeline & RAG

Structured JSONL extraction · Qdrant index · recall@5 baseline

Phase 4–6

Planned

Instruction data & fine-tune

10K–20K examples · QLoRA training · checkpoint evals

Phase 7–8