Language Identification NLP Tools

Tools for automatically detecting and classifying the language of input text. Does NOT include language-specific NLP processing, multilingual models for downstream tasks, or code-switching analysis beyond language identification.

There are 42 language identification tools tracked. 3 score above 50 (established tier). The highest-rated is indix/whatthelang at 54/100 with 167 stars.

Get all 42 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=language-identification&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 indix/whatthelang

Lightning Fast Language Prediction πŸš€

54
Established
2 nitotm/efficient-language-detector-js

Fast and accurate natural language detection. Detector written in...

52
Established
3 pemistahl/lingua-py

The most accurate natural language detection library for Python, suitable...

50
Established
4 nitotm/efficient-language-detector

Fast and accurate natural language detection. Detector written in PHP. Nito-ELD, ELD.

46
Emerging
5 mbanon/fastspell

Targetted language identifier, based on FastText and Hunspell.

46
Emerging
6 nickdavidhaynes/spacy-cld

Language detection extension for spaCy 2.0+

45
Emerging
7 patrickschur/language-detection

A language detection library for PHP. Detects the language from a given text string.

44
Emerging
8 Ankush-Chander/messandei

A simple stopword based language detector

39
Emerging
9 fedelopez77/langdetect

A language detection software

34
Emerging
10 honeybhardwaj/Language_Identification

it is a language identifier that detect different languages.

34
Emerging
11 kaiidams/LanguageDetection

C# port of https://github.com/shuyo/language-detection

33
Emerging
12 nitotm/efficient-language-detector-py

Fast and accurate natural language detection. Detector written in Python....

33
Emerging
13 aravind-selvam/language_identification-using-cnn-and-audio-processing

An Web application Language Identification project uses Pytorch and...

33
Emerging
14 zamgi/lingvo--LanguageDetector

Implementation of detection a few language

32
Emerging
15 searchpioneer/lingua-dotnet

Natural language detection library for .NET, suitable for long and short text alike

32
Emerging
16 loonghuey/native-language-cnn

Speech subtask of the 2017 NLI Shared Task

31
Emerging
17 tomelf/CNIT623-Native-Language-Identification-On-English-Learner-Dataset

Exploring how to identify the nationality of authors who answered exam...

31
Emerging
18 floydhub/language-identification-template

Detect the languages from short pieces of text

30
Emerging
19 lkevers/ldig-models-TAL62-3

Language identification models for 17 European official languages and...

30
Emerging
20 ilinguistics/geoLid

Geographically-informed language identification

29
Experimental
21 Al00X/LanguageDetector

Detect language from a text string in Swift!

27
Experimental
22 SomeAB/somelang

Natural Language Detection

27
Experimental
23 ffreemt/fast-langid

Detect language of a given text, fast

26
Experimental
24 andrianllmm/tagLID

A word-level Language Identification (LID) tool for Tagalog-English (Taglish) text

24
Experimental
25 Jason-Oleana/fasttext-language-detection

Fasttext language detection wrapped in Fastapi + DockerπŸ‹

23
Experimental
26 PhilWicke/Language_Identifier

Language Identification classification using XGBoost

23
Experimental
27 Lidan0241/language-detection

A language detection model for code-switched texts in es/en/zh

21
Experimental
28 br-pki/detectLanguage

To 1) create train/test samples of Tatoeba sentences for NLP-related tasks &...

21
Experimental
29 masalha-alaa/native-language-recognition

Mother tongue prediction from reddit posts (Deep Learning vs. Regular...

20
Experimental
30 javadr/PyTorch-Detect-Code-Switching

Implementation of a deep learning model (BiLSTM) to detect code-switching

20
Experimental
31 aparnadutta/code-mixed-lid

Word-level language identification for Bangla-English code-mixed social...

20
Experimental
32 Ehsan-Tavan/Language_Identification

Automatic detection of languages in text utilizing machine learning and Deep...

19
Experimental
33 Interaction-Bot/LanguageDetection

Experimental language detector used by Interaction Bot.

19
Experimental
34 xzhren/PreferenceAwareLID

Unsupervised Preference-Aware Language Identification

18
Experimental
35 AigozhiyevB/kazakh-russian-classification

НСбольшая модСль классификации казахского ΠΈ русского языков

17
Experimental
36 javadr/Language_Detection

Detection of the language of a text with Multinomial Naive Bayes method and...

15
Experimental
37 FR34KY-CODER/Language-Detection

Language Detection Project Using NLP

12
Experimental
38 ibtihelgharsalah/NLP-Language-Detection

An AI model that automatically detects the language of a given text...

11
Experimental
39 Aayushinit/LanguageDetectorApp

Real-time background subtraction using OpenCV + Flask with switchable...

11
Experimental
40 giacomolat/MuseumLangID---Model-for-Identifying-the-Language-of-Texts-for-a-Museum

This repository contains a Language Identification project to classify...

11
Experimental
41 golecalicja/language-recognition-neural-network

A single-layer neural network written from scratch that predicts the...

10
Experimental
42 caiselvas/language-identification

An NLP project leveraging character trigrams and smoothing techniques...

10
Experimental