zhihu/cuBERT

Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL

/ 100

Emerging

This project dramatically speeds up the process of analyzing text using BERT models. It takes raw text inputs and quickly outputs classifications (like sentiment or topic), extracted features, or pooled representations, making text analysis much faster. This is ideal for developers and MLOps engineers who need to deploy and run BERT-based natural language processing applications at scale, without the overhead of larger machine learning frameworks.

549 stars. No commits in the last 6 months.

Use this if you need to perform BERT inference with significantly reduced latency and higher throughput, especially in production environments where speed and efficiency are critical for text processing tasks.

Not ideal if you need to train BERT models, perform tasks beyond standard BERT inference, or if your application does not involve high-volume, performance-critical text analysis.

Natural Language Processing Text Analytics Machine Learning Inference Deep Learning Deployment High-Performance Computing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

549

Forks

Language

C++

License

MIT

Related frameworks

ThalesGroup/ConceptBERT

Implementation of ConceptBert: Concept-Aware Representation for Visual Question Answering

dimitreOliveira/bert-as-a-service_TFX

End-to-end pipeline with TFX to train and deploy a BERT model for sentiment analysis.

kpi6research/Bert-as-a-Library

Bert as a Library is a Tensorflow library for quick and easy training and finetuning of models...

SapienzaNLP/mosaico

A multilingual open-text semantically annotated interlinked corpus

Statistical-Impossibility/Feline-Project

Domain-adaptive NLP pipeline for feline veterinary NER using BERT

Explore ML Frameworks

All categories Trending ML Framework directory Insights