zhihu/cuBERT
Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
This project dramatically speeds up the process of analyzing text using BERT models. It takes raw text inputs and quickly outputs classifications (like sentiment or topic), extracted features, or pooled representations, making text analysis much faster. This is ideal for developers and MLOps engineers who need to deploy and run BERT-based natural language processing applications at scale, without the overhead of larger machine learning frameworks.
549 stars. No commits in the last 6 months.
Use this if you need to perform BERT inference with significantly reduced latency and higher throughput, especially in production environments where speed and efficiency are critical for text processing tasks.
Not ideal if you need to train BERT models, perform tasks beyond standard BERT inference, or if your application does not involve high-volume, performance-critical text analysis.
Stars
549
Forks
84
Language
C++
License
MIT
Category
Last pushed
Nov 18, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/zhihu/cuBERT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
ThalesGroup/ConceptBERT
Implementation of ConceptBert: Concept-Aware Representation for Visual Question Answering
dimitreOliveira/bert-as-a-service_TFX
End-to-end pipeline with TFX to train and deploy a BERT model for sentiment analysis.
kpi6research/Bert-as-a-Library
Bert as a Library is a Tensorflow library for quick and easy training and finetuning of models...
SapienzaNLP/mosaico
A multilingual open-text semantically annotated interlinked corpus
Statistical-Impossibility/Feline-Project
Domain-adaptive NLP pipeline for feline veterinary NER using BERT