kennethleungty/Llama-2-Open-Source-LLM-CPU-Inference

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

/ 100

Established

This project helps teams who need to extract specific information from long documents, like annual reports or legal filings, by asking questions in plain language. You input your documents and your questions, and it provides direct answers. This is ideal for analysts, researchers, or anyone handling sensitive information that can't be shared with external AI services.

974 stars. No commits in the last 6 months.

Use this if you need to run a question-answering system on your own private documents without relying on external cloud-based AI services, especially due to data privacy or cost concerns.

Not ideal if you're comfortable using commercial AI services like OpenAI's GPT-4 or if you need to process extremely large volumes of data very quickly on high-end GPUs.

document-analysis private-data information-retrieval enterprise-search compliance

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

974

Forks

207

Language

Python

License

MIT

Related models

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

alibaba/MNN

MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...

tensorzero/tensorzero

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...

Explore Transformer Models

All categories Trending Transformer directory Insights