LastBotInc/llama2j

Pure Java Llama2 inference with optional multi-GPU CUDA implementation

27
/ 100
Experimental

This project helps integrate Llama 2 language models directly into Java-based applications, allowing them to generate text, summarize information, or answer questions as part of existing backend services. It takes a Llama 2 model checkpoint file and provides text responses based on your input. This is ideal for backend developers or architects building scalable Java applications that need to leverage large language models locally and efficiently.

No commits in the last 6 months.

Use this if you are a Java backend developer needing to embed Llama 2 inference capabilities directly within your application, prioritizing high performance and local deployment on CPU or NVIDIA GPUs.

Not ideal if you need to run large language models other than Llama 2, prefer cloud-based LLM APIs, or are working outside of the Java ecosystem.

Java backend development local LLM inference application integration text generation AI-powered applications
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 6 / 25

How are scores calculated?

Stars

13

Forks

1

Language

Java

License

Apache-2.0

Last pushed

Sep 02, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/LastBotInc/llama2j"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.