LastBotInc/llama2j
Pure Java Llama2 inference with optional multi-GPU CUDA implementation
This project helps integrate Llama 2 language models directly into Java-based applications, allowing them to generate text, summarize information, or answer questions as part of existing backend services. It takes a Llama 2 model checkpoint file and provides text responses based on your input. This is ideal for backend developers or architects building scalable Java applications that need to leverage large language models locally and efficiently.
No commits in the last 6 months.
Use this if you are a Java backend developer needing to embed Llama 2 inference capabilities directly within your application, prioritizing high performance and local deployment on CPU or NVIDIA GPUs.
Not ideal if you need to run large language models other than Llama 2, prefer cloud-based LLM APIs, or are working outside of the Java ecosystem.
Stars
13
Forks
1
Language
Java
License
Apache-2.0
Category
Last pushed
Sep 02, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/LastBotInc/llama2j"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and...
zhudotexe/kani
kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)
SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.