LastBotInc/llama2j

Pure Java Llama2 inference with optional multi-GPU CUDA implementation

/ 100

Experimental

This project helps integrate Llama 2 language models directly into Java-based applications, allowing them to generate text, summarize information, or answer questions as part of existing backend services. It takes a Llama 2 model checkpoint file and provides text responses based on your input. This is ideal for backend developers or architects building scalable Java applications that need to leverage large language models locally and efficiently.

No commits in the last 6 months.

Use this if you are a Java backend developer needing to embed Llama 2 inference capabilities directly within your application, prioritizing high performance and local deployment on CPU or NVIDIA GPUs.

Not ideal if you need to run large language models other than Llama 2, prefer cloud-based LLM APIs, or are working outside of the Java ecosystem.

Java backend development local LLM inference application integration text generation AI-powered applications

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Java

License

Apache-2.0

Higher-rated alternatives

ludwig-ai/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

withcatai/node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...

mudler/LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and...

zhudotexe/kani

kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)

SciSharp/LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

Explore Transformer Models

All categories Trending Transformer directory Insights