Collections: awesome-llama

https://github.com/vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd blackwell cuda deepseek deepseek-v3 gpt gpt-oss inference kimi llama llm llm-serving model-serving moe openai pytorch qwen qwen3 tpu transformer

Last synced: 01 Jun 2026

https://github.com/ggml-org/llama.cpp

LLM inference in C/C++

ggml

Last synced: 01 Jun 2026

https://github.com/unslothai/unsloth

Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.6, DeepSeek, gpt-oss locally.

agent deepseek fine-tuning gemma gemma3 gpt-oss llama llama3 llm llms mistral openai qwen reinforcement-learning self-hosted text-to-speech tts ui unsloth

Last synced: 01 Jun 2026

https://github.com/run-llama/LlamaIndexTS

Data framework for your LLM applications. Focus on server side solution

agent chatbot claude-ai create-llama embedding groq-ai javascript llama llama-index llama3 llamaindex llm node nodejs openai react typescript

Last synced: 01 Jun 2026

https://github.com/langgenius/dify

Production-ready platform for agentic workflow development.

agent agentic-ai agentic-framework agentic-workflow ai automation gemini genai gpt gpt-4 llm low-code mcp nextjs no-code openai orchestration python rag workflow

Last synced: 01 Jun 2026

https://github.com/mangiucugna/json_repair

A python module to repair invalid JSON from LLMs

deep-learning gpt-4 json llama3 llm machine-learning mistral openai-api parser repair

Last synced: 01 Jun 2026

https://github.com/scisharp/llamasharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

chatbot gpt llama llama-cpp llama2 llama3 llamacpp llava llm multi-modal semantic-kernel

Last synced: 01 Jun 2026

https://github.com/SciSharp/LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

chatbot gpt llama llama-cpp llama2 llama3 llamacpp llava llm multi-modal semantic-kernel

Last synced: 01 Jun 2026

https://github.com/nomic-ai/gpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

ai-chat llm-inference

Last synced: 01 Jun 2026

https://github.com/withcatai/node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level

ai bindings catai cmake cmake-js cuda embedding function-calling gguf gpu grammar json-schema llama llama-cpp llm metal nodejs prebuilt-binaries self-hosted vulkan

Last synced: 01 Jun 2026

https://github.com/modelscope/ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Gemma4, Llava, Phi4, ...) (AAAI 2025).

deepseek-r1 embedding grpo internvl liger llama llama4 llm lora megatron moe multimodal open-r1 peft qwen3 qwen3-6 qwen3-omni qwen3-vl reranker sft

Last synced: 01 Jun 2026

https://github.com/huggingface/text-generation-inference

Large Language Model Text Generation Inference

bloom deep-learning falcon gpt inference nlp pytorch starcoder transformer

Last synced: 01 Jun 2026

https://github.com/PaddlePaddle/PaddleNLP

Easy-to-use and powerful LLM and SLM library with awesome model zoo.

bert compression distributed-training document-intelligence embedding ernie information-extraction llama llm neural-search nlp paddlenlp pretrained-models question-answering search-engine semantic-analysis sentiment-analysis transformers uie

Last synced: 01 Jun 2026

https://github.com/modular/modular

The Modular Platform (includes MAX & Mojo)

ai language machine-learning max modular mojo programming-language

Last synced: 01 Jun 2026

https://github.com/InternLM/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

codellama cuda-kernels deepspeed fastertransformer internlm llama llama2 llama3 llm llm-inference turbomind

Last synced: 01 Jun 2026

https://github.com/zilliztech/GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

aigc autogpt babyagi chatbot chatgpt chatgpt-api dolly gpt langchain llama llama-index llm memcache milvus openai redis semantic-search similarity-search vector-search

Last synced: 01 Jun 2026

https://github.com/awaescher/OllamaSharp

The easiest way to use Ollama in .NET

ai gpt ichatclient library llama llamacpp llm localllama microsoft-extensions-ai ollama ollama-api streaming

Last synced: 01 Jun 2026

https://github.com/internlm/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

codellama cuda-kernels deepspeed fastertransformer internlm llama llama2 llama3 llm llm-inference turbomind

Last synced: 01 Jun 2026

https://github.com/mlc-ai/web-llm

High-performance In-browser LLM Inference Engine

chatgpt deep-learning language-model llm tvm webgpu webml

Last synced: 01 Jun 2026

https://github.com/xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.

artificial-intelligence chatglm deployment flan-t5 gemma ggml glm4 inference llama llama3 llamacpp llm machine-learning mistral openai-api pytorch qwen vllm whisper wizardlm

Last synced: 01 Jun 2026

https://github.com/oobabooga/textgen

Open-source desktop app for local LLMs. Text, vision, tool-calling, OpenAI/Anthropic-compatible API. 100% private.

Last synced: 01 Jun 2026

https://github.com/sigoden/aichat

All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.

ai ai-agents chatbot claude cli function-calling gemini llm ollama openai rag rust shell webui

Last synced: 01 Jun 2026

https://github.com/ollama/ollama

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

deepseek gemma gemma3 glm go golang gpt-oss llama llama3 llm llms minimax mistral ollama qwen

Last synced: 01 Jun 2026

https://github.com/sobelio/llm-chain

`llm-chain` is a powerful rust crate for building chains in large language models allowing you to summarise text and complete complex tasks

chatgpt langchain llama llm openai rust text-summary

Last synced: 01 Jun 2026

https://github.com/run-llama/llama_cloud_services

Knowledge Agents and Management in the Cloud

document document-parser document-parsing docx-to-markdown parsing pdf pdf-document-processor pdf-to-excel pdf-to-json pdf-to-markdown pdf-to-text ppt-to-json ppt-to-markdown pptx structured-data tables

Last synced: 01 Jun 2026

https://github.com/TheR1D/shell_gpt

A command-line productivity tool powered by AI large language models like GPT-5, will help you accomplish your tasks faster and more efficiently.

chatgpt cheat-sheet cli commands gpt-3 gpt-4 gpt-5 linux llama llm ollama openai productivity python shell terminal

Last synced: 01 Jun 2026

https://github.com/twinnydotdev/twinny

The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but 100% free.

artificial-intelligence code-chat code-completion code-generation codellama copilot free llama2 llamacpp ollama ollama-api ollama-chat private symmetry vscode-extension

Last synced: 01 Jun 2026

https://github.com/langroid/langroid

Harness LLMs with Multi-Agent Programming

agents ai chatgpt function-calling gpt gpt-4 gpt4 information-retrieval language-model llama llm llm-agent llm-framework local-llm multi-agent-systems openai-api rag retrieval-augmented-generation

Last synced: 01 Jun 2026

https://github.com/ludwig-ai/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

computer-vision data-centric data-science deep deep-learning deeplearning fine-tuning learning llama llama2 llm llm-training machine-learning machinelearning mistral ml natural-language natural-language-processing neural-network pytorch

Last synced: 01 Jun 2026

https://github.com/explosion/curated-transformers

🤖 A PyTorch library of curated Transformer models and their composable components

albert bert camembert dolly2 falcon gptneox llama llm llms nlp pytorch roberta transformer transformers xlm-roberta

Last synced: 01 Jun 2026

https://github.com/open-compass/opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

benchmark chatgpt evaluation large-language-model llama2 llama3 llm openai

Last synced: 01 Jun 2026

https://github.com/bentoml/OpenLLM

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

bentoml fine-tuning llama llama2 llama3-1 llama3-2 llama3-2-vision llm llm-inference llm-ops llm-serving llmops mistral mlops model-inference open-source-llm openllm vicuna

Last synced: 01 Jun 2026

https://github.com/bentoml/openllm

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

bentoml fine-tuning llama llama2 llama3-1 llama3-2 llama3-2-vision llm llm-inference llm-ops llm-serving llmops mistral mlops model-inference open-source-llm openllm vicuna

Last synced: 01 Jun 2026

https://github.com/tenstorrent/tt-metal

:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.

accelerator ai cuda deepseek gpu img-gen kernels llama llm metal scale-out stable-diffusion tenstorrent video-gen

Last synced: 01 Jun 2026

https://github.com/predibase/lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

fine-tuning gpt llama llm llm-inference llm-serving llmops lora model-serving pytorch transformers

Last synced: 01 Jun 2026

https://github.com/meta-llama/llama-cookbook

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services

ai finetuning langchain llama llama2 llm machine-learning python pytorch vllm

Last synced: 01 Jun 2026

https://github.com/datajuicer/data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

data data-analysis data-pipeline data-processing data-science data-visualization foundation-models instruction-tuning large-language-models llm llms multi-modal pre-training synthetic-data

Last synced: 01 Jun 2026

https://github.com/SilasMarvin/lsp-ai

LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.

ai auto-completion developer-tools ide language-client llama llamacpp llm lsp mistral openai self-hosted

Last synced: 01 Jun 2026

https://github.com/InternLM/xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

agent deepseek-v3 gpt-oss intern-s1 internvl kimi-k2 llm multimodal qwen3-moe qwen3-vl reinforcement-learning

Last synced: 01 Jun 2026

https://github.com/internlm/xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

agent deepseek-v3 gpt-oss intern-s1 internvl kimi-k2 llm multimodal qwen3-moe qwen3-vl reinforcement-learning

Last synced: 01 Jun 2026

https://github.com/mybigday/llama.rn

React Native binding of llama.cpp

android ios llama llama-cpp llm react-native

Last synced: 01 Jun 2026

https://github.com/bigscience-workshop/petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

bloom chatbot deep-learning distributed-systems falcon gpt guanaco language-models large-language-models llama machine-learning mixtral neural-networks nlp pipeline-parallelism pretrained-models pytorch tensor-parallelism transformer volunteer-computing

Last synced: 01 Jun 2026

https://github.com/explosion/spacy-llm

🦙 Integrating LLMs into structured NLP pipelines

anthropic claude cohere dolly falcon gpt-3 gpt-4 large-language-models llama llm machine-learning named-entity-recognition natural-language-processing nlp openai prompt-engineering spacy text-classification

Last synced: 01 Jun 2026

https://github.com/yoshoku/llama_cpp.rb

llama_cpp.rb provides Ruby bindings for llama.cpp

ai gem llama llm ruby

Last synced: 01 Jun 2026

https://github.com/haotian-liu/LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning

Last synced: 01 Jun 2026

https://github.com/haotian-liu/llava

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning

Last synced: 01 Jun 2026

https://github.com/ChunelFeng/CGraph

【A common used C++ & Python DAG framework】一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流

ai ai-agents dag graph pipeline taskflow workflow

Last synced: 01 Jun 2026

https://github.com/chunelfeng/cgraph

【A common used C++ & Python DAG framework】一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流

ai ai-agents dag graph pipeline taskflow workflow

Last synced: 01 Jun 2026

https://github.com/tak-bro/aicommit2

A Reactive CLI that generates commit messages for Git and Jujutsu with Ollama, ChatGPT, Gemini, Claude, Mistral and other AI

ai-commits aicommit aicommits anthropic chatgpt claude cli codestral cohere deepseek git-commit groq jj lazygit llama mistral ollama perplexity pre-commit pre-commit-hook

Last synced: 01 Jun 2026

https://github.com/bolna-ai/bolna

Conversational voice AI agents

agentic-ai agents ai-agents cartesia conversational-ai deepgram deepseek deepseek-chat elevenlabs function-calling gpt-4 llama openai plivo twilio voice-agents voice-ai-agents voice-assistant whisper

Last synced: 01 Jun 2026

https://github.com/kyegomez/zeta

Build high-performance AI models with modular building blocks

attention-mechanism attention-model chatgpt ffns llms lucidrains openai pytorch pytorch-implementation pytorch-tutorial tensorflow transformer-architecture transformers

Last synced: 01 Jun 2026

https://github.com/k8sgpt-ai/k8sgpt

Giving Kubernetes Superpowers to everyone

ai devops kubernetes llama openai sre tooling

Last synced: 01 Jun 2026

https://github.com/sendbird/sendbird-uikit-react-native

Build chat in minutes with Sendbird UIKit open source code.

api-for-chat bard chat-api chat-api-platform chat-platform chat-sdk chat-ui chatbot-api chatbot-ui chatgpt communications-platform genai-chatbot genai-chatbot-api gpt-powered-chatbot gpt-ui llama2 messaging-api messaging-platform messaging-sdk palm2

Last synced: 01 Jun 2026

https://github.com/mdrokz/rust-llama.cpp

LLama.cpp rust bindings

api-bindings cpp crates-io ffi llama llama-cpp machine-learning model rust

Last synced: 01 Jun 2026

https://github.com/snowby666/poe-api-wrapper

👾 A Python API wrapper for Poe.com. With this, you will have free access to GPT-4, Claude, Llama, Gemini, Mistral and more! 🚀

api chatbot chatgpt claude code-llama dall-e gemini gpt-4 groq llama mistral openai palm2 poe poe-api python quora qwen reverse-engineering stable-diffusion

Last synced: 01 Jun 2026

https://github.com/gbaptista/ollama-ai

A Ruby gem for interacting with Ollama's API that allows you to run open source AI LLMs (Large Language Models) locally.

ai alpaca bakllava dolphin llama llama2 llava llm mistral mistral-ai mixtral nano-bots ollama ollama-api openorca vicuna

Last synced: 01 Jun 2026

https://github.com/sendbird/sendbird-chat-sdk-javascript

Sendbird Chat SDK for JavaScript.

api-for-chat bard chat-api chat-api-platform chat-platform chat-sdk chatbot-api chatbot-sdk chatgpt communications-platform genai-chatbot genai-chatbot-api genai-chatbot-sdk gpt-powered-chatbot instant-messaging-api llama2 messaging-api messaging-platform messaging-sdk palm2

Last synced: 01 Jun 2026

https://github.com/mishushakov/llm-scraper

Turn any webpage into structured data using LLMs

ai artificial-intelligence browser browser-automation gpt gpt-4 langchain llama llm openai playwright puppeteer scraper

Last synced: 01 Jun 2026

https://github.com/ngxson/wllama

WebAssembly binding for llama.cpp - Enabling on-browser LLM inference

llama llamacpp llm wasm webassembly

Last synced: 01 Jun 2026

https://github.com/shroominic/codeinterpreter-api

👾 Open source implementation of the ChatGPT Code Interpreter

chatgpt chatgpt-code-generation code-interpreter codeinterpreter langchain llm-agent

Last synced: 01 Jun 2026

https://github.com/h2oai/h2ogpt

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/

ai chatgpt embeddings fedramp generative gpt gpt4all llama2 llm mixtral pdf private privategpt vectorstore

Last synced: 01 Jun 2026

https://github.com/smallcloudai/refact

AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.

ai-agent developer-tools enterprise fine-tuning on-prem open-source rag self-hosted swe-bench vscode

Last synced: 01 Jun 2026

https://github.com/zya/litellmjs

JavaScript implementation of LiteLLM.

javascript llama2 llm nodejs ollama openai

Last synced: 01 Jun 2026

https://github.com/lobehub/lobehub

LobeHub organizes your agents into 7×24 operation. It hires, schedules, reports on your entire AI team. You stay in charge — without staying online.

agent agent-collaboration agent-harness ai chatgpt claude deepseek gemini gpt knowledge-base mcp openai

Last synced: 01 Jun 2026

https://github.com/expectedparrot/edsl

Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.

anthropic data-labeling deepinfra domain-specific-language experiments llama2 llm llm-agent llm-framework llm-inference market-research mixtral open-source openai python social-science surveys synthetic-data

Last synced: 01 Jun 2026

https://github.com/hiyouga/LlamaFactory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

agent ai deepseek fine-tuning gemma gpt instruction-tuning large-language-models llama llama3 llm lora moe nlp peft qlora quantization qwen rlhf transformers

Last synced: 01 Jun 2026

https://github.com/melih-unsal/demogpt

🤖 Everything you need to create an LLM Agent—tools, prompts, frameworks, and models—all in one place.

agent agents ai artificial-intelligence autogpt autonomous-agents chatgpt chatgpt-api deepseek demo gpt-4 langchain langchain-app langchain-python llms o1 openai python streamlit streamlit-application

Last synced: 01 Jun 2026

https://github.com/melih-unsal/DemoGPT

🤖 Everything you need to create an LLM Agent—tools, prompts, frameworks, and models—all in one place.

Last synced: 01 Jun 2026

https://github.com/unifyai/unify

Notion for AI Observability 📊

ai claude gpt gpt-4 llama2 llm llm-inference llms mixtral openai python

Last synced: 01 Jun 2026

https://github.com/eidolon-ai/eidolon

The first AI Agent Server, Eidolon is a pluggable Agent SDK and enterprise ready, deployment server for Agentic applications

agents generative-ai langchain llama llm openai python services

Last synced: 01 Jun 2026

https://github.com/zhudotexe/kani

kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)

chatgpt framework function-calling gpt-4 large-language-models llama llms microframework openai tool-use

Last synced: 01 Jun 2026

https://github.com/liltom-eth/llama2-webui

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps.

llama-2 llama2 llm llm-inference

Last synced: 01 Jun 2026

https://github.com/pleisto/flappy

Production-Ready LLM Agent SDK for Every Developer

agent chatgpt generative-ai llama llm rewoo transformers

Last synced: 01 Jun 2026

https://github.com/Picovoice/picollm

On-device LLM Inference Powered by X-Bit Quantization

compression efficient-inference gemma generative-ai language-model language-models large-language-model llama llama2 llama3 llm llm-inference llms mistral mixtral model-compression natural-language-processing quantization self-hosted

Last synced: 01 Jun 2026

https://github.com/aandrew-me/tgpt

AI Chatbots in terminal for free

ai chatbot chatgpt cli go golang gpt4 linux llama macos mixtral terminal windows

Last synced: 01 Jun 2026

https://github.com/ad-si/cai

User friendly CLI tool for AI tasks. Stop thinking about LLMs and prompts, start getting results!

ai anthropic chatgpt claude cli gpt gpt-4o gpt-5 groq llama llama3 llamafile llm machine-learning mistral ml ollama openai prompt rust

Last synced: 01 Jun 2026

https://github.com/10Nates/ollama-autocoder

A simple to use Ollama autocompletion engine with options exposed and streaming functionality

ai autocomplete copilot llama llm ollama ollama-interface vscode vscode-extension

Last synced: 01 Jun 2026

https://github.com/Noeda/rllama

Rust+OpenCL+AVX2 implementation of LLaMA inference code

Last synced: 01 Jun 2026

https://github.com/janhq/jan

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

chatgpt gpt llamacpp llm localai open-source self-hosted tauri

Last synced: 01 Jun 2026

https://github.com/Gimer-Studios/APIMyLlama

API up your Ollama Server.

ai easy-to-use ollama ollama-api open-source

Last synced: 01 Jun 2026

https://github.com/icebaker/ruby-nano-bots

Ruby Implementation of Nano Bots: small, AI-powered bots that can be easily shared as a single file, designed to support multiple providers such as Anthropic Claude, Cohere Command, Google Gemini, Maritaca AI, Mistral AI, Ollama, OpenAI ChatGPT, and others, with support for calling tools (functions).

anthropic anthropic-claude chatgpt cohere-ai gemini gemini-pro google-ai google-gemini google-vertex-ai gpt gpt-4 llama llm maritaca-ai mistral mistral-ai nano-bots ollama openai openai-api

Last synced: 01 Jun 2026

https://github.com/stochasticai/xturing

Build, personalize and control your own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJEk6

adapter deep-learning fine-tuning finetuning gen-ai generative-ai gpt-2 gpt-j language-model llama llm lora mistral mixed-precision peft quantization

Last synced: 01 Jun 2026

https://github.com/Vali-98/cui-llama.rn

React Native binding of llama.cpp

Last synced: 01 Jun 2026

https://github.com/zjunlp/EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

artificial-intelligence baichuan chatgpt cknowedit easyedit easyedit2 efficient gpt knowedit knowledge-editing knowlm large-language-models llama mmedit model-editing natural-language-processing safeedit tool trustworthy-ai unlearning

Last synced: 01 Jun 2026

https://github.com/strvm/meta-ai-api

Llama 3 API 70B & 405B (MetaAI Reverse Engineered)

405b 70b ai api llama llama2 llama3 meta

Last synced: 01 Jun 2026

https://github.com/stochasticai/xTuring

adapter deep-learning fine-tuning finetuning gen-ai generative-ai gpt-2 gpt-j language-model llama llm lora mistral mixed-precision peft quantization

Last synced: 01 Jun 2026

https://github.com/belladoreai/llama3-tokenizer-js

JS tokenizer for LLaMA 3 and LLaMA 3.1

llama llama3 llm tokenizer

Last synced: 01 Jun 2026

https://github.com/dzhng/zod-gpt

Get structured, fully typed, and validated JSON outputs from OpenAI and Anthropic models.

Last synced: 01 Jun 2026

https://github.com/Tiiny-AI/PowerInfer

High-speed Large Language Model Serving for Local Deployment

large-language-models llama llm llm-inference local-inference

Last synced: 01 Jun 2026

https://github.com/aidatatools/ollama-benchmark

LLM Benchmark for Throughput via Ollama (Local LLMs)

ai ai-tools benchmark llm ollama

Last synced: 01 Jun 2026

https://github.com/tongjilibo/bert4torch

An elegent pytorch implement of transformers

belle bert bert4keras bert4torch chatglm large-language-models llama llm named-entity-recognition nlp pytorch relation-extraction seq2seq text-classification transformers

Last synced: 01 Jun 2026

https://github.com/Tongjilibo/bert4torch

An elegent pytorch implement of transformers

belle bert bert4keras bert4torch chatglm large-language-models llama llm named-entity-recognition nlp pytorch relation-extraction seq2seq text-classification transformers

Last synced: 01 Jun 2026

https://github.com/dzhng/llm-api

Fully typed & consistent chat APIs for OpenAI, Anthropic, Groq, and Azure's chat models for browser, edge, and node environments.

Last synced: 01 Jun 2026

https://github.com/Strvm/meta-ai-api

Llama 3 API 70B & 405B (MetaAI Reverse Engineered)

405b 70b ai api llama llama2 llama3 meta

Last synced: 01 Jun 2026

https://github.com/Simatwa/python-tgpt

AI Chat in Terminal + Package + REST-API

ai blackboxai chatgp chatgpt fastapi gemini gpt koboldai llama llama2 novita openai perplexity poe python-tgpt terminal-gpt tgpt

Last synced: 01 Jun 2026

https://github.com/Atome-FE/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv

Last synced: 01 Jun 2026

https://github.com/atome-fe/llama-node

Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.

ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv

Last synced: 01 Jun 2026

https://github.com/datawhalechina/self-llm

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程

chatglm chatglm3 gemma-2b-it glm-4 internlm2 llama3 llm lora minicpm q-wen qwen qwen1-5 qwen2

Last synced: 01 Jun 2026

https://github.com/livingbio/fuzzy-json

Fuzzy-JSON is a compact Python package with no dependencies, designed to address the pesky JSONDecodeError that sometimes occurs when utilizing OpenAI's powerful call function.

json llama llm openai openai-chatgpt python

Last synced: 01 Jun 2026