Collections: awesome-llama
https://github.com/vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
amd blackwell cuda deepseek deepseek-v3 gpt gpt-oss inference kimi llama llm llm-serving model-serving moe openai pytorch qwen qwen3 tpu transformer
Last synced: 04 Feb 2026
https://github.com/unslothai/unsloth
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
agent deepseek deepseek-r1 fine-tuning gemma gemma3 gpt-oss llama llama3 llm llms mistral openai qwen qwen3 reinforcement-learning text-to-speech tts unsloth voice-cloning
Last synced: 04 Feb 2026
https://github.com/run-llama/LlamaIndexTS
Data framework for your LLM applications. Focus on server side solution
agent chatbot claude-ai create-llama embedding groq-ai javascript llama llama-index llama3 llamaindex llm node nodejs openai react typescript
Last synced: 03 Feb 2026
https://github.com/lobehub/lobehub
🤯 LobeHub - an open-source, modern design AI Agent Workspace. Supports multiple AI providers, Knowledge Base (file upload / RAG ), one click install MCP Marketplace and Artifacts / Thinking. One-click FREE deployment of your private AI Agent application.
agent ai artifacts chat chatgpt claude deepseek deepseek-r1 function-calling gemini gpt knowledge-base mcp nextjs ollama openai rag
Last synced: 04 Feb 2026
https://github.com/mangiucugna/json_repair
A python module to repair invalid JSON from LLMs
deep-learning gpt-4 json llama3 llm machine-learning mistral openai-api parser repair
Last synced: 04 Feb 2026
https://github.com/langgenius/dify
Production-ready platform for agentic workflow development.
agent agentic-ai agentic-framework agentic-workflow ai automation gemini genai gpt gpt-4 llm low-code mcp nextjs no-code openai orchestration python rag workflow
Last synced: 04 Feb 2026
https://github.com/PaddlePaddle/PaddleNLP
Easy-to-use and powerful LLM and SLM library with awesome model zoo.
bert compression distributed-training document-intelligence embedding ernie information-extraction llama llm neural-search nlp paddlenlp pretrained-models question-answering search-engine semantic-analysis sentiment-analysis transformers uie
Last synced: 04 Feb 2026
https://github.com/nomic-ai/gpt4all
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
ai-chat llm-inference
Last synced: 04 Feb 2026
https://github.com/SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
chatbot gpt llama llama-cpp llama2 llama3 llamacpp llava llm multi-modal semantic-kernel
Last synced: 04 Feb 2026
https://github.com/scisharp/llamasharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.
chatbot gpt llama llama-cpp llama2 llama3 llamacpp llava llm multi-modal semantic-kernel
Last synced: 04 Feb 2026
https://github.com/huggingface/text-generation-inference
Large Language Model Text Generation Inference
bloom deep-learning falcon gpt inference nlp pytorch starcoder transformer
Last synced: 04 Feb 2026
https://github.com/modelscope/ms-swift
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...) (AAAI 2025).
deepseek-r1 embedding grpo internvl liger llama llama4 llm lora megatron moe multimodal open-r1 peft qwen3 qwen3-next qwen3-omni qwen3-vl reranker sft
Last synced: 04 Feb 2026
https://github.com/zilliztech/GPTCache
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
aigc autogpt babyagi chatbot chatgpt chatgpt-api dolly gpt langchain llama llama-index llm memcache milvus openai redis semantic-search similarity-search vector-search
Last synced: 04 Feb 2026
https://github.com/hiyouga/LlamaFactory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
agent ai deepseek fine-tuning gemma gpt instruction-tuning large-language-models llama llama3 llm lora moe nlp peft qlora quantization qwen rlhf transformers
Last synced: 04 Feb 2026
https://github.com/awaescher/OllamaSharp
The easiest way to use Ollama in .NET
ai gpt ichatclient library llama llamacpp llm localllama microsoft-extensions-ai ollama ollama-api streaming
Last synced: 04 Feb 2026
https://github.com/mlc-ai/web-llm
High-performance In-browser LLM Inference Engine
chatgpt deep-learning language-model llm tvm webgpu webml
Last synced: 03 Feb 2026
https://github.com/InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
codellama cuda-kernels deepspeed fastertransformer internlm llama llama2 llama3 llm llm-inference turbomind
Last synced: 04 Feb 2026
https://github.com/modular/modular
The Modular Platform (includes MAX & Mojo)
ai language machine-learning max modular mojo programming-language
Last synced: 04 Feb 2026
https://github.com/internlm/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
codellama cuda-kernels deepspeed fastertransformer internlm llama llama2 llama3 llm llm-inference turbomind
Last synced: 04 Feb 2026
https://github.com/xorbitsai/inference
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source, speech, and multimodal models on cloud, on-prem, or your laptop — all through one unified, production-ready inference API.
artificial-intelligence chatglm deployment flan-t5 gemma ggml glm4 inference llama llama3 llamacpp llm machine-learning mistral openai-api pytorch qwen vllm whisper wizardlm
Last synced: 04 Feb 2026
https://github.com/oobabooga/text-generation-webui
The definitive Web UI for local AI, with powerful features and easy setup.
Last synced: 04 Feb 2026
https://github.com/sigoden/aichat
All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
ai ai-agents chatbot claude cli function-calling gemini llm ollama openai rag rust shell webui
Last synced: 04 Feb 2026
https://github.com/sobelio/llm-chain
`llm-chain` is a powerful rust crate for building chains in large language models allowing you to summarise text and complete complex tasks
chatgpt langchain llama llm openai rust text-summary
Last synced: 04 Feb 2026
https://github.com/ollama/ollama
Get up and running with GLM-4.7, DeepSeek, gpt-oss, Qwen, Gemma and other models.
deepseek gemma gemma3 gemma3n go golang gpt-oss llama llama2 llama3 llava llm llms mistral ollama phi4 qwen
Last synced: 04 Feb 2026
https://github.com/open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
benchmark chatgpt evaluation large-language-model llama2 llama3 llm openai
Last synced: 04 Feb 2026
https://github.com/run-llama/llama_cloud_services
Knowledge Agents and Management in the Cloud
document document-parser document-parsing docx-to-markdown parsing pdf pdf-document-processor pdf-to-excel pdf-to-json pdf-to-markdown pdf-to-text ppt-to-json ppt-to-markdown pptx structured-data tables
Last synced: 04 Feb 2026
https://github.com/ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
computer-vision data-centric data-science deep deep-learning deeplearning fine-tuning learning llama llama2 llm llm-training machine-learning machinelearning mistral ml natural-language natural-language-processing neural-network pytorch
Last synced: 04 Feb 2026
https://github.com/floneum/floneum
Instant, controllable, local pre-trained AI models in Rust
ai candle constrained-generation dioxus floneum-v3 kalosm llama llamacpp llm mistral rust transcription whisper
Last synced: 04 Feb 2026
https://github.com/twinnydotdev/twinny
The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but 100% free.
artificial-intelligence code-chat code-completion code-generation codellama copilot free llama2 llamacpp ollama ollama-api ollama-chat private symmetry vscode-extension
Last synced: 04 Feb 2026
https://github.com/TheR1D/shell_gpt
A command-line productivity tool powered by AI large language models like GPT-4, will help you accomplish your tasks faster and more efficiently.
chatgpt cheat-sheet cli commands gpt-3 gpt-4 linux llama llm ollama openai productivity python shell terminal
Last synced: 04 Feb 2026
https://github.com/explosion/curated-transformers
🤖 A PyTorch library of curated Transformer models and their composable components
albert bert camembert dolly2 falcon gptneox llama llm llms nlp pytorch roberta transformer transformers xlm-roberta
Last synced: 04 Feb 2026
https://github.com/bentoml/OpenLLM
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
bentoml fine-tuning llama llama2 llama3-1 llama3-2 llama3-2-vision llm llm-inference llm-ops llm-serving llmops mistral mlops model-inference open-source-llm openllm vicuna
Last synced: 04 Feb 2026
https://github.com/bentoml/openllm
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
bentoml fine-tuning llama llama2 llama3-1 llama3-2 llama3-2-vision llm llm-inference llm-ops llm-serving llmops mistral mlops model-inference open-source-llm openllm vicuna
Last synced: 04 Feb 2026
https://github.com/withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
ai bindings catai cmake cmake-js cuda embedding function-calling gguf gpu grammar json-schema llama llama-cpp llm metal nodejs prebuilt-binaries self-hosted vulkan
Last synced: 04 Feb 2026
https://github.com/tenstorrent/tt-metal
:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.
accelerator ai cuda deepseek gpu img-gen kernels llama llm metal scale-out stable-diffusion tenstorrent video-gen
Last synced: 04 Feb 2026
https://github.com/meta-llama/llama-cookbook
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
ai finetuning langchain llama llama2 llm machine-learning python pytorch vllm
Last synced: 04 Feb 2026
https://github.com/langroid/langroid
Harness LLMs with Multi-Agent Programming
agents ai chatgpt function-calling gpt gpt-4 gpt4 information-retrieval language-model llama llm llm-agent llm-framework local-llm multi-agent-systems openai-api rag retrieval-augmented-generation
Last synced: 04 Feb 2026
https://github.com/predibase/lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
fine-tuning gpt llama llm llm-inference llm-serving llmops lora model-serving pytorch transformers
Last synced: 04 Feb 2026
https://github.com/SilasMarvin/lsp-ai
LSP-AI is an open-source language server that serves as a backend for AI-powered functionality, designed to assist and empower software engineers, not replace them.
ai auto-completion developer-tools ide language-client llama llamacpp llm lsp mistral openai self-hosted
Last synced: 04 Feb 2026
https://github.com/InternLM/xtuner
A Next-Generation Training Engine Built for Ultra-Large MoE Models
agent deepseek-v3 gpt-oss intern-s1 internvl kimi-k2 llm multimodal qwen3-moe qwen3-vl reinforcement-learning
Last synced: 04 Feb 2026
https://github.com/internlm/xtuner
A Next-Generation Training Engine Built for Ultra-Large MoE Models
agent deepseek-v3 gpt-oss intern-s1 internvl kimi-k2 llm multimodal qwen3-moe qwen3-vl reinforcement-learning
Last synced: 04 Feb 2026
https://github.com/datajuicer/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
data data-analysis data-pipeline data-processing data-science data-visualization foundation-models instruction-tuning large-language-models llm llms multi-modal pre-training synthetic-data
Last synced: 04 Feb 2026
https://github.com/haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning
Last synced: 03 Feb 2026
https://github.com/haotian-liu/llava
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning
Last synced: 04 Feb 2026
https://github.com/explosion/spacy-llm
🦙 Integrating LLMs into structured NLP pipelines
anthropic claude cohere dolly falcon gpt-3 gpt-4 large-language-models llama llm machine-learning named-entity-recognition natural-language-processing nlp openai prompt-engineering spacy text-classification
Last synced: 04 Feb 2026
https://github.com/yoshoku/llama_cpp.rb
llama_cpp.rb provides Ruby bindings for llama.cpp
ai gem llama llm ruby
Last synced: 04 Feb 2026
https://github.com/mybigday/llama.rn
React Native binding of llama.cpp
android ios llama llama-cpp llm react-native
Last synced: 04 Feb 2026
https://github.com/shroominic/codeinterpreter-api
👾 Open source implementation of the ChatGPT Code Interpreter
chatgpt chatgpt-code-generation code-interpreter codeinterpreter langchain llm-agent
Last synced: 03 Feb 2026
https://github.com/bigscience-workshop/petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
bloom chatbot deep-learning distributed-systems falcon gpt guanaco language-models large-language-models llama machine-learning mixtral neural-networks nlp pipeline-parallelism pretrained-models pytorch tensor-parallelism transformer volunteer-computing
Last synced: 03 Feb 2026
https://github.com/k8sgpt-ai/k8sgpt
Giving Kubernetes Superpowers to everyone
ai devops kubernetes llama openai sre tooling
Last synced: 04 Feb 2026
https://github.com/kyegomez/zeta
Build high-performance AI models with modular building blocks
attention-mechanism attention-model chatgpt ffns llms lucidrains openai pytorch pytorch-implementation pytorch-tutorial tensorflow transformer-architecture transformers
Last synced: 03 Feb 2026
https://github.com/mdrokz/rust-llama.cpp
LLama.cpp rust bindings
api-bindings cpp crates-io ffi llama llama-cpp machine-learning model rust
Last synced: 04 Feb 2026
https://github.com/gbaptista/ollama-ai
A Ruby gem for interacting with Ollama's API that allows you to run open source AI LLMs (Large Language Models) locally.
ai alpaca bakllava dolphin llama llama2 llava llm mistral mistral-ai mixtral nano-bots ollama ollama-api openorca vicuna
Last synced: 04 Feb 2026
https://github.com/sendbird/sendbird-chat-sdk-javascript
Sendbird Chat SDK for JavaScript.
api-for-chat bard chat-api chat-api-platform chat-platform chat-sdk chatbot-api chatbot-sdk chatgpt communications-platform genai-chatbot genai-chatbot-api genai-chatbot-sdk gpt-powered-chatbot instant-messaging-api llama2 messaging-api messaging-platform messaging-sdk palm2
Last synced: 03 Feb 2026
https://github.com/mishushakov/llm-scraper
Turn any webpage into structured data using LLMs
ai artificial-intelligence browser browser-automation gpt gpt-4 langchain llama llm openai playwright puppeteer scraper
Last synced: 04 Feb 2026
https://github.com/h2oai/h2ogpt
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
ai chatgpt embeddings fedramp generative gpt gpt4all llama2 llm mixtral pdf private privategpt vectorstore
Last synced: 04 Feb 2026
https://github.com/sendbird/sendbird-uikit-react-native
Build chat in minutes with Sendbird UIKit open source code.
api-for-chat bard chat-api chat-api-platform chat-platform chat-sdk chat-ui chatbot-api chatbot-ui chatgpt communications-platform genai-chatbot genai-chatbot-api gpt-powered-chatbot gpt-ui llama2 messaging-api messaging-platform messaging-sdk palm2
Last synced: 04 Feb 2026
https://github.com/smallcloudai/refact
AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.
ai-agent developer-tools enterprise fine-tuning on-prem open-source rag self-hosted swe-bench vscode
Last synced: 04 Feb 2026
https://github.com/snowby666/poe-api-wrapper
👾 A Python API wrapper for Poe.com. With this, you will have free access to GPT-4, Claude, Llama, Gemini, Mistral and more! 🚀
api chatbot chatgpt claude code-llama dall-e gemini gpt-4 groq llama mistral openai palm2 poe poe-api python quora qwen reverse-engineering stable-diffusion
Last synced: 04 Feb 2026
https://github.com/ChunelFeng/CGraph
【A common used C++ & Python DAG framework】 一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流
ai ai-agents dag graph pipeline taskflow workflow
Last synced: 04 Feb 2026
https://github.com/chunelfeng/cgraph
【A common used C++ & Python DAG framework】 一个通用的、无三方依赖的、跨平台的、收录于awesome-cpp的、基于流图的并行计算框架。欢迎star & fork & 交流
ai ai-agents dag graph pipeline taskflow workflow
Last synced: 04 Feb 2026
https://github.com/tak-bro/aicommit2
A Reactive CLI that generates commit messages for Git and Jujutsu with Ollama, ChatGPT, Gemini, Claude, Mistral and other AI
ai-commits aicommit aicommits anthropic chatgpt claude cli codestral cohere deepseek git-commit groq jj jujutsu llama mistral ollama perplexity pre-commit pre-commit-hook
Last synced: 04 Feb 2026
https://github.com/expectedparrot/edsl
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
anthropic data-labeling deepinfra domain-specific-language experiments llama2 llm llm-agent llm-framework llm-inference market-research mixtral open-source openai python social-science surveys synthetic-data
Last synced: 04 Feb 2026
https://github.com/unifyai/unify
Notion for AI Observability 📊
ai claude gpt gpt-4 llama2 llm llm-inference llms mixtral openai python
Last synced: 03 Feb 2026
https://github.com/eidolon-ai/eidolon
The first AI Agent Server, Eidolon is a pluggable Agent SDK and enterprise ready, deployment server for Agentic applications
agents generative-ai langchain llama llm openai python services
Last synced: 04 Feb 2026
https://github.com/pleisto/flappy
Production-Ready LLM Agent SDK for Every Developer
agent chatgpt generative-ai llama llm rewoo transformers
Last synced: 04 Feb 2026
https://github.com/tongjilibo/bert4torch
An elegent pytorch implement of transformers
belle bert bert4keras bert4torch chatglm large-language-models llama llm named-entity-recognition nlp pytorch relation-extraction seq2seq text-classification transformers
Last synced: 04 Feb 2026
https://github.com/Tongjilibo/bert4torch
An elegent pytorch implement of transformers
belle bert bert4keras bert4torch chatglm large-language-models llama llm named-entity-recognition nlp pytorch relation-extraction seq2seq text-classification transformers
Last synced: 04 Feb 2026
https://github.com/ngxson/wllama
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
llama llamacpp llm wasm webassembly
Last synced: 04 Feb 2026
https://github.com/aandrew-me/tgpt
AI Chatbots in terminal without needing API keys
ai chatbot chatgpt cli go golang gpt4 linux llama macos mixtral terminal windows
Last synced: 03 Feb 2026
https://github.com/atome-fe/llama-node
Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.
ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv
Last synced: 03 Feb 2026
https://github.com/Noeda/rllama
Rust+OpenCL+AVX2 implementation of LLaMA inference code
Last synced: 04 Feb 2026
https://github.com/zya/litellmjs
JavaScript implementation of LiteLLM.
javascript llama2 llm nodejs ollama openai
Last synced: 03 Feb 2026
https://github.com/Gimer-Studios/APIMyLlama
API up your Ollama Server.
ai easy-to-use ollama ollama-api open-source
Last synced: 04 Feb 2026
https://github.com/ad-si/cai
User friendly CLI tool for AI tasks. Stop thinking about LLMs and prompts, start getting results!
ai anthropic chatgpt claude cli gpt gpt-4o gpt-5 groq llama llama3 llamafile llm machine-learning mistral ml ollama openai prompt rust
Last synced: 03 Feb 2026
https://github.com/melih-unsal/DemoGPT
🤖 Everything you need to create an LLM Agent—tools, prompts, frameworks, and models—all in one place.
agent agents ai artificial-intelligence autogpt autonomous-agents chatgpt chatgpt-api deepseek demo gpt-4 langchain langchain-app langchain-python llms o1 openai python streamlit streamlit-application
Last synced: 03 Feb 2026
https://github.com/melih-unsal/demogpt
🤖 Everything you need to create an LLM Agent—tools, prompts, frameworks, and models—all in one place.
agent agents ai artificial-intelligence autogpt autonomous-agents chatgpt chatgpt-api deepseek demo gpt-4 langchain langchain-app langchain-python llms o1 openai python streamlit streamlit-application
Last synced: 03 Feb 2026
https://github.com/icebaker/ruby-nano-bots
Ruby Implementation of Nano Bots: small, AI-powered bots that can be easily shared as a single file, designed to support multiple providers such as Anthropic Claude, Cohere Command, Google Gemini, Maritaca AI, Mistral AI, Ollama, OpenAI ChatGPT, and others, with support for calling tools (functions).
anthropic anthropic-claude chatgpt cohere-ai gemini gemini-pro google-ai google-gemini google-vertex-ai gpt gpt-4 llama llm maritaca-ai mistral mistral-ai nano-bots ollama openai openai-api
Last synced: 03 Feb 2026
https://github.com/bolna-ai/bolna
Conversational voice AI agents
agentic-ai agents ai-agents cartesia conversational-ai deepgram deepseek deepseek-chat elevenlabs function-calling gpt-4 llama openai plivo twilio voice-agents voice-ai-agents voice-assistant whisper
Last synced: 04 Feb 2026
https://github.com/Picovoice/picollm
On-device LLM Inference Powered by X-Bit Quantization
compression efficient-inference gemma generative-ai language-model language-models large-language-model llama llama2 llama3 llm llm-inference llms mistral mixtral model-compression natural-language-processing quantization self-hosted
Last synced: 04 Feb 2026
https://github.com/zjunlp/EasyEdit
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
artificial-intelligence baichuan chatgpt cknowedit easyedit easyedit2 efficient gpt knowedit knowledge-editing knowlm large-language-models llama mmedit model-editing natural-language-processing safeedit tool trustworthy-ai unlearning
Last synced: 04 Feb 2026
https://github.com/artitw/text2text
Text2Text Language Modeling Toolkit
chatbot chatgpt cross-lingual embeddings information-retrieval levenshtein-distance llama llm multi-lingual nlp question-generation rag search tf-idf tokenizer transformers translator
Last synced: 04 Feb 2026
https://github.com/Tiiny-AI/PowerInfer
High-speed Large Language Model Serving for Local Deployment
large-language-models llama llm llm-inference local-inference
Last synced: 04 Feb 2026
https://github.com/10Nates/ollama-autocoder
A simple to use Ollama autocompletion engine with options exposed and streaming functionality
ai autocomplete copilot llama llm ollama ollama-interface vscode vscode-extension
Last synced: 04 Feb 2026
https://github.com/Atome-FE/llama-node
Believe in AI democratization. llama for nodejs backed by llama-rs, llama.cpp and rwkv.cpp, work locally on your laptop CPU. support llama/alpaca/gpt4all/vicuna/rwkv model.
ai embeddings gpt langchain large-language-models llama llama-node llama-rs llamacpp llm napi napi-rs nodejs rwkv
Last synced: 03 Feb 2026
https://github.com/belladoreai/llama3-tokenizer-js
JS tokenizer for LLaMA 3 and LLaMA 3.1
llama llama3 llm tokenizer
Last synced: 04 Feb 2026
https://github.com/zhudotexe/kani
kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)
chatgpt framework function-calling gpt-4 large-language-models llama llms microframework openai tool-use
Last synced: 04 Feb 2026
https://github.com/livingbio/fuzzy-json
Fuzzy-JSON is a compact Python package with no dependencies, designed to address the pesky JSONDecodeError that sometimes occurs when utilizing OpenAI's powerful call function.
json llama llm openai openai-chatgpt python
Last synced: 04 Feb 2026
https://github.com/aidatatools/ollama-benchmark
LLM Benchmark for Throughput via Ollama (Local LLMs)
ai ai-tools benchmark llm ollama
Last synced: 04 Feb 2026
https://github.com/datawhalechina/self-llm
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
chatglm chatglm3 gemma-2b-it glm-4 internlm2 llama3 llm lora minicpm q-wen qwen qwen1-5 qwen2
Last synced: 04 Feb 2026
https://github.com/liltom-eth/llama2-webui
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Use `llama2-wrapper` as your local llama2 backend for Generative Agents/Apps.
llama-2 llama2 llm llm-inference
Last synced: 03 Feb 2026
https://github.com/karpathy/llama2.c
Inference Llama 2 in one file of pure C
Last synced: 03 Feb 2026
https://github.com/mybigday/llama.node
Node.js binding of llama.cpp
llama llama-cpp llamacpp llm node-js nodejs
Last synced: 03 Feb 2026
https://github.com/Simatwa/python-tgpt
AI Chat in Terminal + Package + REST-API
ai blackboxai chatgp chatgpt fastapi gemini gpt koboldai llama llama2 novita openai perplexity poe python-tgpt terminal-gpt tgpt
Last synced: 04 Feb 2026
https://github.com/lenML/tokenizers
a lightweight no-dependency fork from transformers.js (only tokenizers)
baichuan chatglm chatgpt gpt4 llama2 llama3 mistral tokenizer transfomers
Last synced: 03 Feb 2026
https://github.com/axflow/axflow
The TypeScript framework for AI development
ai llm typescript
Last synced: 04 Feb 2026
https://github.com/fardjad/rs-llama-cpp
Automated Rust bindings generation for LLaMA.cpp
Last synced: 04 Feb 2026
https://github.com/Strvm/meta-ai-api
Llama 3 API 70B & 405B (MetaAI Reverse Engineered)
405b 70b ai api llama llama2 llama3 meta
Last synced: 04 Feb 2026
Statistics
- Projects: 2,239
- Last updated: over 1 year ago