#

llm-serving

Here are 35 public repositories matching this topic...

valyu-network / Stitch

Stitch simplifies and scales LLM application deployment, reducing infrastructure complexity and costs.

llm-serving llm-inference llm-framework llmstack

Updated Jun 2, 2024
Python

friendliai / lm-evaluation-harness

A framework for few-shot evaluation of autoregressive language models.

llms generative-ai llm-serving llm-inference

Updated Oct 25, 2024
Python

HACK-AVUM / CodePhoenix

CodePhoenix è un'applicazione basata su un sistema multi-agenti intelligenti, progettata per refactoring e miglioramento del codice legacy. Sviluppata per la sfida Hackathon di CodeMotion per CodeRebirth 2024, integra microservizi specializzati per analisi del codice, refactoring, test e scansione delle vulnerabilità.

react machine-learning ai machine-learning-algorithms sonarqube crew machine-learning-models llm llm-serving llm-agent sonarqube-enabled llm-frameworks

Updated Oct 10, 2024
Python

biomchen / llm-serving

Basic APIs for serving LLMs locally.

llm-serving llama2-7b llama2-13b

Updated Oct 25, 2024
Python

InquestGeronimo / horizon-takeoff

Automating the deployment of the Takeoff Server on AWS for LLMs

aws machine-learning cloud deep-learning ec2 server artificial-intelligence llmops llm-serving llm-inference

Updated Jan 16, 2024
Python

okikorg / okik

Okik is serving framework to deploy LLMs and much more.

python machine-learning deeplearning model-serving llm llmops llm-serving llm-inference

Updated Sep 11, 2024
Python

george-mountain / LLM-Local-Streaming

Streaming of LLM responses in realtime using Fastapi and Streamlit.

ai fastapi streamlit llm llm-serving llm-streaming

Updated Jan 21, 2024
Python

george-mountain / web-app-builder--LLM

Building Static Web Applications using Large Language Model. From hand sketched documents, images and screenshots to proper web pages.

ai pypi pypi-package streamlit llm llm-serving

Updated Mar 12, 2024
Python

Stosan / commentator

generative-ai llm-serving

Updated Jul 5, 2023
Python

suleymansevimli / run-llm-model-locally

You can run any large language model on your local machine with this repository.

python git-lfs huggingface llm llm-serving

Updated Dec 19, 2023
Python

biosfood / intel-llm-guide

A guide on how to run LLMs on intel CPUs

setup machine-learning tutorial guide intel setup-development-environment llm llm-serving llm-inference

Updated Jan 23, 2024
Python

fork123aniket / LLM-RAG-powered-QA-App

A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App

question-answering ray fine-tuning context-aware-system large-language-models ray-serve llmops llm-serving eleutherai llm-training llm-inference retrieval-augmented-generation parameter-efficient-fine-tuning

Updated Jan 8, 2024
Python

genia-dev / vibraniumdome

LLM Security Platform.

security openai prompts adversarial-attacks llm prompt-engineering chatgpt llmops large-language-model prompt-injection llm-serving llm-agent llm-security llm-inference llm-eval llm-framework prompt-injection-tool llm-evaluation llm-firewall

Updated Oct 28, 2024
Python

MinjaeKIM753 / ClaudeComputerUseBeta-Win64

Claude 3.5 Sonnet ComputerUse (Beta) for Win64

windows ai virtual-machine control-systems beta-testing claude sonnet llm llm-serving anthropic anthropic-claude sonnet3-5 computeruse

Updated Nov 3, 2024
Python

France-Travail / benchmark_llm_serving

A library to benchmark LLMs via their API exposure

benchmark llm llm-serving vllm

Updated Aug 5, 2024
Python

cortecs-ai / cortecs-py

Lightweight wrapper for cortecs.ai enabling 🔵 instant provisioning

llm-serving llm-inference

Updated Nov 5, 2024
Python

Neural-Dragon-AI / Cynde

A Framework For Intelligence Farming

xgboost autoscaling pydantic openai-api polars llm-serving llm-inference modal-labs pydantic-logfire intelligence-farming

Updated May 18, 2024
Python

France-Travail / happy_vllm

A REST API for vLLM, production ready

production api-rest llm llm-serving vllm

Updated Oct 29, 2024
Python

EmbeddedLLM / embeddedllm

EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU

windows cpu llama gemma mistral directx-12 openvino npu openvino-inference-engine aipc directml llm model-inference llm-serving llm-inference open-source-llm phi-3 ipexllm

Updated Oct 6, 2024
Python

friendliai / friendli-client

Friendli: the fastest serving engine for generative AI

ai ml inference gpt inference-server mistral inference-engine serving mlops gpt3 llm stable-diffusion llms generative-ai llmops llm-serving llm-inference llama2 llm-ops

Updated Oct 17, 2024
Python

Improve this page

Add a description, image, and links to the llm-serving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-serving topic, visit your repo's landing page and select "manage topics."