Stitch simplifies and scales LLM application deployment, reducing infrastructure complexity and costs.
-
Updated
Jun 2, 2024 - Python
Stitch simplifies and scales LLM application deployment, reducing infrastructure complexity and costs.
A framework for few-shot evaluation of autoregressive language models.
CodePhoenix è un'applicazione basata su un sistema multi-agenti intelligenti, progettata per refactoring e miglioramento del codice legacy. Sviluppata per la sfida Hackathon di CodeMotion per CodeRebirth 2024, integra microservizi specializzati per analisi del codice, refactoring, test e scansione delle vulnerabilità.
Automating the deployment of the Takeoff Server on AWS for LLMs
Okik is serving framework to deploy LLMs and much more.
Streaming of LLM responses in realtime using Fastapi and Streamlit.
Building Static Web Applications using Large Language Model. From hand sketched documents, images and screenshots to proper web pages.
You can run any large language model on your local machine with this repository.
A guide on how to run LLMs on intel CPUs
A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App
LLM Security Platform.
Claude 3.5 Sonnet ComputerUse (Beta) for Win64
A library to benchmark LLMs via their API exposure
Lightweight wrapper for cortecs.ai enabling 🔵 instant provisioning
A Framework For Intelligence Farming
A REST API for vLLM, production ready
EmbeddedLLM: API server for Embedded Device Deployment. Currently support CUDA/OpenVINO/IpexLLM/DirectML/CPU
Friendli: the fastest serving engine for generative AI
Add a description, image, and links to the llm-serving topic page so that developers can more easily learn about it.
To associate your repository with the llm-serving topic, visit your repo's landing page and select "manage topics."