Stitch simplifies and scales LLM application deployment, reducing infrastructure complexity and costs.
-
Updated
Jun 2, 2024 - Python
Stitch simplifies and scales LLM application deployment, reducing infrastructure complexity and costs.
A framework for few-shot evaluation of autoregressive language models.
Deep learning environment setups
Automating the deployment of the Takeoff Server on AWS for LLMs
Okik is serving framework to deploy LLMs and much more.
Streaming of LLM responses in realtime using Fastapi and Streamlit.
Building Static Web Applications using Large Language Model. From hand sketched documents, images and screenshots to proper web pages.
A Curated Paper List for Efficient Large Models
A Curated Paper List for Efficient Large Models
Quick tuto to deploy a vLLM instance using the vLLM-OpenAI Docker Image.
You can run any large language model on your local machine with this repository.
internet llm - access your ollama (or any other local llm) instance from across the internet
LLM Security Platform Docs
A guide on how to run LLMs on intel CPUs
Lightweight and extensible LLM Inference serving benchmark tool written in Rust.
An unofficial Go port of the official Tavily API Python Wrapper.
A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App
A library to benchmark LLMs via their API exposure
Add a description, image, and links to the llm-serving topic page so that developers can more easily learn about it.
To associate your repository with the llm-serving topic, visit your repo's landing page and select "manage topics."