Models
Teapot models are designed to run anywhere — from local CPUs and edge devices to production-scale systems — while staying strong on general-purpose tasks like question answering, summarization, and information extraction. They’re optimized for fast, efficient, and grounded responses, making them a good fit when latency, cost, and reliability matter.
Whether you need a lightweight model for on-device inference or a larger model for higher accuracy, the Teapot family provides flexible options that excel at in-context reasoning and hallucination-resistant outputs. If you’re looking to fine-tune for your specific use case (proprietary data, domain workflows, custom formatting/refusals), get in contact with us to discuss custom training and deployment.
TinyTeapot 🫖
Edge / CPU-friendlyA lightweight grounded model designed for fast, low-latency inference while still performing strong in-context Q&A and hallucination-resistant extraction when given a document/passages to cite from.
TeapotLLM 🫖
Higher accuracyThe larger “previous work” in the Teapot family: stronger grounding and extraction fidelity for context-faithful Q&A, refusal behavior, and structured information extraction—at higher compute cost.