teapot

How we built Teapot AI- An Overview of Our Browser-Based Private LLM Agent

Check out our demo at: https://teapotai.com/chat

Introduction

We’re excited to introduce Teapot AI, an innovative browser-based language model agent that prioritizes privacy and runs entirely on the user’s device. This breakthrough allows for strong AI reasoning capabilities without compromising user data. Here’s a deep dive into the architecture that makes it all possible.

Goals

We created Teapot AI with a simple goal- to keep your data private and make AI affordable by running it on your own device. No more data leaks or hefty cloud fees. There are many intersting technical challenges that this uncovered.

Architecture

Teapot AI is structured into several key components that interact to provide a seamless and private AI experience: ./architecture.png

Scraper

The foundation of Teapot AI’s knowledge acquisition lies in its scraping abilities. It consists of:

Chrome Extension DOM (Document Object Model): This allows Teapot AI to read the content of the web page you’re viewing directly within the browser.

jsdom: A JavaScript implementation of the DOM, which is used to manipulate the content of a page as if it were being rendered, but within Node.js.

Scraping Server: To fetch data from external pages, bypassing CORS (Cross-Origin Resource Sharing) issues, the scraping server securely retrieves publicly available data without additional permissions.

Once the data is scraped, Teapot AI needs to understand and find the best information related to your queries:

Nearest Neighbor Search Model: This model rapidly searches through the indexed information to find the most relevant data points in response to a query.
Vector Store: A storage system for text embeddings, enabling efficient similarity searches. It uses:

Text Chunking & Embedding: Breaking down text into manageable pieces and converting them into vector representations for better searchability.

pouchdb: An open-source database that stores the embeddings and enables quick retrieval.

Teapot Agent

At the heart of our system is the Teapot Agent, which handles the core interactions with the user:

Conversation: Manages the ongoing dialogue, maintaining the context and flow of the conversation.

Chat Context: Retains the user’s conversation history to provide relevant and coherent responses.

Custom Models

To accurately interpret and respond to user queries, Teapot AI employs custom models:

Intent Model: Powered by brain.js, it determines the user’s intention, whether they’re asking a question, requiring scraping of data, or initiating social interaction. Transformers.js The AI’s brain is built on Transformers.js:

Fine-tuned text2text model: Specifically trained for browser efficiency, it generates relevant completions based on input data.

Embedding Model: Converts text to vector form, facilitating the nearest neighbor search for relevant information.

Benchmarks & Key Learnings

In our tests, Teapot AI has shown efficiency and accuracy in browser-based AI tasks. Despite utilizing a smaller model that runs locally, latency was reduced while simultaneously aiding user data privacy, with an average latency of less than 10 seconds on a MacBook Pro. One of our key learnings has been that smaller models running locally can significantly reduce latency and increase privacy. Additionally, we’ve found that while models like flan-t5 are not capable of robust chain of thought reasoning, they perform better with direct question answering tasks augmented through techniques like Retrieval-Augmented Generation (RAG).

Average Latency: <10 seconds on MacBook Pro

Contribute

We believe in community-driven development and are looking to build a team focused on creating strong AI reasoning agents that respect user privacy.

Join us on Discord: Discord

Check us out on Hugging Face: Hugging Face

Join our mission to bring private, powerful AI directly to your browser!