Today, we release the first open source architecture for building on-chain AI Agents. Here is the repository: https://github.com/RSS3-Network/OpenAgent.
AI Agent has been a recurring topic within the AI X Web3 domain. The thesis is that we can build AI Agents to execute various on-chain tasks via natural language.
The core team at RSS3 initiated a technical investigation in late 2022. The first demonstration was present in mid 2023, and the final architecture (Mixture of Experts (MoE)) had been completed by the end of that year.
We understand that project teams in Web3 are building such agents in closed ways for different reasons. Our intention is to witness such construction following a distinct and open way.
Here, we open source the OpenAgent architecture which will drastically shorten the development time of AI Agents from months to hours. Looking forward, we hope to see more agents find their way back to the repository and even deployments on chain (preferably on the RSS3 VSL :D).
Technical Design
The OpenAgent architecture defines an AI Agent with five individual components:
- Client: a user interface for interactions
- (App Server): a server to store user-related data like user info, chat history, etc. (Optional)
- Interpreter: an LLM (or multiple LLMs) to invoke the appropriate expert based on messages received
- Experts: a group of experts that are designed to complete subtasks
- Information Source: a source that experts depend upon to determine the next step
- Executor: a special type of expert to execute transactions on chain
To better illustrate the process of handling a natural language request made by a user, here is a diagram:
Client
The client for such an AI agent often mimics ChatGPT, with a UI constructed primarily for a seamless conversational flow, and a history panel on the left. The OpenAgent architecture suggests that the client also adds a Task section where pending tasks can be viewed and edited.
This section makes it possible for users to check on pending transactions that are yet to be confirmed on chain, while also seeing trigger-based or scheduled tasks. A simple UI we’ve made for demonstration looks like this:
On the top right corner, a button would bring out the Task section.
This is a typical conversation-based UI, but since OpenAgent provides you an architecture with great flexibility and extensibility, I would imagine builders coming up with much more innovative ways for interactions.
(App Server)
App Server refers to a server that stores client data in a database. This data includes basic information, chat history, and other client specific configurations.
Although deploying such an app server is the conventional approach for managing such data, creating an Agent without one is entirely feasible. You can build it in a local-first way where sensitive data never leaves users’ devices, thereby enhancing the system's openness, transparency, and decentralization.
Interpreter
When a user sends a natural language request, our initial step involves recognizing its intent using an LLM. Yes, you can use any LLM for this as the architecture is fully composable - we recommend using a compatible (conform to OpenAI Functions) LLM that is open source (and yes, you can use GPT 3.5/4 if that’s what you like).
Then the request will be routed toward different experts. Collectively, this is know as the Mixture of Experts (MoE) architecture.
Experts
The MoE architecture maximizes the overall performance of an Agent as each expert receives carefully crafted prompts, data sources, and interpreter parameters. The layered structure significantly enhances composability and flexibility, allowing for the integration of any expert into the framework, and this means the possibility is limitless. There can be experts specializing in information processing, data analysis, on chain swapping, arbitraging, content creation, and much more. These experts can be chained together as well to complete a complex task.
One can even build a marketplace for people to build and charge for different experts (such a highly promising ecosystem!).
Information Source
Some experts depend heavily on real time on-chain data and information. in that case, an Agent needs to be connected to such data and information source through specific experts. With the demo we have built, RSS3 Network is the major information source. However, you can always switch to other sources to cover different usage scenarios.
Executor
Some experts need to execute specific tasks on chain. Such an expert needs to have the capability to sign transactions on behave of a user. The default executor expert is built with smart contract wallet within to make this possible. We expect people build other types of executor experts such as one leverages ERC-4337 Account Abstraction.
An Example
Here is one example to better explain the architecture. When a user asks to "transfer 100 USDT to Vitalik when he posts something on Farcaster" here's how the process flows:
- The user types in "transfer 100 USDT to Vitalik when he posts something on Farcaster" and hits enter in the Client.
- The App Server receives the request, checks necessary business logic in the database, and then forwards it to the Interpreter.
- The Interpreter analyzes the request to determine the most appropriate Expert(s) to handle it. It asks an Expert to check the balance of the user (Information Source involved), an Expert to convert the famous name "Vitalik" to the correct wallet address (Information Source involved), an Expert to track Vitalik’s Farcaster feed (Information Source involved), and a trigger-based Executor to handle the transfer when the condition is met.
- The Client prompts the user for confirmation.
- The Executor waits for the trigger and then completes the transaction on chain.
- The response circles back to the Client, completing the request.
Conclusion
The above explains the OpenAgent architecture in detail with an example. We believe this is by far the most advanced yet practical way to actually build on-chain AI Agents for the whole Web3 ecosystem. While by no means this would be the ultimate perfect solution, we expect this to be the foundation and inspiration for much greater things to come.
We believe that Web3 AI needs to be expedited - we need AI with open architectures to grow and thrive to fend off proprietary and closed AIs. That’s why instead of holding it to ourselves after months of research and investment, we decided to share it with the whole world.
The RSS3 Foundation will always be here to fight for Open Information and the Open Web.
Love you all.
Demo Screenshots
A demo we built under the OpenAgent architecture with some simple Experts we built for fun.