A Node.js/TypeScript implementation of microsoft/autogen, providing a framework for building multi-agent AI systems with conversational agents.
This project brings the powerful multi-agent orchestration capabilities of Microsoft's AutoGen framework to the Node.js ecosystem. It's designed based on the .NET code structure and class definitions, providing a familiar API for developers working with AutoGen in different languages.
- Event-Driven Architecture (AutoGen v0.4): Asynchronous message passing and distributed agent systems
- AgentRuntime: Core runtime for hosting and managing agents
- Direct Messaging: Send messages between agents asynchronously
- Publish/Subscribe: Topic-based broadcast messaging
- Cancellation Tokens: Control and cancel async operations
- State Management: Persist and restore runtime state
- Base Agent Framework: Core interfaces and abstract classes for building custom agents
- Multiple LLM Providers: Support for OpenAI, OpenRouter, Ollama, Anthropic, and Google Gemini
- OpenAI: GPT-3.5, GPT-4, and other OpenAI models
- Anthropic: Claude 3.5 Sonnet, Claude 3 Opus, and Claude 3 Haiku
- Google Gemini: Gemini 1.5 Flash, Gemini 1.5 Pro, and Gemini Pro
- OpenRouter: Access to 100+ models from multiple providers
- Ollama: Run LLMs locally for privacy and offline use
- AssistantAgent: LLM-powered conversational agent with provider flexibility
- UserProxyAgent: Human-in-the-loop agent for interactive conversations
- PlannerAgent: Planning agent that breaks down requirements into structured tasks
- SupervisorAgent: Supervisor agent that verifies task completion and provides feedback
- Group Chat: Multi-agent collaboration system for complex tasks
- Function Calling: Register and execute custom functions with agents
- Code Execution: Automatically execute code generated by agents (JavaScript, Python, Bash)
- LocalCodeExecutor: Execute code on the local machine
- DockerCodeExecutor: Execute code safely in isolated Docker containers
- Memory System: Persistent memory for agents to maintain context across conversations (based on Microsoft AutoGen)
- Type-Safe: Built with TypeScript for enhanced developer experience
- Flexible Message System: Support for different message types and roles
- Conversation Management: Built-in conversation history and state management
- Advanced Conversation Patterns: Complete implementation of AutoGen patterns
- Nested Chat: Hierarchical conversations with task delegation
- Sequential Chat: Predefined workflow execution
- Speaker Selection: Multiple strategies (Round-robin, Random, Manual, Constrained, Auto/LLM-based)
- Swarm Mode: Dynamic multi-agent task distribution and collaboration
- Tools & Extensions: Comprehensive toolset for agent capabilities
- File System Tools: Safe file read/write and directory operations
- Browser Tools: Web automation with Playwright (scraping, screenshots, interaction)
- API Tools: REST and GraphQL API call wrappers
- Image Generation: DALL-E integration for AI image generation
- Database Tools: SQL/NoSQL database connection interfaces
- Tool Caching: Result caching with multiple eviction strategies
npm installimport { AssistantAgent, UserProxyAgent, HumanInputMode } from './src/index';
// Create an AI assistant
const assistant = new AssistantAgent({
name: 'assistant',
provider: 'openai', // optional, this is the default
apiKey: process.env.OPENAI_API_KEY!,
systemMessage: 'You are a helpful assistant.',
model: 'gpt-3.5-turbo',
temperature: 0
});
// Create a user proxy for human interaction
const userProxy = new UserProxyAgent({
name: 'user',
humanInputMode: HumanInputMode.ALWAYS
});
// Start a conversation
await userProxy.initiateChat(
assistant,
'Hello! Can you help me?',
10 // max rounds
);const assistant = new AssistantAgent({
name: 'assistant',
provider: 'openrouter',
apiKey: process.env.OPENROUTER_API_KEY!,
model: 'anthropic/claude-2',
temperature: 0.7
});const assistant = new AssistantAgent({
name: 'assistant',
provider: 'anthropic',
apiKey: process.env.ANTHROPIC_API_KEY!,
model: 'claude-3-5-sonnet-20241022',
temperature: 0.7
});const assistant = new AssistantAgent({
name: 'assistant',
provider: 'gemini',
apiKey: process.env.GEMINI_API_KEY!,
model: 'gemini-1.5-flash',
temperature: 0.7
});const assistant = new AssistantAgent({
name: 'assistant',
provider: 'ollama',
model: 'llama2',
temperature: 0.7
});See LLM_PROVIDERS.md for detailed provider documentation.
autogen_node/
├── src/
│ ├── core/ # Core interfaces and base classes
│ │ ├── IAgent.ts # Agent interface definitions
│ │ ├── BaseAgent.ts # Base agent implementation
│ │ ├── IFunctionCall.ts # Function calling interfaces
│ │ ├── FunctionContract.ts # Function contract builder
│ │ ├── FunctionCallMiddleware.ts # Function execution middleware
│ │ └── ICodeExecutor.ts # Code execution interface
│ ├── agents/ # Agent implementations
│ │ ├── AssistantAgent.ts # LLM-powered assistant with function calling
│ │ └── UserProxyAgent.ts # Human proxy with code execution
│ ├── executors/ # Code execution implementations
│ │ └── LocalCodeExecutor.ts # Local code executor
│ ├── providers/ # LLM provider implementations
│ │ ├── OpenAIProvider.ts
│ │ ├── OpenRouterProvider.ts
│ │ └── OllamaProvider.ts
│ ├── examples/ # Example applications
│ │ ├── basic-chat.ts
│ │ ├── function-calling-example.ts
│ │ └── code-execution-example.ts
│ └── index.ts # Main export file
├── dist/ # Compiled JavaScript output
├── package.json
├── tsconfig.json
└── README.md
This implementation follows the AutoGen architecture with both traditional and event-driven patterns:
The new event-driven architecture enables scalable, distributed multi-agent systems:
-
AgentRuntime: Core runtime for hosting and managing agents
sendMessage(): Direct asynchronous message passingpublishMessage(): Topic-based broadcast messaging- Agent registration and lifecycle management
- State persistence and restoration
-
AgentId & TopicId: Distributed agent addressing
- Unique identification for agents across processes
- Topic-based message routing
-
CancellationToken: Async operation control
- Cancel long-running operations
- Cleanup on cancellation
See EVENT_DRIVEN.md for detailed documentation.
-
IAgent Interface: Defines the contract for all agents
generateReply(): Generate responses to messagesgetName(): Get the agent's name
-
BaseAgent: Abstract base class providing:
- Conversation history management
- Message sending and receiving
- Chat initiation logic
- Termination detection
-
Agent Implementations:
- AssistantAgent: Uses LLM providers for intelligent responses with function calling support
- UserProxyAgent: Facilitates human interaction with configurable input modes and code execution
-
Function Calling: Enable agents to call custom functions
- Define functions with
FunctionContract - Automatic function execution via
FunctionCallMiddleware - OpenAI-compatible function definitions
- Define functions with
-
Code Execution: Execute code generated by agents
LocalCodeExecutorfor JavaScript, Python, and Bash- Automatic code extraction from markdown code blocks
- Safe execution in temporary directories
Messages follow a structured format:
interface IMessage {
content: string;
role: 'user' | 'assistant' | 'system' | 'function' | 'tool';
name?: string;
functionCall?: {
name: string;
arguments: string;
};
toolCalls?: Array<{
id: string;
type: 'function';
function: {
name: string;
arguments: string;
};
}>;
toolCallId?: string;
}Create a .env file in the project root:
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here # Optional
GEMINI_API_KEY=your_gemini_api_key_here # Optional
OPENROUTER_API_KEY=your_openrouter_api_key_here # Optional
# OLLAMA_BASE_URL=http://localhost:11434/v1 # Optional# Build the project
npm run build
# Run the basic interactive example (OpenAI)
npm run example:basic
# Run the automated two-agent conversation (OpenAI)
npm run example:auto
# Run the group chat example (OpenAI)
npm run example:group
# Run Anthropic Claude example
npm run example:anthropic
# Run Google Gemini example
npm run example:gemini
# Run OpenRouter example
npm run example:openrouter
# Run Ollama example (local LLM)
npm run example:ollama
# Run Ollama file organizer example (automatic file renaming and tagging)
npm run example:ollama-organizer
# Run planner-supervisor example (task planning and verification with Ollama)
npm run example:planner-supervisor
# Run GitHub AI search example (web search and API tools with Ollama)
npm run example:github-ai-search
# Run function calling example
npm run example:functions
# Run code execution example
npm run example:code
# Run memory example
npm run example:memory
# Run nested chat example
npm run example:nested
# Run sequential chat example
npm run example:sequential
# Run speaker selection strategies example
npm run example:speaker
# Run swarm mode example
npm run example:swarm
# Run event-driven architecture example (AutoGen v0.4)
npm run example:events
# Run tools examples
npm run example:filesystem # File system operations
npm run example:browser # Web automation with Playwright
npm run example:api # REST/GraphQL API calls
npm run example:docker # Docker code execution
npm run example:image # Image generation with DALL-E
# Run tests
npm test
# Run tests with coverage
npm run test:coverage
# Development mode with auto-reload
npm run dev
# Clean build artifacts
npm run cleanautogen_node provides a comprehensive set of tools and extensions to enhance agent capabilities:
Safe file and directory operations with security restrictions:
import { FileSystemTool, AssistantAgent } from 'autogen_node';
const fsTool = new FileSystemTool({
basePath: '/safe/directory',
allowedExtensions: ['.txt', '.md', '.json']
});
// Create function contracts for agents
const functions = FileSystemTool.createFunctionContracts(fsTool);
const assistant = new AssistantAgent({
name: 'file_assistant',
functions,
// ... other config
});Available Operations:
read_file- Read file contentswrite_file- Write content to a filelist_directory- List directory contentscreate_directory- Create new directoriesdelete_file- Delete filesfile_exists- Check if file/directory existsrename_file- Rename or move files to new locations
Example Use Case: See ollama-file-organizer-example.ts for an intelligent file organization system that uses LLM to analyze file content, suggest categories, and automatically organize files into appropriate folders with descriptive names.
Web automation using Playwright:
import { BrowserTool } from 'autogen_node';
const browser = new BrowserTool({ headless: true });
await browser.navigate('https://example.com');
const text = await browser.getText('h1');
await browser.screenshot({ path: 'screenshot.png' });
// Use with agents
const functions = BrowserTool.createFunctionContracts(browser);Safe code execution in isolated containers:
import { DockerCodeExecutor } from 'autogen_node';
const executor = new DockerCodeExecutor({ timeout: 30000 });
const result = await executor.executeCode(
'console.log("Hello from Docker!");',
'javascript'
);REST and GraphQL API wrapper:
import { APITool } from 'autogen_node';
const apiTool = new APITool({
baseURL: 'https://api.example.com'
});
const data = await apiTool.get('/users');
const result = await apiTool.graphql(query);AI image generation with DALL-E:
import { ImageGenerationTool } from 'autogen_node';
const imageTool = new ImageGenerationTool({
openaiApiKey: process.env.OPENAI_API_KEY
});
const imageUrls = await imageTool.generateImage(
'A serene landscape with mountains',
{ size: '1024x1024', quality: 'hd' }
);Cache expensive tool operations:
import { ToolCache, CacheStrategy } from 'autogen_node';
const cache = new ToolCache({
maxSize: 100,
defaultTTL: 5 * 60 * 1000,
strategy: CacheStrategy.LRU
});
const cachedFn = cache.wrap('expensiveOp', asyncFunction);For detailed documentation on all tools, see TOOLS.md.
import {
AgentId,
TopicId,
SingleThreadedAgentRuntime,
createSubscription,
} from './src/index';
// Create event-driven agent
class EventAgent {
async handleMessage(message: any, sender: AgentId | null) {
return {
role: 'assistant',
content: `Processed: ${message.content}`,
};
}
}
// Create runtime and register agents
const runtime = new SingleThreadedAgentRuntime();
const agent = new EventAgent();
const agentId = new AgentId('event_agent', 'agent1');
await runtime.registerAgentInstance(agent, agentId);
// Direct message passing
const response = await runtime.sendMessage(
{ content: 'Hello!' },
agentId
);
// Topic-based pub/sub
const topic = new TopicId('notifications', 'system');
await runtime.addSubscription(
createSubscription('sub1', topic, agentId)
);
await runtime.publishMessage(
{ content: 'Broadcast message' },
topic
);See EVENT_DRIVEN.md for complete documentation.
import { AssistantAgent, UserProxyAgent, HumanInputMode } from './src/index';
const assistant = new AssistantAgent({
name: 'assistant',
apiKey: process.env.OPENAI_API_KEY!,
systemMessage: 'You are a helpful math tutor.',
model: 'gpt-3.5-turbo'
});
const user = new UserProxyAgent({
name: 'user',
humanInputMode: HumanInputMode.ALWAYS
});
await user.initiateChat(assistant, 'Help me solve 2x + 3 = 7', 10);const user = new UserProxyAgent({
name: 'user',
humanInputMode: HumanInputMode.NEVER
});
// Agent will auto-reply without human interventionimport { AssistantAgent, FunctionContract } from './src/index';
// Define a weather function
const getWeather = FunctionContract.fromFunction(
'get_weather',
'Get the current weather for a location',
[
{
name: 'location',
type: 'string',
description: 'The city and state, e.g. San Francisco, CA',
required: true
}
],
async (location: string) => {
// Your weather API logic here
return `The weather in ${location} is sunny, 72°F`;
}
);
// Create assistant with functions
const assistant = new AssistantAgent({
name: 'assistant',
apiKey: process.env.OPENAI_API_KEY!,
systemMessage: 'You are a helpful assistant with access to weather data.',
model: 'gpt-3.5-turbo',
functions: [getWeather]
});
// The assistant will automatically call the function when needed
await userProxy.initiateChat(assistant, "What's the weather in San Francisco?", 3);import { AssistantAgent, UserProxyAgent, LocalCodeExecutor, HumanInputMode } from './src/index';
// Create code executor
const codeExecutor = new LocalCodeExecutor();
// Create assistant that writes code
const assistant = new AssistantAgent({
name: 'assistant',
apiKey: process.env.OPENAI_API_KEY!,
systemMessage: 'You are a coding assistant. Write code in markdown code blocks.',
model: 'gpt-3.5-turbo'
});
// Create user proxy with code execution enabled
const userProxy = new UserProxyAgent({
name: 'user_proxy',
humanInputMode: HumanInputMode.NEVER,
codeExecutor: codeExecutor,
autoExecuteCode: true
});
// The agent will write code, and it will be automatically executed
await userProxy.initiateChat(
assistant,
'Write JavaScript code to calculate the sum of numbers from 1 to 100',
3
);
await codeExecutor.cleanup();import { AssistantAgent, GroupChat, GroupChatManager } from './src/index';
// Create multiple specialized agents
const designer = new AssistantAgent({
name: 'designer',
apiKey: process.env.OPENAI_API_KEY!,
systemMessage: 'You are a creative designer.',
model: 'gpt-3.5-turbo'
});
const engineer = new AssistantAgent({
name: 'engineer',
apiKey: process.env.OPENAI_API_KEY!,
systemMessage: 'You are a practical engineer.',
model: 'gpt-3.5-turbo'
});
// Create group chat
const groupChat = new GroupChat({
agents: [designer, engineer],
maxRound: 10
});
// Create manager
const manager = new GroupChatManager({
groupChat: groupChat
});
// Run the discussion
await manager.runChat('Design a new mobile app feature');autogen_node implements all major conversation patterns from Microsoft AutoGen:
Delegate tasks to specialist agents:
import { AssistantAgent, supportsNestedChat } from './src/index';
const projectManager = new AssistantAgent({
name: 'project_manager',
systemMessage: 'You delegate tasks to specialists.',
apiKey: process.env.OPENAI_API_KEY!
});
const specialist = new AssistantAgent({
name: 'specialist',
systemMessage: 'You are a code review specialist.',
apiKey: process.env.OPENAI_API_KEY!
});
// Delegate task to specialist
const result = await projectManager.initiateNestedChat(
'Review this code: ...',
specialist,
{ maxRounds: 3, addToParentHistory: true }
);Execute agents in predefined workflow order:
import { runSequentialChat, AssistantAgent } from './src/index';
const result = await runSequentialChat({
steps: [
{ agent: researcher, maxRounds: 1 },
{ agent: writer, maxRounds: 1 },
{ agent: editor, maxRounds: 1 }
],
initialMessage: 'Write an article about AI'
});Control who speaks next in group chats:
import { GroupChat, RoundRobinSelector, RandomSelector, AutoSelector } from './src/index';
// Round-robin selection
const chat1 = new GroupChat({
agents: [agent1, agent2, agent3],
speakerSelector: new RoundRobinSelector()
});
// Random selection
const chat2 = new GroupChat({
agents: [agent1, agent2, agent3],
speakerSelector: new RandomSelector()
});
// LLM-based intelligent selection
const coordinator = new AssistantAgent({ ... });
const chat3 = new GroupChat({
agents: [agent1, agent2, agent3],
speakerSelector: new AutoSelector({ selectorAgent: coordinator })
});Distribute tasks among agents dynamically:
import { SwarmChat, RoundRobinSelector } from './src/index';
const swarm = new SwarmChat({
agents: [researcher, writer, coder, reviewer],
maxRoundsPerTask: 3,
taskAssignmentSelector: new RoundRobinSelector()
});
const result = await swarm.run([
'Research TypeScript benefits',
'Write a tutorial',
'Create code examples',
'Review documentation'
]);
console.log(`Completed: ${result.completedTasks.length}`);See CONVERSATION_PATTERNS.md for detailed documentation.
Memory allows agents to maintain context across conversations:
import { AssistantAgent, ListMemory, MemoryMimeType } from './src/index';
// Create memory instance
const userMemory = new ListMemory({ name: 'user_preferences' });
// Add memory content
await userMemory.add({
content: 'User prefers formal language',
mimeType: MemoryMimeType.TEXT,
metadata: { timestamp: Date.now() }
});
await userMemory.add({
content: 'User is interested in TypeScript and AI',
mimeType: MemoryMimeType.TEXT
});
// Create agent with memory
const assistant = new AssistantAgent({
name: 'assistant',
provider: 'openai',
apiKey: process.env.OPENAI_API_KEY!,
memory: [userMemory]
});
// Memory is automatically injected into context
const reply = await assistant.generateReply([
{ role: 'user', content: 'What should I learn next?' }
]);For more details, see MEMORY.md.
| Feature | .NET AutoGen | autogen_node |
|---|---|---|
| Base Agent Framework | ✅ | ✅ |
| AssistantAgent | ✅ | ✅ |
| UserProxyAgent | ✅ | ✅ |
| ConversableAgent | ✅ | ✅ |
| RetrieveUserProxyAgent (RAG) | ✅ | ✅ |
| GPTAssistantAgent | ✅ | ✅ |
| MultimodalConversableAgent | ✅ | ✅ |
| TeachableAgent | ✅ | ✅ |
| CompressibleAgent | ✅ | ✅ |
| SocietyOfMindAgent | ✅ | ✅ |
| OpenAI Integration | ✅ | ✅ |
| Group Chat | ✅ | ✅ |
| Multiple LLM Providers | ✅ | ✅ (OpenAI, Anthropic, Gemini, OpenRouter, Ollama) |
| Function Calling | ✅ | ✅ |
| Code Execution | ✅ | ✅ (JavaScript, Python, Bash) |
| Memory System | ✅ | ✅ (Based on Python AutoGen) |
| Event-Driven Architecture (v0.4) | ✅ | ✅ |
| AgentRuntime | ✅ | ✅ (SingleThreadedAgentRuntime) |
| Async Message Passing | ✅ | ✅ |
| Publish/Subscribe | ✅ | ✅ |
autogen_node now includes all major agent types from Microsoft AutoGen:
- ConversableAgent: Flexible agent with optional LLM integration and configurable behaviors
- RetrieveUserProxyAgent: RAG-enabled agent for document Q&A and knowledge base queries
- GPTAssistantAgent: Integration with OpenAI's Assistant API for persistent conversations
- MultimodalConversableAgent: Support for images, audio, and multimodal interactions
- TeachableAgent: Learns user preferences and provides personalized responses
- CompressibleAgent: Manages long conversations with automatic history compression
- SocietyOfMindAgent: Complex reasoning using multiple specialized inner agents
- PlannerAgent: Breaks down complex requirements into structured, executable task plans
- SupervisorAgent: Verifies task completion and ensures requirements are met with feedback loops
For detailed documentation and examples, see:
- ADVANCED_AGENTS.md - ConversableAgent, RAG, GPT Assistant, Multimodal, etc.
- PLANNER_SUPERVISOR.md - Planning and Supervision workflow (English)
- PLANNER_SUPERVISOR_CN.md - Planning and Supervision workflow (中文)
- Base agent framework
- AssistantAgent with OpenAI
- UserProxyAgent
- Group chat capabilities
- Multiple LLM provider support (OpenAI, OpenRouter, Ollama)
- Function calling support
- Code execution agent (JavaScript, Python, Bash)
- Additional LLM provider integrations (Anthropic SDK, Google Gemini)
- Memory system (ListMemory implementation)
- Event-driven architecture (AutoGen v0.4)
- AgentRuntime interface
- SingleThreadedAgentRuntime implementation
- AgentId and TopicId for addressing
- CancellationToken for async control
- Direct message passing (sendMessage)
- Publish/Subscribe messaging (publishMessage)
- State persistence and management
- Distributed runtime (multi-process/multi-machine)
- Advanced agent types
- ConversableAgent (flexible conversable agent)
- RetrieveUserProxyAgent (RAG support)
- GPTAssistantAgent (OpenAI Assistant API)
- MultimodalConversableAgent (image/audio support)
- TeachableAgent (learning and personalization)
- CompressibleAgent (conversation compression)
- SocietyOfMindAgent (multi-agent reasoning)
- PlannerAgent (task planning and decomposition)
- SupervisorAgent (task verification and feedback)
- Advanced conversation patterns
- Nested Chat (task delegation)
- Sequential Chat (workflow automation)
- Speaker Selection Strategies (Round-robin, Random, Manual, Constrained, Auto)
- Swarm Mode (dynamic multi-agent collaboration)
- Tools & Extensions
- File System Tools (read/write/directory operations)
- Browser Tools (Playwright web automation)
- Docker Code Executor (isolated code execution)
- API Tools (REST/GraphQL wrappers)
- Image Generation Tools (DALL-E integration)
- Database Tools (SQL/NoSQL interfaces)
- Tool Caching (result caching with eviction strategies)
- MCP (Model Context Protocol) Server Support
- Streaming responses
- Performance optimizations
- Additional memory backends (Vector, Database, File-based)
Contributions are welcome! This project aims to maintain feature parity with the .NET version of AutoGen while adapting to Node.js/TypeScript best practices.
MIT
This project is inspired by and based on the architecture of microsoft/autogen. Special thanks to the AutoGen team for creating such a powerful framework.
- microsoft/autogen - Original Python implementation
- microsoft/autogen (dotnet) - .NET implementation