Skip to content

slingvector/RoboticDevice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RoboticDevice - Autonomous Mobile Device Automation πŸ€–

Python 3.9+ Node.js 18+ Appium 3.x Temporal Workflows Docker Compose License: Proprietary

Enterprise-grade robotic process automation (RPA) - Autonomous orchestration of Android mobile applications using Appium, ADB, Temporal workflows, and intelligent vision-based decision-making.

🎯 What It Does

RoboticDevice is a comprehensive automation platform for autonomous mobile device control:

  1. Mobile UI Automation - Deep-link into Android apps and automate complex user flows
  2. Temporal Workflow Orchestration - Manage resilient, retriable automation jobs with built-in error handling
  3. Vision-Based Intelligence - OCR and computer vision for adaptive automation decisions
  4. Real-time Monitoring Dashboard - Live tracking of device states, execution logs, and error diagnostics
  5. Multi-Device Support - Scale automation across multiple Android devices simultaneously

πŸ’‘ Key Capabilities

βœ… Automate complex Android app workflows end-to-end
βœ… Handle network failures and timeouts gracefully with Temporal retries
βœ… Use OCR and vision models for intelligent decision-making
βœ… Direct ADB integration for performance-critical tasks
βœ… Real-time dashboard monitoring with WebSocket updates
βœ… Comprehensive debugging with screenshots and XML dumps
βœ… Enterprise-grade logging and metrics


πŸ—οΈ Architecture

RoboticDevice is a full-stack distributed system:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         RoboticDevice Automation Platform           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β–Ό                 β–Ό                 β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚Frontend β”‚     β”‚  Core   β”‚     β”‚Temporal β”‚
    β”‚(Next.js)β”‚     β”‚ Backend β”‚     β”‚Server   β”‚
    β”‚ Port    β”‚     β”‚(FastAPI)β”‚     β”‚ Port    β”‚
    β”‚ 3000    β”‚     β”‚Port 8000β”‚     β”‚ 7233    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β–Ό               β–Ό               β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚Appium   β”‚   β”‚PostgreSQL   β”‚Redis    β”‚
    β”‚Server   β”‚   β”‚Database β”‚   β”‚Cache    β”‚
    β”‚Port4723 β”‚   β”‚Port 5432β”‚   β”‚Port 6379β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ Android Device   β”‚
    β”‚ (USB connected)  β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Component Overview

Component Purpose Technology
Frontend Real-time monitoring dashboard Next.js 15, React, WebSocket
Core Backend Workflow orchestration & API FastAPI, Python
Temporal Server Distributed workflow engine Temporal.io
Appium Server Mobile device automation driver Appium 3.x, UiAutomator2
Database Persistence & audit logs PostgreSQL 15
Cache Job queues & state management Redis 7+
ADB Bridge Direct OS-level device control Android Debug Bridge

πŸ› οΈ Core Components

1. Frontend Dashboard (frontend/)

Real-time monitoring and control interface

  • Live device status and connection monitoring
  • Workflow execution tracker with step-by-step visualization
  • Automation history with detailed audit logs
  • Error diagnostics with screenshot previews
  • OCR result inspection and manual correction
  • Task scheduling and workflow builder UI

Tech Stack: Next.js 15 (App Router), React 18, TypeScript, TailwindCSS, Radix UI, React Query, WebSocket

2. Core Backend (backend/)

Distributed automation orchestration engine

  • FastAPI REST API for workflow management
  • Temporal workflow client and activity registrations
  • Appium integration (mobile_driver_service.py)
  • ADB direct command execution (adb_client.py)
  • OCR and vision processing (reproduce_ocr.py)
  • Comprehensive error handling and retry logic

Tech Stack: FastAPI, Temporal Python SDK, Appium Python Client, ADB, SQLAlchemy, PostgreSQL async driver

3. Infrastructure (infrastructure/)

Complete Docker Compose setup

  • Temporal Server (orchestration engine)
  • Temporal UI (workflow monitoring)
  • PostgreSQL (database & persistence)
  • Redis (state & job queue)
  • Appium Server (device driver)

Tech Stack: Docker, Docker Compose, Temporal Cloud/Self-Hosted


πŸš€ Quick Start

Prerequisites

  • Docker & Docker Compose (recommended)
  • Python 3.9+ (for local development)
  • Node.js 18+ (for frontend)
  • Android Device (USB debugging enabled, connected via USB)
  • Appium Desktop (optional, for debugging)
  • Android SDK (ADB tools)

Option 1: Full Stack with Docker (Recommended)

# Clone repository
git clone https://github.com/slingvector/RoboticDevice.git
cd RoboticDevice

# Copy environment template
cp .env.example .env

# Start infrastructure (Temporal, PostgreSQL, Redis, Appium)
cd infrastructure
docker-compose up -d

# Verify services
docker-compose ps

Option 2: Local Development

1. Start Infrastructure:

cd infrastructure
docker-compose up -d

2. Start Frontend:

cd frontend
npm install
npm run dev
# Runs on http://localhost:3000

3. Start Backend:

cd backend
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
# API Docs: http://localhost:8000/docs

4. Verify Android Device Connection:

adb devices
# Should show your device as "device" (not "offline")

5. Access the Dashboard: Open http://localhost:3000 in your browser


πŸ” Environment Setup

Create a .env file in the root directory:

# Database
DATABASE_URL=postgresql://automation:password@localhost:5432/robotic_db
REDIS_URL=redis://localhost:6379

# Temporal Configuration
TEMPORAL_HOST=localhost
TEMPORAL_PORT=7233
TEMPORAL_NAMESPACE=default

# Appium Configuration
APPIUM_HOST=localhost
APPIUM_PORT=4723

# Android Device
DEVICE_UDID=emulator-5554  # Get from: adb devices
ANDROID_HOME=/opt/homebrew/Cellar/android-sdk  # Adjust to your system

# Backend API
API_HOST=0.0.0.0
API_PORT=8000
API_LOG_LEVEL=INFO

# Frontend
NEXT_PUBLIC_API_URL=http://localhost:8000
NEXT_PUBLIC_WS_URL=ws://localhost:8000/ws

# Security
JWT_SECRET=your_super_secret_key

πŸ“Š Usage Examples

Execute a Workflow

curl -X POST http://localhost:8000/api/workflows/execute \
  -H "Content-Type: application/json" \
  -d '{
    "workflow_name": "instagram_post_automation",
    "input": {
      "account_id": "user123",
      "caption": "Check out my automation setup!"
    }
  }'

Get Workflow Status

curl http://localhost:8000/api/workflows/{workflow_run_id}/status

List Recent Executions

curl http://localhost:8000/api/executions?limit=50

πŸ“ˆ Workflow Examples

1. Instagram Reel Posting Flow

Automatically post a video from Instagram to your feed:

@workflow.defn
async def instagram_post_workflow(account_id: str, video_url: str):
    # Download video
    video_path = await activities.download_video(video_url)
    
    # Push to device
    await activities.push_to_device(video_path)
    
    # Automate Instagram UI
    await activities.open_instagram(account_id)
    await activities.navigate_to_gallery()
    await activities.select_video(video_path)
    await activities.add_caption("My awesome video!")
    await activities.post_to_feed()

2. App Installation & Verification

@workflow.defn
async def install_and_verify_app(app_package: str):
    await activities.install_app(app_package)
    await activities.launch_app(app_package)
    
    # Use OCR to verify UI
    screenshot = await activities.take_screenshot()
    text = await activities.ocr_screenshot(screenshot)
    
    if "Welcome" in text:
        return {"status": "success"}
    else:
        raise ApplicationError("App verification failed")

πŸ§ͺ Testing

Run tests:

# Backend tests
cd backend && pytest tests/ -v

# Frontend tests
cd frontend && npm test

# Workflow simulation (no device required)
cd backend && pytest tests/workflows/ --temporal-local

πŸ” Debugging

View Workflow History

# Access Temporal UI
open http://localhost:8233

Check Device Screenshots

# Inspect debug directory
ls -la backend/debug/
# Look for timestamped PNG files

View Backend Logs

docker-compose logs -f backend

Inspect ADB Connection

adb devices
adb logcat | grep -i "your_package_name"

πŸ“š API Reference

Endpoint Method Purpose
/api/devices GET List connected devices
/api/workflows/execute POST Start a workflow
/api/workflows/{id}/status GET Get workflow status
/api/executions GET List execution history
/api/screenshots/{execution_id} GET Get debug screenshots
/ws/device/{device_id} WebSocket Real-time device updates

πŸ› οΈ Tech Stack Summary

Layer Technology
Frontend Next.js 15, React 18, TypeScript, TailwindCSS
Backend FastAPI, Python 3.9+, SQLAlchemy
Orchestration Temporal.io (distributed workflows)
Mobile Automation Appium 3.x, UiAutomator2, ADB
Database PostgreSQL 15
Cache/Queue Redis 7+
Containers Docker, Docker Compose
Deployment AWS/GCP/Azure ready

🚒 Production Deployment

Prerequisites

  • AWS/GCP/Azure account
  • Kubernetes cluster (optional)
  • CI/CD pipeline (GitHub Actions, GitLab CI)

Deploy Steps

  1. Build Docker images
  2. Push to container registry
  3. Deploy Temporal cluster
  4. Deploy PostgreSQL & Redis
  5. Deploy backend & frontend services
  6. Configure load balancer

See Deployment Guide


πŸ“‹ Workflow Management

Available Workflows

  • instagram_post_automation - Full Reel posting pipeline
  • app_installation - Install and verify apps
  • ui_navigation - Complex multi-screen flows
  • data_extraction - OCR and content capture
  • user_interaction_simulation - Realistic user behavior

🀝 Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.


πŸ“„ License

Proprietary - Internal use only. All rights reserved.


πŸ‘¨β€πŸ’» Author

Slingvector


πŸ™ Acknowledgments

Built with:

  • Temporal.io for resilient workflow orchestration
  • Appium for cross-platform mobile automation
  • FastAPI for modern Python APIs
  • React/Next.js for beautiful UIs

πŸ“ž Support

For issues and questions:

  1. Check Issues
  2. Review Documentation
  3. Open a detailed issue with logs and screenshots

πŸ—ΊοΈ Roadmap

  • iOS support via XCUITest
  • Multi-device parallel execution
  • Advanced ML-based UI element detection
  • Natural language workflow description
  • Self-healing automation with vision fallback
  • Advanced analytics dashboard
  • API rate limiting and throttling
  • RBAC and audit trail

⭐ If you find this useful, please star the repository!

About

Enterprise RPA platform for autonomous Android automation using Appium, Temporal workflows, and vision-based intelligence. Orchestrate complex mobile workflows at scale.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors