Skip to content

SamuelOshin/codebegen_be

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

47 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ CodebeGen - AI-Powered FastAPI Backend Generator

Python FastAPI License Build Status

CodebeGen is an advanced AI-powered platform that transforms natural language descriptions into production-ready FastAPI backend projects. Leveraging a multi-model AI pipeline, it generates complete backend architectures including authentication, database models, APIs, tests, and deployment configurations in minutes.

🌟 Key Features

πŸ€– Multi-Model AI Pipeline

  • Schema Extraction: Llama-3.1-8B for intelligent entity and relationship parsing
  • Code Generation: Qwen2.5-Coder-32B (fine-tuned) for high-quality FastAPI code
  • Code Review: Starcoder2-15B for security and best practices validation
  • Documentation: Mistral-7B-Instruct for comprehensive project documentation

πŸ—οΈ Production-Ready Architecture

  • Clean Architecture: Modular design with clear separation of concerns
  • Authentication & Authorization: JWT-based auth with role-based access control
  • Database Integration: PostgreSQL with SQLAlchemy 2.0 and async support
  • API Documentation: Automatic OpenAPI/Swagger generation
  • Testing: Comprehensive test suites with pytest and async testing
  • Deployment Ready: Docker configurations and deployment scripts

πŸ”₯ Advanced Capabilities

  • Real-time Generation: WebSocket streaming for live progress updates
  • Iterative Refinement: AI-powered code iteration and improvement
  • Multi-Template Support: FastAPI with PostgreSQL, MongoDB, and more
  • GitHub Integration: Direct repository creation and deployment
  • Quality Assurance: Automated code review and quality scoring

πŸ“Š Version Tracking & Management

  • Hierarchical Storage: Organized file structure with project/version separation
  • Generation History: Track multiple iterations with automatic versioning
  • Active Generation Management: Switch between different versions seamlessly
  • Diff Generation: Compare changes between versions with detailed file differences
  • Metadata Tracking: Store generation statistics, file counts, and change summaries
  • Backward Compatibility: Support for existing flat storage structure

πŸ› οΈ Technology Stack

Backend Framework

  • FastAPI - High-performance async web framework
  • Python 3.11+ - Latest Python features and performance
  • Uvicorn - Lightning-fast ASGI server

Database & Storage

  • PostgreSQL - Primary database with advanced features
  • SQLAlchemy 2.0 - Modern ORM with async support
  • Alembic - Database migrations and schema management
  • Redis - Caching and background task management
  • Supabase Storage - Optional cloud storage for generated projects (hybrid local/cloud approach)

AI & Machine Learning

  • Qwen2.5-Coder-32B - Primary code generation model
  • Llama-3.1-8B - Schema extraction and parsing
  • Starcoder2-15B - Code review and quality analysis
  • Mistral-7B-Instruct - Documentation generation

Authentication & Security

  • JWT Tokens - Secure authentication with python-jose
  • Bcrypt - Password hashing with passlib
  • Rate Limiting - SlowAPI for request throttling
  • CORS & Security Headers - Production security measures

Development & Testing

  • Poetry - Dependency management and packaging
  • Pytest - Comprehensive testing framework
  • Black & isort - Code formatting and import sorting
  • MyPy - Static type checking
  • Pre-commit - Git hooks for code quality

πŸš€ Quick Start

Prerequisites

  • Python 3.11+
  • PostgreSQL 14+
  • Redis 6+
  • Git

Installation

  1. Clone the repository
git clone https://github.com/yourusername/codebegen-be.git
cd codebegen-be
  1. Install dependencies with Poetry
# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -

# Install project dependencies
poetry install
  1. Set up environment variables
cp .env.example .env
# Edit .env with your configuration
  1. Set up the database
# Run database migrations
poetry run alembic upgrade head

# Optional: Seed with sample data
poetry run python scripts/seed_data.py
  1. Start the development server
poetry run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

The API will be available at http://localhost:8000 with interactive documentation at http://localhost:8000/docs.

Optional: Supabase Cloud Storage Setup

CodebeGen supports optional cloud storage via Supabase for scalable project file storage.

  1. Create a Supabase project at supabase.com

  2. Configure environment variables

# Add to .env
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_KEY=your-service-role-key
USE_CLOUD_STORAGE=true
  1. Migrate existing projects (optional)
# Preview migration
poetry run python scripts/migrate_to_supabase.py --dry-run

# Perform migration
poetry run python scripts/migrate_to_supabase.py

For detailed setup instructions, see docs/STORAGE_SETUP.md.

Note: Cloud storage is optional. The system works perfectly with local-only storage by setting USE_CLOUD_STORAGE=false.

πŸ“ Project Structure

codebegen_be/
β”œβ”€β”€ app/                          # Main application package
β”‚   β”œβ”€β”€ main.py                   # FastAPI app entry point
β”‚   β”œβ”€β”€ auth/                     # Authentication & authorization
β”‚   β”‚   β”œβ”€β”€ dependencies.py       # Auth dependencies
β”‚   β”‚   β”œβ”€β”€ handlers.py          # Auth handlers
β”‚   β”‚   └── models.py            # Auth models
β”‚   β”œβ”€β”€ core/                     # Core application components
β”‚   β”‚   β”œβ”€β”€ config.py            # Configuration management
β”‚   β”‚   β”œβ”€β”€ database.py          # Database connection
β”‚   β”‚   β”œβ”€β”€ exceptions.py        # Custom exceptions
β”‚   β”‚   └── security.py          # Security utilities
β”‚   β”œβ”€β”€ models/                   # SQLAlchemy database models
β”‚   β”‚   β”œβ”€β”€ base.py              # Base model class
β”‚   β”‚   β”œβ”€β”€ user.py              # User model
β”‚   β”‚   β”œβ”€β”€ project.py           # Project model
β”‚   β”‚   β”œβ”€β”€ generation.py        # AI generation model
β”‚   β”‚   └── organization.py      # Organization model
β”‚   β”œβ”€β”€ repositories/             # Data access layer
β”‚   β”‚   β”œβ”€β”€ base.py              # Base repository
β”‚   β”‚   β”œβ”€β”€ user_repository.py   # User data operations
β”‚   β”‚   β”œβ”€β”€ project_repository.py # Project data operations
β”‚   β”‚   └── generation_repository.py # Generation data operations
β”‚   β”œβ”€β”€ routers/                  # API route handlers
β”‚   β”‚   β”œβ”€β”€ auth.py              # Authentication routes
β”‚   β”‚   β”œβ”€β”€ projects.py          # Project management routes
β”‚   β”‚   β”œβ”€β”€ generations.py       # Code generation routes
β”‚   β”‚   β”œβ”€β”€ ai.py                # AI service routes
β”‚   β”‚   └── webhooks.py          # Webhook handlers
β”‚   β”œβ”€β”€ schemas/                  # Pydantic schemas/DTOs
β”‚   β”‚   β”œβ”€β”€ base.py              # Base schemas
β”‚   β”‚   β”œβ”€β”€ user.py              # User schemas
β”‚   β”‚   β”œβ”€β”€ project.py           # Project schemas
β”‚   β”‚   β”œβ”€β”€ generation.py        # Generation schemas
β”‚   β”‚   └── ai.py                # AI request/response schemas
β”‚   β”œβ”€β”€ services/                 # Business logic layer
β”‚   β”‚   β”œβ”€β”€ ai_orchestrator.py   # AI pipeline coordination
β”‚   β”‚   β”œβ”€β”€ generation_service.py # Generation management & versioning
β”‚   β”‚   β”œβ”€β”€ file_manager.py      # Hierarchical file storage management
β”‚   β”‚   β”œβ”€β”€ supabase_storage_service.py # Cloud storage integration
β”‚   β”‚   β”œβ”€β”€ storage_manager.py   # Hybrid local/cloud storage manager
β”‚   β”‚   β”œβ”€β”€ storage_integration_helper.py # Storage integration helper
β”‚   β”‚   β”œβ”€β”€ code_generator.py    # Code generation service
β”‚   β”‚   β”œβ”€β”€ code_reviewer.py     # Code review service
β”‚   β”‚   β”œβ”€β”€ docs_generator.py    # Documentation service
β”‚   β”‚   β”œβ”€β”€ github_service.py    # GitHub integration
β”‚   β”‚   β”œβ”€β”€ billing_service.py   # Billing and subscriptions
β”‚   β”‚   └── schema_parser.py     # Schema extraction service
β”‚   └── utils/                    # Utility functions
β”‚       β”œβ”€β”€ file_utils.py        # File operations
β”‚       β”œβ”€β”€ formatters.py        # Code formatting
β”‚       └── validators.py        # Input validation
β”œβ”€β”€ ai_models/                    # AI model implementations
β”‚   β”œβ”€β”€ qwen_generator.py        # Qwen code generation model
β”‚   β”œβ”€β”€ llama_parser.py          # Llama schema parser
β”‚   β”œβ”€β”€ starcoder_reviewer.py    # Starcoder code reviewer
β”‚   β”œβ”€β”€ mistral_docs.py          # Mistral documentation generator
β”‚   └── model_loader.py          # Model loading utilities
β”œβ”€β”€ alembic/                      # Database migrations
β”‚   β”œβ”€β”€ versions/                # Migration files
β”‚   └── env.py                   # Alembic configuration
β”œβ”€β”€ templates/                    # Project templates
β”‚   β”œβ”€β”€ fastapi_basic/           # Basic FastAPI template
β”‚   β”œβ”€β”€ fastapi_mongo/           # FastAPI + MongoDB template
β”‚   └── fastapi_sqlalchemy/      # FastAPI + SQLAlchemy template
β”œβ”€β”€ tests/                        # Test suites
β”‚   β”œβ”€β”€ test_auth/               # Authentication tests
β”‚   β”œβ”€β”€ test_services/           # Service layer tests
β”‚   β”œβ”€β”€ test_ai/                 # AI pipeline tests
β”‚   └── test_integration/        # Integration tests
β”œβ”€β”€ infra/                        # Infrastructure & deployment
β”‚   β”œβ”€β”€ docker-compose.yml       # Local development setup
β”‚   β”œβ”€β”€ Dockerfile               # Container configuration
β”‚   └── nginx.conf               # Nginx configuration
β”œβ”€β”€ docs/                         # Documentation
β”‚   β”œβ”€β”€ architecture.md          # System architecture
β”‚   β”œβ”€β”€ deployment.md            # Deployment guide
β”‚   └── openapi.yaml             # API specification
β”œβ”€β”€ scripts/                      # Utility scripts
β”‚   β”œβ”€β”€ migrate.py               # Database migration runner
β”‚   β”œβ”€β”€ seed_data.py             # Sample data seeder
β”‚   └── migrate_to_supabase.py   # Supabase migration script
β”‚   └── setup.py                 # Environment setup
β”œβ”€β”€ storage/                      # File storage with hierarchical structure
β”‚   └── projects/                 # Project-specific storage
β”‚       └── {project_id}/         # Individual project directory
β”‚           β”œβ”€β”€ generations/      # Version-tracked generations
β”‚           β”‚   β”œβ”€β”€ v1__{gen_id}/ # Version 1 generation files
β”‚           β”‚   β”‚   β”œβ”€β”€ source/   # Generated source code
β”‚           β”‚   β”‚   └── manifest.json # Generation metadata
β”‚           β”‚   β”œβ”€β”€ v2__{gen_id}/ # Version 2 generation files
β”‚           β”‚   └── active -> v2__{gen_id} # Symlink to active version
β”‚           └── legacy/           # Backward compatibility for old flat storage
└── requirements/                 # Dependency files
    β”œβ”€β”€ base.txt                 # Core dependencies
    β”œβ”€β”€ dev.txt                  # Development dependencies
    └── prod.txt                 # Production dependencies

πŸ”Œ API Endpoints

Authentication

  • POST /auth/register - User registration
  • POST /auth/login - User login
  • GET /auth/me - Get current user
  • POST /auth/refresh - Refresh access token

Projects

  • GET /projects/ - List user projects
  • POST /projects/ - Create new project
  • GET /projects/{id} - Get project details
  • PUT /projects/{id} - Update project
  • DELETE /projects/{id} - Delete project
  • GET /projects/public - List public projects
  • GET /projects/search - Search projects

AI Generation

  • POST /generations/ - Start code generation
  • GET /generations/{id} - Get generation status
  • GET /generations/{id}/stream - Stream generation progress
  • POST /generations/{id}/iterate - Iterate on generation
  • GET /generations/{id}/files - Download generated files

Version Management

  • GET /projects/{project_id}/generations - List all versions for a project
  • GET /projects/{project_id}/generations/{version} - Get specific version details
  • GET /projects/{project_id}/generations/active - Get active generation
  • POST /projects/{project_id}/generations/{generation_id}/activate - Set active generation
  • GET /projects/{project_id}/generations/compare/{from_version}/{to_version} - Compare two versions

AI Services

  • POST /ai/generate - Generate project from prompt
  • POST /ai/iterate - Iterate and improve code
  • GET /ai/models - List available AI models

πŸ”§ Configuration

Environment Variables

# Application Settings
APP_NAME=codebegen
ENVIRONMENT=development
DEBUG=true
SECRET_KEY=your-secret-key-here

# Database Configuration
DATABASE_URL=postgresql://user:password@localhost:5432/codebegen
REDIS_URL=redis://localhost:6379

# AI Model Paths
QWEN_MODEL_PATH=Qwen/Qwen2.5-Coder-32B
LLAMA_MODEL_PATH=meta-llama/Llama-3.1-8B
STARCODER_MODEL_PATH=bigcode/starcoder2-15b
MISTRAL_MODEL_PATH=mistralai/Mistral-7B-Instruct-v0.1

# GitHub Integration
GITHUB_CLIENT_ID=your-github-client-id
GITHUB_CLIENT_SECRET=your-github-client-secret

# External Services
STRIPE_SECRET_KEY=your-stripe-key

πŸ§ͺ Testing

Run All Tests

poetry run pytest

Run Specific Test Suites

# Authentication tests
poetry run pytest tests/test_auth/

# Service layer tests
poetry run pytest tests/test_services/

# Integration tests
poetry run pytest tests/test_integration/

# AI pipeline tests
poetry run pytest tests/test_ai/

Test Coverage

poetry run pytest --cov=app --cov-report=html

🐳 Docker Deployment

Development Environment

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

Production Deployment

# Build production image
docker build -f infra/Dockerfile -t codebegen:latest .

# Run with production settings
docker run -d \
  --name codebegen \
  -p 8000:8000 \
  -e ENVIRONMENT=production \
  -e DATABASE_URL=your-production-db-url \
  codebegen:latest

🀝 Development Workflow

Phase Implementation Status

βœ… Phase 1: Core Infrastructure (Completed)

  • FastAPI application setup
  • Database models and migrations
  • Authentication system
  • Basic API structure
  • Testing framework

βœ… Phase 2: Project Management (Completed)

  • Project CRUD operations
  • User management
  • API endpoints
  • Data validation
  • Repository pattern

🚧 Phase 3: AI Integration (In Progress)

  • AI model loading and inference
  • Multi-model pipeline implementation
  • Code generation service
  • Real-time streaming
  • Quality scoring

πŸ“‹ Phase 4: Advanced Features (Planned)

  • GitHub integration
  • Template system
  • Billing integration
  • Advanced analytics
  • Performance optimization

Code Quality Standards

Formatting & Linting

# Format code
poetry run black .
poetry run isort .

# Type checking
poetry run mypy app/

# Run pre-commit hooks
poetry run pre-commit run --all-files

Git Workflow

# Create feature branch
git checkout -b feature/your-feature-name

# Make changes and commit
git add .
git commit -m "feat: add new feature"

# Push and create PR
git push origin feature/your-feature-name

πŸ“Š Performance & Monitoring

Metrics Tracked

  • Generation Time: End-to-end code generation performance
  • Quality Scores: AI-generated code quality metrics
  • User Engagement: Project creation and iteration rates
  • Model Performance: Individual AI model accuracy and speed

Health Monitoring

  • Health Check: GET /health - System status
  • Database: Connection pool and query performance
  • Redis: Cache hit rates and connection status
  • AI Models: Model loading status and inference times

πŸ”’ Security Features

Authentication & Authorization

  • JWT token-based authentication
  • Role-based access control (RBAC)
  • Secure password hashing with bcrypt
  • Token refresh mechanism

API Security

  • Rate limiting on all endpoints
  • CORS configuration
  • Request validation with Pydantic
  • SQL injection prevention
  • XSS protection headers

Data Protection

  • Encrypted sensitive data storage
  • Secure environment variable handling
  • Database connection encryption
  • API key rotation support

πŸ€– AI Pipeline Details

Schema Extraction (Llama-3.1-8B)

  • Parses natural language requirements
  • Extracts entities, relationships, and constraints
  • Generates database schema suggestions
  • Handles complex domain modeling

Code Generation (Qwen2.5-Coder-32B)

  • Fine-tuned on FastAPI and clean architecture patterns
  • Generates complete project structures
  • Implements best practices and patterns
  • Supports multiple tech stacks

Code Review (Starcoder2-15B)

  • Security vulnerability detection
  • Performance optimization suggestions
  • Code quality assessment
  • Best practices validation

Documentation (Mistral-7B-Instruct)

  • README generation
  • API documentation
  • Code comments and docstrings
  • Deployment guides

🌐 API Examples

Generate a Complete FastAPI Project

import httpx

# Authentication
auth_response = httpx.post("http://localhost:8000/auth/login", json={
    "username": "user@example.com",
    "password": "your-password"
})
token = auth_response.json()["access_token"]

headers = {"Authorization": f"Bearer {token}"}

# Create a project
project_response = httpx.post(
    "http://localhost:8000/projects/",
    json={
        "name": "E-commerce API",
        "description": "A complete e-commerce backend with products, orders, and payments",
        "domain": "ecommerce",
        "tech_stack": ["FastAPI", "PostgreSQL", "Redis"],
        "is_public": False
    },
    headers=headers
)
project_id = project_response.json()["id"]

# Start AI generation
generation_response = httpx.post(
    "http://localhost:8000/generations/",
    json={
        "project_id": project_id,
        "prompt": """
        Create a complete e-commerce API with:
        - User authentication and profiles
        - Product catalog with categories and inventory
        - Shopping cart functionality
        - Order processing and payment integration
        - Admin dashboard for management
        - Inventory tracking
        - Email notifications
        - Comprehensive testing
        """,
        "context": {
            "complexity": "high",
            "include_tests": True,
            "include_docs": True,
            "deployment_target": "docker"
        }
    },
    headers=headers
)

generation_id = generation_response.json()["id"]

# Monitor progress via WebSocket or polling
status_response = httpx.get(
    f"http://localhost:8000/generations/{generation_id}",
    headers=headers
)
print(status_response.json())

Stream Generation Progress

// WebSocket connection for real-time updates
const ws = new WebSocket(`ws://localhost:8000/generations/${generationId}/stream`);

ws.onmessage = function(event) {
    const data = JSON.parse(event.data);
    console.log(`Progress: ${data.progress}% - ${data.stage}`);
    
    if (data.status === 'completed') {
        console.log('Generation completed!');
        console.log('Files generated:', data.files);
    }
};

πŸ“š Documentation Links

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass
  6. Submit a pull request

Code Style

  • Follow PEP 8 style guidelines
  • Use type hints for all functions
  • Write comprehensive docstrings
  • Maintain test coverage above 90%

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • FastAPI team for the excellent framework
  • Hugging Face for the transformer models
  • SQLAlchemy team for the powerful ORM
  • OpenAI for inspiration in AI-powered development

πŸ“ž Support


Built with ❀️ by the CodebeGen Team

Transform your ideas into production-ready backends with the power of AI.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages