CodebeGen is an advanced AI-powered platform that transforms natural language descriptions into production-ready FastAPI backend projects. Leveraging a multi-model AI pipeline, it generates complete backend architectures including authentication, database models, APIs, tests, and deployment configurations in minutes.
- Schema Extraction: Llama-3.1-8B for intelligent entity and relationship parsing
- Code Generation: Qwen2.5-Coder-32B (fine-tuned) for high-quality FastAPI code
- Code Review: Starcoder2-15B for security and best practices validation
- Documentation: Mistral-7B-Instruct for comprehensive project documentation
- Clean Architecture: Modular design with clear separation of concerns
- Authentication & Authorization: JWT-based auth with role-based access control
- Database Integration: PostgreSQL with SQLAlchemy 2.0 and async support
- API Documentation: Automatic OpenAPI/Swagger generation
- Testing: Comprehensive test suites with pytest and async testing
- Deployment Ready: Docker configurations and deployment scripts
- Real-time Generation: WebSocket streaming for live progress updates
- Iterative Refinement: AI-powered code iteration and improvement
- Multi-Template Support: FastAPI with PostgreSQL, MongoDB, and more
- GitHub Integration: Direct repository creation and deployment
- Quality Assurance: Automated code review and quality scoring
- Hierarchical Storage: Organized file structure with project/version separation
- Generation History: Track multiple iterations with automatic versioning
- Active Generation Management: Switch between different versions seamlessly
- Diff Generation: Compare changes between versions with detailed file differences
- Metadata Tracking: Store generation statistics, file counts, and change summaries
- Backward Compatibility: Support for existing flat storage structure
- FastAPI - High-performance async web framework
- Python 3.11+ - Latest Python features and performance
- Uvicorn - Lightning-fast ASGI server
- PostgreSQL - Primary database with advanced features
- SQLAlchemy 2.0 - Modern ORM with async support
- Alembic - Database migrations and schema management
- Redis - Caching and background task management
- Supabase Storage - Optional cloud storage for generated projects (hybrid local/cloud approach)
- Qwen2.5-Coder-32B - Primary code generation model
- Llama-3.1-8B - Schema extraction and parsing
- Starcoder2-15B - Code review and quality analysis
- Mistral-7B-Instruct - Documentation generation
- JWT Tokens - Secure authentication with python-jose
- Bcrypt - Password hashing with passlib
- Rate Limiting - SlowAPI for request throttling
- CORS & Security Headers - Production security measures
- Poetry - Dependency management and packaging
- Pytest - Comprehensive testing framework
- Black & isort - Code formatting and import sorting
- MyPy - Static type checking
- Pre-commit - Git hooks for code quality
- Python 3.11+
- PostgreSQL 14+
- Redis 6+
- Git
- Clone the repository
git clone https://github.com/yourusername/codebegen-be.git
cd codebegen-be- Install dependencies with Poetry
# Install Poetry if not already installed
curl -sSL https://install.python-poetry.org | python3 -
# Install project dependencies
poetry install- Set up environment variables
cp .env.example .env
# Edit .env with your configuration- Set up the database
# Run database migrations
poetry run alembic upgrade head
# Optional: Seed with sample data
poetry run python scripts/seed_data.py- Start the development server
poetry run uvicorn app.main:app --reload --host 0.0.0.0 --port 8000The API will be available at http://localhost:8000 with interactive documentation at http://localhost:8000/docs.
CodebeGen supports optional cloud storage via Supabase for scalable project file storage.
-
Create a Supabase project at supabase.com
-
Configure environment variables
# Add to .env
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_KEY=your-service-role-key
USE_CLOUD_STORAGE=true- Migrate existing projects (optional)
# Preview migration
poetry run python scripts/migrate_to_supabase.py --dry-run
# Perform migration
poetry run python scripts/migrate_to_supabase.pyFor detailed setup instructions, see docs/STORAGE_SETUP.md.
Note: Cloud storage is optional. The system works perfectly with local-only storage by setting USE_CLOUD_STORAGE=false.
codebegen_be/
βββ app/ # Main application package
β βββ main.py # FastAPI app entry point
β βββ auth/ # Authentication & authorization
β β βββ dependencies.py # Auth dependencies
β β βββ handlers.py # Auth handlers
β β βββ models.py # Auth models
β βββ core/ # Core application components
β β βββ config.py # Configuration management
β β βββ database.py # Database connection
β β βββ exceptions.py # Custom exceptions
β β βββ security.py # Security utilities
β βββ models/ # SQLAlchemy database models
β β βββ base.py # Base model class
β β βββ user.py # User model
β β βββ project.py # Project model
β β βββ generation.py # AI generation model
β β βββ organization.py # Organization model
β βββ repositories/ # Data access layer
β β βββ base.py # Base repository
β β βββ user_repository.py # User data operations
β β βββ project_repository.py # Project data operations
β β βββ generation_repository.py # Generation data operations
β βββ routers/ # API route handlers
β β βββ auth.py # Authentication routes
β β βββ projects.py # Project management routes
β β βββ generations.py # Code generation routes
β β βββ ai.py # AI service routes
β β βββ webhooks.py # Webhook handlers
β βββ schemas/ # Pydantic schemas/DTOs
β β βββ base.py # Base schemas
β β βββ user.py # User schemas
β β βββ project.py # Project schemas
β β βββ generation.py # Generation schemas
β β βββ ai.py # AI request/response schemas
β βββ services/ # Business logic layer
β β βββ ai_orchestrator.py # AI pipeline coordination
β β βββ generation_service.py # Generation management & versioning
β β βββ file_manager.py # Hierarchical file storage management
β β βββ supabase_storage_service.py # Cloud storage integration
β β βββ storage_manager.py # Hybrid local/cloud storage manager
β β βββ storage_integration_helper.py # Storage integration helper
β β βββ code_generator.py # Code generation service
β β βββ code_reviewer.py # Code review service
β β βββ docs_generator.py # Documentation service
β β βββ github_service.py # GitHub integration
β β βββ billing_service.py # Billing and subscriptions
β β βββ schema_parser.py # Schema extraction service
β βββ utils/ # Utility functions
β βββ file_utils.py # File operations
β βββ formatters.py # Code formatting
β βββ validators.py # Input validation
βββ ai_models/ # AI model implementations
β βββ qwen_generator.py # Qwen code generation model
β βββ llama_parser.py # Llama schema parser
β βββ starcoder_reviewer.py # Starcoder code reviewer
β βββ mistral_docs.py # Mistral documentation generator
β βββ model_loader.py # Model loading utilities
βββ alembic/ # Database migrations
β βββ versions/ # Migration files
β βββ env.py # Alembic configuration
βββ templates/ # Project templates
β βββ fastapi_basic/ # Basic FastAPI template
β βββ fastapi_mongo/ # FastAPI + MongoDB template
β βββ fastapi_sqlalchemy/ # FastAPI + SQLAlchemy template
βββ tests/ # Test suites
β βββ test_auth/ # Authentication tests
β βββ test_services/ # Service layer tests
β βββ test_ai/ # AI pipeline tests
β βββ test_integration/ # Integration tests
βββ infra/ # Infrastructure & deployment
β βββ docker-compose.yml # Local development setup
β βββ Dockerfile # Container configuration
β βββ nginx.conf # Nginx configuration
βββ docs/ # Documentation
β βββ architecture.md # System architecture
β βββ deployment.md # Deployment guide
β βββ openapi.yaml # API specification
βββ scripts/ # Utility scripts
β βββ migrate.py # Database migration runner
β βββ seed_data.py # Sample data seeder
β βββ migrate_to_supabase.py # Supabase migration script
β βββ setup.py # Environment setup
βββ storage/ # File storage with hierarchical structure
β βββ projects/ # Project-specific storage
β βββ {project_id}/ # Individual project directory
β βββ generations/ # Version-tracked generations
β β βββ v1__{gen_id}/ # Version 1 generation files
β β β βββ source/ # Generated source code
β β β βββ manifest.json # Generation metadata
β β βββ v2__{gen_id}/ # Version 2 generation files
β β βββ active -> v2__{gen_id} # Symlink to active version
β βββ legacy/ # Backward compatibility for old flat storage
βββ requirements/ # Dependency files
βββ base.txt # Core dependencies
βββ dev.txt # Development dependencies
βββ prod.txt # Production dependencies
POST /auth/register- User registrationPOST /auth/login- User loginGET /auth/me- Get current userPOST /auth/refresh- Refresh access token
GET /projects/- List user projectsPOST /projects/- Create new projectGET /projects/{id}- Get project detailsPUT /projects/{id}- Update projectDELETE /projects/{id}- Delete projectGET /projects/public- List public projectsGET /projects/search- Search projects
POST /generations/- Start code generationGET /generations/{id}- Get generation statusGET /generations/{id}/stream- Stream generation progressPOST /generations/{id}/iterate- Iterate on generationGET /generations/{id}/files- Download generated files
GET /projects/{project_id}/generations- List all versions for a projectGET /projects/{project_id}/generations/{version}- Get specific version detailsGET /projects/{project_id}/generations/active- Get active generationPOST /projects/{project_id}/generations/{generation_id}/activate- Set active generationGET /projects/{project_id}/generations/compare/{from_version}/{to_version}- Compare two versions
POST /ai/generate- Generate project from promptPOST /ai/iterate- Iterate and improve codeGET /ai/models- List available AI models
# Application Settings
APP_NAME=codebegen
ENVIRONMENT=development
DEBUG=true
SECRET_KEY=your-secret-key-here
# Database Configuration
DATABASE_URL=postgresql://user:password@localhost:5432/codebegen
REDIS_URL=redis://localhost:6379
# AI Model Paths
QWEN_MODEL_PATH=Qwen/Qwen2.5-Coder-32B
LLAMA_MODEL_PATH=meta-llama/Llama-3.1-8B
STARCODER_MODEL_PATH=bigcode/starcoder2-15b
MISTRAL_MODEL_PATH=mistralai/Mistral-7B-Instruct-v0.1
# GitHub Integration
GITHUB_CLIENT_ID=your-github-client-id
GITHUB_CLIENT_SECRET=your-github-client-secret
# External Services
STRIPE_SECRET_KEY=your-stripe-keypoetry run pytest# Authentication tests
poetry run pytest tests/test_auth/
# Service layer tests
poetry run pytest tests/test_services/
# Integration tests
poetry run pytest tests/test_integration/
# AI pipeline tests
poetry run pytest tests/test_ai/poetry run pytest --cov=app --cov-report=html# Start all services
docker-compose up -d
# View logs
docker-compose logs -f
# Stop services
docker-compose down# Build production image
docker build -f infra/Dockerfile -t codebegen:latest .
# Run with production settings
docker run -d \
--name codebegen \
-p 8000:8000 \
-e ENVIRONMENT=production \
-e DATABASE_URL=your-production-db-url \
codebegen:latest- FastAPI application setup
- Database models and migrations
- Authentication system
- Basic API structure
- Testing framework
- Project CRUD operations
- User management
- API endpoints
- Data validation
- Repository pattern
- AI model loading and inference
- Multi-model pipeline implementation
- Code generation service
- Real-time streaming
- Quality scoring
- GitHub integration
- Template system
- Billing integration
- Advanced analytics
- Performance optimization
# Format code
poetry run black .
poetry run isort .
# Type checking
poetry run mypy app/
# Run pre-commit hooks
poetry run pre-commit run --all-files# Create feature branch
git checkout -b feature/your-feature-name
# Make changes and commit
git add .
git commit -m "feat: add new feature"
# Push and create PR
git push origin feature/your-feature-name- Generation Time: End-to-end code generation performance
- Quality Scores: AI-generated code quality metrics
- User Engagement: Project creation and iteration rates
- Model Performance: Individual AI model accuracy and speed
- Health Check:
GET /health- System status - Database: Connection pool and query performance
- Redis: Cache hit rates and connection status
- AI Models: Model loading status and inference times
- JWT token-based authentication
- Role-based access control (RBAC)
- Secure password hashing with bcrypt
- Token refresh mechanism
- Rate limiting on all endpoints
- CORS configuration
- Request validation with Pydantic
- SQL injection prevention
- XSS protection headers
- Encrypted sensitive data storage
- Secure environment variable handling
- Database connection encryption
- API key rotation support
- Parses natural language requirements
- Extracts entities, relationships, and constraints
- Generates database schema suggestions
- Handles complex domain modeling
- Fine-tuned on FastAPI and clean architecture patterns
- Generates complete project structures
- Implements best practices and patterns
- Supports multiple tech stacks
- Security vulnerability detection
- Performance optimization suggestions
- Code quality assessment
- Best practices validation
- README generation
- API documentation
- Code comments and docstrings
- Deployment guides
import httpx
# Authentication
auth_response = httpx.post("http://localhost:8000/auth/login", json={
"username": "user@example.com",
"password": "your-password"
})
token = auth_response.json()["access_token"]
headers = {"Authorization": f"Bearer {token}"}
# Create a project
project_response = httpx.post(
"http://localhost:8000/projects/",
json={
"name": "E-commerce API",
"description": "A complete e-commerce backend with products, orders, and payments",
"domain": "ecommerce",
"tech_stack": ["FastAPI", "PostgreSQL", "Redis"],
"is_public": False
},
headers=headers
)
project_id = project_response.json()["id"]
# Start AI generation
generation_response = httpx.post(
"http://localhost:8000/generations/",
json={
"project_id": project_id,
"prompt": """
Create a complete e-commerce API with:
- User authentication and profiles
- Product catalog with categories and inventory
- Shopping cart functionality
- Order processing and payment integration
- Admin dashboard for management
- Inventory tracking
- Email notifications
- Comprehensive testing
""",
"context": {
"complexity": "high",
"include_tests": True,
"include_docs": True,
"deployment_target": "docker"
}
},
headers=headers
)
generation_id = generation_response.json()["id"]
# Monitor progress via WebSocket or polling
status_response = httpx.get(
f"http://localhost:8000/generations/{generation_id}",
headers=headers
)
print(status_response.json())// WebSocket connection for real-time updates
const ws = new WebSocket(`ws://localhost:8000/generations/${generationId}/stream`);
ws.onmessage = function(event) {
const data = JSON.parse(event.data);
console.log(`Progress: ${data.progress}% - ${data.stage}`);
if (data.status === 'completed') {
console.log('Generation completed!');
console.log('Files generated:', data.files);
}
};- Architecture Guide - Detailed system architecture
- Deployment Guide - Production deployment instructions
- API Reference - Complete API documentation
- Development Setup - Local development guide
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
- Follow PEP 8 style guidelines
- Use type hints for all functions
- Write comprehensive docstrings
- Maintain test coverage above 90%
This project is licensed under the MIT License - see the LICENSE file for details.
- FastAPI team for the excellent framework
- Hugging Face for the transformer models
- SQLAlchemy team for the powerful ORM
- OpenAI for inspiration in AI-powered development
- Documentation: docs/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: support@codebegen.com
Built with β€οΈ by the CodebeGen Team
Transform your ideas into production-ready backends with the power of AI.