Skip to content

SajidAli8015/TradingAI-Extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TradingAI — YouTube Trading Strategy Extractor

TradingAI transforms any YouTube trading video into a structured, actionable strategy report — and executable trading code — in minutes. Paste a URL: TradingAI downloads the audio, transcribes it locally using OpenAI Whisper (your data never leaves your machine), uses GPT-4o to extract every trading strategy, indicator, risk rule, and algorithm discussed, then automatically generates ready-to-use Pine Script v6 (TradingView) and MetaTrader 5 Python code from the extracted strategies.

Built for traders, analysts, and quant researchers who want to capture insights from hours of trading content — without watching every video.

Python React FastAPI Docker Whisper PineScript

TradingAI UI

✨ Key Features

  • 🎥 YouTube to report in one click — paste any YouTube URL and get a full strategy report
  • 🎙️ Local transcription — Whisper runs on your machine, no audio sent to third parties
  • 📊 Live transcription progress — real-time progress bar shows % complete while Whisper processes
  • 🧠 AI strategy extraction — GPT-4o extracts strategies, indicators, risk rules and algorithms
  • 🧩 Code generation — auto-generates Pine Script v6 (TradingView) and MT5 Python from every report
  • 📄 Export anywhere — download as PDF, Markdown, plain text, Pine Script (.pine) or MT5 (.py)
  • 🔁 Handles long videos — automatic token-aware chunking for videos of any length
  • 🌐 Beautiful web UI — React frontend with live progress, GPU/CPU selector, and code tabs
  • 🖥️ CLI support — full pipeline available as a command-line tool
  • 🐳 Docker ready — CPU variant, one command to run on any machine
  • 🔌 Flexible LLM — works with Azure OpenAI or standard OpenAI API

🛠️ Tech Stack

Layer Technology
Transcription OpenAI Whisper (local, GPU or CPU)
LLM Azure OpenAI GPT-4o / OpenAI GPT-4o
Audio download yt-dlp + ffmpeg
Backend FastAPI + Python 3.9+
Frontend React 18 + Vite
Containerisation Docker + docker-compose
Token counting tiktoken
Code generation Pine Script v6 + MetaTrader 5 Python
Progress tracking Chunked Whisper transcription with callbacks

⚙️ How It Works

YouTube URL
    │
    ▼
┌─────────────────┐
│  downloader.py  │  Downloads audio-only via yt-dlp + ffmpeg
└────────┬────────┘
         │
         ▼
┌──────────────────┐
│  transcriber.py  │  Transcribes locally with Whisper (GPU/CPU)
│                  │  Live progress bar — shows % complete in UI + logs
└────────┬─────────┘
         │
         ▼
┌─────────────────┐
│   chunker.py    │  Counts tokens — splits into chunks if too long
└────────┬────────┘
         │
         ▼
┌──────────────────────────┐
│  strategy_extractor.py   │  Sends to GPT-4o — extracts strategies,
│                          │  indicators, risk rules, algorithms
└────────┬─────────────────┘
         │
         ▼
┌──────────────────────────┐
│  code_generator.py       │  Generates Pine Script v6 (TradingView)
│                          │  and Python MT5 code from strategies
└────────┬─────────────────┘
         │
         ▼
┌─────────────────┐
│   reporter.py   │  Formats and saves the final report
└─────────────────┘

📁 Project Structure

trading_app/
├── backend/
│   ├── app.py                  # FastAPI server + SSE streaming
│   ├── main.py                 # CLI entry point
│   ├── config.py               # Config loader (.env → constants)
│   ├── logger.py               # Logging setup (console + file)
│   ├── requirements.txt        # Core dependencies
│   ├── requirements_web.txt    # Web/API dependencies
│   └── modules/
│       ├── downloader.py          # yt-dlp audio download
│       ├── transcriber.py         # Whisper transcription (with live progress %)
│       ├── chunker.py             # Token counting + smart chunking
│       ├── strategy_extractor.py  # GPT-4o strategy extraction
│       ├── code_generator.py      # Pine Script v6 + MT5 Python generation
│       └── reporter.py            # Report formatting + saving
├── frontend/
│   └── src/                    # React 18 + Vite source
├── docker/
│   ├── Dockerfile              # CPU Docker image
│   └── docker-compose.yml      # CPU compose config
├── start.py                    # One-command launcher (builds frontend + starts server)
├── .env.example                # Environment variable template
└── README.md

🔌 LLM Provider — Azure OpenAI or OpenAI

TradingAI supports two LLM providers. Set LLM_PROVIDER in your .env to choose:

Option A — Azure OpenAI

LLM_PROVIDER=azure
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your_azure_key_here
AZURE_OPENAI_API_VERSION=2024-02-01
AZURE_DEPLOYMENT_NAME=gpt-4o

Option B — Standard OpenAI

LLM_PROVIDER=openai
OPENAI_API_KEY=sk-your-openai-key-here
OPENAI_MODEL=gpt-4o

Not sure which to use? Standard OpenAI is simpler to set up. Azure OpenAI is better for enterprise/team use with compliance requirements.


🐳 Quick Start (Docker — recommended)

No Python, Node.js, or ffmpeg setup needed. Just Docker.

Prerequisites

  • Docker Desktop installed and running
  • Your LLM credentials (Azure OpenAI or standard OpenAI)

Setup

git clone https://github.com/your-username/trading_strategy_extractor.git
cd trading_strategy_extractor
cp .env.example .env
# Open .env and fill in your credentials

Run — CPU version (works on all machines)

docker compose -f docker/docker-compose.yml up --build

Open in browser

http://localhost:8000

First build takes 3–5 minutes. Subsequent starts are instant.

Command Description
docker compose -f docker/docker-compose.yml up --build First run
docker compose -f docker/docker-compose.yml up Start (no rebuild)
docker compose -f docker/docker-compose.yml down Stop the app
docker compose -f docker/docker-compose.yml logs -f View live logs

💻 Manual Setup (without Docker)

Prerequisites

  • Python 3.9+
  • FFmpeg
  • CUDA GPU (optional but recommended for Whisper speed)
  • Azure OpenAI or OpenAI API credentials

Installation

# 1. Create virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
source venv/bin/activate     # macOS / Linux

# 2. Install PyTorch (with CUDA — check your version with nvidia-smi)
pip install torch --index-url https://download.pytorch.org/whl/cu118   # CUDA 11.8
pip install torch --index-url https://download.pytorch.org/whl/cu121   # CUDA 12.1
pip install torch                                                        # CPU only

# 3. Install dependencies
pip install -r backend/requirements.txt

# 4. Set up environment
cp .env.example .env
# Fill in your credentials

Run the Web UI

pip install -r backend/requirements_web.txt
python start.py
# Open http://localhost:8000

⚙️ Environment Variables

Variable Required Default Description
LLM_PROVIDER Yes azure azure or openai
AZURE_OPENAI_ENDPOINT If azure Azure OpenAI endpoint URL
AZURE_OPENAI_API_KEY If azure Azure OpenAI API key
AZURE_OPENAI_API_VERSION If azure 2024-02-01 Azure API version
AZURE_DEPLOYMENT_NAME If azure gpt-4o GPT-4o deployment name
OPENAI_API_KEY If openai OpenAI API key (sk-...)
OPENAI_MODEL If openai gpt-4o OpenAI model name
WHISPER_MODEL No small tiny / base / small / medium / large
WHISPER_DEVICE No cpu cuda or cpu
LOG_LEVEL No DEBUG DEBUG / INFO / WARNING / ERROR
MAX_TOKENS_SAFE No 60000 Token limit before chunking kicks in
CHUNK_OVERLAP_WORDS No 200 Overlap words between chunks
YTDLP_COOKIES_FILE No Path to cookies.txt for authenticated downloads

🍪 YouTube Authentication (if required)

YouTube may require authentication to download some videos. If you see a "Sign in to confirm you're not a bot" error:

  1. Install the "Get cookies.txt LOCALLY" Chrome extension
  2. Go to youtube.com while logged in to your Google account
  3. Click the extension and export cookies as cookies.txt
  4. Place cookies.txt in the project root folder
  5. Add to your .env file:
    YTDLP_COOKIES_FILE=cookies.txt

Note: cookies.txt is gitignored and should never be committed to version control as it contains your session credentials.


🎙️ Whisper Model Reference

Model Size VRAM Speed Recommended for
tiny 75 MB ~1 GB fastest quick tests
base 145 MB ~1 GB fast short videos
small 465 MB ~2 GB good default (4 GB GPU)
medium 1.5 GB ~5 GB better longer/complex content
large 3 GB ~10 GB best maximum accuracy

Docker CPU mode uses small by default. Change via WHISPER_MODEL in .env.


🔁 Handling Long Videos

Token checking and chunking happen automatically — no manual steps needed.

Video length Approx tokens Behaviour
Under ~5 hours Under 60,000 Single pass — full transcript sent to GPT-4o
Over ~5 hours Over 60,000 Auto-split → MAP phase → REDUCE phase → merged report

How it works

  1. chunker.py counts tokens using tiktoken with GPT-4o's cl100k_base encoding
  2. If within limit → single pass, normal flow
  3. If over limit → split into overlapping chunks → extract strategies per chunk (MAP) → merge and deduplicate (REDUCE)
# Test the chunker on any saved transcript
python backend/modules/chunker.py ./outputs/test_transcript.txt

🧩 Code Generation

After extracting strategies, TradingAI automatically generates executable trading code in three formats — ready to use in real trading platforms.

Pine Script v6 (TradingView)

  • Ready to paste directly into TradingView's Pine Script Editor
  • Implements entry/exit conditions, indicators, stop loss and take profit
  • Uses Pine Script v6 syntax with built-in ta.* functions
  • How to use:
    1. Open TradingView
    2. Go to Pine Script Editor
    3. Click "New Script"
    4. Paste the generated code
    5. Click "Add to chart"

MQL5 Expert Advisor — MetaTrader 5 (Native)

  • Runs natively inside MetaTrader 5 — no Python or external software needed
  • Complete Expert Advisor (.mq5) with OnInit(), OnDeinit(), OnTick() handlers
  • Uses MT5's built-in indicator functions (iMA, iRSI, iATR, etc.)
  • Input parameters configurable directly from the MT5 interface
  • How to use:
    1. Download the .mq5 file
    2. Open MetaTrader 5 → File → Open Data Folder
    3. Place file in MQL5/Experts/ folder
    4. In MT5 Navigator → Expert Advisors → right-click → Refresh
    5. Double-click the EA to attach to a chart
    6. Enable "Allow Algo Trading" in MT5 settings

Python MT5 — MetaTrader 5 (External)

  • Python script that connects to MT5 externally via the MetaTrader5 library
  • MT5 must be open and running on your machine
  • More flexible for custom data processing and ML integration
  • Prerequisite: pip install MetaTrader5

Which MT5 option should I use?

MQL5 Python MT5
Runs inside MT5 ✅ Native ❌ External
Requires Python ❌ No ✅ Yes
Best for Pure trading automation Custom logic + ML
Recommended for Most traders Quant developers

Where to find the generated code

All three formats are available in the web UI as tabs:

[ Strategy Report ] [ Pine Script ] [ MQL5 (MT5) ] [ MT5 Python ]

Each tab has:

  • Syntax-highlighted code block (github-dark theme)
  • Copy button — copies code to clipboard instantly
  • Download button — saves as .pine, .mq5, or .py file

Test code generation independently

python backend/modules/code_generator.py ./outputs/test_transcript.txt

This generates and saves:

  • ./outputs/test_pine_script.pine
  • ./outputs/test_expert_advisor.mq5
  • ./outputs/test_mt5_strategy.py

🌐 Web UI

What the UI shows

  • URL bar — paste any YouTube URL, select CPU or GPU (CUDA), click Generate
  • Left panel — live pipeline steps with animated status indicators, real-time transcription progress bar (updates every ~10% as Whisper processes), then video info card, token consumption card with progress bar, and stats grid (words, tokens, API calls, total time)
  • Right panel — four tabs when report is ready:
    • Strategy Report — fully formatted markdown report with section headers
    • Pine Script — syntax-highlighted Pine Script v6 code with Copy + Download
    • MQL5 (MT5) — Native MetaTrader 5 Expert Advisor with Copy + Download
    • MT5 Python — syntax-highlighted MetaTrader 5 Python code with Copy + Download
  • Download buttons — TXT · MD · PDF · Pine Script (.pine) · MQL5 (.mq5) · MT5 Python (.py)

🖥️ CLI Usage

# Full pipeline
python backend/main.py "https://www.youtube.com/watch?v=EXAMPLE"

# Transcription only (skip LLM)
python backend/main.py "https://youtube.com/watch?v=EXAMPLE" --transcript-only

# Override Whisper model
python backend/main.py "https://youtube.com/watch?v=EXAMPLE" --whisper-model base

# Verbose debug logging
python backend/main.py "https://youtube.com/watch?v=EXAMPLE" --log-level DEBUG

# Test individual modules
python backend/modules/downloader.py "https://youtube.com/watch?v=EXAMPLE"
python backend/modules/transcriber.py ./temp/audio.mp3
python backend/modules/strategy_extractor.py ./outputs/test_transcript.txt
python backend/modules/reporter.py

# Test code generator independently
python backend/modules/code_generator.py ./outputs/test_transcript.txt

# Test strategy extractor independently
python backend/modules/strategy_extractor.py ./outputs/test_transcript.txt

📁 Output Format

Reports saved to ./outputs/ as {title}_{timestamp}.txt:

================================================
TRADING STRATEGY REPORT
================================================
Source Video : Video Title Here
Channel      : Channel Name
Duration     : 28m 24s
Transcript   : 6,732 words
Generated    : 2026-04-06 14:22:00
Tokens Used  : 8,738
================================================

[extracted strategies]

================================================
END OF REPORT
================================================

🔍 Troubleshooting

CUDA / GPU issues
  • "CUDA not available" — Whisper falls back to CPU automatically. Check CUDA with nvidia-smi.
  • GPU out of memory — Use a smaller model: --whisper-model tiny or base
  • torch not found — Use the CUDA-specific pip install command above
yt-dlp / ffmpeg errors
  • "Video unavailable" — Video may be private, age-restricted, or region-blocked
  • "ffmpeg not found" — Install FFmpeg and add its bin/ folder to your system PATH
  • "Sign in to confirm you're not a bot" — See the YouTube Authentication section above
Token / chunking issues
  • A WARNING is logged automatically when transcript exceeds safe token limit
  • Chunking is fully automatic — no manual steps needed
  • Use python backend/modules/chunker.py ./outputs/transcript.txt to inspect token counts
Missing .env values
  • A clear ValueError is raised if required credentials are missing
  • Check your .env file is in the project root (not inside a subfolder)
  • Make sure LLM_PROVIDER matches the credentials you have filled in
Pine Script or MT5 code not generating
  • Check that your LLM API key is valid and has remaining quota
  • Code generation makes 2 additional API calls after strategy extraction
  • If code tabs appear empty, check the logs for code_generator errors
  • Test independently:
    python backend/modules/code_generator.py ./outputs/test_transcript.txt
  • If the transcript is very short (under 100 words), the LLM may not generate meaningful code — try with a longer video
MQL5 Expert Advisor not compiling in MetaTrader 5
  • Open MetaTrader 5 → Tools → MetaEditor (or press F4)
  • Open the .mq5 file in MetaEditor
  • Press F7 to compile — check the Errors tab for issues
  • Common fix: make sure #include <Trade\Trade.mqh> is available (comes with standard MT5 installation)
  • If you see "CTrade" errors, the Trade library may need to be enabled in MetaEditor's Include paths
  • The generated EA is a starting template — you may need to adjust lot sizes and broker-specific settings
Transcription progress shows 0% or does not update
  • Progress updates every ~10% as each audio chunk is processed
  • Short videos (under 2 minutes) may complete too fast to see progress
  • Check logs for [DEBUG] Transcription progress: lines to confirm it is working
  • If using GPU (CUDA), transcription is faster and progress updates may flash briefly
  • Try with a longer video (5+ minutes) to see progress updates clearly

🤝 Contributing

Pull requests are welcome! For major changes please open an issue first.

  1. Fork the repo
  2. Create a feature branch: git checkout -b feature/your-feature
  3. Commit: git commit -m "add your feature"
  4. Push: git push origin feature/your-feature
  5. Open a Pull Request

About

Extract actionable trading strategies from any YouTube video using Whisper + GPT-4o

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors