Skip to content

kodelyx/flow-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚡ Flow Agent

Generate AI videos & images from your terminal — no API key, no limits.

Omni Flash for cinematic video generation · Nano Banana 2 for unlimited image creation
Auto watermark removal · Reference-based editing · One command, zero setup.


🎬 T2V V2V I2V — Video generation with auto watermark clean (uses credits)
🖼️ T2I I2I — Unlimited image generation with reference support (no credits needed)
🔑 Uses your Google account via Chrome extension — no API key required

✅ Features & Status

Feature What it does Time Status
T2V Generate video from text prompt ~44s ✅ Working
T2I Generate image from text prompt ~10-30s ✅ Working
V2V Edit/restyle existing video ~3min ✅ Working
I2I Edit image with reference ~10-30s ✅ Working
I2V Animate a still image into video ~44s ✅ Working
FL First + Last frame video control ~44s ✅ Working
R2V Reference-based video generation ~44s ✅ Working
Upload Upload video/image to Flow ~12s ✅ Working
Watermark Remove Auto-remove Gemini watermark (~1s) ~1s ✅ Auto
Auto-Retry Auto-open/refresh Flow tab for token auto ✅ Built-in
API Sniffer Discover new endpoints/payloads - ✅ Working

📋 Prerequisites

Requirement Details
Python 3.9 or higher
Chrome Latest version
Google Account Logged into Flow
ffmpeg Only for V2V merge (optional)

🛠️ Installation (Step by Step)

Step 1: Clone the repo

git clone https://github.com/kodelyx/flow-agent.git
cd flow-agent

Step 2: Install Python dependencies

pip install -r requirements.txt

This installs websockets, opencv-python-headless, and numpy.

Step 3: Install the Chrome Extension

  1. Open Chrome browser
  2. Go to chrome://extensions in the address bar
  3. Toggle "Developer mode" ON (top-right corner)
  4. Click "Load unpacked"
  5. Select the extension/ folder from this repo
  6. You should see the Flow Agent extension appear

Step 4: Open Google Flow

  1. Open labs.google/fx/tools/flow in Chrome
  2. Make sure you're logged into your Google account
  3. The extension icon should show a green badge = connected
  4. The extension auto-opens this tab when needed

⚠️ The Flow tab auto-opens when you run a command. No manual tab management needed!


🚀 Usage

Text → Video (T2V)

Generate a new video from a text description.

# Basic (portrait 9:16, 10 seconds)
python -m cli.generate "A samurai drawing his katana on a cliff at golden sunset"

# Landscape mode (16:9)
python -m cli.generate "Eagle soaring over snowy mountains" --aspect landscape

# Custom output file
python -m cli.generate "A dragon breathing fire" -o dragon.mp4

# Shorter duration (4/6/8/10 seconds)
python -m cli.generate "Dog playing in the park" --duration 6

# Generate multiple variations
python -m cli.generate "Cyberpunk city at night" --count 4

# I2V — animate a still image
python -m cli.generate "Character comes alive" --start photo.png

# FL — First + Last frame (controlled transition)
python -m cli.generate "Person walks forward" --start start.png --end end.png

# R2V — Reference images (character consistency)
python -m cli.generate "Character in new scene" --ref char1.png char2.png

CLI Options:

Flag Short Default Description
--output -o omni_output.mp4 Output filename
--aspect -a portrait portrait or landscape
--duration -d 10 4, 6, 8, or 10 seconds
--count -c 1 Generate 1-4 videos
--edit -e - Pass media_id for V2V edit mode
--start -s - Start frame image (I2V / FL mode)
--end - End frame image (use with --start for FL)
--ref -r - Reference image(s) for R2V mode
--no-clean - Skip auto watermark removal

Text → Image (T2I)

Generate images from a text description.

# Basic (portrait 9:16)
python -m cli.image "A dragon breathing fire in a cyberpunk city"

# Landscape
python -m cli.image "Mountain sunset" --aspect landscape -o sunset.png

# Square (for logos, icons)
python -m cli.image "Minimal logo design" --aspect square

# Generate 4 variations
python -m cli.image "Abstract art" --count 4

# I2I: Edit with reference image
python -m cli.image "Make it anime style" --ref original.png -o anime.png

CLI Options:

Flag Short Default Description
--output -o output/image.png Output filename
--aspect -a portrait portrait, landscape, square, 4x3, 3x4
--count -c 1 Generate 1-4 variations
--ref -r - Reference image(s) for I2I

Upload Video

Upload a local video to Google Flow. Returns a media_id needed for V2V editing.

# Single video
python -m cli.upload my_video.mp4

# Batch upload all .mp4 in a folder
python -m cli.upload chunks/ --batch

The media_id is automatically saved to media-id.js:

my_video.mp4 : 49f7d936-01e3-41ad-917a-2f9bb6ead00b

Video → Video Edit (V2V)

Apply style changes to an uploaded video (e.g., convert to anime).

# Step 1: Upload your video
python -m cli.upload my_video.mp4
# → media_id saved to media-id.js

# Step 2: Edit with style prompt
python -m cli.edit "Transform into vibrant anime style, Studio Ghibli aesthetic" \
    --media-id 49f7d936-01e3-41ad-917a-2f9bb6ead00b \
    --video-file my_video.mp4 \
    --output output_anime/ \
    --merge

# Without local video file (specify duration manually)
python -m cli.edit "Make it look cyberpunk neon" \
    -m MEDIA_ID \
    --total-seconds 30 \
    -o output_cyber/

How it works: Long videos are automatically split into 10s segments, each processed in parallel, then merged with ffmpeg.

CLI Options:

Flag Short Required Description
--media-id -m Yes Flow media ID (from upload)
--video-file -v No Local file (auto-detects duration/fps)
--total-seconds -t No Duration if no local file
--output -o No Output directory (default: output/)
--aspect -a No portrait or landscape
--merge No Merge segments with ffmpeg

Image → Video (I2V)

Animate a still image into a video. Requires Python scripting (no CLI yet):

import asyncio
from omniflash import (
    ExtensionBridge, upload_image, generate_video_i2v,
    poll_status, download_video, ASPECTS, DEFAULT_PROJECT,
)

async def main():
    # Connect to extension
    bridge = ExtensionBridge()
    await bridge.start()
    await bridge.wait_for_extension(30)

    # Upload your image
    img_id = await upload_image(bridge, "my_image.png")
    print(f"Image uploaded: {img_id}")

    # Generate video from image
    media_ids = await generate_video_i2v(
        bridge,
        prompt="The character comes alive, dramatic movement, cinematic",
        aspect=ASPECTS['portrait'],       # or ASPECTS['landscape']
        project_id=DEFAULT_PROJECT,
        image_media_id=img_id,
        duration=8,                        # 4, 6, 8, or 10 seconds
    )

    # Wait for video to finish
    if media_ids:
        await poll_status(bridge, media_ids[0], DEFAULT_PROJECT)
        await download_video(bridge, media_ids[0], "output_i2v.mp4")

    await bridge.close()

asyncio.run(main())

API Sniffer

Capture all API requests made by the Flow UI. Useful for discovering new endpoints or debugging.

# Start sniffer, then use Flow UI normally
python -m cli.sniff

# Save captured requests to file
python -m cli.sniff --save sniffed.json

See SNIFFING.md for the full API discovery guide.


🐍 Python API (for developers)

Use the omniflash package directly in your own scripts:

from omniflash import (
    ExtensionBridge,          # WebSocket bridge to Chrome extension
    generate_video,           # T2V: text → video
    edit_video,               # V2V: video → video
    upload_image,             # Upload image → get media_id
    generate_video_i2v,       # I2V: image → video
    poll_status,              # Poll until video is ready
    download_video,           # Download finished video
    ASPECTS,                  # {'portrait': '...', 'landscape': '...'}
    DEFAULT_PROJECT,          # Your default project ID
)
from omniflash.upload import upload_video  # Upload video file
from omniflash import media_store          # Read/write media-id.js

# media_store examples:
media_store.save("video.mp4", "uuid-here")   # Save entry
mid = media_store.get("video.mp4")            # Get media_id
all_entries = media_store.read_entries()       # Get all entries

Backward compatible: from omni import ExtensionBridge, generate_video, ... still works.


📁 Project Structure

flow-agent/
├── omniflash/                  # Core Python package
│   ├── __init__.py             # Public API exports
│   ├── bridge.py               # ExtensionBridge (WS + HTTP + auto-retry)
│   ├── config.py               # All config hardcoded here
│   ├── media_store.py          # media-id.js read/write
│   ├── upload.py               # Video upload (GCS resumable)
│   ├── watermark.py            # Auto watermark removal (embedded assets)
│   └── generators/             # API functions
│       ├── common.py           # poll_status, download_video
│       ├── t2v.py              # Text → Video
│       ├── t2i.py              # Text → Image + I2I
│       ├── v2v.py              # Video → Video (edit)
│       └── i2v.py              # Image → Video + upload_image
├── cli/                        # CLI entry points
│   ├── generate.py             # python -m cli.generate
│   ├── image.py                # python -m cli.image
│   ├── upload.py               # python -m cli.upload
│   ├── edit.py                 # python -m cli.edit
│   └── sniff.py                # python -m cli.sniff
├── extension/                  # Chrome extension
│   ├── manifest.json           # Extension manifest
│   ├── background.js           # WS client, API proxy
│   ├── content.js              # Page ↔ background bridge
│   └── injected.js             # Fetch interceptor, reCAPTCHA
├── omni.py                     # Backward-compatible wrapper
├── .gitignore                  # Git ignore rules
├── media-id.js                 # Auto-updated filename → media_id
├── requirements.txt            # Python dependencies
├── SNIFFING.md                 # API discovery guide
└── README.md

⚙️ How It Works

┌─────────────────────────────────┐
│  Your Terminal / Python Script  │
│  python -m cli.generate "..."   │
└──────────┬──────────────────────┘
           │ import omniflash
           ▼
┌─────────────────────────────────┐
│  omniflash package              │
│  ExtensionBridge (WS + HTTP)    │
└──────────┬──────────────────────┘
           │ WebSocket (:9222)
           │ HTTP callback (:8100)
           ▼
┌─────────────────────────────────┐
│  Chrome Extension (Flow Agent) │
│  Auth token + reCAPTCHA solving │
└──────────┬──────────────────────┘
           │ HTTPS (browser cookies)
           ▼
┌─────────────────────────────────┐
│  Google Omni API (aisandbox)    │
│  Video generation / editing     │
└─────────────────────────────────┘
  1. Python starts a WebSocket server + HTTP callback server
  2. Chrome extension auto-connects and provides authentication
  3. Script sends API requests through the extension
  4. Extension solves reCAPTCHA and proxies with browser cookies
  5. Script polls for completion, then downloads the result

🎯 Models & Endpoints

Model Key Duration Type
Omni Flash T2V 4s abra_t2v_4s 4 sec Text → Video
Omni Flash T2V 6s abra_t2v_6s 6 sec Text → Video
Omni Flash T2V 8s abra_t2v_8s 8 sec Text → Video
Omni Flash T2V 10s abra_t2v_10s 10 sec Text → Video
Omni Flash Edit abra_edit 10 sec Video → Video
Endpoint Path
T2V /v1/video:batchAsyncGenerateVideoText
I2V /v1/video:batchAsyncGenerateVideoStartImage
V2V Edit /v1/video:batchAsyncGenerateVideoEditVideo
Upload Image /v1/flow/uploadImage
Poll Status /v1/video:batchCheckAsyncVideoGenerationStatus
Get Media /v1/video/media/{media_id}

🔧 Troubleshooting

Problem Solution
Extension not connecting Make sure Flow tab is open and you're logged in
Address already in use Another script is using port 9222/8100. Kill it first
TIMEOUT error Extension may have disconnected. Reload Flow tab
reCAPTCHA failed Reload Flow tab, wait a few seconds, try again
No media in response Check your prompt. Some prompts get blocked
curl failed Upload too large or network issue. Retry
Video quality poor Use longer duration (10s) and detailed prompts
V2V merge fails Install ffmpeg: brew install ffmpeg

⚠️ Important Notes

  • Flow tab auto-opens — no manual tab management needed
  • Uses your Google account's free Flow credits (check remaining in Flow UI)
  • Extension auto-reconnects and auto-retries (3 attempts)
  • Watermark auto-removed on every generated video (~1s)
  • media-id.js auto-updates on every upload (video or image)
  • All generated videos save to the output/ directory by default
  • Old from omni import ... syntax still works (backward compatible)

📄 License

MIT

About

⚡ CLI toolkit for Google Flow — Omni Flash video + unlimited Nano Banana 2 image generation. T2V, V2V, I2V, T2I, I2I with auto watermark removal. No API key needed.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors