Skip to content

PRITHIVSAKTHIUR/Qwen-Image-Edit-Object-Manipulator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Qwen-Image-Edit-Object-Manipulator

A Gradio-based interactive demonstration built on top of Qwen/Qwen-Image-Edit-2511, specifically optimized for high-precision object manipulation through lazy-loaded LoRA adapters. The system enables users to add, modify, or remove targeted elements such as logos, accessories, clothing items, or background objects, using either single-image or multi-image inputs. Each LoRA adapter is loaded on demand, minimizing memory overhead while allowing rapid switching between different editing capabilities. This design ensures scalability and efficient resource utilization, especially in multi-adapter workflows.

The demo emphasizes structural consistency and visual realism, preserving original lighting conditions, shadows, textures, perspective, and background context during edits. Fine-grained prompt control allows users to specify exact object attributes, placement, and transformations, reducing unwanted artifacts and improving edit fidelity. For performance, the pipeline is accelerated using Flash Attention 3, enabling low-latency inference even at higher resolutions and complex edit scenarios. The result is a responsive, production-ready image editing interface suitable for research, prototyping, and real-world creative applications.

This app uses Gradio v6.3.0.

Screenshot 2026-01-27 093348 Screenshot 2026-01-27 093432

Features

  • Object Addition/Removal: Add elements (e.g., "Add batman logo") or remove (e.g., "Remove necklace") with natural integration.
  • Multi-Image Support: Upload gallery for reference-based edits (e.g., transfer pose or style).
  • Lazy LoRA Loading: 2 adapters (Object-Adder, Object-Remover) load on-demand to optimize memory.
  • Rapid Inference: 4-step default generations with bfloat16 and Flash Attention 3.
  • Auto-Resizing: Maintains aspect ratio up to 1024px max edge (multiples of 8).
  • Custom Theme: OrangeRedTheme with responsive layout.
  • Examples: 4 curated scenarios for quick testing.
  • Queueing: Up to 30 concurrent jobs.

SQp9XAlYp7N08VAWg9KnH p2ezcG3uN1a0WjcO2VusT ILqeYxJg3wjh28bnpmUTK ECUOiU7HLnir26iIjCPoY AqwrsNO-RDSF2OxZG06kx _EcUfxwqvRIom-_QbQTVC

Note: Experimental for Qwen-Image-Edit-2511; consider 2509 version for stability.

Prerequisites

  • Python 3.10 or higher.
  • CUDA-compatible GPU (required for bfloat16 and Flash Attention 3).
  • pip >= 23.0.0 (see pre-requirements.txt).
  • Stable internet for initial model/LoRA downloads.

Installation

  1. Clone the repository:

    git clone https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-Object-Manipulator.git
    cd Qwen-Image-Edit-Object-Manipulator
    
  2. Install pre-requirements: Create a pre-requirements.txt file with the following content, then run:

    pip install -r pre-requirements.txt
    

    pre-requirements.txt content:

    pip>=23.0.0
    
  3. Install dependencies: Create a requirements.txt file with the following content, then run:

    pip install -r requirements.txt
    

    requirements.txt content:

    git+https://github.com/huggingface/transformers.git@v4.57.3
    git+https://github.com/huggingface/accelerate.git
    git+https://github.com/huggingface/diffusers.git
    git+https://github.com/huggingface/peft.git
    huggingface_hub
    sentencepiece
    torchvision
    supervision
    kernels
    spaces
    gradio
    hf_xet
    torch
    numpy
    av
    
  4. Start the application:

    python app.py
    

    The demo launches at http://localhost:7860.

Usage

  1. Upload Images: Use gallery for one or more images (e.g., base + reference).

  2. Select Manipulator: Choose "Object-Adder" or "Object-Remover".

  3. Enter Prompt: Describe the action (e.g., "Add the batman logo while preserving background lighting").

  4. Configure (Optional): Expand "Advanced Settings" for seed, guidance, steps.

  5. Edit Image: Click "Edit Image" to generate output.

Supported Manipulators

Manipulator Use Case
Object-Adder Add elements (logos, accessories) realistically
Object-Remover Remove items (jewelry, goggles) seamlessly

Examples

Input Images Prompt Example Manipulator
examples/D.jpg "Add the batman logo to the image while preserving background lighting and details." Object-Adder
examples/A.jpg "Add slim rectangular transparent frame sunglasses while preserving lighting." Object-Adder
examples/B.jpeg "Remove the necklace and goggles while preserving background and details." Object-Remover
examples/C.png "Add the leather cowboy cap while preserving lighting and surroundings." Object-Adder

Troubleshooting

  • Manipulator Loading: First use downloads LoRA; check console.
  • OOM: Reduce steps/resolution; clear cache with torch.cuda.empty_cache().
  • Flash Attention Fails: Fallback to default; requires compatible CUDA.
  • Gallery Input: Supports filepaths, tuples, or PIL objects.
  • No Output: Ensure image uploaded and prompt specific.

Contributing

Contributions welcome! Add new manipulators to ADAPTER_SPECS, improve prompts, or enhance multi-image support.

Repository: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-Object-Manipulator.git

License

Apache License 2.0. See LICENSE for details.

Built by Prithiv Sakthi. Report issues via the repository.

About

Demonstration for the Qwen/Qwen-Image-Edit-2511 model, specialized in object manipulation via lazy-loaded LoRA adapters. Supports adding or removing specific elements (e.g., logos, accessories, clothing) in single- or multi-image inputs while preserving lighting, realism, and background details. Features precise prompt control and fast inference.

Topics

Resources

License

Stars

Watchers

Forks

Languages