Skip to content

mitkeng/QRxVision

Repository files navigation

$\text{QR}{\color{red}\text{x}}\text{Vision}$

A multimodal approach to domain-specific drug-kinase assignment using QR code embedding and image-based transfer learning

image

🚀 1. Setup and Installation

To get started, you'll need the run_qrxvision.py script, the requirements.txt file, and the necessary data files.

Clone the Repository

git clone https://github.com/mitkeng/QRxVision.git
cd your-repository-name

📂 Required Components

  • 📜 Scripts: run_qrxvision.py, requirements.txt
  • 📊 Datasets: Train_data.csv, Kinase_drug.csv, Total_dataset.csv

├── run_qrxvision.py          # Main execution script
├── requirements.txt          # Package dependencies
├── Train_data.csv            # Dataset 1
├── Kinase_drug.csv           # Dataset 2
└── Total_dataset.csv         # Dataset 3

Install Dependencies

Open your terminal or command prompt, navigate to the directory where you placed the files, and install the required Python libraries using uv pip (or pip if uv is not installed):

# Using uv (recommended for speed)
uv pip install -r requirements.txt

# Or using standard pip
pip install -r requirements.txt

🛠 2. Running the run_qrxvision.py Script

The run_qrxvision.py script is designed to be run from the command line using argparse to handle different input scenarios.

Command Structure: The basic command structure is python run_qrxvision.py [arguments].

Available Arguments:

  • --smile <SMILE_STRING>: (Required for single compound processing) Provide a single SMILES string for analysis.
  • --name <COMPOUND_NAME>: (Optional, for single compound) Provide a name for the compound being analyzed.
  • --csv_file <PATH_TO_CSV>: (Required for batch processing) Provide the path to a CSV file containing compounds. This CSV must have a column named smile for SMILES strings and can optionally have a name column.
  • --output_file <OUTPUT_PATH.csv>: (Optional) Specify a path to save the similarity results to a CSV file. If not provided, results will only be printed to the console.
  • --top_n <NUMBER>: (Optional) Specify the number of top similar compounds to display/save. Defaults to 10.

Single Compound Processing Example:

To find the top 7 similar compounds for Ripretinib and save the results:

python run_qrxvision.py --smile "CCN1C2=CC(=NC=C2C=C(C1=O)C3=CC(=C(C=C3Br)F)NC(=O)NC4=CC=CC=C4)NC" --name "Ripretinib" --output_file single_compound_results.csv --top_n 7

📊 3. Understanding the Output

Console Output

The script will print the processing status and the top N similar compounds directly to your terminal.

CSV Output File

If you specify --output_file, a CSV file will be generated. This file will contain details for each query compound (Name, SMILE, Test Image filename) and its top similar reference compounds (Reference Compound filename without .png extension, Similarity Score).

Example structure for single_compound_results.csv:

TestCompound TestCompoundSMILE TestImage ReferenceCompound SimilarityScore
Ripretinib CCN1C2=CC(...) test_4.png Cenerimod 0.9729
Ripretinib CCN1C2=CC(...) test_161.png Erdafitinib 0.9711
... ... ... ... ...

Transitive Assignment Scheme

image

📖 Citation (Pending Publication)