Skip to content

overwrite00/NullifyPDF

πŸ”’ NullifyPDF β€” AI Forensic Edition

GitHub Release GitHub Actions Workflow Status GitHub License Python Version

NullifyPDF Logo

πŸ” NullifyPDF is a professional tool for forensic PDF anonymization. Designed for absolute privacy, it operates entirely locally using artificial intelligence to identify and permanently destroy sensitive data without ever uploading files to the cloud.

Tip

First time using NullifyPDF? Start with our User Guide β€” takes 5 minutes.


πŸ“‹ Quick Overview

NullifyPDF goes beyond simple text covering. It uses Natural Language Processing (NLP) engines to understand context and identify entities like names, addresses, email addresses, IBANs, and credit card numbers. Unlike common PDF editors, this tool performs forensic scrubbing, destroying metadata, hyperlinks, and hidden vector layers to ensure censorship is irreversible.

🎯 Feature ✨ Benefit
🧠 AI-Powered Bilingual (EN/IT) automatic detection
πŸ” 100% Local No cloud uploads, complete privacy
⚑ Real-time Instant scanning with live preview
πŸ›‘οΈ Forensic-Grade Binary-level data destruction

✨ Key Features

  • 🧠 AI-Powered Redaction β€” Automatic bilingual (EN/IT) detection of PII: names, locations, emails, phones, IBANs, credit cards, crypto addresses
  • πŸ—„οΈ Fluid UI & Thread-Safe β€” PySide6 modern dark-mode interface with zero UI freezing. Text extraction in worker thread with QMutex serialization
  • πŸ“– Intelligent Persistent Dictionaries β€” Blocklist and Allowlist synchronized to disk (~/.nullifypdf) with mutual exclusivity and anti-stacking logic. O(1) fast-path matching
  • πŸ›‘οΈ Forensic Scrubbing β€” Not just black boxes. Binary-level destruction of metadata, hidden links, and flattened interactive forms (AcroForms) at export
  • πŸ–ΌοΈ Blindfold Mode β€” One-click image/logo censoring with professional placeholder: [ IMAGE REMOVED ]
  • πŸ“¦ Native Cross-Platform β€” Automated build scripts generate .exe (Windows), .app bundles (macOS), and .deb/.rpm packages (Linux)
  • 🎯 Drag & Drop Support β€” Native file drag-and-drop on main window
  • πŸ“Š Logging & Diagnostics β€” Rotating file-based logging (~/.nullifypdf/logs/) with debug mode for advanced troubleshooting

⚠️ Tool Limitations

To keep NullifyPDF lightweight, 100% offline, and secure, be aware of these technical limits:

❌ Limitation πŸ’‘ Workaround
No Built-in OCR AI reads only digital text, not scanned images. Use Blindfold Mode to remove photo blocks entirely.
Handwritten Text NLP models cannot analyze non-digitized handwriting.
Password-Protected PDFs Encrypted documents are blocked at load. Decrypt before importing.
Digital Signatures Invalidated Forensic scrubbing destroys binary objects; cryptographic signatures (PAdES, notarized) become invalid.

Warning

Digital signatures will be invalidated after redaction. Save unredacted originals separately for formal records.


πŸš€ Getting Started

πŸ“‹ System Requirements

βœ… Python 3.12 (required for PyMuPDF wheel compatibility)
βœ… 2 GB disk space (dependencies + spaCy models)
βœ… 4 GB RAM minimum (8 GB recommended for large PDFs)

Operating System Support:

  • βœ… Windows 10/11
  • βœ… macOS 11+
  • βœ… Linux (Ubuntu 20.04+, Fedora 33+)

βš™οΈ Installation

πŸ‘€ End Users β€” Use Pre-Built Executable

Download the latest pre-compiled executable from Releases:

  • Windows: NullifyPDF_v2.0.5_Windows.exe
  • macOS: NullifyPDF_v2.0.5_macOS.app
  • Linux: nullifypdf_2.0.5_amd64.deb or .rpm

No installation needed on Windows/macOS β€” just run. Linux users: sudo dpkg -i nullifypdf_*.deb

πŸ‘¨β€πŸ’» Developers β€” Install from Source
  1. Clone the repository

    git clone https://github.com/overwrite00/NullifyPDF.git
    cd NullifyPDF
  2. Verify Python 3.12

    # Windows
    py -3.12 --version
    
    # macOS/Linux
    python3.12 --version
  3. Run automated setup (recommended)

    python setup_env.py

    This script automatically:

    • Creates isolated virtual environment (.venv)
    • Installs all dependencies
    • Downloads spaCy language models (EN/IT)
  4. Activate environment & launch

    # Windows (PowerShell)
    .\.venv\Scripts\Activate.ps1
    python NullifyPDF.py
    
    # macOS/Linux (Bash)
    source .venv/bin/activate
    python3.12 NullifyPDF.py

πŸ€– Automation Scripts

The repository includes cross-platform Python scripts for developers:

πŸ”§ setup_env.py β€” Environment Setup

Configures development environment with Python 3.12, virtual environment, and NLP models.

python setup_env.py

What it does:

  • Detects OS (Windows/macOS/Linux)
  • Creates .venv with Python 3.12
  • Installs requirements.txt dependencies
  • Downloads spaCy models (English, Italian, both)
  • Runs smoke tests to verify installation

Automatic OS detection:

  • Windows: Uses py -3.12 launcher
  • macOS/Linux: Uses python3.12 directly
πŸ—οΈ build_local.py β€” Build Executable

Compiles standalone executable with PyInstaller.

python build_local.py

Features:

  • Cleans temporary directories
  • Auto-detects your OS
  • Reads version dynamically from code
  • Generates named executable: NullifyPDF_v2.0.5_Windows.exe

Linux bonus: On Ubuntu/Fedora, automatically builds .deb and .rpm packages in dist/

βœ“ Running Tests

Verify critical fixes with smoke tests:

# Activate venv first
source .venv/bin/activate  # or .venv\Scripts\Activate.ps1 on Windows

pytest tests/ -v

Test coverage:

  • PDFListManager (blocklist/allowlist persistence)
  • Input validation (path, range, language selection)
  • Resource path resolution

πŸ“š Documentation

πŸ“„ Document πŸ“– Purpose
USER_GUIDE.md Step-by-step usage instructions
CONTRIBUTING.md How to contribute code & report issues
ARCHITECTURE.md System design & technical overview
TROUBLESHOOTING.md Common issues & solutions
CHANGELOG.md Release history & updates

πŸ”’ Security & Privacy

βœ… 100% Local Processing β€” All analysis happens on your machine
βœ… No Network Calls β€” Except GitHub release checks
βœ… Open Source β€” Full code transparency
βœ… No Telemetry β€” Zero user tracking

Important

NullifyPDF performs binary-level data destruction. Always keep backups of original documents.

See SECURITY.md for responsible disclosure and privacy details.


πŸ› οΈ Tech Stack

Technology Purpose
Python 3.12 Core language (required for PyMuPDF compatibility)
PySide6 (Qt6) Modern dark-mode GUI with multi-threading
PyMuPDF (fitz) High-performance PDF manipulation
Microsoft Presidio PII (Personally Identifiable Information) detection
spaCy NLP for entity recognition (bilingual EN/IT)

πŸ“ License

MIT License β€” Free to use, modify, and distribute

Copyright (c) 2026 Graziano Mariella

See LICENSE for full text.


🀝 Contributing

Want to help improve NullifyPDF? See CONTRIBUTING.md for guidelines.


Last updated: 2026-06-06
← README_IT | User Guide β†’

About

NullifyPDF is a professional tool for forensic PDF anonymization. Designed for absolute privacy, it operates entirely locally using artificial intelligence to identify and permanently destroy sensitive data without ever uploading files to the cloud.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages