Defensive notice: This project is provided for educational and defensive use by teachers and school IT staff. It is not a replacement for enterprise antivirus or endpoint protection.
To add your own walkthrough, drop a GIF at
docs/demo.gifand update this link.
Teacher-Safe Local File Scanner is a Python-based, offline-friendly toolkit that helps educators quickly triage student-submitted files before opening them. It performs static checks only—no execution of untrusted code—and produces human-readable and machine-readable reports.
- Features
- Quickstart
- How scanning works
- Command reference
- Optional defensive plugins
- Workflow guidance for flagged files
- Safety, ethics, and limitations
- Cross-platform notes
- Reports and outputs
- Troubleshooting & FAQ
- Development
- Contributing
- License
If you are new to command-line tools, start with the Beginner Guide for a slower, step-by-step walkthrough.
- Static detectors for risky constructs in ZIP, Office, PDF, and image files
- Heuristic scoring with clear severity labels
- Console, JSON, and HTML reporting
- Optional directory watch mode using polling
- Quarantine helper that moves, never deletes, suspicious files
- Cross-platform (Windows, macOS, Linux) with standard library defaults
- Optional integrations with
python-magicandyara-python
python -m venv .venv
source .venv/bin/activate # On Windows use: .venv\\Scripts\\activate
pip install -r requirements.txt
python examples/generate_benign_samples.py # Materialise demo filespython -m scanner scan ./examples/benign_samples --max-file-size 5000000 --threads 4
python -m scanner.main scan ./examples/benign_samples --max-file-size 5000000 --threads 4- Exit code
0: no suspicious findings - Exit code
1: caution or suspicious findings - Exit code
2: high severity findings - Exit code
3: internal scanner error
python -m scanner scan --watch ./incoming
python -m scanner.main scan --watch ./incomingpython -m scanner scan submissions --report-json scan_report.json --report-html scan_report.html
python -m scanner report scan_report.json --html --output scan_report.html
python -m scanner.main scan submissions --output scan_report.json
python -m scanner.main report scan_report.json --html --output scan_report.htmlpython -m scanner quarantine ./submissions/suspicious.docx --dest ./quarantine
python -m scanner.main quarantine ./submissions/suspicious.docx --dest ./quarantineThe quarantine command moves the file safely, sets read-only permissions, and leaves a .meta.json file with provenance details.
If you delete the generated examples or clone the repository fresh, run:
python examples/generate_benign_samples.pyThe script recreates a harmless text file, a minimal PNG image, and a macro-free .docx document without storing binary fixtures in the repository.
Grab the latest release assets for Windows, macOS, or Linux to run the scanner without Python. Each bundle ships offline-first and collects no telemetry.
- Copy
TeacherSafeScanner.exetoC:\Program Files\TeacherSafe\. - Double-click
scripts/windows_add_context_menu.regto register a Scan with Teacher-Safe right-click option.
- On Python: run
python -m scanner.guiand use the picker to select files or folders, then press Scan and Open Report. - On packaged builds: launch
TeacherSafeScannerfrom the extracted bundle and follow the same steps to save and open the HTML report.
The scanner combines lightweight type identification, static detectors, and heuristic scoring:
| Phase | What happens | Key modules |
|---|---|---|
| Discovery | Files are walked recursively (respecting --max-file-size) and hashed using streaming reads. |
scanner.utils |
| Type sniffing | If python-magic is enabled, MIME detection is delegated; otherwise magic bytes are inspected. |
scanner.scanner_core |
| Detection | Format-specific rules look for risky markers (e.g., macros, embedded executables, appended payloads). | scanner.detectors |
| Detection | Format-specific rules look for risky markers (e.g., macros, embedded executables, appended payloads). | scanner.detectors |
| Scoring | Each finding contributes a weighted score mapped to Safe/Caution/Suspicious/High labels. | scanner.heuristics |
| Reporting | Results are aggregated into JSON, console, or HTML outputs. | scanner.reporters |
The entire pipeline avoids running untrusted content and is safe to execute on offline, air-gapped devices.
The CLI exposes three subcommands and several shared options: The CLI exposes three subcommands and several shared options.
Scan one file or a directory tree.
python -m scanner.main scan <path> [--output report.json] [--threads 8] [--max-file-size 200000000]Useful flags:
--watch <folder>: poll for new files while continuing to monitor previously scanned ones.--report-json/--report-html: save structured and teacher-friendly reports in one run.--pdf-rules,--office-rules,--zip-rules,--image-rules: chooseoff,normal, orstrictfor per-format heuristics.--use-magic/--use-yara: opt into external libraries when installed.--max-file-size: skip overly large submissions to save time.--threads: increase if you have many CPU cores and fast storage.
The scan command exits with a severity-driven code so it integrates well with CI or folder monitors.
Move suspicious files to a safe holding area without deleting them.
python -m scanner.main quarantine ./submissions/suspicious.docx --dest ./quarantineThe destination receives a read-only copy plus a .meta.json file recording the original location, hash, and timestamp.
Render previously generated JSON results into other formats.
python -m scanner.main report scan_report.json --html --output scan_report.htmlOmit --html to stream a human-readable console summary instead.
Install optional packages only if your environment permits:
pip install -r requirements-optional.txtpython-magic: richer MIME identification (--use-magic)yara-python: experimental pattern matching (--use-yara)
The CLI flags are opt-in, and the scanner gracefully degrades when the libraries are unavailable.
- Do not open the file. Treat warnings as serious until reviewed by IT.
- Do not open the file. Treat warnings as serious until reviewed by IT.
- Move the file to the quarantine folder for record keeping.
- Escalate to your IT or security team with the JSON/HTML report.
- Review in an isolated virtual machine if your institution allows it.
- When in doubt, collect additional context (e.g., student name, assignment) in a secure ticketing system.
- Static analysis only; no attempt is made to remove malware.
- Large or encrypted archives may hide malicious content the scanner cannot inspect.
- The heuristics prioritise minimizing false negatives but may produce false positives—always confirm with professional tools.
- The tool never executes or modifies untrusted binaries beyond safe hashing and metadata reads.
Read more in SAFETY.md.
- Paths are managed with
pathlib. When running on Windows, prefer PowerShell or CMD with UTF-8 enabled (chcp 65001). - Quarantine sets read-only attributes; if you need to restore a quarantined file, manually adjust permissions via
attrib -ron Windows orchmod +won Unix. - Polling-based watch mode relies on filesystem timestamps; on slow or networked drives expect a 10-second delay before changes are detected.
- For macOS Gatekeeper prompts, run
xattr -dr com.apple.quarantine <path>only on files you trust and after verifying reports.
Reports follow a stable JSON schema so they can be ingested by help-desk systems:
{
"path": "submissions/homework1.zip",
"sha256": "abc123...",
"size": 34567,
"magic_type": "zip",
"issues": [
{"code": "exe_in_zip", "description": "Found executable file payload.exe inside archive", "evidence": "payload.exe"},
{"code": "double_extension", "description": "Filename uses double extension 'report.pdf.exe'", "evidence": "report.pdf.exe"}
],
"score": 75,
"severity": "Suspicious"
}When exporting HTML the report includes:
- A safety banner reminding readers not to open flagged files.
- A severity-coloured table summarising each item.
- Collapsible detail sections for detector evidence.
- Footer tips on next steps for educators.
Console output defaults to a clean table suitable for terminal screenshots. Use --verbose during scans for additional logging.
The scanner skips files larger than expected.
- Confirm the
--max-file-sizeflag; the default is 100 MB. Some learning management systems export multi-gigabyte ZIPs that may need a higher limit.
python-magic or yara-python import errors appear.
- Ensure you installed
requirements-optional.txt. On Windows you may need the Visual C++ Build Tools; on macOS install Homebrewlibmagicfirst.
Watching a network share misses changes.
- Keep the watch directory local when possible. The default 10-second polling interval may drift on congested networks—re-run the command if scans appear delayed.
How do I update the benign sample files?
- Run
python examples/generate_benign_samples.py --forceto regenerate all fixtures. The script never overwrites files unless the hash changes, so it is safe to run repeatedly.
Can I integrate results into another system?
- Yes. The JSON report is linearly structured. Use
jq, Python, or your preferred language to parse theissuesarray per file. The exit code makes automation straightforward.
pip install -r requirements.txt
pytest
ruff check .Recommended editor settings:
- Enable
black-style formatting at 88 columns. - Turn on type checking (MyPy or Pyright) for early detection of annotation issues.
- Configure your IDE to respect
.editorconfigif present.
We welcome defensive-minded contributions. See CONTRIBUTING.md for coding standards and submission guidelines.
MIT License © Teacher Safe Maintainers