Skip to content

wlo2/obsidian-auto-codeblock-language-detector

Repository files navigation

Auto Codeblock Language Detector

An Obsidian community plugin that automatically detects and adds the programming language to your pasted code blocks, powered by a bundled TensorFlow.js Guesslang model.

It runs entirely offline and intelligently filters languages to provide highly accurate detection for over 50 programming languages.

Features

  • Smart Paste Handling: Automatically detects the language of plain multi-line code pastes and wraps them in a labeled code block.
  • Fence Annotation: If you paste code that already has unlabeled fences (e.g., ```), it instantly adds the correct language tag (e.g., ```python).
  • Manual Detection Command: Run /detect_language to scan the current note and add language labels to any existing unlabeled fenced code blocks.
  • Customizable: Use the plugin settings to tweak the minimum confidence threshold, or completely disable languages you don't use (e.g., disable Batch to prioritize Bash on Mac/Linux).
  • Prose Rejection: A pre-detection filter skips text that statistically looks like natural language instead of code, reducing false positives on pasted notes and paragraphs.
  • Fast and Offline: Uses a bundled, optimized machine learning model — no API keys required, and your code never leaves your device.

How detection works

When you paste text or run /detect_language, the plugin first finds unlabeled code fences or treats a plain multi-line paste as a candidate block.

Before the Guesslang model runs, the plugin applies a lightweight prose filter. It counts common prose words, sentence-ending lines, punctuation, paragraph breaks, long lines, indented detail-list lines, code keywords, assignments, and code operators. A block is rejected as prose only when it has strong natural-language signals and weak code signals. This keeps ordinary paragraphs and paragraph-plus-detail-list notes from being labeled as languages while still allowing compact code samples through to the model.

If the prose filter allows the block, the bundled Guesslang model scores possible languages. The plugin then takes the highest-scoring language that is not disabled in settings and only applies it when the score is at or above your configured confidence threshold.

Installation

You can install this plugin manually:

  1. Create a directory named auto-codeblock-language-detector inside your vault's plugins folder:
    <Vault>/.obsidian/plugins/auto-codeblock-language-detector/
    
  2. Copy main.js and manifest.json into the newly created directory.
  3. Reload Obsidian and enable Auto Codeblock Language Detector in Settings -> Community plugins.

Configuration

In the plugin settings, you can configure:

  • Confidence threshold: Minimum confidence score (0.05 to 0.95) required to label a code block. Lower values detect more blocks but may be less accurate on very short snippets.
  • Enabled / Disabled Languages: Click the language chips to move them between the enabled and disabled lists. Languages in the disabled list will never be emitted by the detector.

Development

Install dependencies and start the development watcher:

npm install
npm run dev

Build a production bundle with:

npm run build

Credits

Uses the open-source Guesslang model via TensorFlow.js.

About

A minimal Obsidian plugin that automatically detects the language of pasted code blocks

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors