Skip to content

NotAndrej/tgwordcounter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Telegram Word Counter

A open-source CLI utility for analyzing word frequency in Telegram chat exports

Features

  • JSON Parsing: Automatically handles standard Telegram JSON export structures.
  • Stop-Word Filtering: Easily exclude filler words (like "the", "and", "is", "a") using a custom stopwords.txt file.
  • Command Line Interface: Fully customizable via flags for inputs, outputs, and result limits.
  • No Dependencies: Uses only Python standard libraries (json, re, collections, argparse).

Getting Started

1. Export your Telegram Data

  • Open Telegram Desktop.
  • Go to the settings of the channel/chat you want to analyze.
  • Click the three dots (menu) -> Export chat history.
  • Choose JSON as the format.

2. Usage

  • Clone this repository and run the script from your terminal:
python3 tgwordcounter.py -i path/to/your/result.json -l 100 -s stopwords.txt

3. Arguments

Flag Description Default
-i, --input Path to your result.json file. result.json
-o, --output The filename for the outputted text file. results.txt
-l, --limit Number of most frequent words to display. 100
-s, --stopwords Path to a text file containing words to ignore. None

Customizing Stop-Words

To filter out non-essential words, create a stopwords.txt file in the same folder. Place one word per line:

a
is
in
are

About

A open-source CLI utility for counting the word frequency in Telegram chat exports

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages