A full-stack extractive text summarization app built with Node.js + Express (backend) and Angular 17 (frontend), featuring TextRank, TF-IDF, and Word Frequency algorithms, plus Named Entity Recognition and Flesch-Kincaid Readability scoring.
text-summarizer/
├── README.md
│
├── backend/
│ ├── package.json
│ └── src/
│ ├── server.js # Express app entry point
│ ├── routes/
│ │ └── summarize.routes.js # POST /api/summarize, /api/summarize/stats
│ ├── controllers/
│ │ └── summarize.controller.js # Request handlers
│ ├── services/
│ │ └── summarizer.service.js # TextRank, TF-IDF, Frequency, NER, Readability
│ └── middleware/
│ ├── validateRequest.js # express-validator middleware
│ └── errorHandler.js # Global error handler
│
└── frontend/
├── angular.json
├── tsconfig.json
├── tsconfig.app.json
├── package.json
└── src/
├── main.ts # Bootstrap
├── index.html
├── styles.scss # Global styles
├── environments/
│ └── environment.ts
└── app/
├── app.component.ts
├── app.config.ts # Standalone app config
├── app.routes.ts
├── models/
│ └── summarizer.model.ts # TypeScript interfaces (incl. Entities, Readability)
├── services/
│ └── summarizer.service.ts # HttpClient API service
└── components/
└── summarizer/
├── summarizer.component.ts # Signals, computed, 5-tab output
├── summarizer.component.html # Full UI template
└── summarizer.component.scss # Dark terminal styles
cd backend
npm install
npm run dev # nodemon hot reload
# API runs on http://localhost:3000cd frontend
npm install
npm start # Angular dev server
# App runs on http://localhost:4200Request body:
{
"text": "Your long text here...",
"numSentences": 5,
"method": "textrank"
}| Field | Type | Required | Description |
|---|---|---|---|
text |
string | ✅ | Min 100, max 50,000 chars |
numSentences |
number | ❌ | 1–20 (default: 5) |
method |
string | ❌ | textrank, tfidf, or frequency (default: textrank) |
Response:
{
"success": true,
"data": {
"summary": "Combined summary text...",
"summarySentences": ["Sentence 1.", "Sentence 2."],
"scores": [{ "index": 0, "score": 0.0842 }],
"stats": {
"originalWords": 850,
"originalSentences": 13,
"summaryWords": 120,
"summarySentences": 5,
"compressionRatio": 61.5,
"readingTimeOriginal": 5,
"readingTimeSummary": 1
},
"keywords": [{ "word": "learning", "score": 4.12 }],
"entities": {
"people": ["Steve Jobs"],
"places": ["California", "United States"],
"organizations": ["Google", "Microsoft"],
"numbers": ["2024", "42"]
},
"readability": {
"original": {
"readingEase": 28.4,
"easeLabel": "Difficult",
"gradeLevel": 14.2,
"avgWordsPerSentence": 22.1,
"avgSyllablesPerWord": 1.84
},
"summary": {
"readingEase": 31.0,
"easeLabel": "Difficult",
"gradeLevel": 13.8,
"avgWordsPerSentence": 20.4,
"avgSyllablesPerWord": 1.79
}
},
"method": "textrank"
}
}Returns text statistics without summarizing. Includes entities and readability.
- Each sentence is a node in a graph
- Edge weights = cosine similarity between sentence word vectors
- Similarity matrix is row-normalized
- Power iteration (PageRank, 30 rounds, damping=0.85) converges on sentence importance scores
- Top-N sentences selected and reordered to preserve original flow
- Each sentence is treated as a document
- TF-IDF score computed per word per sentence
- Sentence score = average TF-IDF of its significant words
- Rare but meaningful words are weighted higher
- Word frequency map built from entire text
- Stop words and numbers removed
- Frequencies normalized against the max
- Sentences scored by average normalized frequency of their words
- Best for news articles with repeated key terms
Uses the compromise NLP library to extract:
| Entity Type | Example |
|---|---|
| People | Steve Jobs, Elon Musk |
| Places | California, United States, London |
| Organizations | Google, Microsoft, OpenAI |
| Numbers & Values | 2024, 42 billion, 15th |
Two scores computed for both the original text and the summary:
206.835 − (1.015 × avg words/sentence) − (84.6 × avg syllables/word)
| Score | Label |
|---|---|
| 90–100 | Very Easy |
| 70–89 | Easy |
| 60–69 | Standard |
| 50–59 | Fairly Difficult |
| 30–49 | Difficult |
| 0–29 | Very Difficult |
(0.39 × avg words/sentence) + (11.8 × avg syllables/word) − 15.59
| Tab | Content |
|---|---|
| Summary | Extracted sentences numbered in order, copy button |
| Keywords | Top 10 TF-IDF keywords with score bar chart |
| Entities | Color-coded tags — People (purple), Places (green), Orgs (yellow), Numbers (blue) |
| Readability | Flesch-Kincaid ease meter + grade level for original vs summary |
| Stats | Word counts, sentence counts, compression ratio, reading time |
| Library | Purpose |
|---|---|
natural |
Tokenization, TF-IDF, sentence splitting |
compromise |
Named entity recognition (people, places, orgs) |
stopword |
Remove common stop words |
express |
REST API server |
express-validator |
Input validation |
helmet |
HTTP security headers |
express-rate-limit |
100 req / 15 min rate limiting |
@angular/common/http |
HttpClient for API calls |
@angular/animations |
Fade-in & list stagger animations |
- Standalone components (Angular 17, no NgModule)
- Signals (
signal(),computed()) for reactive state - HttpClient with typed
Observableresponses - Animations (
@fadeIn,@listStagger) - FormsModule with
ngModeltwo-way binding - RxJS
catchError,mapfor error handling