Stefan Pavlovic serder8866

Stefan Pavlovic

MPhil in Speech and Language Processing · Trinity College Dublin (Distinction, 2025)

I work at the intersection of NLP, speech processing, and applied ML. My research focus is low-resource and multilingual NLP — building systems that work for languages that most of the field ignores.

Projects

🔤 Low-Resource NLP for Serbian

Fine-tuning XLM-R and BERTić for Serbian sentiment analysis using cross-lingual label projection from machine-translated English data. Investigates whether Google Translate can substitute for scarce human-labelled data — BERTić hits 90.2% accuracy, and the F1 gap between native and translated training conditions is under 2%.

transformers PyTorch Hugging Face cross-lingual NLP Serbian

🤖 AI Agent with Tool Use

Command-line LLM agent built on Gemini's API. Accepts a natural language prompt and iteratively reasons toward a solution by calling tools — reading files, writing code, executing Python — until it reaches a final answer. Implements the same agentic loop pattern that underpins modern AI coding assistants.

Python Gemini API function calling agent architecture

🔊 Formant Synthesizer

Cascade and parallel formant synthesizers for producing synthetic vowels using discrete-time IIR filters. Implements and compares two vocal tract models across four vowel types (/i/, /a/, /u/, schwa), with magnitude/phase response analysis.

Python scipy signal processing speech synthesis

📚 Lexical Simplification in Translation (in progress)

Corpus-based investigation of simplification as a translation universal. Compares lexical variety, density, and high-frequency word coverage across translated (German→English) and non-translated English literary fiction using NLTK, SciPy, and JASP. Built on the Standardized Project Gutenberg corpus.

Python NLTK corpus linguistics translation studies

Stack

Languages: Python (primary), some C and Go NLP / ML: Hugging Face Transformers, PyTorch, scikit-learn, NLTK, spaCy
Speech: scipy signal processing, formant synthesis, speech feature extraction
Data: pandas, numpy, matplotlib, JASP
Tools: Git, Jupyter, VS Code, Miniconda

Background

Before the MPhil I studied Psychology at Webster University, which means I approach language problems with a grounding in experimental design, statistics, and how humans actually process language — not just how models do. My coursework projects were all implemented from scratch in Python, covering the full pipeline from data collection and preprocessing through to model training, evaluation, and writeup. My thesis, on the other hand, was an exploration of a linking hypothesis between modern theories of syntax and recent work in neuroscience. As it turns out, ML has a lot of potential to further our understanding at the intersection of these fields.

I'm based in Dublin and currently looking for NLP / ML engineer roles. Feel free to reach out.

📫 LinkedIn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly