Skip to content
View serder8866's full-sized avatar

Block or report serder8866

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
serder8866/README.md

Stefan Pavlovic

MPhil in Speech and Language Processing · Trinity College Dublin (Distinction, 2025)

I work at the intersection of NLP, speech processing, and applied ML. My research focus is low-resource and multilingual NLP — building systems that work for languages that most of the field ignores.


Projects

Fine-tuning XLM-R and BERTić for Serbian sentiment analysis using cross-lingual label projection from machine-translated English data. Investigates whether Google Translate can substitute for scarce human-labelled data — BERTić hits 90.2% accuracy, and the F1 gap between native and translated training conditions is under 2%.

transformers PyTorch Hugging Face cross-lingual NLP Serbian


Command-line LLM agent built on Gemini's API. Accepts a natural language prompt and iteratively reasons toward a solution by calling tools — reading files, writing code, executing Python — until it reaches a final answer. Implements the same agentic loop pattern that underpins modern AI coding assistants.

Python Gemini API function calling agent architecture


Cascade and parallel formant synthesizers for producing synthetic vowels using discrete-time IIR filters. Implements and compares two vocal tract models across four vowel types (/i/, /a/, /u/, schwa), with magnitude/phase response analysis.

Python scipy signal processing speech synthesis


Corpus-based investigation of simplification as a translation universal. Compares lexical variety, density, and high-frequency word coverage across translated (German→English) and non-translated English literary fiction using NLTK, SciPy, and JASP. Built on the Standardized Project Gutenberg corpus.

Python NLTK corpus linguistics translation studies


Stack

Languages: Python (primary), some C and Go NLP / ML: Hugging Face Transformers, PyTorch, scikit-learn, NLTK, spaCy
Speech: scipy signal processing, formant synthesis, speech feature extraction
Data: pandas, numpy, matplotlib, JASP
Tools: Git, Jupyter, VS Code, Miniconda


Background

Before the MPhil I studied Psychology at Webster University, which means I approach language problems with a grounding in experimental design, statistics, and how humans actually process language — not just how models do. My coursework projects were all implemented from scratch in Python, covering the full pipeline from data collection and preprocessing through to model training, evaluation, and writeup. My thesis, on the other hand, was an exploration of a linking hypothesis between modern theories of syntax and recent work in neuroscience. As it turns out, ML has a lot of potential to further our understanding at the intersection of these fields.

I'm based in Dublin and currently looking for NLP / ML engineer roles. Feel free to reach out.

📫 LinkedIn

Pinned Loading

  1. serbian-sentiment-low-resource-nlp serbian-sentiment-low-resource-nlp Public

    Fine-tuning XLM-R and BERTić for Serbian sentiment analysis using cross-lingual label projection from machine-translated English data

    Python

  2. AI_Agent AI_Agent Public

    Command-line AI agent with tool use built on Gemini — reads, writes, and executes code iteratively via an LLM reasoning loop

    Python

  3. Formant-Synthesizer Formant-Synthesizer Public

    Cascade and Parallel Formant Synthesizers for producing synthetic vowels

    Python 1