Skip to content
View muehlenbernd's full-sized avatar

Block or report muehlenbernd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
muehlenbernd/README.md

Roland Mühlenbernd

ML Researcher · LLM Evaluation · Computational Pragmatics · Berlin

I develop methods to evaluate whether large language models genuinely understand social and pragmatic language — politeness, register, precision, and the fine-grained signals humans navigate effortlessly.

My background spans Computer Science (BSc), Interdisciplinary Media Studies (MSc), and Computational Linguistics (PhD), with 15+ years of formal modeling work in game theory, probabilistic NLP, and multi-agent systems.

Current focus:

  • Novel calibration metrics (ESR, CDS) for LLM evaluation on social meaning tasks
  • Benchmarking GPT-4, Claude, and Gemini on pragmatic phenomena
  • Probabilistic speaker models of politeness and register

Stack: Python · PyTorch · HuggingFace · scikit-learn · R

Links: muehlenbernd.net · LinkedIn

Pinned Loading

  1. prisma-chatbot prisma-chatbot Public

    PRISMA: a chatbot that responds to you and forms impressions of how you write. Research demo on LLM social perception.

    Python

  2. llm-social-calibration llm-social-calibration Public

    Evaluation framework and LLM ratings for social meaning calibration (ESR, CDS metrics) — CMCL 2026

    Jupyter Notebook