Skip to content

Infer audio encoding format from source instead of defaulting to MP3#794

Open
jaredoconnell wants to merge 1 commit into
vllm-project:mainfrom
jaredoconnell:fix/audio-formats
Open

Infer audio encoding format from source instead of defaulting to MP3#794
jaredoconnell wants to merge 1 commit into
vllm-project:mainfrom
jaredoconnell:fix/audio-formats

Conversation

@jaredoconnell

@jaredoconnell jaredoconnell commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

Uses the source's audio format rather than defaulting to MP3

Details

  • If no format is provided by the user, and it cannot be inferred, it defaults to WAV and warns the user.
  • Simply uses the format read from the dataset if none is provided.
  • Includes a fix for a missing super call in the response handler.

Test Plan

Make sure you install vLLM with the audio features.

You can run this with multiple datasets. Some options include:

guidellm benchmark run   --target "http://localhost:8000"   --request-type /v1/audio/transcriptions   --profile kind=synchronous   --max-requests 10   --data '{"kind": "huggingface", "source": "google/fleurs", "load_kwargs": {"name": "en_us", "split": "test"}}'   --data-column-mapper '{"column_mappings": {"audio_column": "audio"}}'   --disable-progress
guidellm benchmark run   --target "http://localhost:8000"   --request-type /v1/audio/transcriptions   --profile kind=synchronous   --max-requests 10   --data '{"kind": "huggingface", "source": "openslr/librispeech_asr", "load_kwargs": {"name": "clean", "split": "test"}}'   --data-column-mapper '{"column_mappings": {"audio_column": "audio"}}'   --disable-progress

To explicitly use MP3 like it used to, do:

guidellm benchmark run \
  --target "http://localhost:8000" \
  --request-type /v1/audio/transcriptions \
  --profile kind=synchronous \
  --max-requests 10 \
  --data '{"kind": "huggingface", "source": "openslr/librispeech_asr", "load_kwargs": {"name": "clean", "split": "test"}}' \
  --data-column-mapper '{"column_mappings": {"audio_column": "audio"}}' \
  --data-preprocessors '{"kind": "encode_media", "audio_kwargs": {"audio_format": "mp3"}}' \
  --disable-progress

Related Issues


  • "I certify that all code in this PR is my own, except as noted below."

Use of AI

  • Includes code generated or substantially modified by an AI agent
  • Includes tests generated or substantially modified by an AI agent

NOTE: the Generated-by or Assisted-by trailers should be used in git commit messages when code or tests were generated or substantially modified by an AI agent, as described in the project's DEVELOPING.md file.


git log

commit 6cc6e9f
Author: Jared O'Connell joconnel@redhat.com
Date: Thu Jun 11 15:00:08 2026 -0400

Infer audio encoding format from source instead of defaulting to MP3

Generated-by: Cursor AI Claude Opus 4.6
Signed-off-by: Jared O'Connell <joconnel@redhat.com>

Generated-by: Cursor AI Claude Opus 4.6
Signed-off-by: Jared O'Connell joconnel@redhat.com

Generated-by: Cursor AI Claude Opus 4.6
Signed-off-by: Jared O'Connell <joconnel@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Avoid default MP3 transcoding for transcription benchmarks

1 participant