Skip to content

SillyNickDev/browsinc

Repository files navigation

BrowSync

Inferred eyebrow tracking for VRChat — no Quest Pro required.

Estimates VRCFT Unified Expression brow parameters from eye tracking, lower face tracking, and microphone prosody data using a rule-based pipeline with a learned GRU residual model. Works with any OpenXR-compatible headset.


Requirements

  • Python 3.11+
  • SteamVR (required for head motion tracking; other runtimes fall back to no head tracking)
  • VRCFaceTracking (VRCFT) for eye/face input
  • Microphone (optional, improves prosody-driven brow movement)
pip install -r requirements.txt

Quick Start

# Default startup (will load the model at start)
python -m ws_server.server

# With ONNX model and terminal UI
python -m ws_server.server --model models/browsync.onnx

# Headless (log to console, no TUI)
python -m ws_server.server --model models/browsync.onnx --no-tui

# Custom port (default: 7720)
python -m ws_server.server --port 7721

Note: Head motion tracking uses SteamVR via OpenVR's background mode — BrowSync does not need a graphics context, but SteamVR must already be running. If SteamVR is not running when BrowSync starts, head tracking is disabled and the server falls back to a lower inference mode automatically.


VRCFT Module

BrowSyncModule/ contains the C# plugin that connects VRCFT to the Python server. Each frame it sends current eye/face expression data to the server and receives inferred brow values in return, writing them into VRCFT's UnifiedTracking shape buffer so VRChat picks them up via OSC.

The module deliberately only writes brow shapes — it does not claim eye tracking or lower-face data. This means it stacks cleanly with whatever eye/face tracker you already use in VRCFT.

Prerequisites

Build and install

cd BrowSyncModule
dotnet build -c Release

The post-build step automatically copies BrowSyncModule.dll to %APPDATA%\VRCFaceTracking\CustomLibs\. VRCFT discovers modules there on startup. To install manually, copy the DLL yourself:

BrowSyncModule\bin\Release\net7.0\BrowSyncModule.dll
  → %APPDATA%\VRCFaceTracking\CustomLibs\BrowSyncModule.dll

Usage

  1. Start the Python server (see Quick Start)
  2. Launch VRCFaceTracking — the module loads automatically and connects to ws://localhost:7720
  3. If the server isn't running yet, the module will retry in the background and connect once it's up

The module sends a reset on connect to clear the GRU buffer and pings every 5 seconds to keep the connection alive. If the Python server disconnects, brow shapes are zeroed and the module reconnects automatically with a 3-second backoff.

The host and port are hardcoded to localhost:7720. If you need to change them, edit BrowSyncModule.cs and rebuild.


Terminal UI

The TUI launches automatically unless --no-tui is passed. It shows live inference mode, FPS, per-AU bar meters, and source status.

Key Action
q Quit
r Recalibrate head tracking neutral pose
d Donate current session to HF Hub (Quest Pro users)
Ctrl+L Toggle dev log

Inference Modes

The server selects the best available mode automatically and degrades gracefully as sources become unavailable. VRCFT data is considered stale after 0.5 seconds. A source must be absent for 300ms before the mode degrades (hysteresis), and mode transitions blend over 150ms to prevent visible brow snapping.

Mode Eye + Face Mic Head ML Model
ml
rules_only
mic_head
head_only
noise_only

noise_only outputs procedural animation to prevent the avatar from freezing. In this mode the VRCFT module zeros all brow shapes rather than applying the noise to the avatar.


Input Features (52 total)

Group Count Features
Eye tracking 14 Openness, wide, squint, lid tightener, gaze X/Y, blink (per side)
Lower face 13 Jaw, lip corners, cheek raise, lip stretch, and others
Prosody 6 PitchNorm, PitchDelta, EnergyNorm, EnergyDelta, SpeechRate, IsSpeaking
Emotion 3 Valence, arousal, confidence (requires SpeechBrain, optional)
Temporal deltas 5 Computed from key signals each frame
Head motion 11 Pitch/roll/yaw, Y/Z translation, linear/angular velocity and acceleration

Missing features default to 0.0 — partial tracker setups are fully supported.


Output Features (8 total)

All outputs are VRCFT Unified Expression parameters in [0, 1]:

  • BrowInnerUpLeft / BrowInnerUpRight
  • BrowOuterUpLeft / BrowOuterUpRight
  • BrowLowererLeft / BrowLowererRight
  • BrowPinchLeft / BrowPinchRight

WebSocket Protocol

Default endpoint: ws://localhost:7720

Frame (client → server)

Send eye/face feature values from VRCFT. Only include features that are available — missing keys default to 0.0.

{
  "type": "frame",
  "ts": 1234567890.123,
  "inputs": {
    "EyeOpenessLeft": 0.82,
    "EyeOpenessRight": 0.79,
    "LipCornerPullLeft": 0.3
  }
}

Brow output (server → client)

Pushed at ~90fps regardless of whether the client sends frames.

{
  "type": "brow",
  "ts": 1234567890.123,
  "outputs": {
    "BrowInnerUpLeft": 0.34,
    "BrowInnerUpRight": 0.34,
    "BrowOuterUpLeft": 0.28,
    "BrowOuterUpRight": 0.31,
    "BrowLowererLeft": 0.05,
    "BrowLowererRight": 0.06,
    "BrowPinchLeft": 0.02,
    "BrowPinchRight": 0.02
  },
  "mode": "ml",
  "head_tracking": true,
  "mic_active": true
}

Control messages

Message Description Acknowledgement
{"type": "ping"} Keepalive {"type": "pong"}
{"type": "reset"} Clear GRU buffer and recalibrate head {"type": "reset_ack"}
{"type": "recalibrate_head"} Restart head neutral pose calibration {"type": "recalibrate_head_ack"}
{"type": "set_mode", "mode": "rules_only"} Force a specific inference mode {"type": "mode_ack", "mode": "..."}
{"type": "get_status"} Request current server status {"type": "status", ...}
{"type": "get_calibration"} Calibration phase + per-AU output variance (last 3s) {"type": "calibration", ...}

Architecture

Input sources (90fps polling loop):
  VRCFT client         →  14 eye + 13 face features   (WebSocket frames from client)
  MicrophoneProcessor  →  6 prosody features           (background thread, 16kHz audio)
  HeadMotionTracker    →  11 head motion features      (background thread, OpenVR poll)
  Emotion context      →  3 features                   (optional SpeechBrain SER)
  Delta features       →  5 features                   (computed per-frame)
                          ─────────────────────────────
                          52 inputs total

  → RuleBasedEstimator    deterministic baseline → 8 AU values [0, 1]
  → GRU residual          learned correction in (−1, 1), scaled by 0.4×
  → Spring-damper smoother  per-AU physics (asymmetric attack/decay)
  → 8 VRCFT brow AU outputs, pushed to all connected clients

The GRU learns residuals on top of the rule base, not the full mapping. This means rules-only mode is immediately usable, and the ML model improves on top incrementally.

Head tracking calibrates on the first 2.5 seconds of runtime, treating the average pose as neutral. All subsequent values are offsets from that baseline.


Data Collection

Three recording paths are available depending on your setup:

Live VRCFT recorder — captures eye/face data from the C# module while you play normally:

python data/record_session.py --subject alice
# Options: --no_mic  --no_head  --val  --port 7721

Runs as a drop-in replacement for the inference server on port 7720. Merges microphone prosody and head motion into full 52-feature frames. Sends dummy zero brow responses so the C# module stays connected.

Raw Quest recorder — reads face expression weights directly from Quest hardware via OpenXR, bypassing VRCFT's remapping entirely:

python data/record_quest_raw.py --subject alice
# Requires: Meta Quest Link running, Meta set as active OpenXR runtime

Quest Pro sessions include labeled brow ground truth (has_labels=True). Quest 3 sessions are unlabeled — rule pseudo-labels are applied at training time.

Offline converter — converts already-captured recordings to BrowSync .jsonl format:

python data/convert_raw.py raw_recording.jsonl --subject alice
python data/convert_raw.py path/to/dir/ --subject alice    # batch

Accepts WebSocket frame log .jsonl or feature .csv with schema.py column names.


Training

# Place .jsonl session files in:
#   data/sessions/train/
#   data/sessions/val/

python -m training.train
# Outputs: models/checkpoints/best_model.pt + models/browsync.onnx

Each line in a session file is a BrowFrame JSON object:

{"timestamp_ms": 123.4, "inputs": [52 floats], "targets": [8 floats], "has_labels": true, "session_id": "uuid"}

Sessions without Quest Pro ground truth omit targets and set has_labels: false. They are trained with rule-based pseudo-labels at a reduced loss weight (0.25).

Training config: 60 epochs, batch size 256, LR 3e-4, early stop patience 15. Loss = MSE + temporal smoothness penalty + asymmetric output weights (brow raises weighted 1.3× vs lowers). Rule estimates are pre-cached at dataset construction time so they are not recomputed per batch. torch.compile is enabled automatically on platforms where MSVC is available.


ONNX Model

The exported models/browsync.onnx is self-contained:

  • Normalisation stats (mean/std per feature) embedded in ONNX metadata
  • Feature names embedded in ONNX metadata
  • Input: float32[batch, 30, 52] — 30-frame window at 90fps (≈333ms context)
  • Output: float32[batch, 8] — Tanh residuals, scaled 0.4× at inference time

The companion models/browsync.onnx.data file holds the weights — both files must remain together.


Data Donation

Quest Pro users can donate sessions to improve the shared model. In the TUI, press d during a recording session to upload to HF Hub. Donated sessions provide real brow AU ground truth paired with the full input feature vector. They are:

  • Stored locally in data/sessions/donated/ (gitignored) before upload
  • Uploaded to HF Hub via huggingface_hub — requires the optional huggingface_hub dependency
  • Used to fine-tune future model releases
  • Opt-in only; never collected without the explicit d keypress

About

Raise an eyebrow without raising the price of entry.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors