Skip to content

Service: Fix automount regression - escape JSON arg passed to child mount processes#2015

Merged
tyrielv merged 2 commits into
microsoft:masterfrom
tyrielv:tyrielv/fix-automount-json-quoting
Jun 11, 2026
Merged

Service: Fix automount regression - escape JSON arg passed to child mount processes#2015
tyrielv merged 2 commits into
microsoft:masterfrom
tyrielv:tyrielv/fix-automount-json-quoting

Conversation

@tyrielv

@tyrielv tyrielv commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Symptom

Automount on the latest prerelease (v2.0.26154.1) silently failed for every
registered repo. The service log showed each GVFS.exe mount child being
started, then a 60-second WaitUntilMounted timeout and
"Unable to mount because the GVFS.Mount process is not responding". No
mount-side log file was created — the child died before the log listener
was attached.

Root cause

GVFSMountProcess.CallGVFSMount built the spawn command line by
concatenating InternalVerbParameters.ToJson() straight in:

"…\GVFS.exe" mount D:\os1 --internal_use_only {"ServiceName":null,…}

Windows CommandLineToArgvW strips the embedded " characters when
re-parsing the child's argv, so the receiving process sees
{ServiceName:null,…}. Newtonsoft.Json was permissive about unquoted
keys, so the bug was latent for years. After the
System.Text.Json migration,
strict parsing throws JsonException, the verb hits ReportErrorAndExit
before AddLogFileEventListener runs, and the child exits with no log.

Confirmed by reproducing the parse with CommandLineToArgvW:

argv[4] = '{ServiceName:null,StartedByService:true,MaintenanceJob:null,PackfileMaintenanceBatchSize:null}'

and feeding that to JsonDocument.Parse:

'S' is an invalid start of a property name. Expected a '"'.

Scope: only the SYSTEM-service automount path. gvfs mount (user-run)
and gvfs service --mount-all invoke MountVerb in-process — no
command-line round-trip, no bug.

Fix

  • CommandLineEscaping.EscapeArgument — proper Windows command-line
    argument escaping per the CommandLineToArgvW rules (quote when
    needed, double trailing backslashes that precede a ", escape literal
    " as \").
  • CurrentUser.RunAsTryRunAs(string, string[], out int processId)
    take argv as an array, escape centrally, return the child PID so
    callers can track its liveness.
  • WindowsPlatform.StartBackgroundVFS4GProcess — route every arg
    through the new escaper (previously only space-containing args got
    naively wrapped, and embedded " were never escaped).
  • GVFSEnlistment.WaitUntilMounted — new overload accepting
    Func<MountProcessSnapshot>. When supplied, the wait loop polls the
    snapshot between short (500 ms) connect attempts and fails fast with
    the exit code if the child has already terminated, instead of blocking
    for the full 60-second pipe timeout.
  • GVFSMountProcess — pass the spawned PID into the new wait so
    logon automount detects an early GVFS.exe exit within seconds and
    logs "GVFS.Mount process (Id N) exited with code X before the named pipe was ready" instead of "not responding".

Tests

  • CommandLineEscapingTests — canonical edge cases (spaces, tabs,
    embedded quotes, trailing backslashes) plus the exact
    InternalVerbParameters JSON shape that triggered the regression. A
    round-trip test feeds the escaped form back through
    CommandLineToArgvW and verifies the original string pops out
    unchanged.
  • WaitUntilMountedProcessTrackingTests — verifies the wait
    short-circuits on an exit reported by the first snapshot poll, and
    also catches a late exit reported on a later poll, both well under
    the legacy 60-second timeout.

All 842 unit tests pass.

Validation checklist

  • dotnet build GVFS\GVFS.UnitTests\GVFS.UnitTests.csproj -c Debug — 0 errors
  • All unit tests green (842 passed, including 24 new)
  • Manual smoke: install local Debug build, register two repos with
    gvfs mount, restart GVFS.Service, sign out / sign in, confirm
    both repos automount.
  • Manual fault injection: temporarily make GVFS.exe exit non-zero
    in the mount verb and verify the service logs the new
    "GVFS.Mount process (Id N) exited with code X before the named pipe was ready" message within a few seconds rather than waiting
    60 s.

GVFS.Service uses CreateProcessAsUser to spawn 'GVFS.exe mount <repo>
--internal_use_only <json>' on each user logon.  The JSON arg
({"ServiceName":null,"StartedByService":true,...}) was concatenated
straight into the command line with no escaping.  Windows
CommandLineToArgvW strips the embedded " characters when re-parsing
argv in the child, so the receiving process sees
{ServiceName:null,StartedByService:true,...}.

Newtonsoft.Json was permissive enough to round-trip that corrupted
payload.  After commit c83beab ("Migrate from Newtonsoft.Json to
System.Text.Json"), System.Text.Json (strict — quoted property names
required) throws JsonException, the verb hits ReportErrorAndExit
*before* AddLogFileEventListener runs, and the child exits without ever
creating a mount log file.  The service then blocks for the full
60-second WaitUntilMounted timeout on each repo, fails automount, and
reports only "the GVFS.Mount process is not responding."

Fix:

  * Add CommandLineEscaping.EscapeArgument implementing the
    CommandLineToArgvW escaping rules (quote when needed, double
    backslashes that precede a quote terminator, escape literal "
    as \").

  * Replace CurrentUser.RunAs(string, string) with TryRunAs(string,
    string[], out int processId) so callers supply argv values and
    the escaping happens centrally.  Also surface the child PID so
    callers can track its liveness.

  * WindowsPlatform.StartBackgroundVFS4GProcess: route every arg
    through the new escaper (previously only space-containing args
    got naively quoted, embedded " was never escaped).

  * GVFSEnlistment.WaitUntilMounted: add an overload that takes a
    Func<MountProcessSnapshot>.  When supplied, the wait loop polls
    the snapshot between short (500 ms) connect attempts and fails
    fast with the exit code if the child has already terminated,
    instead of blocking for the full pipe timeout.

  * GVFSMountProcess: pass the spawned child's PID into the new
    WaitUntilMounted overload so logon automount detects an early
    GVFS.exe exit within seconds and logs a useful diagnostic
    ("GVFS.Mount process (Id N) exited with code X before the named
    pipe was ready") instead of a 60-second silent timeout.

Tests:

  * CommandLineEscapingTests covers the canonical edge cases
    (spaces, tabs, embedded quotes, trailing backslashes) and the
    exact InternalVerbParameters JSON shape that triggered the
    regression.  A round-trip test feeds the escaped form back
    through CommandLineToArgvW and verifies the original string
    pops out unchanged.

  * WaitUntilMountedProcessTrackingTests verifies the wait
    short-circuits on an exit reported by the first snapshot poll,
    and also catches a late exit reported on a later poll, both
    well under the legacy 60-second timeout.

Assisted-by: Claude Opus 4.7
Signed-off-by: Tyrie Vella <tyrielv@gmail.com>
@tyrielv tyrielv marked this pull request as ready for review June 10, 2026 23:25
Address code-review feedback on the prior commit.  The wait loop used
the spawned mount process's PID and re-opened it via
Process.GetProcessById on every poll, which has two problems:

1. The lookup races with the very failure mode the fast-fail path is
   meant to catch.  If the child exits between CreateProcessAsUser
   returning and GVFSMountProcess looking it up, GetProcessById
   throws ArgumentException, TryGetProcessById returns null, and
   WaitUntilMounted silently falls back to the legacy single 60-second
   blocking Connect — defeating the whole purpose.

2. Querying by PID is vulnerable to PID reuse: if Windows recycles
   the PID quickly enough, a snapshot poll could observe an
   unrelated process's exit state.

Both problems disappear if we keep the process handle that
CreateProcessAsUser already returned.  The kernel keeps the process
object alive for as long as we hold the handle, so liveness and
exit-code queries remain valid even after the child has exited, and
PID reuse is irrelevant because we never look up by PID again.

Changes:

  * CurrentUser.TryRunAs returns a SafeProcessHandle (owned) instead
    of just a PID.  Internally, the thread handle is still closed
    immediately; only the process handle is retained.

  * New ProcessHandleHelper exposes HasExited (non-blocking
    WaitForSingleObject(handle, 0)) and TryGetExitCode
    (GetExitCodeProcess) directly against the handle.  This avoids
    the STILL_ACTIVE (259) ambiguity in GetExitCodeProcess alone —
    we only ask for the exit code once the wait says signaled.

  * GVFSMountProcess.MountRepository owns the SafeProcessHandle in a
    using block spanning the entire WaitUntilMounted call and uses
    the handle for snapshot queries.

  * Removed TryGetProcessById and the Process.GetProcessById /
    Process.HasExited path entirely — no PID lookup remains in the
    automount code path.

Tests:

  * Build clean, 0 errors.
  * All 842 unit tests still pass (24 new + 818 existing).  The
    existing WaitUntilMountedProcessTrackingTests don't need
    changes because they exercise the snapshot delegate, not the
    handle plumbing.

Assisted-by: Claude Opus 4.7
Signed-off-by: Tyrie Vella <tyrielv@gmail.com>
@tyrielv tyrielv changed the title Service: escape JSON arg passed to child mount processes Service: Fix automount regression - escape JSON arg passed to child mount processes Jun 11, 2026
@tyrielv tyrielv merged commit 69afbc0 into microsoft:master Jun 11, 2026
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants