From 8349d7e93d66c59b80764dc2f0c33bf88651514c Mon Sep 17 00:00:00 2001 From: Nat Kershaw Date: Thu, 18 Jun 2026 18:00:27 -0700 Subject: [PATCH] Add vision (image understanding) web server samples for C#, JS, Rust MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds three new samples demonstrating image understanding via the Foundry Local local web server (Responses API) using the qwen3.5-0.8b vision model: - samples/cs/foundry-local-web-server-responses-vision - samples/js/web-server-responses-vision-example - samples/rust/foundry-local-webserver-responses-vision Each sample initializes the SDK, downloads/loads a vision-capable model, starts the local web service, sends a streaming Responses API request with a base64-encoded image, and prints the streamed assistant output. Also updates the language-level READMEs and root samples README to list the new samples, and bumps the Rust workspace Cargo.toml. Rust fix: removed reqwest `.timeout(Duration::from_secs(0))` — reqwest treats a zero Duration as 'expire immediately', not 'no timeout', causing every request to fail with TimedOut. JS sample notes: - Removed `foundry-local-sdk-winml` from optionalDependencies so the standard installer ships the Core dylib on macOS/Linux (winml installer has no payload for those platforms). - Use `model.isCached` (property getter) instead of `await model.isCached()`. Verified end-to-end on macOS (darwin-arm64) with two images. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- samples/README.md | 6 +- samples/cs/README.md | 1 + ...oundryLocalWebServerResponsesVision.csproj | 54 +++ .../FoundryLocalWebServerResponsesVision.sln | 34 ++ .../Program.cs | 232 ++++++++++++ .../test_image.jpg | Bin 0 -> 6828 bytes samples/js/README.md | 1 + .../app.js | 209 +++++++++++ .../package.json | 12 + .../test_image.jpg | Bin 0 -> 6828 bytes samples/rust/Cargo.toml | 1 + samples/rust/README.md | 1 + .../Cargo.toml | 16 + .../src/main.rs | 330 ++++++++++++++++++ .../test_image.jpg | Bin 0 -> 6828 bytes 15 files changed, 894 insertions(+), 3 deletions(-) create mode 100644 samples/cs/foundry-local-web-server-responses-vision/FoundryLocalWebServerResponsesVision.csproj create mode 100644 samples/cs/foundry-local-web-server-responses-vision/FoundryLocalWebServerResponsesVision.sln create mode 100644 samples/cs/foundry-local-web-server-responses-vision/Program.cs create mode 100644 samples/cs/foundry-local-web-server-responses-vision/test_image.jpg create mode 100644 samples/js/web-server-responses-vision-example/app.js create mode 100644 samples/js/web-server-responses-vision-example/package.json create mode 100644 samples/js/web-server-responses-vision-example/test_image.jpg create mode 100644 samples/rust/foundry-local-webserver-responses-vision/Cargo.toml create mode 100644 samples/rust/foundry-local-webserver-responses-vision/src/main.rs create mode 100644 samples/rust/foundry-local-webserver-responses-vision/test_image.jpg diff --git a/samples/README.md b/samples/README.md index ebd1afb8c..a9d680412 100644 --- a/samples/README.md +++ b/samples/README.md @@ -8,8 +8,8 @@ Explore complete working examples that demonstrate how to use Foundry Local — | Language | Samples | Description | |----------|---------|-------------| -| [**C#**](cs/) | 13 | .NET SDK samples including native chat, embeddings, audio transcription, tool calling, model management, web server, tutorials, and WinML EP verification. Uses WinML on Windows for hardware acceleration. | -| [**JavaScript**](js/) | 15 | Node.js SDK samples including native chat, embeddings, audio transcription, Electron desktop app, Copilot SDK integration, LangChain, tool calling, web server, tutorials, and WinML EP verification. | +| [**C#**](cs/) | 14 | .NET SDK samples including native chat, embeddings, audio transcription, tool calling, model management, web server, vision via Responses API, tutorials, and WinML EP verification. Uses WinML on Windows for hardware acceleration. | +| [**JavaScript**](js/) | 16 | Node.js SDK samples including native chat, embeddings, audio transcription, Electron desktop app, Copilot SDK integration, LangChain, tool calling, web server, vision via Responses API, tutorials, and WinML EP verification. | | [**Python**](python/) | 14 | Python samples using the OpenAI-compatible API, including chat, embeddings, audio transcription, LangChain integration, tool calling, web server, Responses API, tutorials, and WinML EP verification. | -| [**Rust**](rust/) | 11 | Rust SDK samples including native chat, embeddings, audio transcription, tool calling, web server, tutorials, and WinML EP verification. | +| [**Rust**](rust/) | 12 | Rust SDK samples including native chat, embeddings, audio transcription, tool calling, web server, vision via Responses API, tutorials, and WinML EP verification. | | [**C++**](cpp/) | 1 | C++ sample for live audio transcription. | diff --git a/samples/cs/README.md b/samples/cs/README.md index ad10a3c65..fb594717e 100644 --- a/samples/cs/README.md +++ b/samples/cs/README.md @@ -15,6 +15,7 @@ Both packages provide the same APIs, so the same source code works on all platfo | [embeddings](embeddings/) | Generate single and batch text embeddings using the Foundry Local SDK. | | [audio-transcription-example](audio-transcription-example/) | Transcribe audio files using the Foundry Local SDK. | | [foundry-local-web-server](foundry-local-web-server/) | Set up a local OpenAI-compliant web server. | +| [foundry-local-web-server-responses-vision](foundry-local-web-server-responses-vision/) | Stream a vision (image understanding) response from the local web server using the Responses API. | | [tool-calling-foundry-local-sdk](tool-calling-foundry-local-sdk/) | Use tool calling with native chat completions. | | [tool-calling-foundry-local-web-server](tool-calling-foundry-local-web-server/) | Use tool calling with the local web server. | | [model-management-example](model-management-example/) | Manage models, variant selection, and updates. | diff --git a/samples/cs/foundry-local-web-server-responses-vision/FoundryLocalWebServerResponsesVision.csproj b/samples/cs/foundry-local-web-server-responses-vision/FoundryLocalWebServerResponsesVision.csproj new file mode 100644 index 000000000..06e29a5d2 --- /dev/null +++ b/samples/cs/foundry-local-web-server-responses-vision/FoundryLocalWebServerResponsesVision.csproj @@ -0,0 +1,54 @@ + + + + Exe + enable + enable + + + + + net9.0-windows10.0.18362.0 + ARM64;x64 + None + false + + + + + net9.0 + + + + $(NETCoreSdkRuntimeIdentifier) + + + + + + + + + + + + + + + + + + + + + + PreserveNewest + + + + + + + + + diff --git a/samples/cs/foundry-local-web-server-responses-vision/FoundryLocalWebServerResponsesVision.sln b/samples/cs/foundry-local-web-server-responses-vision/FoundryLocalWebServerResponsesVision.sln new file mode 100644 index 000000000..ac1df4ebb --- /dev/null +++ b/samples/cs/foundry-local-web-server-responses-vision/FoundryLocalWebServerResponsesVision.sln @@ -0,0 +1,34 @@ + +Microsoft Visual Studio Solution File, Format Version 12.00 +# Visual Studio Version 17 +VisualStudioVersion = 17.0.31903.59 +MinimumVisualStudioVersion = 10.0.40219.1 +Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "FoundryLocalWebServerResponsesVision", "FoundryLocalWebServerResponsesVision.csproj", "{8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}" +EndProject +Global + GlobalSection(SolutionConfigurationPlatforms) = preSolution + Debug|Any CPU = Debug|Any CPU + Debug|x64 = Debug|x64 + Debug|x86 = Debug|x86 + Release|Any CPU = Release|Any CPU + Release|x64 = Release|x64 + Release|x86 = Release|x86 + EndGlobalSection + GlobalSection(ProjectConfigurationPlatforms) = postSolution + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Debug|Any CPU.ActiveCfg = Debug|ARM64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Debug|Any CPU.Build.0 = Debug|ARM64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Debug|x64.ActiveCfg = Debug|x64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Debug|x64.Build.0 = Debug|x64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Debug|x86.ActiveCfg = Debug|ARM64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Debug|x86.Build.0 = Debug|ARM64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Release|Any CPU.ActiveCfg = Release|ARM64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Release|Any CPU.Build.0 = Release|ARM64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Release|x64.ActiveCfg = Release|x64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Release|x64.Build.0 = Release|x64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Release|x86.ActiveCfg = Release|ARM64 + {8B4D2C97-2B5D-4A4E-9D31-7C8A6E6F3F11}.Release|x86.Build.0 = Release|ARM64 + EndGlobalSection + GlobalSection(SolutionProperties) = preSolution + HideSolutionNode = FALSE + EndGlobalSection +EndGlobal diff --git a/samples/cs/foundry-local-web-server-responses-vision/Program.cs b/samples/cs/foundry-local-web-server-responses-vision/Program.cs new file mode 100644 index 000000000..3048344b0 --- /dev/null +++ b/samples/cs/foundry-local-web-server-responses-vision/Program.cs @@ -0,0 +1,232 @@ +// +// +using System.Net.Http.Headers; +using System.Text; +using System.Text.Json; +using System.Text.Json.Nodes; +using Microsoft.AI.Foundry.Local; +// + +const int DefaultMaxOutputTokens = 8192; + +if (args.Length < 1) +{ + Console.Error.WriteLine("Usage: dotnet run -- [image_path]"); + Console.Error.WriteLine(" dotnet run -- --list-models"); + Console.Error.WriteLine(" Example: dotnet run -- qwen3.5-0.8b"); + Console.Error.WriteLine(" Example: dotnet run -- Qwen2.5-VL-7B-Instruct-generic-cpu"); + return 1; +} + +bool listModels = args[0] is "--list-models" or "-l"; +string? modelIdentifier = listModels ? null : args[0]; +string defaultImage = Path.Combine(AppContext.BaseDirectory, "test_image.jpg"); +string imagePath = !listModels && args.Length > 1 ? args[1] : defaultImage; + +// +var config = new Configuration +{ + AppName = "foundry_local_samples", + LogLevel = Microsoft.AI.Foundry.Local.LogLevel.Information, + Web = new Configuration.WebService + { + Urls = "http://127.0.0.1:52496" + } +}; + +await FoundryLocalManager.CreateAsync(config, Utils.GetAppLogger()); +var mgr = FoundryLocalManager.Instance; + +Console.WriteLine("\nDownloading execution providers:"); +var currentEp = ""; +await mgr.DownloadAndRegisterEpsAsync((epName, percent) => +{ + if (epName != currentEp) + { + if (currentEp != "") Console.WriteLine(); + currentEp = epName; + } + Console.Write($"\r {epName.PadRight(30)} {percent,6:F1}%"); +}); +if (currentEp != "") Console.WriteLine(); +// + +var catalog = await mgr.GetCatalogAsync(); + +if (listModels) +{ + var allModels = await catalog.ListModelsAsync(); + var visionModels = allModels + .Where(m => (m.Info?.Task ?? "").Contains("vision", StringComparison.OrdinalIgnoreCase)) + .OrderBy(m => m.Alias) + .ToList(); + + if (visionModels.Count == 0) + { + Console.WriteLine("\nNo vision models found in catalog."); + return 0; + } + + var totalVariants = visionModels.Sum(m => m.Variants.Count); + Console.WriteLine($"\nVision models in catalog ({visionModels.Count} aliases, {totalVariants} variants):"); + Console.WriteLine($" {"ALIAS",-32} {"INPUT MODALITIES",-20} {"OUTPUT MODALITIES",-20} {"TASK",-24} CAPABILITIES"); + foreach (var m in visionModels) + { + var task = m.Info?.Task ?? ""; + var capabilities = m.Info?.Capabilities ?? ""; + var inMod = m.Info?.InputModalities ?? ""; + var outMod = m.Info?.OutputModalities ?? ""; + Console.WriteLine($" {m.Alias,-32} {inMod,-20} {outMod,-20} {task,-24} {capabilities}"); + + var variants = m.Variants + .OrderBy(v => v.Info?.Runtime?.DeviceType.ToString() ?? "") + .ThenBy(v => v.Info?.Runtime?.ExecutionProvider ?? "") + .ThenBy(v => v.Id) + .ToList(); + if (variants.Count == 0) continue; + + Console.WriteLine($" {"VARIANT ID",-54} {"DEVICE",-6} {"EXECUTION PROVIDER",-32} {"SIZE (MB)",10} CACHED"); + foreach (var v in variants) + { + var rt = v.Info?.Runtime; + var device = rt?.DeviceType.ToString() ?? ""; + var ep = rt?.ExecutionProvider ?? ""; + var size = v.Info?.FileSizeMb is int s ? s.ToString().PadLeft(10) : new string(' ', 10); + var cached = await v.IsCachedAsync() ? "yes" : "no"; + Console.WriteLine($" {v.Id,-54} {device,-6} {ep,-32} {size} {cached}"); + } + } + return 0; +} + +// +var model = await catalog.GetModelAsync(modelIdentifier!); +if (model is null) +{ + model = await catalog.GetModelVariantAsync(modelIdentifier!); +} +if (model is null) +{ + var available = (await catalog.ListModelsAsync()).Select(m => m.Alias); + Console.Error.WriteLine($"\nModel '{modelIdentifier}' not found in catalog (tried alias and variant id)."); + Console.Error.WriteLine($"Available aliases: {string.Join(", ", available)}"); + Console.Error.WriteLine("Run with --list-models to see variant ids."); + return 1; +} + +if (!await model.IsCachedAsync()) +{ + Console.WriteLine($"\nDownloading model {modelIdentifier}..."); + await model.DownloadAsync(progress => + { + Console.Write($"\rDownloading model: {progress:F2}%"); + if (progress >= 100f) Console.WriteLine(); + }); + Console.WriteLine("Model downloaded"); +} + +Console.WriteLine("\nLoading model..."); +await model.LoadAsync(); +Console.WriteLine("Model loaded"); +// + +// +Console.WriteLine("\nStarting web service..."); +await mgr.StartWebServiceAsync(); +var baseUrl = config.Web.Urls!.TrimEnd('/') + "/v1"; +Console.WriteLine($"Web service started on {baseUrl}"); +// + +// +Console.WriteLine($"\nPreparing image: {imagePath}"); +var (imageB64, mediaType) = EncodeImage(imagePath); + +// The Foundry Local Responses API accepts an array of message items with input_text / +// input_image content parts. The input_image part uses Foundry-specific `image_data` and +// `media_type` fields (in place of OpenAI's `image_url`). +var visionInput = new JsonArray +{ + new JsonObject + { + ["type"] = "message", + ["role"] = "user", + ["content"] = new JsonArray + { + new JsonObject { ["type"] = "input_text", ["text"] = "Describe this image." }, + new JsonObject + { + ["type"] = "input_image", + ["image_data"] = imageB64, + ["media_type"] = mediaType, + }, + }, + }, +}; + +var body = new JsonObject +{ + ["model"] = model.Id, + ["input"] = visionInput, + ["max_output_tokens"] = DefaultMaxOutputTokens, + ["stream"] = true, +}; + +using var http = new HttpClient { Timeout = Timeout.InfiniteTimeSpan }; +using var request = new HttpRequestMessage(HttpMethod.Post, $"{baseUrl}/responses") +{ + Content = new StringContent(body.ToJsonString(), Encoding.UTF8, "application/json"), +}; +request.Headers.Accept.Add(new MediaTypeWithQualityHeaderValue("text/event-stream")); +request.Headers.Authorization = new AuthenticationHeaderValue("Bearer", "notneeded"); + +Console.WriteLine("\nStreaming vision response..."); +using var response = await http.SendAsync(request, HttpCompletionOption.ResponseHeadersRead); +response.EnsureSuccessStatusCode(); + +await using var stream = await response.Content.ReadAsStreamAsync(); +using var reader = new StreamReader(stream); + +Console.Write("[ASSISTANT]: "); +while (await reader.ReadLineAsync() is string line) +{ + if (string.IsNullOrEmpty(line) || !line.StartsWith("data: ", StringComparison.Ordinal)) + continue; + var data = line["data: ".Length..]; + if (data == "[DONE]") break; + try + { + using var doc = JsonDocument.Parse(data); + var root = doc.RootElement; + if (root.TryGetProperty("type", out var t) && + t.GetString() == "response.output_text.delta" && + root.TryGetProperty("delta", out var d)) + { + Console.Write(d.GetString()); + } + } + catch (JsonException) { /* ignore non-JSON keepalives */ } +} +Console.WriteLine(); +// + +await mgr.StopWebServiceAsync(); +await model.UnloadAsync(); +return 0; + +static (string Base64, string MediaType) EncodeImage(string path) +{ + var mediaTypes = new Dictionary(StringComparer.OrdinalIgnoreCase) + { + [".jpg"] = "image/jpeg", + [".jpeg"] = "image/jpeg", + [".png"] = "image/png", + [".gif"] = "image/gif", + [".bmp"] = "image/bmp", + [".webp"] = "image/webp", + }; + var ext = Path.GetExtension(path); + var mediaType = mediaTypes.TryGetValue(ext, out var m) ? m : "image/jpeg"; + var bytes = File.ReadAllBytes(path); + return (Convert.ToBase64String(bytes), mediaType); +} +// diff --git a/samples/cs/foundry-local-web-server-responses-vision/test_image.jpg b/samples/cs/foundry-local-web-server-responses-vision/test_image.jpg new file mode 100644 index 0000000000000000000000000000000000000000..73a4e8004db0fd82a2913bd14ad8b97672097ac5 GIT binary patch literal 6828 zcmc&&cT^MmwjOE#k)nx!lu)HAB1n;DIROy`K_EyK5D_?l^r8?20gV(1f*z`X2+|_b zM0)R{(v%{-w-8zqQr>XxJ?HAVuiX3IA8)euT9Y+1d;h-K-*4}4Htj2I0^qu&qo)JV z(E$J*_ycGozy*MXnHj>&!~%gpSXo)v*r5m64;*0UJ9L;6DtHtwBzROnKv?{Qq_C)* zn1FzkinQEG1tldVxTKn<>M4yAib|*UozSteva%mw=RJ6k_mqf$$f zpc4h?x#<|V>1fRW99$<8-R}qR_k)g}fsu(B!otdS0KB1$3!tZCV4!DYU}9oq1n&+8 zuLF$SOov1i&M_Z0vV(|vLQg%2%U}^ZU)%sQ{)`h>wD$^SWjn&d%Xd`bn52}njM8ak z6;-t}7k)wL=w8&jbj`%n3~7Gd!r`XlEvMVgF5W)Ae*OW0L17QWBO)I?Mq}ciB_uw7 z@mo@6)~oE-Ik|80-j$S=l~=s4{Lt9c+|t_C-qHD`uYX{0=xkPlAYPx`Yw@#N?MXjWe+_u4j#5;eh}NpKCDt**m%bxufL0ye8TH$; zEv`y?$%Y1Wj8lhne~31O`K&u6O9NCU!YGztp!p%{75UBo9)c8U7BxraLgfeAl1ZS; zY^qZss&Rz|ytzvQBvE_TL4FCrl$=-BZHXba!!Kjpl1nPQ!lHItU1x(x{*B{!soQni zlMw0?i`9mqH^x`4j1r3^I&m*#OlBh^W!=WjNP*?9MhMsfTnK7p6&pL%gmJJ8~q4kHU@WQ^ryZ;f0i zP=?pav9^7O3yZ~v)=w`fZqB1aL<#XNBU%0O_wp5KzzM(lHXhtIpG4)=5c@mcQAk-^ z=g>(eM>(tQRrn?ikl0kqenh=CewuTb$1ukCs7K_QyU0l5-Q@7_u$voY7eWZdOA_QUm!Ozg`3J9Ta`wGTc^qFZJFzo+b_llRVx#jeN+u2m z%!eY;w_j*@>9kzu6$oXS;kr9?ME2p^1lU4H17(>^0}f@`=zFg&$tZ-U^NEiUA}~B0YIK$fG@&aSCDa?JoedgLy4>M9SgVZc z=A?et-boES9$?-(-8D;u=i$c5=tb1XIO!U=Su11jNM)jdA5C^~`~a#wBa;RUJ%p2v zyUxk{%#p2A>uBB9jaZy?P+j|ACc1OuL=`0Bo(ig9v}j?~Uhc5Chj(MqS=BBs#Tdn; z9p^^VV;M|-TQ)iMDsBe4Q#sAbxLe7vv=r{Ly8{mV3XGO7r@LkQFe8Oy!nY=hg;g+AII3`d$ZQJ$wK+*QV3QUefQ|2W{qnUSpiWJNW{`-xyz5CY${# zzMu}*;Zt@0KJhfLQ&y(mdGPgOKk~`gLFLrXJFlbjEnJ6Bzs|ScD=f-9bnK#p_kaN_ z-Gg>94K9QYHi^m}(6a?UbmFm>ch1tgJo@Xg2`VC&9>i@5NKfzO2+M0MC zEqd_eh1lYh$w8@G^B;?#V9p%(bN2s`FDwILZ%QZKxEMEjB7)zS2gz@)u*olC66JW& zDy{l4M1aADXe&gRzvgD!W-+$-UjO0UjziK84Q9~>{S5-##mF>DAhve$vP^!IuVX2U z>VY7orwk=zIizQzWw+v^eq_@Ax$JUXabaou@7}-|J-pE>Tk;<$SSKp^M=dqug;j%5i_8f!; zc)<4zsuQc_oHl0hXq%ZsepKDcghv zt513mp-Q$STo?^N6bAA6w}?##v=N{+wll8CPFvWP@+YdOi)$GfNqF%Hf1+=p>z+yv_oZ*z(B&mnMg2j-^|~ zEwT?WJ#{cv$uK216HWPL@}HChNp|8`#Gn?mdSC{L>~}gKWDdJM)y@vg2(AjNQ5U8w zahm*kRYPUTS7cgLti~>hT@F1zpD#dv{sG276-e6S*ZJb6BzUy%Qt_4Y!D5y^VfwF2 zZ2*1qOni*qv0B|Ho_T3URUNBS$HB@M=24jn74!Gnh8o$X3EF`vgT&wal?J?kQ>sy` zXYdx=*y!|41{zQiOUbd(w>hFnz!O7sm#S50fanL?on7iPFfT<8qyLQJr+!OB*QVks z$mpZ;lz?+IAak&Wy=p3MquPPYW`o6_&QqH}&#?-pi6`=zyozhjNa&B^IyVujZt@IS zkXoi&50}2Dk5j5;+hFBhIb*%7k%%y#g>t`W7f&gg<(n4{8xr<3-SK!-t^S&jqm!&Z zAhIbOpWrw-<@x#3j$Qg3$2nng!!#-~sqVpibWx|(*UK$-RSyMU%}koZKQyyX#|<$~ zOm9_3YDAn!%D6rZzL$<_%3TUI#n;)ZsxU?eEror3@*fDXZUk@KEnBT|iwzSg5jKg) zD>w9y^a|Y(1L(p^Qsxgv!8BCKjSQVQ@|w@J^1W`9sRpn%S94 zgZjHkEHB{6YV}TxcOSHd++r94X)f93;NsxgSz|pu@j-sMe=W+y^eHGv+OHZRpma0i zKeD|xxAsfvs$th&aCqVP^VGC^vd2cJRokak;Rz-*;Gt#H_!iw48i3=_#$mN;*-T~B zaw>QDY!7{DuoH=V#_w5MVQq9bCI$@M59UmRBjJW)F+5O4X9V80gbT)M71UjI4dYm1Au87z@}Z2 zO>UsR|3m}Q4cB$MO^|(cKe~jX8jnw|lYm7O0Rkr&QBvjSIJa9P$s|;emVIU|s6T&N ze%4>Ztt_HWqKcHX=lk|P|IYxRC?|bl7{xOK)THU(V_)R|-1l$)C0e0sW(ti z^!ZHT##{jaSdh&>7~x~-mL?RKcCGtOBUhN2e}6F4$U3qoTvvnJszLm4Gp0%A8Ac6y z>O)*8iw2jj3kH3-JeN~o!?51 z7;cL1wDr+B7Sju*ON?<+lDE4uWH3mr#`L+IP@gT{L=&p>Qgmd7j5SH_D=K= zov~U-K$(R*cy;U|J5<1z54JUyn(#W+C9z!j?TDpr>eB|^U#wD6=;~+U3H{qG=xMe- zvy3acJtZ)cq;BB+J@Q^D4hJPP* z#p*I8+gVLp(FbD&b`dj%e{8rPKeq}vVT5;GjbBe&E{mCG-QbNW_$v7nvSi|! z^7z>^)7fIh6gwVi&4VQAIDG-!t55A1glL5Bg@hY{jNJX#meBvXmdZ`1bhotS@ zf*hMvwbsLaULzleqmOM%fF0jxzy1DK#-3;1NmYu>gAVbu&|2?}w#TnF zR37D|I{FodNBKQ6?Bw!%&(40NbxkLTqp^m6suyi(&=-HcOf_vsbMw6%8zKczXc)b~ z(0lfrK$wlRjoF#9@e*6b=I&yikWNh%&F7OUXzI`{(m~2ccagc?Zg(3Eudvw3)}Z*X z`o^K0q0V8OY|pHv#NqHPx3uqPmpg#zTvg$Ts8ID#*TrZEGScSL%JxO({`aw-HU_N% zr)D0coD@@?ht;xGXaou^Ity;(Ut*3q)&9kdGh(=y=@mvUYMu2sb zqxx#C7;Xun6f+!-DL^tkAV#Xg#ZxdmX++3<1?%u&#$!1!iGEU&FYKap+bvL$80~l5 z--VLt+8o|a7@qnbW?#JKxcOR6YhfZ-wth;kYsp;aC55<9hs2YIPiJvBbA$bWXg8|I z_?jJ*DfadcKL$gb18dGWZV|^?d0XIZ)O2{H4=H?cYA0Z{90P|iOM)X zu#tU=Ir~FO|DN8P@7fsUmA;63>SU*mPMUD&iElyay1|J<%fZQ5j?(fkqDwpnXD#U& zZqR_k6KQARMwm+8)0>4Jgs$f3sdVluxr_m;jgJba@41_|6{%zN)}y8+VGnGM$W)Z2 zC5z1!zle3ab|tX^6o+c<`%26`K)=~Lz_yudyscYu*PpdiA>c97{SE;f%ZwLuKsz?u zK2dE_FC$JIjbM+7l&QzFh~;AA^S^3w2V~!@(BiLD#c1M(i)9$3)o!*KpOYISNWQdK zuOFO!UJ`70?Rc|S*EfBqiZ@Hoj1WP8=)ZHq^m_Gz3+}^jWo^1^hj*|XVHw|D;-fn}geIz^R{~7i zA$<1Co~^EP!Pku651!UfJ-=FE=qchi^HlK@JLV2EAb20v6kXUMgs39NswgzyRlTda z1`4%osFBUN2~Z3Ft%8}qt^3d?pHcHJb)qu7Li|ZTd`X=vKn#sVu#T804g2f%3keM! zzlX>iSyC=~c=4s)(elg2H{37O&jd5;aGsYnu9lqiKwIh#@#v+@^dX>P-e=1M$gF_Jz0i|q-PLDeSb8qcW$;LTLMOmf)w*s5&bQb?PhbmAB)EztTWkJy zb0RmeBw|`4ycKM86;}~~+fZb0f9{fANBVJ#{CpKPUyvs+yML@+vI~ZhHd40S1Ygl4 zUAOZcX0oC6OCwGQs%kI0LR|dl&1qqw zfzJ_zD>)%iOAS++(fT!M^L;)h$`x0?e{1k}-89@WIKiErqc z19&2{m}ie`uYRwaUFb#0%^S@Y6uQ-6MWbOJds?>Zl5?oW*cBqGI{i0rorLrijW+7N zlA)cqYbKe6G{C8gGDGeKC4aj7Zv3Y9r0$N811QVQ@0$C|7@5(4l36M~#EC3L?R~!2 zK}^SG?QtrADV5VKP*0G`M}5Q7fIw|UDk$mya1{9h9A+m94GMNXGp+kGu=byhnz~W* zU7W5Qv!?b1ym!vE7ruCPsJTcTo$<#K%-rlSKi=6?w7+kXIB{W3HF literal 0 HcmV?d00001 diff --git a/samples/js/README.md b/samples/js/README.md index d334555c3..0b7d677c4 100644 --- a/samples/js/README.md +++ b/samples/js/README.md @@ -19,6 +19,7 @@ These samples demonstrate how to use the Foundry Local JavaScript SDK (`foundry- | [langchain-integration-example](langchain-integration-example/) | LangChain.js integration for building text generation chains. | | [tool-calling-foundry-local](tool-calling-foundry-local/) | Tool calling with custom function definitions and streaming responses. | | [web-server-example](web-server-example/) | Start a local OpenAI-compatible web server and call it with the OpenAI SDK. | +| [web-server-responses-vision-example](web-server-responses-vision-example/) | Stream a vision (image understanding) response from the local web server using the Responses API. | | [tutorial-chat-assistant](tutorial-chat-assistant/) | Build an interactive multi-turn chat assistant (tutorial). | | [tutorial-document-summarizer](tutorial-document-summarizer/) | Summarize documents with AI (tutorial). | | [tutorial-tool-calling](tutorial-tool-calling/) | Create a tool-calling assistant (tutorial). | diff --git a/samples/js/web-server-responses-vision-example/app.js b/samples/js/web-server-responses-vision-example/app.js new file mode 100644 index 000000000..09d04f571 --- /dev/null +++ b/samples/js/web-server-responses-vision-example/app.js @@ -0,0 +1,209 @@ +// +// +import fs from 'node:fs'; +import path from 'node:path'; +import { fileURLToPath } from 'node:url'; +import { FoundryLocalManager } from 'foundry-local-sdk'; +// + +const __filename = fileURLToPath(import.meta.url); +const __dirname = path.dirname(__filename); + +const DEFAULT_MODEL_ALIAS = 'qwen3.5-0.8b'; +const DEFAULT_MAX_OUTPUT_TOKENS = 8192; +const endpointUrl = 'http://localhost:5765'; + +const argv = process.argv.slice(2); +if (argv.length < 1) { + console.error('Usage: node app.js [image_path]'); + console.error(' node app.js --list-models'); + console.error(' Example: node app.js qwen3.5-0.8b'); + console.error(' Example: node app.js Qwen2.5-VL-7B-Instruct-generic-cpu'); + process.exit(1); +} + +const listModels = argv[0] === '--list-models' || argv[0] === '-l'; +const modelIdentifier = listModels ? null : argv[0]; +const defaultImage = path.join(__dirname, 'test_image.jpg'); +const imagePath = !listModels && argv.length > 1 ? argv[1] : defaultImage; + +// +console.log('Initializing Foundry Local SDK...'); +const manager = FoundryLocalManager.create({ + appName: 'foundry_local_samples', + logLevel: 'info', + webServiceUrls: endpointUrl, +}); +console.log('✓ SDK initialized successfully'); + +console.log('\nDownloading execution providers:'); +let currentEp = ''; +await manager.downloadAndRegisterEps((epName, percent) => { + if (epName !== currentEp) { + if (currentEp !== '') process.stdout.write('\n'); + currentEp = epName; + } + process.stdout.write(`\r ${epName.padEnd(30)} ${percent.toFixed(1).padStart(5)}%`); +}); +if (currentEp !== '') process.stdout.write('\n'); +// + +if (listModels) { + const allModels = await manager.catalog.listModels(); + const visionModels = allModels + .filter((m) => (m.info?.task ?? '').toLowerCase().includes('vision')) + .sort((a, b) => a.alias.localeCompare(b.alias)); + + if (visionModels.length === 0) { + console.log('\nNo vision models found in catalog.'); + process.exit(0); + } + + const totalVariants = visionModels.reduce((sum, m) => sum + (m.variants?.length ?? 0), 0); + console.log(`\nVision models in catalog (${visionModels.length} aliases, ${totalVariants} variants):`); + console.log(` ${'ALIAS'.padEnd(32)} ${'INPUT MODALITIES'.padEnd(20)} ${'OUTPUT MODALITIES'.padEnd(20)} ${'TASK'.padEnd(24)} CAPABILITIES`); + for (const m of visionModels) { + const task = m.info?.task ?? ''; + const capabilities = m.info?.capabilities ?? ''; + const inMod = m.info?.inputModalities ?? ''; + const outMod = m.info?.outputModalities ?? ''; + console.log(` ${m.alias.padEnd(32)} ${inMod.padEnd(20)} ${outMod.padEnd(20)} ${task.padEnd(24)} ${capabilities}`); + + const variants = [...(m.variants ?? [])].sort((a, b) => { + const ad = a.info?.runtime?.deviceType ?? ''; + const bd = b.info?.runtime?.deviceType ?? ''; + if (ad !== bd) return ad.localeCompare(bd); + const ae = a.info?.runtime?.executionProvider ?? ''; + const be = b.info?.runtime?.executionProvider ?? ''; + if (ae !== be) return ae.localeCompare(be); + return a.id.localeCompare(b.id); + }); + if (variants.length === 0) continue; + + console.log(` ${'VARIANT ID'.padEnd(54)} ${'DEVICE'.padEnd(6)} ${'EXECUTION PROVIDER'.padEnd(32)} ${'SIZE (MB)'.padStart(10)} CACHED`); + for (const v of variants) { + const device = v.info?.runtime?.deviceType ?? ''; + const ep = v.info?.runtime?.executionProvider ?? ''; + const size = v.info?.fileSizeMb != null ? String(v.info.fileSizeMb).padStart(10) : ''.padStart(10); + const cached = (await v.isCached()) ? 'yes' : 'no'; + console.log(` ${v.id.padEnd(54)} ${device.padEnd(6)} ${ep.padEnd(32)} ${size} ${cached}`); + } + } + process.exit(0); +} + +// +let model = await manager.catalog.getModel(modelIdentifier); +if (!model) { + model = await manager.catalog.getModelVariant(modelIdentifier); +} +if (!model) { + const allModels = await manager.catalog.listModels(); + console.error(`\nModel '${modelIdentifier}' not found in catalog (tried alias and variant id).`); + console.error(`Available aliases: ${allModels.map((m) => m.alias).join(', ')}`); + console.error('Run with --list-models to see variant ids.'); + process.exit(1); +} + +if (!model.isCached) { + console.log(`\nDownloading model ${modelIdentifier}...`); + await model.download((progress) => { + process.stdout.write(`\rDownloading model: ${progress.toFixed(2)}%`); + }); + console.log('\nModel downloaded'); +} + +console.log('\nLoading model...'); +await model.load(); +console.log('Model loaded'); +// + +// +console.log('\nStarting web service...'); +manager.startWebService(); +const baseUrl = endpointUrl.replace(/\/+$/, '') + '/v1'; +console.log(`Web service started on ${baseUrl}`); +// + +// +console.log(`\nPreparing image: ${imagePath}`); +const { base64: imageB64, mediaType } = encodeImage(imagePath); + +// The Foundry Local Responses API accepts an array of message items with input_text / +// input_image content parts. The input_image part uses Foundry-specific `image_data` and +// `media_type` fields (in place of OpenAI's `image_url`). +const visionInput = [ + { + type: 'message', + role: 'user', + content: [ + { type: 'input_text', text: 'Describe this image.' }, + { type: 'input_image', image_data: imageB64, media_type: mediaType }, + ], + }, +]; + +console.log('\nStreaming vision response...'); +const response = await fetch(`${baseUrl}/responses`, { + method: 'POST', + headers: { + 'Content-Type': 'application/json', + Accept: 'text/event-stream', + Authorization: 'Bearer notneeded', + }, + body: JSON.stringify({ + model: model.id, + input: visionInput, + max_output_tokens: DEFAULT_MAX_OUTPUT_TOKENS, + stream: true, + }), +}); +if (!response.ok) { + throw new Error(`Responses API error: ${response.status} ${await response.text()}`); +} + +process.stdout.write('[ASSISTANT]: '); +const decoder = new TextDecoder(); +let buf = ''; +for await (const chunk of response.body) { + buf += decoder.decode(chunk, { stream: true }); + let nl; + while ((nl = buf.indexOf('\n')) !== -1) { + const line = buf.slice(0, nl).trimEnd(); + buf = buf.slice(nl + 1); + if (!line.startsWith('data: ')) continue; + const data = line.slice('data: '.length); + if (data === '[DONE]') break; + try { + const event = JSON.parse(data); + if (event.type === 'response.output_text.delta' && typeof event.delta === 'string') { + process.stdout.write(event.delta); + } + } catch { + // ignore keepalives or non-JSON lines + } + } +} +process.stdout.write('\n'); +// + +console.log('\nUnloading model and stopping web service...'); +await model.unload(); +manager.stopWebService(); +console.log('✓ Model unloaded and web service stopped'); + +function encodeImage(p) { + const mediaTypes = { + '.jpg': 'image/jpeg', + '.jpeg': 'image/jpeg', + '.png': 'image/png', + '.gif': 'image/gif', + '.bmp': 'image/bmp', + '.webp': 'image/webp', + }; + const ext = path.extname(p).toLowerCase(); + const mediaType = mediaTypes[ext] ?? 'image/jpeg'; + const bytes = fs.readFileSync(p); + return { base64: bytes.toString('base64'), mediaType }; +} +// diff --git a/samples/js/web-server-responses-vision-example/package.json b/samples/js/web-server-responses-vision-example/package.json new file mode 100644 index 000000000..e02dde17e --- /dev/null +++ b/samples/js/web-server-responses-vision-example/package.json @@ -0,0 +1,12 @@ +{ + "name": "web-server-responses-vision-example", + "version": "1.0.0", + "type": "module", + "main": "app.js", + "scripts": { + "start": "node app.js" + }, + "dependencies": { + "foundry-local-sdk": "latest" + } +} diff --git a/samples/js/web-server-responses-vision-example/test_image.jpg b/samples/js/web-server-responses-vision-example/test_image.jpg new file mode 100644 index 0000000000000000000000000000000000000000..73a4e8004db0fd82a2913bd14ad8b97672097ac5 GIT binary patch literal 6828 zcmc&&cT^MmwjOE#k)nx!lu)HAB1n;DIROy`K_EyK5D_?l^r8?20gV(1f*z`X2+|_b zM0)R{(v%{-w-8zqQr>XxJ?HAVuiX3IA8)euT9Y+1d;h-K-*4}4Htj2I0^qu&qo)JV z(E$J*_ycGozy*MXnHj>&!~%gpSXo)v*r5m64;*0UJ9L;6DtHtwBzROnKv?{Qq_C)* zn1FzkinQEG1tldVxTKn<>M4yAib|*UozSteva%mw=RJ6k_mqf$$f zpc4h?x#<|V>1fRW99$<8-R}qR_k)g}fsu(B!otdS0KB1$3!tZCV4!DYU}9oq1n&+8 zuLF$SOov1i&M_Z0vV(|vLQg%2%U}^ZU)%sQ{)`h>wD$^SWjn&d%Xd`bn52}njM8ak z6;-t}7k)wL=w8&jbj`%n3~7Gd!r`XlEvMVgF5W)Ae*OW0L17QWBO)I?Mq}ciB_uw7 z@mo@6)~oE-Ik|80-j$S=l~=s4{Lt9c+|t_C-qHD`uYX{0=xkPlAYPx`Yw@#N?MXjWe+_u4j#5;eh}NpKCDt**m%bxufL0ye8TH$; zEv`y?$%Y1Wj8lhne~31O`K&u6O9NCU!YGztp!p%{75UBo9)c8U7BxraLgfeAl1ZS; zY^qZss&Rz|ytzvQBvE_TL4FCrl$=-BZHXba!!Kjpl1nPQ!lHItU1x(x{*B{!soQni zlMw0?i`9mqH^x`4j1r3^I&m*#OlBh^W!=WjNP*?9MhMsfTnK7p6&pL%gmJJ8~q4kHU@WQ^ryZ;f0i zP=?pav9^7O3yZ~v)=w`fZqB1aL<#XNBU%0O_wp5KzzM(lHXhtIpG4)=5c@mcQAk-^ z=g>(eM>(tQRrn?ikl0kqenh=CewuTb$1ukCs7K_QyU0l5-Q@7_u$voY7eWZdOA_QUm!Ozg`3J9Ta`wGTc^qFZJFzo+b_llRVx#jeN+u2m z%!eY;w_j*@>9kzu6$oXS;kr9?ME2p^1lU4H17(>^0}f@`=zFg&$tZ-U^NEiUA}~B0YIK$fG@&aSCDa?JoedgLy4>M9SgVZc z=A?et-boES9$?-(-8D;u=i$c5=tb1XIO!U=Su11jNM)jdA5C^~`~a#wBa;RUJ%p2v zyUxk{%#p2A>uBB9jaZy?P+j|ACc1OuL=`0Bo(ig9v}j?~Uhc5Chj(MqS=BBs#Tdn; z9p^^VV;M|-TQ)iMDsBe4Q#sAbxLe7vv=r{Ly8{mV3XGO7r@LkQFe8Oy!nY=hg;g+AII3`d$ZQJ$wK+*QV3QUefQ|2W{qnUSpiWJNW{`-xyz5CY${# zzMu}*;Zt@0KJhfLQ&y(mdGPgOKk~`gLFLrXJFlbjEnJ6Bzs|ScD=f-9bnK#p_kaN_ z-Gg>94K9QYHi^m}(6a?UbmFm>ch1tgJo@Xg2`VC&9>i@5NKfzO2+M0MC zEqd_eh1lYh$w8@G^B;?#V9p%(bN2s`FDwILZ%QZKxEMEjB7)zS2gz@)u*olC66JW& zDy{l4M1aADXe&gRzvgD!W-+$-UjO0UjziK84Q9~>{S5-##mF>DAhve$vP^!IuVX2U z>VY7orwk=zIizQzWw+v^eq_@Ax$JUXabaou@7}-|J-pE>Tk;<$SSKp^M=dqug;j%5i_8f!; zc)<4zsuQc_oHl0hXq%ZsepKDcghv zt513mp-Q$STo?^N6bAA6w}?##v=N{+wll8CPFvWP@+YdOi)$GfNqF%Hf1+=p>z+yv_oZ*z(B&mnMg2j-^|~ zEwT?WJ#{cv$uK216HWPL@}HChNp|8`#Gn?mdSC{L>~}gKWDdJM)y@vg2(AjNQ5U8w zahm*kRYPUTS7cgLti~>hT@F1zpD#dv{sG276-e6S*ZJb6BzUy%Qt_4Y!D5y^VfwF2 zZ2*1qOni*qv0B|Ho_T3URUNBS$HB@M=24jn74!Gnh8o$X3EF`vgT&wal?J?kQ>sy` zXYdx=*y!|41{zQiOUbd(w>hFnz!O7sm#S50fanL?on7iPFfT<8qyLQJr+!OB*QVks z$mpZ;lz?+IAak&Wy=p3MquPPYW`o6_&QqH}&#?-pi6`=zyozhjNa&B^IyVujZt@IS zkXoi&50}2Dk5j5;+hFBhIb*%7k%%y#g>t`W7f&gg<(n4{8xr<3-SK!-t^S&jqm!&Z zAhIbOpWrw-<@x#3j$Qg3$2nng!!#-~sqVpibWx|(*UK$-RSyMU%}koZKQyyX#|<$~ zOm9_3YDAn!%D6rZzL$<_%3TUI#n;)ZsxU?eEror3@*fDXZUk@KEnBT|iwzSg5jKg) zD>w9y^a|Y(1L(p^Qsxgv!8BCKjSQVQ@|w@J^1W`9sRpn%S94 zgZjHkEHB{6YV}TxcOSHd++r94X)f93;NsxgSz|pu@j-sMe=W+y^eHGv+OHZRpma0i zKeD|xxAsfvs$th&aCqVP^VGC^vd2cJRokak;Rz-*;Gt#H_!iw48i3=_#$mN;*-T~B zaw>QDY!7{DuoH=V#_w5MVQq9bCI$@M59UmRBjJW)F+5O4X9V80gbT)M71UjI4dYm1Au87z@}Z2 zO>UsR|3m}Q4cB$MO^|(cKe~jX8jnw|lYm7O0Rkr&QBvjSIJa9P$s|;emVIU|s6T&N ze%4>Ztt_HWqKcHX=lk|P|IYxRC?|bl7{xOK)THU(V_)R|-1l$)C0e0sW(ti z^!ZHT##{jaSdh&>7~x~-mL?RKcCGtOBUhN2e}6F4$U3qoTvvnJszLm4Gp0%A8Ac6y z>O)*8iw2jj3kH3-JeN~o!?51 z7;cL1wDr+B7Sju*ON?<+lDE4uWH3mr#`L+IP@gT{L=&p>Qgmd7j5SH_D=K= zov~U-K$(R*cy;U|J5<1z54JUyn(#W+C9z!j?TDpr>eB|^U#wD6=;~+U3H{qG=xMe- zvy3acJtZ)cq;BB+J@Q^D4hJPP* z#p*I8+gVLp(FbD&b`dj%e{8rPKeq}vVT5;GjbBe&E{mCG-QbNW_$v7nvSi|! z^7z>^)7fIh6gwVi&4VQAIDG-!t55A1glL5Bg@hY{jNJX#meBvXmdZ`1bhotS@ zf*hMvwbsLaULzleqmOM%fF0jxzy1DK#-3;1NmYu>gAVbu&|2?}w#TnF zR37D|I{FodNBKQ6?Bw!%&(40NbxkLTqp^m6suyi(&=-HcOf_vsbMw6%8zKczXc)b~ z(0lfrK$wlRjoF#9@e*6b=I&yikWNh%&F7OUXzI`{(m~2ccagc?Zg(3Eudvw3)}Z*X z`o^K0q0V8OY|pHv#NqHPx3uqPmpg#zTvg$Ts8ID#*TrZEGScSL%JxO({`aw-HU_N% zr)D0coD@@?ht;xGXaou^Ity;(Ut*3q)&9kdGh(=y=@mvUYMu2sb zqxx#C7;Xun6f+!-DL^tkAV#Xg#ZxdmX++3<1?%u&#$!1!iGEU&FYKap+bvL$80~l5 z--VLt+8o|a7@qnbW?#JKxcOR6YhfZ-wth;kYsp;aC55<9hs2YIPiJvBbA$bWXg8|I z_?jJ*DfadcKL$gb18dGWZV|^?d0XIZ)O2{H4=H?cYA0Z{90P|iOM)X zu#tU=Ir~FO|DN8P@7fsUmA;63>SU*mPMUD&iElyay1|J<%fZQ5j?(fkqDwpnXD#U& zZqR_k6KQARMwm+8)0>4Jgs$f3sdVluxr_m;jgJba@41_|6{%zN)}y8+VGnGM$W)Z2 zC5z1!zle3ab|tX^6o+c<`%26`K)=~Lz_yudyscYu*PpdiA>c97{SE;f%ZwLuKsz?u zK2dE_FC$JIjbM+7l&QzFh~;AA^S^3w2V~!@(BiLD#c1M(i)9$3)o!*KpOYISNWQdK zuOFO!UJ`70?Rc|S*EfBqiZ@Hoj1WP8=)ZHq^m_Gz3+}^jWo^1^hj*|XVHw|D;-fn}geIz^R{~7i zA$<1Co~^EP!Pku651!UfJ-=FE=qchi^HlK@JLV2EAb20v6kXUMgs39NswgzyRlTda z1`4%osFBUN2~Z3Ft%8}qt^3d?pHcHJb)qu7Li|ZTd`X=vKn#sVu#T804g2f%3keM! zzlX>iSyC=~c=4s)(elg2H{37O&jd5;aGsYnu9lqiKwIh#@#v+@^dX>P-e=1M$gF_Jz0i|q-PLDeSb8qcW$;LTLMOmf)w*s5&bQb?PhbmAB)EztTWkJy zb0RmeBw|`4ycKM86;}~~+fZb0f9{fANBVJ#{CpKPUyvs+yML@+vI~ZhHd40S1Ygl4 zUAOZcX0oC6OCwGQs%kI0LR|dl&1qqw zfzJ_zD>)%iOAS++(fT!M^L;)h$`x0?e{1k}-89@WIKiErqc z19&2{m}ie`uYRwaUFb#0%^S@Y6uQ-6MWbOJds?>Zl5?oW*cBqGI{i0rorLrijW+7N zlA)cqYbKe6G{C8gGDGeKC4aj7Zv3Y9r0$N811QVQ@0$C|7@5(4l36M~#EC3L?R~!2 zK}^SG?QtrADV5VKP*0G`M}5Q7fIw|UDk$mya1{9h9A+m94GMNXGp+kGu=byhnz~W* zU7W5Qv!?b1ym!vE7ruCPsJTcTo$<#K%-rlSKi=6?w7+kXIB{W3HF literal 0 HcmV?d00001 diff --git a/samples/rust/Cargo.toml b/samples/rust/Cargo.toml index 37a579a1b..7d528e0a1 100644 --- a/samples/rust/Cargo.toml +++ b/samples/rust/Cargo.toml @@ -1,6 +1,7 @@ [workspace] members = [ "foundry-local-webserver", + "foundry-local-webserver-responses-vision", "tool-calling-foundry-local", "native-chat-completions", "audio-transcription-example", diff --git a/samples/rust/README.md b/samples/rust/README.md index bc65306fa..f260df7d9 100644 --- a/samples/rust/README.md +++ b/samples/rust/README.md @@ -14,6 +14,7 @@ These samples demonstrate how to use the Rust binding for Foundry Local. | [embeddings](embeddings/) | Generate single and batch text embeddings using the native embedding client. | | [audio-transcription-example](audio-transcription-example/) | Audio transcription (non-streaming and streaming) using the Whisper model. | | [foundry-local-webserver](foundry-local-webserver/) | Start a local OpenAI-compatible web server and call it with a standard HTTP client. | +| [foundry-local-webserver-responses-vision](foundry-local-webserver-responses-vision/) | Stream a vision (image understanding) response from the local web server using the Responses API. | | [tool-calling-foundry-local](tool-calling-foundry-local/) | Tool calling with streaming responses, multi-turn conversation, and local tool execution. | | [tutorial-chat-assistant](tutorial-chat-assistant/) | Build an interactive multi-turn chat assistant (tutorial). | | [tutorial-document-summarizer](tutorial-document-summarizer/) | Summarize documents with AI (tutorial). | diff --git a/samples/rust/foundry-local-webserver-responses-vision/Cargo.toml b/samples/rust/foundry-local-webserver-responses-vision/Cargo.toml new file mode 100644 index 000000000..290022f2e --- /dev/null +++ b/samples/rust/foundry-local-webserver-responses-vision/Cargo.toml @@ -0,0 +1,16 @@ +[package] +name = "foundry-local-webserver-responses-vision" +version = "0.1.0" +edition = "2021" +description = "Vision (image understanding) example using the Foundry Local web service and the OpenAI Responses API" + +[dependencies] +foundry-local-sdk = { path = "../../../sdk/rust" } +tokio = { version = "1", features = ["rt-multi-thread", "macros"] } +serde_json = "1" +reqwest = { version = "0.12", features = ["json", "stream"] } +base64 = "0.22" +futures-util = "0.3" + +[target.'cfg(windows)'.dependencies] +foundry-local-sdk = { path = "../../../sdk/rust", features = ["winml"] } diff --git a/samples/rust/foundry-local-webserver-responses-vision/src/main.rs b/samples/rust/foundry-local-webserver-responses-vision/src/main.rs new file mode 100644 index 000000000..b0c4f831c --- /dev/null +++ b/samples/rust/foundry-local-webserver-responses-vision/src/main.rs @@ -0,0 +1,330 @@ +// +// Copyright (c) Microsoft Corporation. All rights reserved. +// Licensed under the MIT License. + +//! Foundry Local Web Server vision example (Responses API). +//! +//! Mirrors `samples/python/web-server-responses-vision`. Starts the local +//! Foundry web service, posts a multimodal request to `/v1/responses` with a +//! base64-encoded image, and streams the SSE response, printing each +//! `response.output_text.delta` event. + +// +use std::io::{self, Write}; +use std::path::{Path, PathBuf}; + +use base64::Engine; +use futures_util::StreamExt; +use serde_json::{json, Value}; + +use foundry_local_sdk::{FoundryLocalConfig, FoundryLocalManager}; +// + +const DEFAULT_MODEL_ALIAS: &str = "qwen3.5-0.8b"; +const DEFAULT_MAX_OUTPUT_TOKENS: u64 = 8192; + +fn print_usage() { + eprintln!("Usage: cargo run -p foundry-local-webserver-responses-vision -- [image_path]"); + eprintln!(" cargo run -p foundry-local-webserver-responses-vision -- --list-models"); + eprintln!(" Example: ... -- {DEFAULT_MODEL_ALIAS}"); + eprintln!(" Example: ... -- Qwen2.5-VL-7B-Instruct-generic-cpu"); +} + +#[tokio::main] +async fn main() -> Result<(), Box> { + let args: Vec = std::env::args().skip(1).collect(); + if args.is_empty() { + print_usage(); + std::process::exit(1); + } + + let list_models = matches!(args[0].as_str(), "--list-models" | "-l"); + + // + println!("Initializing Foundry Local SDK..."); + let manager = FoundryLocalManager::create(FoundryLocalConfig::new("foundry_local_samples"))?; + println!("✓ SDK initialized"); + + println!("\nDownloading execution providers:"); + manager + .download_and_register_eps_with_progress(None, { + let mut current_ep = String::new(); + move |ep_name: &str, percent: f64| { + if ep_name != current_ep { + if !current_ep.is_empty() { + println!(); + } + current_ep = ep_name.to_string(); + } + print!("\r {:<30} {:5.1}%", ep_name, percent); + io::stdout().flush().ok(); + } + }) + .await?; + println!(); + // + + if list_models { + let all_models = manager.catalog().get_models().await?; + let mut vision_models: Vec<_> = all_models + .into_iter() + .filter(|m| { + m.info() + .task + .as_deref() + .map(|t| t.to_lowercase().contains("vision")) + .unwrap_or(false) + }) + .collect(); + vision_models.sort_by(|a, b| a.alias().cmp(b.alias())); + + if vision_models.is_empty() { + println!("\nNo vision models found in catalog."); + return Ok(()); + } + + let total_variants: usize = vision_models.iter().map(|m| m.variants().len()).sum(); + println!( + "\nVision models in catalog ({} aliases, {} variants):", + vision_models.len(), + total_variants + ); + println!( + " {:<32} {:<20} {:<20} {:<24} CAPABILITIES", + "ALIAS", "INPUT MODALITIES", "OUTPUT MODALITIES", "TASK" + ); + for m in &vision_models { + let info = m.info(); + println!( + " {:<32} {:<20} {:<20} {:<24} {}", + m.alias(), + info.input_modalities.as_deref().unwrap_or(""), + info.output_modalities.as_deref().unwrap_or(""), + info.task.as_deref().unwrap_or(""), + info.capabilities.as_deref().unwrap_or(""), + ); + + let mut variants = m.variants(); + if variants.is_empty() { + continue; + } + variants.sort_by(|a, b| { + let ad = a + .info() + .runtime + .as_ref() + .map(|r| format!("{:?}", r.device_type)) + .unwrap_or_default(); + let bd = b + .info() + .runtime + .as_ref() + .map(|r| format!("{:?}", r.device_type)) + .unwrap_or_default(); + ad.cmp(&bd) + .then_with(|| { + let ae = a + .info() + .runtime + .as_ref() + .map(|r| r.execution_provider.clone()) + .unwrap_or_default(); + let be = b + .info() + .runtime + .as_ref() + .map(|r| r.execution_provider.clone()) + .unwrap_or_default(); + ae.cmp(&be) + }) + .then_with(|| a.id().cmp(b.id())) + }); + + println!( + " {:<54} {:<6} {:<32} {:>10} CACHED", + "VARIANT ID", "DEVICE", "EXECUTION PROVIDER", "SIZE (MB)" + ); + for v in &variants { + let info = v.info(); + let device = info + .runtime + .as_ref() + .map(|r| format!("{:?}", r.device_type)) + .unwrap_or_default(); + let ep = info + .runtime + .as_ref() + .map(|r| r.execution_provider.as_str()) + .unwrap_or(""); + let size = match info.file_size_mb { + Some(s) => format!("{:>10}", s), + None => " ".repeat(10), + }; + let cached = if v.is_cached().await.unwrap_or(false) { "yes" } else { "no" }; + println!( + " {:<54} {:<6} {:<32} {} {}", + v.id(), + device, + ep, + size, + cached + ); + } + } + return Ok(()); + } + + let model_identifier = args[0].clone(); + let default_image = PathBuf::from(env!("CARGO_MANIFEST_DIR")).join("test_image.jpg"); + let image_path = if args.len() > 1 { + PathBuf::from(&args[1]) + } else { + default_image + }; + + // + let model = match manager.catalog().get_model(&model_identifier).await { + Ok(m) => m, + Err(_) => match manager.catalog().get_model_variant(&model_identifier).await { + Ok(m) => m, + Err(_) => { + let all = manager.catalog().get_models().await?; + let aliases: Vec = all.iter().map(|m| m.alias().to_string()).collect(); + eprintln!( + "\nModel '{}' not found in catalog (tried alias and variant id).", + model_identifier + ); + eprintln!("Available aliases: {:?}", aliases); + eprintln!("Run with --list-models to see variant ids."); + std::process::exit(1); + } + }, + }; + + if !model.is_cached().await? { + print!("\nDownloading model {model_identifier}..."); + model + .download(Some(|progress: f64| { + print!("\rDownloading model: {progress:.2}%"); + io::stdout().flush().ok(); + })) + .await?; + println!(); + } + + print!("Loading model {model_identifier}..."); + model.load().await?; + println!("done."); + // + + // + print!("\nStarting web service..."); + manager.start_web_service().await?; + println!("done."); + + let urls = manager.urls()?; + let endpoint = urls + .first() + .expect("Web service did not return an endpoint"); + let base_url = format!("{}/v1", endpoint.trim_end_matches('/')); + println!("Web service listening on: {base_url}"); + // + + // + println!("\nPreparing image: {}", image_path.display()); + let (image_b64, media_type) = encode_image(&image_path)?; + + // The Foundry Local Responses API accepts an array of message items with input_text / + // input_image content parts. The input_image part uses Foundry-specific `image_data` and + // `media_type` fields (in place of OpenAI's `image_url`). + let vision_input = json!([ + { + "type": "message", + "role": "user", + "content": [ + { "type": "input_text", "text": "Describe this image." }, + { "type": "input_image", "image_data": image_b64, "media_type": media_type } + ] + } + ]); + + let body = json!({ + "model": model.id(), + "input": vision_input, + "max_output_tokens": DEFAULT_MAX_OUTPUT_TOKENS, + "stream": true, + }); + + println!("\nStreaming vision response..."); + // No request timeout: streamed vision responses can take a while to complete. + // (reqwest treats a zero Duration as "expire immediately" rather than "no timeout", + // so the timeout is simply left unset here.) + let client = reqwest::Client::builder().build()?; + let response = client + .post(format!("{base_url}/responses")) + .bearer_auth("notneeded") + .header("Accept", "text/event-stream") + .json(&body) + .send() + .await? + .error_for_status()?; + + print!("[ASSISTANT]: "); + io::stdout().flush().ok(); + + let mut stream = response.bytes_stream(); + let mut buf = String::new(); + 'outer: while let Some(chunk) = stream.next().await { + let chunk = chunk?; + buf.push_str(&String::from_utf8_lossy(&chunk)); + while let Some(nl) = buf.find('\n') { + let line = buf[..nl].trim_end().to_string(); + buf.drain(..=nl); + let Some(data) = line.strip_prefix("data: ") else { + continue; + }; + if data == "[DONE]" { + break 'outer; + } + if let Ok(event) = serde_json::from_str::(data) { + if event.get("type").and_then(|t| t.as_str()) + == Some("response.output_text.delta") + { + if let Some(delta) = event.get("delta").and_then(|d| d.as_str()) { + print!("{delta}"); + io::stdout().flush().ok(); + } + } + } + } + } + println!(); + // + + println!("\nStopping web service..."); + manager.stop_web_service().await?; + println!("Unloading model..."); + model.unload().await?; + println!("✓ Done."); + Ok(()) +} + +fn encode_image(path: &Path) -> Result<(String, &'static str), Box> { + let media_type = match path + .extension() + .and_then(|e| e.to_str()) + .map(|s| s.to_ascii_lowercase()) + .as_deref() + { + Some("jpg") | Some("jpeg") => "image/jpeg", + Some("png") => "image/png", + Some("gif") => "image/gif", + Some("bmp") => "image/bmp", + Some("webp") => "image/webp", + _ => "image/jpeg", + }; + let bytes = std::fs::read(path)?; + let b64 = base64::engine::general_purpose::STANDARD.encode(&bytes); + Ok((b64, media_type)) +} +// diff --git a/samples/rust/foundry-local-webserver-responses-vision/test_image.jpg b/samples/rust/foundry-local-webserver-responses-vision/test_image.jpg new file mode 100644 index 0000000000000000000000000000000000000000..73a4e8004db0fd82a2913bd14ad8b97672097ac5 GIT binary patch literal 6828 zcmc&&cT^MmwjOE#k)nx!lu)HAB1n;DIROy`K_EyK5D_?l^r8?20gV(1f*z`X2+|_b zM0)R{(v%{-w-8zqQr>XxJ?HAVuiX3IA8)euT9Y+1d;h-K-*4}4Htj2I0^qu&qo)JV z(E$J*_ycGozy*MXnHj>&!~%gpSXo)v*r5m64;*0UJ9L;6DtHtwBzROnKv?{Qq_C)* zn1FzkinQEG1tldVxTKn<>M4yAib|*UozSteva%mw=RJ6k_mqf$$f zpc4h?x#<|V>1fRW99$<8-R}qR_k)g}fsu(B!otdS0KB1$3!tZCV4!DYU}9oq1n&+8 zuLF$SOov1i&M_Z0vV(|vLQg%2%U}^ZU)%sQ{)`h>wD$^SWjn&d%Xd`bn52}njM8ak z6;-t}7k)wL=w8&jbj`%n3~7Gd!r`XlEvMVgF5W)Ae*OW0L17QWBO)I?Mq}ciB_uw7 z@mo@6)~oE-Ik|80-j$S=l~=s4{Lt9c+|t_C-qHD`uYX{0=xkPlAYPx`Yw@#N?MXjWe+_u4j#5;eh}NpKCDt**m%bxufL0ye8TH$; zEv`y?$%Y1Wj8lhne~31O`K&u6O9NCU!YGztp!p%{75UBo9)c8U7BxraLgfeAl1ZS; zY^qZss&Rz|ytzvQBvE_TL4FCrl$=-BZHXba!!Kjpl1nPQ!lHItU1x(x{*B{!soQni zlMw0?i`9mqH^x`4j1r3^I&m*#OlBh^W!=WjNP*?9MhMsfTnK7p6&pL%gmJJ8~q4kHU@WQ^ryZ;f0i zP=?pav9^7O3yZ~v)=w`fZqB1aL<#XNBU%0O_wp5KzzM(lHXhtIpG4)=5c@mcQAk-^ z=g>(eM>(tQRrn?ikl0kqenh=CewuTb$1ukCs7K_QyU0l5-Q@7_u$voY7eWZdOA_QUm!Ozg`3J9Ta`wGTc^qFZJFzo+b_llRVx#jeN+u2m z%!eY;w_j*@>9kzu6$oXS;kr9?ME2p^1lU4H17(>^0}f@`=zFg&$tZ-U^NEiUA}~B0YIK$fG@&aSCDa?JoedgLy4>M9SgVZc z=A?et-boES9$?-(-8D;u=i$c5=tb1XIO!U=Su11jNM)jdA5C^~`~a#wBa;RUJ%p2v zyUxk{%#p2A>uBB9jaZy?P+j|ACc1OuL=`0Bo(ig9v}j?~Uhc5Chj(MqS=BBs#Tdn; z9p^^VV;M|-TQ)iMDsBe4Q#sAbxLe7vv=r{Ly8{mV3XGO7r@LkQFe8Oy!nY=hg;g+AII3`d$ZQJ$wK+*QV3QUefQ|2W{qnUSpiWJNW{`-xyz5CY${# zzMu}*;Zt@0KJhfLQ&y(mdGPgOKk~`gLFLrXJFlbjEnJ6Bzs|ScD=f-9bnK#p_kaN_ z-Gg>94K9QYHi^m}(6a?UbmFm>ch1tgJo@Xg2`VC&9>i@5NKfzO2+M0MC zEqd_eh1lYh$w8@G^B;?#V9p%(bN2s`FDwILZ%QZKxEMEjB7)zS2gz@)u*olC66JW& zDy{l4M1aADXe&gRzvgD!W-+$-UjO0UjziK84Q9~>{S5-##mF>DAhve$vP^!IuVX2U z>VY7orwk=zIizQzWw+v^eq_@Ax$JUXabaou@7}-|J-pE>Tk;<$SSKp^M=dqug;j%5i_8f!; zc)<4zsuQc_oHl0hXq%ZsepKDcghv zt513mp-Q$STo?^N6bAA6w}?##v=N{+wll8CPFvWP@+YdOi)$GfNqF%Hf1+=p>z+yv_oZ*z(B&mnMg2j-^|~ zEwT?WJ#{cv$uK216HWPL@}HChNp|8`#Gn?mdSC{L>~}gKWDdJM)y@vg2(AjNQ5U8w zahm*kRYPUTS7cgLti~>hT@F1zpD#dv{sG276-e6S*ZJb6BzUy%Qt_4Y!D5y^VfwF2 zZ2*1qOni*qv0B|Ho_T3URUNBS$HB@M=24jn74!Gnh8o$X3EF`vgT&wal?J?kQ>sy` zXYdx=*y!|41{zQiOUbd(w>hFnz!O7sm#S50fanL?on7iPFfT<8qyLQJr+!OB*QVks z$mpZ;lz?+IAak&Wy=p3MquPPYW`o6_&QqH}&#?-pi6`=zyozhjNa&B^IyVujZt@IS zkXoi&50}2Dk5j5;+hFBhIb*%7k%%y#g>t`W7f&gg<(n4{8xr<3-SK!-t^S&jqm!&Z zAhIbOpWrw-<@x#3j$Qg3$2nng!!#-~sqVpibWx|(*UK$-RSyMU%}koZKQyyX#|<$~ zOm9_3YDAn!%D6rZzL$<_%3TUI#n;)ZsxU?eEror3@*fDXZUk@KEnBT|iwzSg5jKg) zD>w9y^a|Y(1L(p^Qsxgv!8BCKjSQVQ@|w@J^1W`9sRpn%S94 zgZjHkEHB{6YV}TxcOSHd++r94X)f93;NsxgSz|pu@j-sMe=W+y^eHGv+OHZRpma0i zKeD|xxAsfvs$th&aCqVP^VGC^vd2cJRokak;Rz-*;Gt#H_!iw48i3=_#$mN;*-T~B zaw>QDY!7{DuoH=V#_w5MVQq9bCI$@M59UmRBjJW)F+5O4X9V80gbT)M71UjI4dYm1Au87z@}Z2 zO>UsR|3m}Q4cB$MO^|(cKe~jX8jnw|lYm7O0Rkr&QBvjSIJa9P$s|;emVIU|s6T&N ze%4>Ztt_HWqKcHX=lk|P|IYxRC?|bl7{xOK)THU(V_)R|-1l$)C0e0sW(ti z^!ZHT##{jaSdh&>7~x~-mL?RKcCGtOBUhN2e}6F4$U3qoTvvnJszLm4Gp0%A8Ac6y z>O)*8iw2jj3kH3-JeN~o!?51 z7;cL1wDr+B7Sju*ON?<+lDE4uWH3mr#`L+IP@gT{L=&p>Qgmd7j5SH_D=K= zov~U-K$(R*cy;U|J5<1z54JUyn(#W+C9z!j?TDpr>eB|^U#wD6=;~+U3H{qG=xMe- zvy3acJtZ)cq;BB+J@Q^D4hJPP* z#p*I8+gVLp(FbD&b`dj%e{8rPKeq}vVT5;GjbBe&E{mCG-QbNW_$v7nvSi|! z^7z>^)7fIh6gwVi&4VQAIDG-!t55A1glL5Bg@hY{jNJX#meBvXmdZ`1bhotS@ zf*hMvwbsLaULzleqmOM%fF0jxzy1DK#-3;1NmYu>gAVbu&|2?}w#TnF zR37D|I{FodNBKQ6?Bw!%&(40NbxkLTqp^m6suyi(&=-HcOf_vsbMw6%8zKczXc)b~ z(0lfrK$wlRjoF#9@e*6b=I&yikWNh%&F7OUXzI`{(m~2ccagc?Zg(3Eudvw3)}Z*X z`o^K0q0V8OY|pHv#NqHPx3uqPmpg#zTvg$Ts8ID#*TrZEGScSL%JxO({`aw-HU_N% zr)D0coD@@?ht;xGXaou^Ity;(Ut*3q)&9kdGh(=y=@mvUYMu2sb zqxx#C7;Xun6f+!-DL^tkAV#Xg#ZxdmX++3<1?%u&#$!1!iGEU&FYKap+bvL$80~l5 z--VLt+8o|a7@qnbW?#JKxcOR6YhfZ-wth;kYsp;aC55<9hs2YIPiJvBbA$bWXg8|I z_?jJ*DfadcKL$gb18dGWZV|^?d0XIZ)O2{H4=H?cYA0Z{90P|iOM)X zu#tU=Ir~FO|DN8P@7fsUmA;63>SU*mPMUD&iElyay1|J<%fZQ5j?(fkqDwpnXD#U& zZqR_k6KQARMwm+8)0>4Jgs$f3sdVluxr_m;jgJba@41_|6{%zN)}y8+VGnGM$W)Z2 zC5z1!zle3ab|tX^6o+c<`%26`K)=~Lz_yudyscYu*PpdiA>c97{SE;f%ZwLuKsz?u zK2dE_FC$JIjbM+7l&QzFh~;AA^S^3w2V~!@(BiLD#c1M(i)9$3)o!*KpOYISNWQdK zuOFO!UJ`70?Rc|S*EfBqiZ@Hoj1WP8=)ZHq^m_Gz3+}^jWo^1^hj*|XVHw|D;-fn}geIz^R{~7i zA$<1Co~^EP!Pku651!UfJ-=FE=qchi^HlK@JLV2EAb20v6kXUMgs39NswgzyRlTda z1`4%osFBUN2~Z3Ft%8}qt^3d?pHcHJb)qu7Li|ZTd`X=vKn#sVu#T804g2f%3keM! zzlX>iSyC=~c=4s)(elg2H{37O&jd5;aGsYnu9lqiKwIh#@#v+@^dX>P-e=1M$gF_Jz0i|q-PLDeSb8qcW$;LTLMOmf)w*s5&bQb?PhbmAB)EztTWkJy zb0RmeBw|`4ycKM86;}~~+fZb0f9{fANBVJ#{CpKPUyvs+yML@+vI~ZhHd40S1Ygl4 zUAOZcX0oC6OCwGQs%kI0LR|dl&1qqw zfzJ_zD>)%iOAS++(fT!M^L;)h$`x0?e{1k}-89@WIKiErqc z19&2{m}ie`uYRwaUFb#0%^S@Y6uQ-6MWbOJds?>Zl5?oW*cBqGI{i0rorLrijW+7N zlA)cqYbKe6G{C8gGDGeKC4aj7Zv3Y9r0$N811QVQ@0$C|7@5(4l36M~#EC3L?R~!2 zK}^SG?QtrADV5VKP*0G`M}5Q7fIw|UDk$mya1{9h9A+m94GMNXGp+kGu=byhnz~W* zU7W5Qv!?b1ym!vE7ruCPsJTcTo$<#K%-rlSKi=6?w7+kXIB{W3HF literal 0 HcmV?d00001