Releases: askui/python-sdk
v0.32.1
v0.32.1
🎉 Overview
v0.32.1 fixes a bug that led to a crash if the optional "web" dependency group was not installed.
🐛 Bug Fixes
- fix: add missing import guard for PlaywrightBaseTool by @philipph-askui in #270
Full Changelog: v0.32.0...v0.32.1
v0.32.0
v0.32.0
🎉 Overview
v0.32.0 introduces the new WebAgent, a browser automation agent with native Playwright tools for mouse, keyboard, and screenshot interactions. The release also adds numpad key support across the AgentOS keyboard abstraction.
✨ New Features
WebAgent— a new browser automation agent with a full suite of Playwright tools (screenshot,move_mouse,mouse_click,mouse_scroll,mouse_hold_down,mouse_release,type,keyboard_tap,keyboard_pressed,keyboard_release) in addition to the existing navigation tools by @philipph-askui in #267- Numpad key support — added
numpad_lock,numpad_0–numpad_9,numpad_+,numpad_-,numpad_*,numpad_/, andnumpad_.toPcKeywith corresponding Playwright key mappings by @mlikasam-askui in #269
🔧 Improvements
- Set
is_cacheableflag onlist_process_toolfor improved caching by @philipph-askui in #267
⚠️ Breaking Changes
WebVisionAgentis deprecated — useWebAgentinstead.WebVisionAgentstill works but emits aDeprecationWarningWebAgentnow extendsAgentdirectly instead ofComputerAgent, with a new constructor signature that acceptscallbacksandtruncation_strategyparameters- Playwright navigation tools (
PlaywrightGotoTool,PlaywrightBackTool, etc.) now inherit fromPlaywrightBaseToolinstead ofTooland require aPlaywrightAgentOs(or compatible) instance as their agent OS
Full Changelog: v0.31.0...v0.32.0
v0.31.0
v0.31.0
🎉 Overview
v0.31.0 substantially improves the memory efficiency of askui. The SimpleHtmlReporter has been rearchitected to stream message rows (including base64-encoded screenshots) to a temporary file on disk instead of accumulating them in memory, significantly reducing memory usage during long-running sessions. Further, reporters are now wrapped with automatic error handling so that a failure in one reporter no longer crashes the agent.
✨ New Features
ReporterErrorHandler— a decorator that wraps anyReporterwith try/except error handling; on first failure the reporter is disabled for the rest of the session, preventing reporting errors from interrupting agent execution by @mlikasam-askui in #258
🔧 Improvements
SimpleHtmlReporternow streams HTML message rows to a temporary file as they arrive instead of holding all base64 image data in memory, reducing peak memory usage for screenshot-heavy sessions by @mlikasam-askui in #258CompositeReporternow automatically wraps all reporters inReporterErrorHandler, making error resilience the default behavior by @mlikasam-askui in #258
Full Changelog: v0.30.0...v0.31.0
v0.30.0
v0.30.0
🎉 Overview
v0.30.0 introduces a new infrastructure-error handling prompt that prevents agents from entering unfixable retry loops when the underlying controller, session, or RPC connection fails. It also enriches the HTML report's conversation breakdown with per-conversation step counts, durations, and cache token statistics, and quiets noisy tool-failure logs by demoting them from WARNING to INFO.
✨ New Features
- Infrastructure / tool error prompt added to the computer, Android, and multi-device agent capabilities — instructs agents to retry infrastructure failures (connection lost, session expired, RPC errors, stream closed, service unavailable, controller timeouts) at most once and otherwise stop immediately with a
BROKENreport status instead of looping on unfixable errors by @philipph-askui in #265 - Step count,
cache_creation_input_tokens, andcache_read_input_tokensadded to the per-conversation usage breakdown inSimpleHtmlReporterby @philipph-askui in #264 - Per-conversation duration added to the HTML report breakdown —
started_at/ended_attimestamps are captured on conversation summaries and rendered in a human-readable elapsed-time format by @philipph-askui in #266
🔧 Improvements
Tool failedlogs inToolCollectiondemoted fromWARNINGtoINFOto reduce log noise during normal agent operation by @philipph-askui in #264
⚠️ Breaking Changes
UsageTrackingCallbackrenamed toConversationStatisticsCallback
Full Changelog: v0.29.0...v0.30.0
v0.29.0
v0.29.0
🎉 Overview
v0.29.0 replaces the simple message-dropping truncation strategy with a new VLM-based SummarizingTruncationStrategy that summarizes older conversation history to preserve context while staying within token limits. It also fixes mouse scroll coordinate scaling issues, improves scroll tool descriptions with OS-specific guidance, removes get and locate from the default agent tools, hardens the move_mouse tool against malformed coordinate inputs, and makes base64 image truncation in html reports more robust.
✨ New Features
SummarizingTruncationStrategy— new default truncation strategy that uses the VLM to summarize older conversation history instead of dropping messages, with prompt caching support during summarization for cost efficiency by @philipph-askui in #257SlidingImageWindowSummarizingTruncationStrategy(experimental) — extends summarization with dynamic image removal from older messages to reduce network traffic and latencies while staying compatible with prompt caching by @philipph-askui in #257truncation_strategyinit parameter onComputerAgent,AndroidAgent, andAgent— allows passing a custom truncation strategy with auto-injection of conversation dependencies (vlm_provider,reporter,callbacks) by @philipph-askui in #257
🔧 Improvements
- Mouse scroll tool description now includes OS-dependent scroll guidance (start with
dy=150/dy=-150, macOS direction info) by @programminx-askui in #260 truncate_contentin reporting replaced bytruncate_base64_images— only base64 image data is replaced with placeholders, leaving all other content (prompts, tool outputs) untouched by @philipph-askui in #259move_mousetool now robustly parses coordinates when the agent passes them as strings or comma-separated values, with clearer tool description and improved error messages by @philipph-askui in #262
🐛 Bug Fixes
- Fix incorrect coordinate scaling on mouse scroll deltas —
ComputerAgentOsFacade.mouse_scrollno longer applies display scaling to scroll amounts (SOLENG-332) by @programminx-askui in #260
⚠️ Breaking Changes
SimpleTruncationStrategyandSimpleTruncationStrategyFactoryremoved — replaced bySummarizingTruncationStrategyas the new defaultConversationconstructor parametertruncation_strategy_factoryreplaced bytruncation_strategy(a strategy instance instead of a factory)getandlocatetools removed fromAgent's default tool list — they are no longer auto-added when anagent_osis providedmouse_scrollparameters renamed fromx/ytodx/dyacross allAgentOsimplementations (AskUiControllerClient,PlaywrightAgentOs,ComputerAgentOsFacade,ComputerAgent)truncate_contentfunction inreporting.pyremoved — replaced bytruncate_base64_images
Full Changelog: v0.28.0...v0.29.0
v0.28.0
v0.28.0
🎉 Overview
v0.28.0 integrates AgentOS as a Python package dependency (no more manual installation), adds a UIAutomator hierarchy tool for Android agents, improves support for Anthropic prompt caching to reduce inference cost, introduces Tool.from_mcp_tool() for wrapping FastMCP tools, and overhauls usage tracking with per-step and per-conversation cost breakdowns including cache token costs in the HTML reports.
✨ New Features
- AgentOS shipped as Python package (
askui-agent-os) — no manual installation needed by @mlikasam-askui in #246 - Anthropic prompt caching (auto strategy) with
cache_controlparameter by @philipph-askui in #253 AndroidGetUIAutomatorHierarchyTool— accessibility hierarchy dump for Android agents, providing structured UI element data (text, resource IDs, tap centers) as an alternative to screenshot-based inference by @mlikasam-askui in #251- Hierarchical usage tracking with per-step, per-conversation, and aggregate cost breakdowns including cache token costs in HTML reports by @mlikasam-askui in #253
🔧 Improvements
Tool.from_mcp_tool()to wrap FastMCP tools as AskUI Tools by @mlikasam-askui in #250markitdownandbsonmoved to optional dependencies (office-document) andpure-python-adbpromoted to core to streamline the installation by @mlikasam-askui in #255- Documented optional install extras (
office-document,bedrock,vertex,otel,web) in README by @mlikasam-askui in #255 - Workspace ID (
askui.workspace.id) added to OTEL trace resource attributes by @philipph-askui in #256 - Improved tracing structure with
_get_next_message()span for better observability by @philipph-askui in #256
🐛 Bug Fixes
- Fix prompt caching breakpoints to improve prompt caching efficiency by @philipph-askui in #253
- Fix report formatting and cache statistics accumulation by @philipph-askui in #253
- Constrain
grpcio<1.80.0to avoid compatibility issues by @philipph-askui in #250 - Clean up OTEL tracing: remove stale
cluster_nameconfig and unnecessary SQLAlchemy instrumentation by @philipph-askui in #256
⚠️ Breaking Changes
ASKUI_COMPONENT_REGISTRY_FILE,ASKUI_INSTALLATION_DIRECTORY, andASKUI_CONTROLLER_PATHenvironment variables are no longer recognized — AgentOS is now auto-discovered via theaskui-agent-ospackageOtelSettings.cluster_namefield andASKUI__OTEL_CLUSTER_NAMEenv var removed; replaced byworkspace_id/ASKUI_WORKSPACE_ID- Minimum
anthropicSDK version bumped from>=0.72.0to>=0.86.0 androidoptional extra removed —pure-python-adbis now a core dependency; useoffice-documentextra for MarkItDown features previously bundled by defaultbsonandmarkitdownremoved from default dependencies — installaskui[office-document]if you need Office file conversion
Full Changelog: v0.27.0...v0.28.0
v0.27.0
v0.27.0
🎉 Overview
v0.27.0 adds a MultiDeviceAgent that can operate android and computer devices simultaneously, improves the SDK structure by introducing default tool lists for the ComputerAgent, and AndroidAgent, and fixes a bug with single-display handling on android.
✨ New Features
- Execution Cost in Html Reports by @philipph-askui in #243
- Otel tracing @philipph-askui in #249
- max_steps parameter to stop the execution after a predefined number of steps by @philipph-askui in #244
🔧 Improvements
- format execution time in html reports as
hh:mm:ssby @philipph-askui in #247
🐛 Bug Fixes
Full Changelog: v0.26.1...v0.27.0
v0.26.1
What's Changed
- calls to the print_to_console_tool are now skipped during cached executions, to save time and tokens by @philipph-askui in #245
Full Changelog: v0.26.0...v0.26.1
v0.26.0
v0.26.0
🎉 Overview
v0.26.0 adds a MultiDeviceAgent that can operate android and computer devices simultaneously, improves the SDK structure by introducing default tool lists for the ComputerAgent, and AndroidAgent, and fixes a bug with single-display handling on android.
✨ New Features
- Add
MultiDeviceAgentby @philipph-askui in #241
🔧 Improvements
- Adds Default tool Lists for Agents by @philipph-askui in #242
🐛 Bug Fixes
- refactor(android): single-display handling and SingleAndroidDisplay by @mlikasam-askui in #240
Full Changelog: v0.25.1...v0.26.0
v0.25.1
What's Changed
the value of model_id for vlm_providers can now be set as env variable by @philipph-askui in #239
Full Changelog: v0.25.0...v0.25.1