Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.runtools.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Use the desktop-ubuntu template when an agent or operator needs a graphical environment. Desktop sandboxes expose browser desktop access through the sandbox state and support shell-driven automation tools available in the image.

Create A Desktop Sandbox

const sandbox = await rt.sandbox.create({
  template: 'desktop-ubuntu',
  name: 'browser-env',
});

await sandbox.waitForReady();

sandbox.on('status', (state) => {
  if (state.vncUrl) {
    console.log(state.vncUrl);
  }
});
runtools sandbox create --name browser-env --template desktop-ubuntu
runtools sandbox list
agents/browser-agent.ts
import { defineAgent } from '@runtools/sdk';

export default defineAgent({
  slug: 'browser-agent',
  name: 'Browser Agent',
  model: 'claude-sonnet-4',
  executionMode: 'in_sandbox',
  sandbox: 'browser-env',
  tools: ['exec_command', 'write_stdin', 'apply_patch', 'view_image', 'web_search', 'get_dev_url'],
  systemPrompt: `Use the desktop tools available through shell commands when a task requires browser or GUI inspection.`,
});
runtools deploy --agents-only
runtools agent run browser-agent "Open a browser, inspect the page, and summarize what you find"

Browser Desktop URL

The SDK exposes a normalized sandbox.vncUrl when desktop access is ready:
sandbox.on('status', (state) => {
  console.log(state.vncReady, sandbox.vncUrl);
});
The REST GET /v1/sandboxes/{id} response can also include vncReady and vncUrl.

Automation Pattern

Agents currently use shell-accessible tools for desktop automation. For example:
await sandbox.exec('xdotool getmouselocation');
await sandbox.exec('scrot /tmp/screen.png');
Use a vision-capable model when your workflow depends on screenshot interpretation.

Best Practices

Use Tool Hub or direct APIs when a provider offers a reliable API. Desktop automation is best for web or GUI workflows that do not have a usable API.
Wait for both sandbox readiness and desktop readiness before starting browser tasks.
Give the agent specific page goals, selectors, or visual landmarks when possible.
Pause or destroy desktop sandboxes when you are done.