trycua

trycua /cua

c/ua is the Docker Container for Computer-Use AI Agents.

8,782
393
GitHub
Public

Repository Statistics

Key metrics and engagement data

8.8k
Stars
393
Forks
57
Open Issues
0
Releases
1.04
Engagement Rate
Default branch: main

Timeline

Repository has been active for 5 months

Repository Created

Last Commit
Recently active

README.md

Cua logo

Python Swift macOS Discord
trycua%2Fcua | Trendshift

c/ua ("koo-ah") is Docker for Computer-Use Agents - it enables AI agents to control full operating systems in virtual containers and deploy them locally or to the cloud.

Check out more demos of the Computer-Use Agent in action
MCP Server: Work with Claude Desktop and Tableau
AI-Gradio: Multi-app workflow with browser, VS Code and terminal
Notebook: Fix GitHub issue in Cursor

🚀 Quick Start with a Computer-Use Agent UI

Need to automate desktop tasks? Launch the Computer-Use Agent UI with a single command.

Option 1: Fully-managed install with Docker (recommended)

Docker-based guided install for quick use

macOS/Linux/Windows (via WSL):

bash
1# Requires Docker
2/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/scripts/playground-docker.sh)"

This script will guide you through setup using Docker containers and launch the Computer-Use Agent UI.


Option 2: Dev Container

Best for contributors and development

This repository includes a Dev Container configuration that simplifies setup to a few steps:

  1. Install the Dev Containers extension (VS Code or WindSurf)
  2. Open the repository in the Dev Container:
    • Press Ctrl+Shift+P (or ⌘+Shift+P on macOS)
    • Select Dev Containers: Clone Repository in Container Volume... and paste the repository URL: https://github.com/trycua/cua.git (if not cloned) or Dev Containers: Open Folder in Container... (if git cloned).

    Note: On WindSurf, the post install hook might not run automatically. If so, run /bin/bash .devcontainer/post-install.sh manually.

  3. Open the VS Code workspace: Once the post-install.sh is done running, open the .vscode/py.code-workspace workspace and press
    Open Workspace
    .
  4. Run the Agent UI example: Click
    Run Agent UI
    to start the Gradio UI. If prompted to install debugpy (Python Debugger) to enable remote debugging, select 'Yes' to proceed.
  5. Access the Gradio UI: The Gradio UI will be available at http://localhost:7860 and will automatically forward to your host machine.

Option 3: PyPI

Direct Python package installation

bash
1# conda create -yn cua python==3.12
2
3pip install -U "cua-computer[all]" "cua-agent[all]"
4python -m agent.ui # Start the agent UI

Or check out the Usage Guide to learn how to use our Python SDK in your own code.


Supported Agent Loops

🖥️ Compatibility

For detailed compatibility information including host OS support, VM emulation capabilities, and model provider compatibility, see the Compatibility Matrix.



🐍 Usage Guide

Follow these steps to use C/ua in your own Python code. See Developer Guide for building from source.

Step 1: Install Lume CLI

bash
1/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"

Lume CLI manages high-performance macOS/Linux VMs with near-native speed on Apple Silicon.

Step 2: Pull the macOS CUA Image

bash
1lume pull macos-sequoia-cua:latest

The macOS CUA image contains the default Mac apps and the Computer Server for easy automation.

Step 3: Install Python SDK

bash
1pip install "cua-computer[all]" "cua-agent[all]"

Step 4: Use in Your Code

python
1from computer import Computer
2from agent import ComputerAgent, LLM
3
4async def main():
5 # Start a local macOS VM
6 computer = Computer(os_type="macos")
7 await computer.run()
8
9 # Or with C/ua Cloud Container
10 computer = Computer(
11 os_type="linux",
12 api_key="your_cua_api_key_here",
13 name="your_container_name_here"
14 )
15
16 # Example: Direct control of a macOS VM with Computer
17 computer.interface.delay = 0.1 # Wait 0.1 seconds between kb/m actions
18 await computer.interface.left_click(100, 200)
19 await computer.interface.type_text("Hello, world!")
20 screenshot_bytes = await computer.interface.screenshot()
21
22 # Example: Create and run an agent locally using mlx-community/UI-TARS-1.5-7B-6bit
23 agent = ComputerAgent(
24 computer=computer,
25 loop="uitars",
26 model=LLM(provider="mlxvlm", name="mlx-community/UI-TARS-1.5-7B-6bit")
27 )
28 async for result in agent.run("Find the trycua/cua repository on GitHub and follow the quick start guide"):
29 print(result)
30
31if __name__ == "__main__":
32 asyncio.run(main())

For ready-to-use examples, check out our Notebooks collection.

Lume CLI Reference

bash
1# Install Lume CLI and background service
2curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash
3
4# List all VMs
5lume ls
6
7# Pull a VM image
8lume pull macos-sequoia-cua:latest
9
10# Create a new VM
11lume create my-vm --os macos --cpu 4 --memory 8GB --disk-size 50GB
12
13# Run a VM (creates and starts if it doesn't exist)
14lume run macos-sequoia-cua:latest
15
16# Stop a VM
17lume stop macos-sequoia-cua_latest
18
19# Delete a VM
20lume delete macos-sequoia-cua_latest

Lumier CLI Reference

For advanced container-like virtualization, check out Lumier - a Docker interface for macOS and Linux VMs.

bash
1# Install Lume CLI and background service
2curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash
3
4# Run macOS in a Docker container
5docker run -it --rm \
6 --name lumier-vm \
7 -p 8006:8006 \
8 -v $(pwd)/storage:/storage \
9 -v $(pwd)/shared:/shared \
10 -e VM_NAME=lumier-vm \
11 -e VERSION=ghcr.io/trycua/macos-sequoia-cua:latest \
12 -e CPU_CORES=4 \
13 -e RAM_SIZE=8192 \
14 -e HOST_STORAGE_PATH=$(pwd)/storage \
15 -e HOST_SHARED_PATH=$(pwd)/shared \
16 trycua/lumier:latest

Resources

Modules

ModuleDescriptionInstallation
LumeVM management for macOS/Linux using Apple's Virtualization.Frameworkcurl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash
LumierDocker interface for macOS and Linux VMsdocker pull trycua/lumier:latest
Computer (Python)Python Interface for controlling virtual machinespip install "cua-computer[all]"
Computer (Typescript)Typescript Interface for controlling virtual machinesnpm install @trycua/computer
AgentAI agent framework for automating taskspip install "cua-agent[all]"
MCP ServerMCP server for using CUA with Claude Desktoppip install cua-mcp-server
SOMSelf-of-Mark library for Agentpip install cua-som
Computer ServerServer component for Computerpip install cua-computer-server
Core (Python)Python Core utilitiespip install cua-core
Core (Typescript)Typescript Core utilitiesnpm install @trycua/core

Computer Interface Reference

For complete examples, see computer_examples.py or computer_nb.ipynb

python
1# Shell Actions
2result = await computer.interface.run_command(cmd) # Run shell command
3# result.stdout, result.stderr, result.returncode
4
5# Mouse Actions
6await computer.interface.left_click(x, y) # Left click at coordinates
7await computer.interface.right_click(x, y) # Right click at coordinates
8await computer.interface.double_click(x, y) # Double click at coordinates
9await computer.interface.move_cursor(x, y) # Move cursor to coordinates
10await computer.interface.drag_to(x, y, duration) # Drag to coordinates
11await computer.interface.get_cursor_position() # Get current cursor position
12await computer.interface.mouse_down(x, y, button="left") # Press and hold a mouse button
13await computer.interface.mouse_up(x, y, button="left") # Release a mouse button
14
15# Keyboard Actions
16await computer.interface.type_text("Hello") # Type text
17await computer.interface.press_key("enter") # Press a single key
18await computer.interface.hotkey("command", "c") # Press key combination
19await computer.interface.key_down("command") # Press and hold a key
20await computer.interface.key_up("command") # Release a key
21
22# Scrolling Actions
23await computer.interface.scroll(x, y) # Scroll the mouse wheel
24await computer.interface.scroll_down(clicks) # Scroll down
25await computer.interface.scroll_up(clicks) # Scroll up
26
27# Screen Actions
28await computer.interface.screenshot() # Take a screenshot
29await computer.interface.get_screen_size() # Get screen dimensions
30
31# Clipboard Actions
32await computer.interface.set_clipboard(text) # Set clipboard content
33await computer.interface.copy_to_clipboard() # Get clipboard content
34
35# File System Operations
36await computer.interface.file_exists(path) # Check if file exists
37await computer.interface.directory_exists(path) # Check if directory exists
38await computer.interface.read_text(path, encoding="utf-8") # Read file content
39await computer.interface.write_text(path, content, encoding="utf-8") # Write file content
40await computer.interface.read_bytes(path) # Read file content as bytes
41await computer.interface.write_bytes(path, content) # Write file content as bytes
42await computer.interface.delete_file(path) # Delete file
43await computer.interface.create_dir(path) # Create directory
44await computer.interface.delete_dir(path) # Delete directory
45await computer.interface.list_dir(path) # List directory contents
46
47# Accessibility
48await computer.interface.get_accessibility_tree() # Get accessibility tree
49
50# Delay Configuration
51# Set default delay between all actions (in seconds)
52computer.interface.delay = 0.5 # 500ms delay between actions
53
54# Or specify delay for individual actions
55await computer.interface.left_click(x, y, delay=1.0) # 1 second delay after click
56await computer.interface.type_text("Hello", delay=0.2) # 200ms delay after typing
57await computer.interface.press_key("enter", delay=0.5) # 500ms delay after key press
58
59# Python Virtual Environment Operations
60await computer.venv_install("demo_venv", ["requests", "macos-pyxa"]) # Install packages in a virtual environment
61await computer.venv_cmd("demo_venv", "python -c 'import requests; print(requests.get(`https://httpbin.org/ip`).json())'") # Run a shell command in a virtual environment
62await computer.venv_exec("demo_venv", python_function_or_code, *args, **kwargs) # Run a Python function in a virtual environment and return the result / raise an exception
63
64# Example: Use sandboxed functions to execute code in a C/ua Container
65from computer.helpers import sandboxed
66
67@sandboxed("demo_venv")
68def greet_and_print(name):
69 """Get the HTML of the current Safari tab"""
70 import PyXA
71 safari = PyXA.Application("Safari")
72 html = safari.current_document.source()
73 print(f"Hello from inside the container, {name}!")
74 return {"greeted": name, "safari_html": html}
75
76# When a @sandboxed function is called, it will execute in the container
77result = await greet_and_print("C/ua")
78# Result: {"greeted": "C/ua", "safari_html": "<html>...</html>"}
79# stdout and stderr are also captured and printed / raised
80print("Result from sandboxed function:", result)

ComputerAgent Reference

For complete examples, see agent_examples.py or agent_nb.ipynb

python
1# Import necessary components
2from agent import ComputerAgent, LLM, AgentLoop, LLMProvider
3
4# UI-TARS-1.5 agent for local execution with MLX
5ComputerAgent(loop=AgentLoop.UITARS, model=LLM(provider=LLMProvider.MLXVLM, name="mlx-community/UI-TARS-1.5-7B-6bit"))
6# OpenAI Computer-Use agent using OPENAI_API_KEY
7ComputerAgent(loop=AgentLoop.OPENAI, model=LLM(provider=LLMProvider.OPENAI, name="computer-use-preview"))
8# Anthropic Claude agent using ANTHROPIC_API_KEY
9ComputerAgent(loop=AgentLoop.ANTHROPIC, model=LLM(provider=LLMProvider.ANTHROPIC))
10
11# OmniParser loop for UI control using Set-of-Marks (SOM) prompting and any vision LLM
12ComputerAgent(loop=AgentLoop.OMNI, model=LLM(provider=LLMProvider.OLLAMA, name="gemma3:12b-it-q4_K_M"))
13# OpenRouter example using OAICOMPAT provider
14ComputerAgent(
15 loop=AgentLoop.OMNI,
16 model=LLM(
17 provider=LLMProvider.OAICOMPAT,
18 name="openai/gpt-4o-mini",
19 provider_base_url="https://openrouter.ai/api/v1"
20 ),
21 api_key="your-openrouter-api-key"
22)

Community

Join our Discord community to discuss ideas, get assistance, or share your demos!

License

Cua is open-sourced under the MIT License - see the LICENSE file for details.

Microsoft's OmniParser, which is used in this project, is licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0) - see the OmniParser LICENSE file for details.

Contributing

We welcome contributions to CUA! Please refer to our Contributing Guidelines for details.

Trademarks

Apple, macOS, and Apple Silicon are trademarks of Apple Inc. Ubuntu and Canonical are registered trademarks of Canonical Ltd. Microsoft is a registered trademark of Microsoft Corporation. This project is not affiliated with, endorsed by, or sponsored by Apple Inc., Canonical Ltd., or Microsoft Corporation.

Stargazers

Thank you to all our supporters!

Stargazers over time

Contributors