The Golden Path: your SO-101, driven by a cloud robot brain

This is the recipe. Follow it top to bottom and a cloud-hosted MolmoAct2 model will drive your real SO-101 through a pick-and-place — smoothly, safely, in under 15 minutes including the GPU worker's cold start. No robotics PhD required.

No arm yet? From a source checkout, run uv run --package dropbear --extra sim dropbear sim after cd mvp (Linux or WSL2 — on macOS Apple Silicon the upstream LIBERO/robosuite stack currently crashes at env creation, a known limitation verified 2026-06-12) — same brain, simulated LIBERO arm, nothing to plug in.

What you need

Item	Notes
SO-101 arm	assembled, motors connected, USB to your computer
A commodity webcam	any 1080p USB webcam or your laptop's built-in camera. NOT a RealSense or other depth camera — the model wants plain RGB, and the golden path is tuned and evaluated on cheap cameras
A computer	macOS or Linux, Python 3.12 (uv uses the workspace pin below), normal home network
A table	plain, uncluttered surface; a light solid-colour tablecloth or desk mat helps
Props	a red cube, a blue cube, a yellow cube (~3–4 cm sides), a bowl, a plate, masking tape
15 minutes	including the GPU waking up

1. Sign up and create an API key

Sign in at dropbear.dreamtimelabs.ai (GitHub or Google).
Dashboard → API keys → Create key. Copy it now — it is shown once.

2. Run from source

Until the first PyPI publish, use a checkout of this repository instead of package install commands. From the checkout root:

cd mvp
uv run --package dropbear dropbear quickstart

For simulation, include the workspace sim extra:

cd mvp
uv run --package dropbear --extra sim dropbear sim

The quickstart command asks for at most three things: paste your API key, confirm the auto-detected robot port, pick your camera from a live preview. It saves everything to ~/.dropbear/config.toml, then pre-warms your GPU worker — the cloud brain boots while you set up the scene below. Until PyPI packages are published, run SDK commands from mvp with the same uv run --package dropbear dropbear ... prefix.

Once PyPI packages are published, the hardware path can move to a normal dropbear console script. Sim users will need the published dropbear[sim] extra; that package path is not available yet.

3. Place the camera (the part most people get wrong)

The model was trained looking at the arm from the side. Reproduce that view with a third-person side camera:

TOP VIEW                                SIDE VIEW (what the camera sees)
┌────────────────────────┐
│   workspace   ███ arm  │              ┌──────────────────────────┐
│   (props)     ███ base │              │      ▒ arm reaching      │
│      ◄── 25 cm ──►     │              │  ▒▒▒▒▒  over workspace   │
└────────────────────────┘              │  █████  props in front   │
          ▲                             │ ────────────── table edge│
          │ 60–80 cm                    └──────────────────────────┘
          │
       [webcam]  ← 20–30 cm above the tabletop, level-ish, slight downward tilt

Position: 60–80 cm from the arm, at 90° to the arm's forward direction (side-on), lens 20–30 cm above the tabletop, tilted slightly down.
Framing: the whole arm AND the whole workspace are visible; the arm sits in the left or right third, props in the middle; nothing important touches the frame edge.
Lighting: even, diffuse room light. No window or lamp behind the scene (backlight), no hard shadows across the workspace.
Lock white balance and exposure. Auto-WB/auto-exposure drift mid-task and confuse the model. On Linux: v4l2-ctl -c white_balance_automatic=0 -c auto_exposure=1. On macOS, use your camera utility's manual mode. If you can't lock them, keep the lighting constant and avoid wearing/moving bright colours near the frame.

A labelled reference diagram ships with the SDK at dropbear/assets/camera_reference.svg — open it whenever you need the exact layout. uv run --package dropbear dropbear doctor shows a photographic version of this same setup side-by-side with your live frame and asks "does your frame look like this?" before any motion.

4. Calibrate

If your arm has never been calibrated (or dropbear doctor flags stale calibration), run:

uv run --package dropbear dropbear calibrate

It is a guided wrapper around LeRobot's SO-101 calibration: move each joint to the positions it shows, press enter, done. Calibration is the #1 first-run blocker — do not skip it when offered.

5. Run it

uv run --package dropbear dropbear doctor          # runs automatically before motion, too
uv run --package dropbear dropbear demo pick-place

The first demo run is a dry run: the arm does not move; you get a live "would move" readout instead. When it completes, it asks "Enable real motion? [y/N]" — answer y and it runs for real, reusing the pre-warmed worker. If the worker is still waking you'll see "Starting GPU worker… ~2–4 min" with live progress; that wait is already inside the 15-minute budget.

Safety is code, not vibes: per-tick joint-step clamping, spacebar/Ctrl-C e-stop, and freeze-on-disconnect are always on.

6. The curated tasks

Start with these three — they are the tasks the golden path is evaluated on (high-contrast props, simple goals):

pick up the red cube and place it in the bowl
pick up the blue cube and place it on the plate
push the yellow cube onto the taped square

For task 3, mark a ~10×10 cm square on the table with masking tape. Place props ~10–20 cm apart, none touching, all inside the camera frame. Start Python from the source checkout:

cd mvp
uv run --package dropbear python

Then:

import dropbear
arm = dropbear.connect_so101()
arm.do("pick up the red cube and place it in the bowl")

Troubleshooting

Symptom	Fix
`UDP blocked` from doctor	your network filters UDP/4433 — try a phone hotspot or a different network; corporate networks and some campus Wi-Fi block it
Camera not found / black preview	unplug-replug, close other apps using the camera (Zoom, browser tabs), re-run `uv run --package dropbear dropbear quickstart` to re-pick
Robot port not found / permission denied	check the USB cable seat; on Linux add yourself to the serial group: `sudo usermod -aG dialout $USER`, then log out and back in
Calibration mismatch error	run `uv run --package dropbear dropbear calibrate` — it migrates old LeRobot calibration formats too
"Starting GPU worker…" for more than ~4 min	the session will fail with a clear error and a retry offer; AWS GPU capacity can briefly run dry — retry in a few minutes or run `uv run --package dropbear --extra sim dropbear sim` meanwhile
Arm freezes mid-task	that is freeze-on-disconnect doing its job; the CLI explains and auto-reconnects, or prints the exact retry command
Motion is slower than usual	the status line says how many sessions share your GPU; it's a shared instance — try again later for solo speed
Cert-fingerprint mismatch error	do not reconnect; report it. This is the SDK refusing a worker it cannot verify
Arm dips toward the table	stop (spacebar), re-check camera placement against the reference image — a mis-framed camera is the usual cause

What "good" looks like

On a cheap webcam with this recipe, the bar we hold ourselves to (and measure publicly) is the Act-0 protocol: 10 trials per task, two of the three tasks succeeding at 6/10 and 3/10 or better, zero unsafe motions. See mvp/docs/act0-protocol.md in the repo for the exact protocol.