The Golden Path: your SO-101, driven by a cloud robot brain
This is the recipe. Follow it top to bottom and a cloud-hosted MolmoAct2 model will drive your real SO-101 through a pick-and-place — smoothly, safely, in under 15 minutes including the GPU worker's cold start. No robotics PhD required.
No arm yet? From a source checkout, run
uv run --package dropbear --extra sim dropbear sim after cd mvp (Linux or
WSL2 — on macOS Apple Silicon the upstream LIBERO/robosuite stack currently crashes
at env creation, a known limitation verified 2026-06-12) — same brain,
simulated LIBERO arm, nothing to plug in.
What you need
| Item | Notes |
|---|---|
| SO-101 arm | assembled, motors connected, USB to your computer |
| A commodity webcam | any 1080p USB webcam or your laptop's built-in camera. NOT a RealSense or other depth camera — the model wants plain RGB, and the golden path is tuned and evaluated on cheap cameras |
| A computer | macOS or Linux, Python 3.12 (uv uses the workspace pin below), normal home network |
| A table | plain, uncluttered surface; a light solid-colour tablecloth or desk mat helps |
| Props | a red cube, a blue cube, a yellow cube (~3–4 cm sides), a bowl, a plate, masking tape |
| 15 minutes | including the GPU waking up |
1. Sign up and create an API key
- Sign in at
dropbear.dreamtimelabs.ai(GitHub or Google). - Dashboard → API keys → Create key. Copy it now — it is shown once.
2. Run from source
Until the first PyPI publish, use a checkout of this repository instead of package install commands. From the checkout root:
cd mvp
uv run --package dropbear dropbear quickstart
For simulation, include the workspace sim extra:
cd mvp
uv run --package dropbear --extra sim dropbear sim
The quickstart command asks for at most three things: paste your API key, confirm the
auto-detected robot port, pick your camera from a live preview. It saves everything to
~/.dropbear/config.toml, then pre-warms your GPU worker — the cloud brain boots while
you set up the scene below. Until PyPI packages are published, run SDK commands from
mvp with the same uv run --package dropbear dropbear ... prefix.
Once PyPI packages are published, the hardware path can move to a normal dropbear
console script. Sim users will need the published dropbear[sim] extra; that package
path is not available yet.
3. Place the camera (the part most people get wrong)
The model was trained looking at the arm from the side. Reproduce that view with a third-person side camera:
TOP VIEW SIDE VIEW (what the camera sees)
┌────────────────────────┐
│ workspace ███ arm │ ┌──────────────────────────┐
│ (props) ███ base │ │ ▒ arm reaching │
│ ◄── 25 cm ──► │ │ ▒▒▒▒▒ over workspace │
└────────────────────────┘ │ █████ props in front │
▲ │ ────────────── table edge│
│ 60–80 cm └──────────────────────────┘
│
[webcam] ← 20–30 cm above the tabletop, level-ish, slight downward tilt
- Position: 60–80 cm from the arm, at 90° to the arm's forward direction (side-on), lens 20–30 cm above the tabletop, tilted slightly down.
- Framing: the whole arm AND the whole workspace are visible; the arm sits in the left or right third, props in the middle; nothing important touches the frame edge.
- Lighting: even, diffuse room light. No window or lamp behind the scene (backlight), no hard shadows across the workspace.
- Lock white balance and exposure. Auto-WB/auto-exposure drift mid-task and confuse
the model. On Linux:
v4l2-ctl -c white_balance_automatic=0 -c auto_exposure=1. On macOS, use your camera utility's manual mode. If you can't lock them, keep the lighting constant and avoid wearing/moving bright colours near the frame.
A labelled reference diagram ships with the SDK at
dropbear/assets/camera_reference.svg — open it whenever you need the exact layout.
uv run --package dropbear dropbear doctor shows a photographic version of this same
setup side-by-side with your live frame and asks "does your frame look like this?" before
any motion.
4. Calibrate
If your arm has never been calibrated (or dropbear doctor flags stale calibration), run:
uv run --package dropbear dropbear calibrate
It is a guided wrapper around LeRobot's SO-101 calibration: move each joint to the positions it shows, press enter, done. Calibration is the #1 first-run blocker — do not skip it when offered.
5. Run it
uv run --package dropbear dropbear doctor # runs automatically before motion, too
uv run --package dropbear dropbear demo pick-place
The first demo run is a dry run: the arm does not move; you get a live "would move"
readout instead. When it completes, it asks "Enable real motion? [y/N]" — answer y and
it runs for real, reusing the pre-warmed worker. If the worker is still waking you'll see
"Starting GPU worker… ~2–4 min" with live progress; that wait is already inside the
15-minute budget.
Safety is code, not vibes: per-tick joint-step clamping, spacebar/Ctrl-C e-stop, and freeze-on-disconnect are always on.
6. The curated tasks
Start with these three — they are the tasks the golden path is evaluated on (high-contrast props, simple goals):
- pick up the red cube and place it in the bowl
- pick up the blue cube and place it on the plate
- push the yellow cube onto the taped square
For task 3, mark a ~10×10 cm square on the table with masking tape. Place props ~10–20 cm apart, none touching, all inside the camera frame. Start Python from the source checkout:
cd mvp
uv run --package dropbear python
Then:
import dropbear
arm = dropbear.connect_so101()
arm.do("pick up the red cube and place it in the bowl")
Troubleshooting
| Symptom | Fix |
|---|---|
UDP blocked from doctor |
your network filters UDP/4433 — try a phone hotspot or a different network; corporate networks and some campus Wi-Fi block it |
| Camera not found / black preview | unplug-replug, close other apps using the camera (Zoom, browser tabs), re-run uv run --package dropbear dropbear quickstart to re-pick |
| Robot port not found / permission denied | check the USB cable seat; on Linux add yourself to the serial group: sudo usermod -aG dialout $USER, then log out and back in |
| Calibration mismatch error | run uv run --package dropbear dropbear calibrate — it migrates old LeRobot calibration formats too |
| "Starting GPU worker…" for more than ~4 min | the session will fail with a clear error and a retry offer; AWS GPU capacity can briefly run dry — retry in a few minutes or run uv run --package dropbear --extra sim dropbear sim meanwhile |
| Arm freezes mid-task | that is freeze-on-disconnect doing its job; the CLI explains and auto-reconnects, or prints the exact retry command |
| Motion is slower than usual | the status line says how many sessions share your GPU; it's a shared instance — try again later for solo speed |
| Cert-fingerprint mismatch error | do not reconnect; report it. This is the SDK refusing a worker it cannot verify |
| Arm dips toward the table | stop (spacebar), re-check camera placement against the reference image — a mis-framed camera is the usual cause |
What "good" looks like
On a cheap webcam with this recipe, the bar we hold ourselves to (and measure publicly)
is the Act-0 protocol: 10 trials per task, two of the three tasks succeeding at 6/10 and
3/10 or better, zero unsafe motions. See mvp/docs/act0-protocol.md in the repo for the
exact protocol.