A Robot That Finds Waldo

Written by
Matt Reed
Matt Reed
multiple authors
Updated on
May 29, 2024 6:34 PM
Finding Waldos with Google AutoML Vision.

Oh, hey. We built a little robot called "There's Waldo" to test the capabilities of Google's AutoML Vision service. We've found that technologies can feel unapproachable (and irrelevant by extension) to many people. That's why we learn ahead of the curve and show our work in fun ways to demonstrate what's possible.

There's Waldo is a robot built to find Waldo and point at him. The robot arm is controlled by a Raspberry Pi using the PYUARM Python library for the UARM Metal. Once initialized, the arm is instructed to extend and take a photo of the canvas below. It then uses OpenCV to find and extract faces from the photo. The faces are sent to the Google Auto ML Vision service which compares each one against the trained Waldo model. If a confident match of 95% (0.95) or higher is found, the robot arm is then instructed to extend to the coordinates of the matching face and point at it. If there are multiple Waldos in a photo, the robot will point to each one.

While only a prototype, the fastest There's Waldo has pointed out a match has been 4.45 seconds—which is better than most 5 year olds. Here's a look at There's Waldo in action:

A few cool things we've done

CESAR®
Brand Building

CESAR®

Learn more
Varonis
Integrated Campaigns

Varonis

Learn more
Slack
Integrated Campaigns

Slack

Learn more