Can ChatGPT find Waldo?

Igor Alcantara
Aug 30, 2024
3 min read

Updated: Oct 13

It's Friday, so let's have some fun and once again push the boundaries of AI.

While waiting for a Qlik Answers knowledge base to index a few documents, I decided to test ChatGPT's ability to separate signal from noise and recognize patterns. I thought Waldo (or Wally in the UK, Brazil, and many other countries), the famous character from the puzzle book series created by British illustrator Martin Handford in 1987, would be a great challenge.

I also wanted to use this exercise to test the famous Moravec's Paradox. This paradox states that tasks easy for an average human, like communication, perception, or motor skills, are very hard for machines (AI-capable machines). Conversely, tasks hard for humans, like playing chess or writing good Python code, are easy for machines. Hans Moravec presented this idea in the 1980s, decades before the "Transformers revolution" (insert an Optimus Prime joke here) in 2017.

But every rule has its exception. Could this be one? If Waldo's puzzle is a hard task for humans (if they were easy, would people still be interested after over 30 years?), then it should be easy for a machine, right? Let's put it to the test with ChatGPT-4o.

I started with the image above, asking ChatGPT to find Waldo and describe his location. Can you, human, find him? He's right in the middle, behind a red and white striped towel.

When asked, ChatGPT found someone who looked very similar, behind the left beach tent. It really did look like Waldo, but the colors of the clothes were different. This could be due to an error in the Convolutional Neural Network that split the image into multiple color components—who knows what goes on inside the black box? So, while ChatGPT was close, it was still wrong. Score: 1-0 to Waldo.

Now, onto the hardest of the three challenges. This one took me a few minutes to solve. Can you find Waldo? Leave your answer in the comments.

Once again, ChatGPT couldn't find him. This time, it got lost in the colors, pointing to a spot near the striped dinosaur on the bottom left—not even remotely close. Another win for Waldo: 2-0.

For the final test, I chose an image that's not as easy as the first but not as hard as the previous one, with fewer elements. Waldo is in the center-right. Can you see him? How about ChatGPT?

If I tried to help it, it gets even worse.

In this case, ChatGPT once again hallucinated (though, to be fair, who wouldn't with these psychedelic images? lol). It claimed Waldo was in the center-top, where he clearly isn't, and no one remotely similar is in that part of the image. Another win for Waldo, bringing the score to 3-0, just like when Brazil plays Argentina in soccer (and to be clear, Brazil is Waldo).

All jokes aside, ChatGPT-4 is a highly advanced generative AI capable of many things, but it still can't solve a small sample of Waldo's puzzles. Does this mean Moravec's Paradox is finally wrong? Only if you consider, as I do, that finding Waldo is a hard task. Maybe, in general, it's an easy task, which would explain why it's so difficult for AI. Or perhaps this is simply one of the exceptions to the paradox.

Regardless, this highlights a significant limitation of this technology in other use cases. Can it find a hidden tumor in a cat scan? Can it detect a threat in a security video at an airport or highlight an anomaly in a satellite photo? These are some of the scenarios I still need to test.

If you enjoyed this, please share it. If you'd like me to conduct more tests like this, leave a comment with suggestions. In the meantime, have a great weekend!