← Back to posts

πŸš—πŸ€– The Car Wash Test: When AI Misses Common Sense

πŸš— The Car Wash Test: When AI Misses Common Sense

I tried the famous "Car Wash Test" that's been circulating everywhere.

I ran it with ChatGPT 5.2.
No need to test other AIs. I've just already seen plenty of comparisons.

Here's the chat:
https://chatgpt.com/share/69913359-a480-8006-a3b7-4f4dfbe5e2c7

What is the Car Wash Test?

It's a simple common-sense and causal reasoning test.

Miss one basic validation, and instead of solving the problem… you make it worse.

Because really...
What are we doing at the car wash without the car? πŸ€”

Car wash without a car illustration

The lesson

AI is powerful. It can generate code, write essays, solve complex problems.
But it can also confidently execute a plan that's fundamentally flawed.

Missing basic validations is not just an AI problem. It's a human problem too.
We get so focused on the solution that we forget to verify the premise.

  • Did we bring the car?
  • Did we check the requirements?
  • Are we solving the right problem?

Next time you're debugging, deploying, or designing something:
Don't skip the obvious checks.

Sometimes the biggest bug is the one you never thought to look for.

Going one step further

After discussing this with the Latino .NET community and others, I noticed something interesting.

  • Reasoning models help when they explicitly step back and re-evaluate assumptions.
  • Claude (without reasoning) showed some flakiness β€” sometimes it caught the issue, sometimes it didn't.
  • Grok used more reasoning tokens but still failed. More tokens don't fix a flawed premise.
  • A structured prompt (clear role, objective, and goal) dramatically reduces failure rates.

The key insight: If the premise is wrong, better reasoning won't save the outcome.

In many cases, the real issue is not the model β€” it's how we frame the question.

This idea has been widely discussed. One reference I had already seen is this article by Leonard Ng .

#AI #LLM #ChatGPT #CommonSense #Reasoning #Testing