← Home

Here's What Happens When AI Takes a Classic Psychology Test (And Why It's Kind of Embarrassing)

2026-06-10T14:11:17.695469+00:00

markdown content

Okay, I have to share this with you because it's one of those findings that made me genuinely laugh out loud.

So there's this classic psychology test called the Stroop task. You've probably never heard of it, but you've definitely done something similar without realizing it. Here's how it works: someone shows you the word "red" written in blue ink. Your job is to say the color of the ink — not read the word. Sounds easy, right?

Here's the thing: your brain really wants to read that word. Reading is automatic for most of us. So you have to actively fight against your own brain to focus on the ink color instead. Psychologists use this test to measure something called "executive control" — basically your brain's ability to stay focused, resist distractions, and not get derailed by competing information.

Simple. Except it's actually surprisingly hard.

But here's where it gets interesting. Researchers decided to give this same test to several leading AI systems — you know, the fancy language models powering chatbots and all that jazz. And the results? Let's just say our robot friends didn't exactly impress.

When the lists were short — just five color words — AI did okay. GPT-4o managed about 91% accuracy. Not bad, right?

But then things got longer. Ten words? Accuracy dropped to 57%. Forty words? A measly 15%.

That's not a gradual decline. That's a collapse.

Now, here's what really got me. Humans face the exact same problem. We're way better at reading words than naming colors. It's literally how our brains are wired. But here's the thing — we can still mostly keep our focus even when the lists get long and confusing. Our performance stays pretty stable.

AI? Not so much. When researchers mixed matching and mismatched words together in the same list, AI accuracy for the mismatched items basically fell to zero in some cases. Zero. That's not even "pretty bad." That's "forgot how to do the thing entirely."

The researchers noticed something telling: instead of following the instruction to identify ink colors, AI models kept defaulting back to reading the words. In other words, they couldn't suppress their most natural response — even when they were explicitly told not to do that thing.

And honestly? I think this is kind of wonderful.

Not wonderful because AI failed — I mean, that's a little bit wonderful from a "science is cool" perspective. But wonderful because it reminds us that AI isn't actually thinking like we do. It's pattern-matching on a massive scale, which is genuinely impressive and useful. But "staying focused when things get messy"? That's apparently not in its wheelhouse yet.

The human brain has evolved over millions of years to filter out distractions, prioritize information, and keep our eyes on the prize even when competing signals are screaming at us. AI systems are incredibly sophisticated, but they're built completely differently. They can write poetry and debug code and explain quantum physics, but ask them to ignore a written word and focus on a color, and suddenly it's like asking a dog to fetch a specific shoe.

This doesn't mean AI is useless or that these systems aren't remarkable. They're absolutely remarkable. But it does mean we should probably cool it with the "AI is going to take over everything" panic. Right now, at least, our squishy human brains still have some tricks up their sleeves that silicon simply cannot replicate.

And personally, I find that oddly comforting. The next time you zone in on a task while your phone buzzes, your inbox fills up, and your brain keeps throwing random thoughts at you — give yourself a little credit. You're doing something that AI currently can't.

That's kind of a big deal.

Source: ScienceDaily

#artificial intelligence #psychology #brain science #ai research #cognitive science #human vs machine #attention research