Let me be honest with you — false positives with AI detectors are nothing new. We’ve written about it years ago, and it’s still happening today. In fact, it’s become such a common issue that students are finding ways around it. And who can blame them, really? When your academic future is on the line, you’ll do whatever it takes.
That’s why I always take these lofty AI detection accuracy claims with a hefty grain of salt. Like Leap AI, for example. They tout a 97% detection rate, which is music to the ears of any educator or content creator looking to stay ahead of the AI curve. But is it too good to be true?
Well, there’s only one way to find out. I figured I’d put Leap AI through its paces by pitting it against the best AI bypasser I know: Undetectable AI. If Leap can hold its own then maybe, just maybe, it deserves your attention. But if not, well, let’s just say that that 97% accuracy score might need a bit of fact-checking.
What is Undetectable AI?
False positive AI detection has been a serious problem for years. Many universities started using AI detectors to lessen cheating with AI, but the issue is that they’re not really accurate. Undetectable AI (and many others like it) lessens the odds of getting falsely accused of AI misuse.
Undetectable AI works just like QuillBot, but it specializes in humanizing text. This is done by removing common AI markers, getting rid of repetitions, and adding intentional errors to simulate human writing. It also has other features (like output customization, AI human typer, and their own detector) that makes them stand out among AI bypassers.
You can check out our full review of Undetectable AI here.
What is Leap AI?
Leap AI is a no-code workflow automation platform for AI tools. In other words, it connects services to create a pipeline that gets rid of the repetitive work when producing quality content using AI models.
But we’re not here to talk about their full platform, just one specific free tool that they offer: their AI detector. It works like any other free AI detection tool, but what sets them apart is that (according to them) they have an accuracy of 97%. So naturally, I wanted to know if that’s actually true.
Undetectable AI vs. Leap AI: Accuracy
I decided I’m going to test Leap AI against Undetectable AI, but we need a baseline. So, for each round, I’m also going to check Leap AI’s assessment of the original ChatGPT text and compare.
Test #1
Leap AI vs. Original ChatGPT Text: Correct.
AI Likelihood Score: 80.48%
Leap AI vs. Undetectable AI: Correct.
AI Likelihood Score: 81.95%
Test #2
Leap AI vs. Original ChatGPT Text: Wrong.
AI Likelihood Score: 12.83%
Leap AI vs. Undetectable AI: Wrong.
AI Likelihood Score: 23.45%
Test #3
Leap AI vs. Original ChatGPT Text: Wrong.
AI Likelihood Score: 16.56%
Leap AI vs. Undetectable AI: Wrong.
AI Likelihood Score: 16.21%
Test #4
Leap AI vs. Original ChatGPT Text: Correct.
AI Likelihood Score: 81.56%
Leap AI vs. Undetectable AI: Wrong.
AI Likelihood Score: 34.56%
Test #5
Leap AI vs. Original ChatGPT Text: Correct.
AI Likelihood Score: 89.23%
Leap AI vs. Undetectable AI: Wrong.
AI Likelihood Score: 36.78%
Test #6
Leap AI vs. Original ChatGPT Text: Wrong.
AI Likelihood Score: 37.53%
Leap AI vs. Undetectable AI: Wrong.
AI Likelihood Score: 17.56%
Test #7
Leap AI vs. Original ChatGPT Text: Correct.
AI Likelihood Score: 80.58%
Leap AI vs. Undetectable AI: Correct.
AI Likelihood Score: 57.27%
Test #8
Leap AI vs. Original ChatGPT Text: Correct.
AI Likelihood Score: 73.38%
Leap AI vs. Undetectable AI: Correct.
AI Likelihood Score: 65.11%
Overall Tally
Leap AI vs. Original ChatGPT Text |
Leap AI vs. Undetectable AI |
|
Based on these scores, I can conclude two things:
Number one, I don’t believe Leap AI is as good at detecting LLM-generated content as Winston or Sapling. Yes, you can say that it still passed 5/8 of the original ChatGPT tests, but that’s too low. Even when (in my opinion) the text was blatantly AI, Leap was still giving scores in the 80s, which shows low confidence in their detection model.
That said, number two: Leap AI looks to be the best Undetectable AI detector in the market. A correctness score of 41.61% is higher than its competitors. However, Undetectable AI is simply too good at avoiding detection. Even during rounds where both the original text and the Undetectable AI equivalent is determined to be both AI or human, it still outscored the original text by an average of more than 6%.
So, What Now?
So, what’s the final verdict here? Well, I hate to say it, but Leap AI just doesn’t seem to have what it takes to go toe-to-toe with the likes of Undetectable AI. Sure, it performed better than some of the other free detectors we’ve tested, but when it comes to the really persistent AI bypassing tools, like many others, Leap’s accuracy just couldn’t hack it.
But here’s the thing: even without talking about Undetectable AI, it’s already looking bleak.
Look, I get it, building an AI detector that can reliably outsmart the latest language models is no easy feat. But if Leap AI wants to keep that 97% claim alive, they’ve got some serious work to do. As it stands, a 58% score against ChatGPT (even if my sample size is small) means you need to look elsewhere.
Close, but no cigar. Onto the next.