Your IT Support Experts - Homepage

We partner with many types of businesses in the area, and strive to eliminate IT issues before they cause expensive downtime, so you can continue to drive your business forward. Our dedicated staff loves seeing our clients succeed. Your success is our success, and as you grow, we grow.

Home

About Us

IT Services

Understanding IT

News

Blog

Contact Us

Support

(703) 359-9211

Free Consultation

Interested in seeing what we can do for your business? Contact us to see how we can help you! Sign Up Today

Macro Systems Blog

Macro Systems has been serving the Metro Washington, DC area since 1997, providing IT Support such as technical helpdesk support, computer support and consulting to small and medium-sized businesses.

AI’s Information isn't Always Reliable

AI’s Information isn't Always Reliable

It's common for many people to make up an answer that they feel seems plausible if asked something they don’t have an answer to. It happens all the time.

It turns out that this is a tendency that artificial intelligence has inherited from us.

A Recent Study Strongly Suggests AI is a Terrible Search Engine Alternative

The Tow Center for Digital Journalism—part of Columbia Journalism Review—evaluated eight AI-based live search tools by feeding real excerpts from actual, searchable news articles into these models and asking them to cite the source. 

The Tow Center selected 20 publishers and pulled 10 articles from each, ensuring that a traditional Google search would provide the correct source within the top three results. Eight AI models… 

  1. OpenAI’s ChatGPT Search
  2. Perplexity
  3. Perplexity Pro
  4. DeepSeek-V3 Search
  5. Microsoft’s Copilot
  6. xAI’s Grok-2
  7. xAI’s Grok-3 beta
  8. Google’s Gemini

…were tasked with finding the appropriate article’s headline, publisher, publication date, and web address.

Once the responses were collected, they were labeled as Correct, Correct but Incomplete, Partially Incorrect, Completely Incorrect, Not Provided, or Crawler Blocked, based on whether the chatbots correctly identified the article, source, and link. Crawler Blocked was reserved for websites where the publisher had blocked the chatbot’s web crawlers.

After all 1600 queries were manually categorized, the results provided chilling insights.

The Tow Center’s Research Revealed a Lot

First and foremost, the chatbots were incorrect more than half the time… but confidently so. Overall, over 60% of the queries brought back incorrect answers, with some chatbots being far more egregiously incorrect. Perplexity, for instance, was wrong 37 percent of the time, while Grok missed the mark in 94 percent of its responses. There was also little indication that these incorrect answers could potentially be erroneous. Instead of addressing failure or using qualifiers like “it’s possible” or “might,” most chatbots would simply make up an answer.

Copilot was the only outlier, as it declined to answer whenever it couldn’t accurately (which was most of the time). Premium, subscription-based models Perplexity Pro and Grok-3 Search were also more confidently incorrect, more often, than their free options were.

It gets worse, as some of the publishers included in this study had blocked AI crawlers from accessing content. Interestingly, the AI models were most successful in answering questions about content they weren’t supposed to be able to access, while incorrectly answering or simply not answering queries about content they could freely access.

Misattribution was another common problem, with these search tools commonly citing incorrect articles or failing to link back to the correct source. The researchers noted that even partnerships between publishers and these platforms weren’t enough to prevent this misattribution, with many citations referring to republished versions. This also complicates things for publishers who opt out of being crawled, as the chatbots still cite their content but refer to its republished version on another website. Many responses referenced URLs that were broken or simply made up. While not the only perpetrators, Grok 3 and Gemini were reportedly the worst offenders.

Things didn’t necessarily get better when there were deals with publishers, either, with accuracy varying wildly within these cases.

This Information Shows AI Search is Unreliable

While the researchers acknowledged that they don’t have all the information, like whether or not these publishers have additional blocks in place to prevent chatbot crawlers and the fact that each chatbot could respond differently to the same prompt if given the chance, one fact still surprised them.

As they put it:

“Though we did not expect the chatbots to be able to correctly answer all of the prompts, especially given crawler restrictions, we did expect them to decline to answer or exhibit uncertainty when the correct answer couldn’t be determined, rather than provide incorrect or fabricated responses.”

This makes their ultimate conclusion—that these chatbots have troubling issues that could present harm to publishers and consumers alike—important to acknowledge as these tools are put to use.

While AI can prove to be an extremely useful tool in the right contexts, it is imperative that we use it mindfully.

Tips on Identifying Phishing Scams
How to Convert Remote Work from a Burden into an A...
 

Comments

No comments made yet. Be the first to submit a comment
Guest
Already Registered? Login Here
Guest
Monday, March 31, 2025

Captcha Image

Customer Login


Contact Us

Learn more about what Macro Systems can do for your business.

(703) 359-9211

Macro Systems
3867 Plaza Drive
Fairfax, Virginia 22030