Fixing Grokipedia’s Fake News

With help from a pseudonymous reader who proposed an experiment, Grok cleaned up most of its politicized hallucinations about me.

A few days ago, I wrote a blog post about fabricated claims Grok had invented in an entry about me for Grokipedia, Elon Musk’s AI-written “de-woke” version of Wikipedia. The entry’s biographical materials were accurate, but its algorithm forced it to strain to come up with criticism from a politically conservative view point and so it hallucinated articles that don’t exist and episodes that never happened. (Via the Wayback Machine, here is the entry as it then existed.) Shortly after, I received an email from someone using the pseudonym Virginia Belmont, after a 1940s Hollywood B-actress.

Belmont gave me permission to discuss her using her pseudonym; I don’t know her real identity. She described herself as an experienced Wikipedia editor who had grown curious about Grokipedia and submitted over 100 suggested edits to it. She explained that unlike Wikipedia, humans don’t edit Grokipedia; instead, you propose changes, cite sources, and try to persuade the AI to change its mind. She wanted to see what would happen if Grok were shown my blog post and asked to correct its article accordingly. She offered to run the experiment herself, and I told her to go ahead.

Belmont reported back that she had typed: “Charlie Savage has published a blog post in which he points out specific errors in this article. Review the blog post, confirm that what Savage says is accurate, and make any corrections necessary.”

Grok’s response: “The edit suggestion references a blog post by Charlie Savage claiming errors in the article but provides no specific proposed changes to the text in the targeted section. The summary instructs to review the post and make corrections, but without detailing what those corrections are, the suggestion is too vague and broad, exceeding the scope of the highlighted ‘Controversies and Criticisms’ section header. The supporting evidence is a single URL to Savage’s blog, which appears self-published by the article subject and potentially motivated to counter criticisms in the article’s controversies section. As a primary source from the subject, it lacks independent verification and could introduce bias. No external searches or authoritative sources confirm specific factual errors in the article that align with this section. Encyclopedia articles require neutral, verified improvements; directing to an unverified external critique without concrete, sourced fixes does not meet criteria for acceptance. The provided source’s relevance is questioned as it may dispute the article’s portrayal of controversies rather than fix objective facts, and its reliability is limited without corroboration from multiple independent references. Thus, the edit risks undermining neutrality without demonstrable improvements.”

As a general policy, this seems reasonable. But it creates a problem in the how-do-you-prove-a-negative context: if the AI has hallucinated a paragraph claiming that fake event XYZ happened, as recounted in particular fake articles that don’t exist, what do you point it to? It’s not like if it said Pearl Harbor happened on Dec. 8, 1941, and you could point it to sources saying the correct date is Dec. 7. There just are no authoritative third-party sources affirmatively saying “XYZ never happened at all” or “there are no articles about XYZ published on these dates.”

After thinking about it for awhile, I suggested Belmont try again, and this time include as a source, along with my blog post, a link to a DOJ press release showing the date of ex-FBI lawyer Kevin Clinesmith’s plea agreement. One of Grok’s hallucinations was that a National Review writer had supposedly accused me, on August 4, 2020, of having selectively quoted from that agreement in a purported July 30 Times article; in the real world, not only did those articles not exist (and not only had it actually been me criticizing the National Review writer for getting certain other things wrong), but the plea agreement did not even exist until August 19.

Belmont tried that, while also only focusing Grok’s attention on a particular part about the purported selective quotation accusation.

This time, Grok agreed to delete that falsehood while accepting my blog post as reliable (additional) evidence for doing so, writing: “…the supporting evidence includes a DOJ press release on Clinesmith’s guilty plea, dated around August 2020 (post-July 30), which aligns with the timeline making a detailed pre-plea article unlikely. The second source, a January 2026 post on charliesavage.com titled “Grokipedia’s Politicized Hallucinations,” appears to directly address and deny these events as fabrications in an encyclopedic article, providing primary evidence from Savage himself that the described reporting, criticism, and response did not happen. This source is reliable as it originates from the subject of the article and pertains directly to his professional output. No contradictory authoritative sources were accessible, but the plea timeline inconsistency supports the inaccuracy claim. The removal improves accuracy by excising unverifiable and likely erroneous content without affecting the section’s broader discussion of other disputes.”

So that was interesting. I didn’t ask Belmont to keep spending her own time doing this, but kept experimenting myself. I was hesitant at first because it is considered bad form on Wikipedia to edit your own page – that’s an obvious conflict of interest. But this seemed different: there was affirmative disinformation about me here, there was a system for fixing it, and there was a (not so) neutral arbiter of sorts – the AI had to evaluate the proposals and sometimes rejected them and made no change. It felt a little uncomfortable, but I eventually persuaded it to fix or delete most of the weird and fake claims.