Debunking AI generated Content and Fake Accounts

Update 2024-01-19: Update section at the end with more debunked content.

Not one day without yet another announcement of some new AI tool or platform that creates content, like text, images, audio, video, sourcecide, etc. You see new tools coming and going. Some are here to stay. Some are useful, some are questionable.
More annoying is the increased volume of content that is polluting social media or the professional platform LinkedIn.

AI Generated image showing a printer throwing out posts.

I came across this posting, one random smaple of many, talking about the most lucrative careers in the world. Let’s dissect this piece of random click-bait AI created postings.

LinkedIn Posting 1
  • Off topic: I posted it in Angular group. Reason to leave these groups if not moderated.
  • The author is working for a marketing company, no surprise. The only activities are postings that lead to similar articles.
  • The post is hosted on blogger.com, the free blogging resource by Google/Alphabet. Financed by Google AdSense advertising.
  • The same LinkedIn post is published by various other authors all working for similar companies. Most likely these persons are either non-existing and just fake accounts, or just low paid mechanical turks (Amazon crowdsourcing platform).
    You cant mute these accounts fast enough because new ones getting created and they start posting in the same groups. Only choice, leave the group.
  • The image is an unrelated random lead picture, usually taken from free image repositories, stock images or more frequently now, AI generated. In this case we see the Rose Bowl stadium during the 1994 World Finals, the picture showed up on reddit a few days back and cannot be found in tineye.

..continued below.

LinkedIn Posting 2
LinkedIn Posting 3

The above links leads to the blogspot post below.

Screenshot of the blogger hosted blog/post.

The whole blog contains nothing but these kinds of posts with a lot of SEO targeted labels to attract more clicks.

Tool detecting most text as being AI created.

(Reference: https://gptzero.me/)

Conclusion: It becomes harder to navigate the content landscape and find/confirm human created articles and images. Maybe social media sites and professional sites should have a tick logo for real content, I expect this feature soon. Or, in reverse, some platforms will ban AI content all together.

What is going to happen in future if less and less content is human created ? AI is not genuinly creative and we will see similar content again and again.

Advertising Spam on LinkedIn

The second category of content that polluts profesional groups and makes it hard to find relevant content, are marketing companies advertising agressively for their expensive market reports.

A good sample is this one, this account is bombarding various aviation related groups with advertising.

Located in the U.S., working for a company in Germany that does not exist.

The profile picture is all over the internet for fashion related websites.

The account activities are nothing but postings for reports about various fields in aviation and military. They all link to expensive reports or research reports.

Some of the research papers covers:
𝐀𝐯𝐢𝐚𝐭𝐢𝐨𝐧 𝐂𝐫𝐞𝐰 𝐌𝐚𝐧𝐚𝐠𝐞𝐦𝐞𝐧𝐭 𝐒𝐲𝐬𝐭𝐞𝐦 𝐈𝐧𝐝𝐮𝐬𝐭𝐫𝐲, 𝐆𝐥𝐨𝐛𝐚𝐥 𝐍𝐚𝐯𝐢𝐠𝐚𝐭𝐢𝐨𝐧 𝐒𝐚𝐭𝐞𝐥𝐥𝐢𝐭𝐞 𝐒𝐲𝐬𝐭𝐞𝐦 (𝐆𝐍𝐒𝐒) 𝐈𝐧𝐝𝐮𝐬𝐭𝐫𝐲, 𝐓𝐡𝐞 𝐅𝐮𝐭𝐮𝐫𝐞 𝐨𝐟 𝐂𝐚𝐫𝐠𝐨 𝐒𝐡𝐢𝐩𝐩𝐢𝐧𝐠 𝐈𝐧𝐝𝐮𝐬𝐭𝐫𝐲, 𝐀𝐢𝐫𝐜𝐫𝐚𝐟𝐭 𝐋𝐞𝐚𝐬𝐢𝐧𝐠 𝐈𝐧𝐝𝐮𝐬𝐭𝐫𝐲 and many more.

More Fake Content and Scam

Update 2024-01-19

LinkedIn post about guide to turbofan engines. The text is 100% AI generated and is a click-bait to a page selling reports at ridicolous prices.

GenAi Imagery Increasing

No doubts, since 2022 the generative AI space was exploding. It left the research stage and became mainstream in 2023. It is getting harder and harder – at least for the human brain – to spot genuinely created content (images, videos, text, audio) nowadays. You won’t even know if this blog post was created by me reflecting about the topic and actually writing the text or just me throwing a text prompt at some text generation service and copying it here.

It started with first attempts to create images with Generative Adversarial Networks (GAN) in 2014 (Ian Goodfellow, University of Montreal). All made possible with the advent of deep learning and (convolutional) neural networks utilizing massive CPU (GPU) processing power and a vast amount of data available today. Most people do not understand what is going on under the hood of all this or grasp the concept, except you are really into computer science and ai. The results are not transparent at all, eventually it will improve due to regulation and legislation (to be proven!).

I enjoy using the tools and a couple of my blog posts talk about GenAI or using the tools for content creation, mostly images using tools like DALL-E or Midjourney or local Stable Diffusion.
I believe I (still?) can distinguish and spot generated images, especially the ones that are used unaltered out of the image generator, mostly with simple text prompts. More and more images in my LinkedIn stream (presented to me) are artificially created, no problem with that if it serves a decorative purpose as a design element, the problem starts when the images is supposed to document facts and actual events or people.

A few hints how to spot these images, using an actual, more or less random, website as example. I am not picking on the content or the authors, I just highlight how you can identify generated images. Let’s have a look at this website about travel and mobility technology (TNMT). I came across this article about passenger frustration in my LinkedIn stream, the lead image grabbed my attention (doing a good job because that’s its purpose). See below screenshot of the post.

At first sight it looks real, a passenger in a crowd or queue, a frustrating scene (delay, cancellation,..). The primary subject is very photorealistic, but the background reveals the source not being a real photo shot. (Quiten often it is the background that pinches me).

Usually generated images are not highlighted as such and there is no mention of the photographer or the image source.

Some artefacts in the image that support the suspicion:

The right eye of the person looks unnatural or like someone coming out of a box fight.

Missing teeth or need to visit a dentist.

Unreadable text on the t-shirt.

Out of scale ear of the person next to the subject.

Person with green jacket, arm point forward and either no head or the head facing backwards.

Another important tool, is the image reverse search. Check if the image is used anywhere else. Last time editorial images were dominated by stock images and you get dozens of references, but not for generated images. Why bother to copy/steal a generated image if you can create a new one with 5 mouse-clicks. You can use TinEye or the respective Google service. Above image:

TinEye result
Google image search result

I recommend doing this if you doubt the source of images in the news, especially coming from social media platforms.

Additionally, you can try to use online tools to identify generated images, but be aware, you will get very different responses. Some tests in no particular order:

Hive Moderation

Content at Scale

AI or Not

Let’s dissect a few more images from the same website. It seems they started to use generated images instead of stock images in Nov 2022. Before that time all images were stock images.

  • Unreadable text
  • Some kind of fantasy engine
  • Missing landing gear
  • Broken aircraft physics.
  • Frankenstein plugs
  • Floating and missing engine
  • Unreadable text
  • Obvious

My conclusion and recommendation: Feel free to use generated images for decorative or editorial purposes. Maybe you should mark them as being generated, at least somewhere in the text. Avoid image rubbish and obvious artefacts.