“An instrument is only as good as it's operator”
(This article was previously published on my Substack, Digital Scribbles.)
AI has been a recurring topic in tech news lately. Not so much for its continued rapid advancements, but for a number of bemusing hitches, errors and setbacks.
Firstly, both Google’s Gemini and Adobe’s Firefly AI tools were “exposed”, in generating inaccurate portrayals of historical events, then subsequently accused of pushing an overly diverse, liberally charged equality agenda, “rewriting the past”. This quickly became political, with accusations of “woke” and leftist behaviour from within the organisations prioritising multiculturism over factual evidence & reality. In truth, these inaccuracies were the unfortunate side effects of the companies’ attempts to reduce the scale of opposite.. the “whitewashing” of AI imagery. The fact is that it had overcompensated. In some cases a LOT.
AI image generators have long suffered from a “default” ideal of a man and a woman’s appearance, being predominately White unless specific ethnic groups or races were included as keywords within the prompts used. This is again, largely because the tools’ datasets pooled from millions of images across the internet, and for many years, marketing, advertising, TV and film has had a greater Caucasian bias. This unsurprisingly drew a lot of backlash and outcry, as there was a notable lack of representation, and this didn’t take long to reach the wider general public. Especially with viral posts of Tik Tok filters and Reface apps “whitening” their users, or AI platforms prompted to make users “more professional” and end up changing their race.
However, I cannot help but ponder the subtextual “agenda” of those prompting images of World War II German Soldiers, and the Founding Fathers of the United States to “prove” that Gemini and Firefly were erasing the past with wokeism.. Certainly interesting choices of material to prove a point. (Mind you, this also happened with both generations of the Pope, and of Vikings amongst other things.) I suppose, pick something you know to be a particular group of people, and see if the generated outcome went off-piste. But there were a number of almost convincing accusations that Gemini was on the verge of “erasing white people”, and that the correction had swung wildly in the opposite direction. Google has halted Gemini’s image generation for the time being, whilst they work on a fix…
Examples of this behaviour were also shown to occur on both DALL-E and Midjourney.
So first, AI wasn’t inclusive enough, and now it’s “too inclusive”. But this isn’t some autonomous self-regulating sci-fi entity, no mysterious techno being. Its down to the datasets, the rules and the safeguards its programmed and updated with.
In the same way that an AI generated image is only as good as its prompted written instructions, the training and outputs of AI tools are only as good as the programming, and the size and quality of the dataset they’re trained on.
Now, Adobe’s Firefly is once again under scrutiny, but for a different reason. The use of AI imagery within its supposed “Ethical dataset” of licensed adobe stock images.
Firefly stands out from other leading AI art generators in two key aspects. Firstly, it seamlessly integrates with creative workflows, enhancing tools within established software like Photoshop and Illustrator. Secondly, it is marketed as a commercially secure solution. However, recent revelations have shed light on some undisclosed details regarding its training data.
Adobe has emphasized that Firefly was trained using public domain materials and images sourced from Adobe Stock, the company's licensed asset library. This training methodology was intended to portray Firefly as a more ethically sound option compared to AI image generators that harvested data by scraping the entire internet, potentially infringing on artists' and photographers' work. Hence, the disclosure that Firefly also utilized images from Midjourney adds a layer of complexity for the software giant.
This has caused some understandable unease from many of those who were originally attracted to Firefly for its staunch opposition of unregulated source material, particularly if (albeit) 5% of its dataset could actually be of images generated from unregulated source material…
However this time, it wasn’t an unfortunate error, but a conscious decision on Adobe’s part to include this 5%.
It’s unclear at this stage if there will be any major fallout from this development, but it does once again reinforce that the quality and ethical proficiency of a programmed AI tool relies upon the data it is fed by those in charge.
There is also the additional discussion of what added benefit is there from training AI generators with AI images? Despite being an advocate for AI & AI art, I've often wondered how long before we end up with a creative plateau of regurgitated imitations and digital androgyny, as the cycle repeats itself with more and more AI imagery being included in training data simply due to the sheer volume of generations flooding the online space and the crawling nature of the datasets involved. Originality will always depend on the creator and how they use these tools in their wider, overall workflow. But it is still something to monitor long term.