Brian Holmes on Sat, 24 Dec 2022 22:34:31 +0100 (CET)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: <nettime> Spamming the Data Space – CLIP, GPT and synthetic data


On Fri, Dec 23, 2022, Luke Munn wrote:
At the core of all this, I think, is the instinct that there's something unique about 'human' cultural production. [snip...] Terms like 'meaning', or 'intention', or 'autonomy' gesture to this desire, this hunch that something will be lost, that some ground will be ceded with the move to AI image models, large language models, and so on. 

These are old (maybe antiquated?) problems that were central to Continental philosophy from Heiddeger to Gadamer, Levinas, Baudrillard and many others. Basically the questions are, Who am I and how do I guide my action amid a flood of normalizing or coercive cultural contents? How do I know and recognize the Other in his/her/their full otherness? 

As time goes by I have got more interested in Gadamer's focus on interpretation as the process whereby an individual or community sets their ethical/political course with respect to the expressions and actions of others. That will always be necessary in any society - exactly because there is no reliable benchmark, no fully original _expression_, no pre-given authentic self - so the process of interpretation becomes a creative and always provisional act. However, with statistically generated images you are in a sense alone in the room, there is no one to evaluate or answer to. Baudrillard has a great quote on this, which I used in my work on Guattari's Schizoanalytic Cartographies:

"This is our destiny, subjected to opinion polls, information, publicity, statistics: constantly confronted with the anticipated statistical verification of our behavior, absorbed by this permanent refraction of our least movements, we are no longer confronted with our own will. We are no longer even alienated, because for that it is necessary for the subject to be divided in itself, confronted with the other, contradictory. Now, where there is no other, the scene of the other, like that of politics and of society, has disappeared. Each individual is forced despite himself into the undivided coherency of statistics. There is in this a positive absorption into the transparency of computers, which is something worse than alienation."

Now, AI brings a new twist to all this: computers are no longer transparent, we don't exactly know how neural networks function. Like Harun Farocki in his explorations of machine vision, some people are now interpreting the expressions of the inscrutable AIs. There's a chance that humans will learn something fundamental about the potentials of their own intelligence through this process. However, it is equally or far more likely that entire populations will be massively confronted with statistical transforms of previous generations of statistically generated images, in the scenario that Francis outlines. What's more, it's exceedingly likely that the whole process of statistical image production will be carried on coercively by states and corporations, whose intentions will be masked by the statistical operations. The Baudrillardean worst-case is getting a lot closer to fulfillment.

I would be glad to learn different perspectives on all this. It's why I joined this thread.

All the best, Brian 






On Fri, Dec 23, 2022 at 8:54 AM Francis Hunger <francis.hunger@irmielin.org> wrote:
Dear Luke, dear All
Interesting essay Francis, and always appreciate Brian's thoughtful comments. I think the historical angle Brian is pointing towards is important as a way to push against the claims of AI models as somehow entirely new or revolutionary. 

In particular, I want to push back against this idea that this is the last 'pure' cultural snapshot available to AI models, that future harvesting will be 'tainted' by automated content.

At no point did I allude to the 'pureness' of a cultural snapshot, as you suggest. Why should I? I was discussing this from a material perspective, where data for training diffusion models becomes the statistical material to inform these models. This data has never been 'pure'. I used the distinction of uncontaminated/contaminated to show the difference between a training process for machine learning which builds on an snapshot, that is still uncontaminated by the outputs of CLIP or GPT and one which includes generated text and images using this techique on a large scale.

It is obvious, but maybe I should have made it more clear, that the training data in itself is already far from pure. Honestly I'm a bit shocked, you would suggest I'd come up with a nostalgic argument about purity.

Francis' examples of hip hop and dnb culture, with sampling at their heart, already starts to point to the problems with this statement. Culture has always been a project of cutting and splicing, appropriating, transforming, and remaking existing material. It's funny that AI commentators like Gary Marcus talk about GPT-3 as the 'king of pastiche'. Pastiche is what culture does. Indeed, we have whole genres (the romance novel, the murder mystery, etc) that are about reproducing certain elements in slightly different permutations, over and over again.
Maybe it is no coincidence that I included exactly this example.
Unspoken in this claim of machines 'tainting' or 'corrupting' culture is the idea of authenticity.
I didn't claim 'tainting' or 'corrupting' culture, not even unspoken. Who am I to argue against the productive forces?
It really reminds me of the moral panic surrounding algorithmic news and platform-driven disinformation, where pundits lamented the shift from truth to 'post-truth.'  This is not to suggest that misinformation is not an issue, nor that veracity doesn't matter (i.e. Rohingya and Facebook). But the premise of some halcyon age of truth prior to the digital needs to get wrecked.
I agree. Only, I never equaled 'uncontaminated' to a "truth prior to the digital", I equaled it to a snapshot that doesn't contain material created by transformer models.
Yes, Large language models and other AI technologies do introduce new conditions, generating truth claims rapidly and at scale. But rather than hand-wringing about 'fake news,' it's more productive to see how they splice together several truth theories (coherence, consensus, social construction, etc) into new formations.

I was more interested in two points:

1.) Subversion: What I called in my original text the 'data space' (created through cultural snapshots as suggested by Eva Cetinic) is an already biased, largely uncurated information space where image data and language data are scaped and then mathemtically-statistically merged together. The focus point here is the sheer scale on which this happens. GPT-3 and CLIP are techniques that both build on massive datascraping (compared for instance to GANs) so that it is only possible for well funded organizations such as Open-AI or LAION to build these datasets. This dataspace could be spammed a) if you want to subvert it and b) if you'd want to advertise. The spam would need to be on a large scale in order to influence the next (contaminated) iteration of a cultural snapshot. In that sense only I used the un/contaminated distinction.

2). In response to Brian I evoked a scenario that builds on what we already experience when it comes to information spamming. We all know, that mis-information is a social and _not_ a machinic function. Maybe I should have made this more clear (I simply assumed it). I ignored Brians comment on the decline of culture, whatever this would mean, and could have been more precise in this regards. I don't assume culture declines. Beyond this, there have been discussions about deepfakes for instance and we saw that deepfakes are not needed at all to create mis-information, when one can just cut any video using standard video editing practices towards 'make-believe'. I wasn't 'hand-wringing' about fake news, in my comment to Brian, instead I was quoting Langlois with the concept of 'real fakes'.
Further I'm suggesting that CLIP and GPT make it more easy to automate large scale spamming, making online communities uninhabitable or moderation more difficult. Maybe I'm overestimating the effect. We can already observe GPT-3 automated comments appearing on twitter or the ban of GPTChat posts on Stackoverflow (https://meta.stackoverflow.com/questions/421831/temporary-policy-chatgpt-is-banned), the latter already being a Berghain-no-photo-policy.

Finally, I'm interested in the question of bias and representation, and how a cultural snapshot, that builds on a biased dataset (and no, I'm not saying there are unbiased datasets at all), can further deepen these biases with each future interation, when these bias get statistically reproduced through 'AI' and the become basis for the next dataset.

best

Francis


nga mihi / best,
Luke


On Tue, 20 Dec 2022 at 22:20, Francis Hunger <francis.hunger@irmielin.org> wrote:
Hi Brian,
On Mon, Dec 19, 2022 at 3:55 AM Francis Hunger <francis.hunger@irmielin.org> wrote:
While some may argue that generated text and images will save time and money for businesses, a data ecological view immediately recognizes a major problem: AI feeds into AI. To rephrase it: statistical computing feeds into statistical computing. In using these models and publishing the results online we are beginning to create a loop of prompts and results, with the results being fed into the next iteration of the cultural snapshots. That’s why I call the early cultural snapshots still uncontaminated, and I expect the next iterations of cultural snapshots will be contaminated.

Francis, thanks for your work, it's always totally interesting.

Your argumentation is impeccable and one can easily see how positive feedback loops will form around elements of AI-generated (or perhaps "recombined") images. I agree, this will become untenable, though I'd be interested in your ideas as to why. What kind of effects do you foresee, both on the level of the images themselves and their reception?

Foresight is a difficult field, as most estimates can extrapolate maximum 7 year into the future and there are a lot of independent factors (such as e.g. OpenAI, the producer of CLIP could go bankrupt etc.).

It's worth considering that similar loops have been in place for decades, in the area of market research, product design and advertising. Now, all of neoclassical economics is based on the concept of "consumer preferences," and discovering what consumers prefer is the official justification for market research; but it's clear that advertising has attempted, and in many cases succeeded, in shaping those preferences over generations. The preferences that people express today are, at least in part, artifacts of past advertising campaigns. Product design in the present reflects the influence of earlier products and associated advertising.

That's an great and interesting argument. Because it plays into the cultural snapshot idea.

Obviously Language wise, people already use translation tools, such as Deepl and translate Text from German to English and back to German in order to profit off the "clarity" and "orthographic correction" brought by the statistical analysis that feeds into the translator and seems to straighten the German text. We see the same stuff appearing for products like text editors and thus widely employed for cultural production. That's one example. Automated forum posts using GPT-3, for instance on Reddit are another, because we know that the CLIP Model also partly build on Reddit posts.

Another example is images generated using diffusion models and prompts building on cultural snapshots and being used as _cheap_ illustrations for editorial products, feeding off stock photography and to a certain extend replacing stock photography. This is more or less an economic motivation with cultural consequences. The question is what changes, when there is not sufficiently 'original' stock photography circulating, but the majority is syntheticly generated? Maybe others want to join in, to speculate about it.

We could further look into 1980s HipHop or 1990s Drum'n Bass sample culture, which for instance took (and some argue: stole) one particular sound break, the Amen Break, from an obscure 1969 Soul music record by The Winston Brothers and build a whole cultural genre from it. Cf. https://en.wikipedia.org/wiki/Amen_break Here the sample was refined over time, with generations of musicians cleaning the sample (compression, frequencies, deverbing, etc.) and providing many variations of it, then reusing it, because later generation did not build on the original sample, but on the published versions of it.

We can maybe distinguish two modi operandi where a) "the cultural snapshot" is understood as an automated feedback loop, operating on a large scale, mainly through automated scraping and publication of the derivates of data, amplifying the already most visible representations of culture and b) "the cultural snapshot" is a feedback loop with many creative human interventions, be it through curatorial selection, prompt engineering or intended data manipulation.

Blade Runner vividly demonstrated this cultural condition in the early 1980s, through the figure of the replicants with their implanted memories.
I dont know if I get your point. I'd always say that Blade Runner is a cultural imaginary, one of the many phantasms about the machinisation of humans since at least 1900 if not earlier, and that's an entirely different discussion then. I would avoid this as an metaphor.
The intensely targeted production of postmodern culture ensued, and has been carried on since then with the increasingly granular market research of surveillance capitalism, where the calculation of statistically probable behavior becomes a good deal more precise. The effect across the neoliberal period has been, not increasing standardization or authoritarian control, but instead, the rationalized proliferation of customizable products, whose patterns of use and modification, however divergent or "deviant" they may be, are then fed back into the design process. Not only the "quality of the image" seems to degrade in this process. Instead, culture in general seems to degrade, even though it also becomes more inclusive and more diverse at the same time.

When looking for a plausible scenario regarding synthetic text and synthetic images, Steve Bannons “The real opposition is the media. And the way to deal with them is to flood the zone with shit.” is sadly a good candidate. This ties in with what Ganaele Langlois posits:

„Therefore: communicative fascism posts that what is real is the opposite of social justice, and we now see the armies of ‚Social Injustice Warriors‘ as Sarah Sharma (2019) calls them, busy typing away at their keyboards to defend the rights to keep their fear of Others unchallenged and to protect their bigotry, misogyny, and racism from being debunked as inept constructions of themselves“ Langlois 2021:3

„The first aspect of this new communicative fascism is related to what can be called ‚real fakes_ that is to say, the construction of a fictional and alternative reality where the paranoid position of fear and rage can find some validation … Real fakes are about what reality ought to be: they are virtual backgrounds on which fascists can find their validity and raising’être.“ Langlois 2021:3f

So this is to be expected both for political or consumer marketing purposes.

AI is poised to do a lot of things - but one of them is to further accelerate the continual remaking of generational preferences for the needs of capitalist marketing. Do you think that's right, Francis?

That's one possible reading. I would insist, to not use an active verb with AI however, rephrasing your point towards "AI may be used for a lot of things". Better even replace 'AI' with the term 'statistical computation'.

Currently I would read 'AI' as a mixture of imaginations and phantasms about automation, of which some may become true – just in another way from what was expected or promoted. For certain, the inner logics of capital circulation command to deploy statistical computation to replace living, human labor. We already see how the job description of translators changes towards an human–statistical_computation entanglement and how the repetetive parts of the illustrator job, like coloring get automated away and put people out of jobs and it is plausible to expect the consolidation of jobs like photo editor, news editor, author with prompt-engineering. Since we are concentrating on the cultural sphere here, I'll limit the examples to this field. Human Labor in production, logistics, care labor would need their own thoughts.

What other consequences do you see? And above all, what to do in the face of a seemingly inevitable trend?

We are going to create separate data ecologies, which prohibit spamming the data space. These would be spaces, comparable to the no-photo-policy in clubs like Berghain or IFZ with a no-synthetics policy. While vast areas of the information space may be indeed flooded, these would be valuable zones of cultural exchange. (The answer would be much longer indeed, but we're not writing a book here).



best, Brian
-- 
Researcher at Training The Archive, HMKV Dortmund

Artistic Practice http://www.irmielin.org
Ph.D. at Bauhaus University Weimar http://databasecultures.irmielin.org

Daily Tweets https://twitter.com/databaseculture


Peter and Irene Ludwig guest professorship at the Hungarian University of Fine Arts in Budapest 2022/23
#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mx.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org
#  @nettime_bot tweets mail w/ sender unless #ANON is in Subject:
-- 
Researcher at Training The Archive, HMKV Dortmund

Artistic Practice http://www.irmielin.org
Ph.D. at Bauhaus University Weimar http://databasecultures.irmielin.org

Daily Tweets https://twitter.com/databaseculture


Peter and Irene Ludwig guest professorship at the Hungarian University of Fine Arts in Budapest 2022/23
#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mx.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org
#  @nettime_bot tweets mail w/ sender unless #ANON is in Subject:
#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mx.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org
#  @nettime_bot tweets mail w/ sender unless #ANON is in Subject:
#  distributed via <nettime>: no commercial use without permission
#  <nettime>  is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: http://mx.kein.org/mailman/listinfo/nettime-l
#  archive: http://www.nettime.org contact: nettime@kein.org
#  @nettime_bot tweets mail w/ sender unless #ANON is in Subject: