Prompted by @jjn1's article on AI and creative thinking, I've been reading a paper by some researchers comparing the creativity
of ChatGPT against their students (at an elite university
, no less).
What is interesting about this paper is not that ChatGPT is capable of producing large quantities of ideas
much more quickly than human students, but that the evaluation method used by the researchers rated the AI-generated ideas as being of higher quality. From 200 human-generated ideas and 200 algorithm-generated ideas, 35 of the top-scoring 40 were algo-generated.
So what was this evaluation method? They used a standard market research survey, conducted with college-age individuals in the United States
, mediated via mTurk. Two dimensions of quality were considered: purchase intent (would you be likely to buy one) and novelty. The paper explains the difficulty of evaluating economic value directly, and argues that purchase intent provides a reasonable indicator of relative value.
The paper discusses the production cost of ideas, but this doesn't tell us anything about what the ideas might be worth. If ideas were really a dime a dozen, as the paper title suggests, then neither the impressive productivity of ChatGPT nor the effort of the design students would be economically justified. But the production of the initial idea is only a tiny fraction of the overall creative process, and (with the exception of speculative bubbles) raw ideas have very little market value (hence dime a dozen
). So this research is not telling us much about creativity as a whole.
A footnote to the paper considers and dismisses the concern that some of these mTurk responses might have been generated by an algorithm rather than a human. But does that algo/human distinction even hold up these days? Most of us nowadays inhabit a socio-technical world that is co-created by people and algorithms, and perhaps this is particularly true of the Venn diagram intersection between college-age individuals in the United States
and mTurk users. If humans and algorithms increasingly have access to the same information, and are increasingly judging things in similar ways, it is perhaps not surprising that their evaluations converge. And we should not be too surprised if it turns out that algorithms have some advantages over humans in achieving high scores in this constructed simulation.
(Note: Atari et al recommend caution in interpreting comparisons between humans and algorithms, as they argue that those from Western, Educated, Industrialized, Rich and Democratic societies - which they call WEIRD - are not representative of humanity as a whole.)
A number of writers on algorithms have explored the entanglement between humans and technical systems, often invoking the concept of recursivity. This concept has been variously defined in terms of co-production (Hayles), second-order cybernetics and autopoiesis (Clarke), and being outside of itself (ekstasis), which recursively extends to the indefinite
(Yuk Hui). Louise Amoore argues that, in every singular action of an apparently autonomous system, then, resides a multiplicity of human and algorithmic judgements, assumptions, thresholds, and probabilities
.
(Note: I haven't read Yuk Hui's book yet, so his quote is taken from a 2021 paper)
Of course, the entanglement doesn't only include the participants in the market research survey, but also students and teachers of product design, yes even those at an elite university. This is not to say that any of these human subjects were directly influenced by ChatGPT itself, since much of the content under investigation predated this particular system. What is relevant here is algorithmic culture in general, which as Ted Striphas's new book makes clear has long historical roots. (Or should I say rhizome?)
What does algorithmic culture entail for product design practice? For one thing, if a new product is to appeal to a market of potential consumers, it generally has to achieve this via digital media - recommended by algorithms and liked by people (and bots) on social media. Thus successful products have to submit to the discipline of digital platforms: being sorted, classified and prioritized by a complex sociotechnical ecosystem. So we might expect some anticipation of this (conscious or otherwise) to be built into the design heuristics (or what Peter Rowe, following Gadamer, calls enabling prejudices) taught in the product design programme at an elite university.
So we need to be careful not to interpret this research finding as indicating a successful invasion of the algorithm into a previously entirely human activity. Instead, it merely represents a further recalibration of algorithmic culture in relation to an existing sociotechnical ecosystem.
Update April 2024
As far as I can see, the evaluation method used in this study did not consider the question of feasibility. If students have a stronger sense of the possible than algorithms do, this may inhibit their ability to put forward superficially attractive but practically ridiculous ideas, which might nevertheless score highly on the evaluation method used here. In my post ChatGPT and Entropy (April 2024), I look at the phenomenon of model collapse, which could lead to algorithms becoming increasingly disconnected from reality. But perhaps able to generate increasingly outlandish ideas?
Louise Amoore, Cloud Ethics: Algorithms and the Attributes of Ourselves and Others (Durham and London: Duke University Press 2020)
Mohammad Atari, Mona J. Xue, Peter S. Park, Damián E. Blasi and Joseph Henrich, Which Humans? (PsyArXiv, September 2023) HT @MCoeckelbergh
David Beer, The problem of researching a recursive society: Algorithms, data coils and the looping of the social (Big Data and Society, 2022)
Bruce Clarke, Rethinking Gaia: Stengers, Latour, Margulis (Theory Culture and Society 2017)
Karan Girotra, Lennart Meincke, Christian Terwiesch, and Karl T. Ulrich, Ideas are Dimes a Dozen: Large Language Models for Idea Generation in Innovation (10 July 2023)
N Katherine Hayles, The Illusion of Autonomy and the Fact of Recursivity: Virtual Ecologies, Entertainment, and "Infinite Jest" New Literary History , Summer, 1999, Vol. 30, No. 3, Ecocriticism (Summer, 1999), pp. 675-697
Yuk Hui, Problems of Temporality in the Digital Epoch, in Axel Volmar and Kyle Stine (eds) Media Infrastructures and the Politics of Digital Time (Amsterdam University Press 2021)
John Naughton, When it comes to creative thinking, it’s clear that AI systems mean business (Guardian, 23 September 2023)
Peter Rowe, Design Thinking (MIT Press 1987)
Ted Striphas, Algorithmic culture before the internet (New York: Columbia University Press, 2023)
Richard Veryard, As We May Think Now (Subjectivity 30/4, 2023)
See also: From Enabling Prejudices to Sedimented Principles (March 2013)