Sunday, November 3, 2024

Influencing the Habermas Machine

In my previous post Towards the Habermas Machine, I talked about a large language model (LLM) developed by Google DeepMind for generating a consensus position from a collection of individual views, named after Jürgen Habermas.

Given that democratic deliberation relies on knowledge of various kinds, followers of Habermas might be interested in how knowledge is injected into discourse. Habermas argued that mutual understanding was dependent upon a background stock of cultural knowledge that is always already familiar to agents. but this clearly has to be supplemented by knowledge about the matter in question.

For example, we might expect a discussion about appropriate speed limits to be informed by reliable or unreliable beliefs about the effects of a given speed limit on journey times, accident rates, pollution, and so on. In traditional discussion forums, it is extremely common for people to present themselves as having some special knowledge or authority, which supposedly gives extra weight to their opinions, and we might expect something similar to happen in a tech-enabled version.

For many years, the Internet has been distorted by Search Engine Optimization (SEO), which means that the results of an internet search are largely driven by commercial interests of various kinds. Researchers have recently raised a similar issue in relation to large language models, namely Generative Engine Optimization (GEO). Meanwhile, other researchers have found that LLMs (like many humans) are more impressed by superficial jargon than by proper research.

So we might reasonably assume that various commercial interests (car manufacturers, insurers, oil companies, etc) will be looking for ways to influence the outputs of the Habermas Machine on the speed limit question by overloading the Internet with knowledge (regime of truth) in the appropriate format. Meanwhile the background stock of cultural knowledge is now presumably co-extensive with the entire Internet.

Is there anything that the Habermas Machine can do to manage the quality of the knowledge used in its deliberations?


Footnote: Followers of Habermas can't agree on the encyclopedia entry, so there are two rival versions.

Footnote: The relationship between knowledge and discourse goes much wider than Habermas, so interest in this question is certainly not limited to his followers. I might need to write a separate post about the Foucault Machine.


Pranjal Aggarwal et al, GEO: Generative Engine Optimization (arxiv v3, 28 June 2024)

Callum Bains, The chatbot optimisation game: can we trust AI web searches? (Observer, 3 November 2024)

Alexander Wan, Eric Wallace, Dan Klein, What Evidence Do Language Models Find Convincing? (arxiv v2, 9 August 2024)

Stanford Encyclopedia of Philosophy: Jürgen Habermas (v1 2007) Jürgen Habermas (v2 2023)

Saturday, October 19, 2024

Towards the Habermas Machine

Google DeepMind has just announced a large language model, which claims to generate a consensus position from a collection of individual views. The name of the model is a reference to Jürgen Habermas’s theory of communicative action.

An internet search for Habermas machine throws up two previous initiatives under the same name. Firstly an art project by Kristopher Holland.

The Habermas Machine (2006–2012) is a conceptual art experience that both examines and promotes an experiential relation to Jürgen Habermas’ grand theory for understanding human interaction. The central claim is that The Theory of Communicative Action can be experienced, reflected upon and practised when encountered within arts-based research. Habermas’ description of how our everyday lives are founded by intersubjective experience, and caught up in certain normative, objective and subjective contexts is transformed through the method of conceptual art into a process of collaborative designing, enacting and articulating. This artistic reframing makes it possible to experience the communicative structure of knowledge and the ontological structure of intersubjectivity in a practice of non-discursive ‘philosophy without text’. Feiten Holland Chemero

And secondly, an approach to Dialogue Mapping described as a device that all participants can climb into and converse with complete communicative rationality, contained in a book by @paulculmsee and Jailash Awati, and mentioned in this Reddit post Why is Dialogue Mapping not wide spread? Dialogue Mapping was developed by Jeff Conklin and others as an approach to addressing wicked problems. See also Issue Based Information Systems (IBIS).


Update

Christopher Summerfield, one of the authors of the DeepMind paper, spoke at the Royal Society on October 29th 2024. https://www.youtube.com/live/cW1Wq7_8v1Y?si=oqo8Lw7479x4QqKt&t=18890

All the examples shown in his talk were policy matters that could be reduced to Yes/No questions. Such questions would traditionally be surveyed by asking people to place themselves on a scale from Strongly Agree to Strongly Disagree, and it is easy to see how a language-based method such as the Habermas Machine offers some advantages over a numerical scale. But not clear how this works for more provocative questions, let alone wicked problems.

Someone in the audience asked if this method would work in what he called a compromised democracy, and Summerfield acknowledged that the method assumes what he called a good faith scaffold. Obviously all democracies in the real world are imperfect, and he didn't go into the question as to how sensitive or vulnerable the method might be to such imperfections, but the method might conceivably help to overcome some of these imperfections under some conditions: for example, Summerfield referred specifically to the tyranny of the majority.

While the performance of the Habermas machine in their study compared favourably with the performance of human mediators, Summerfield suggested that we should move away from thinking about AI in these terms. The point is not to create AI-based agents that can behave like intelligent people but to build intelligent institutions - tools for creating social order and fostering cooperation. As my regular readers will know, orgintelligence has long been an important theme for this blog. See for example my post On Organizations and Machines (January 2022).


Jeffrey Conklin, Dialogue Mapping: Building Shared Understanding of Wicked Problems (Wiley 2006). See also CogNexus website.

Paul Culmsee and Jailash Awati, The Heretic's Guide to Best Practice (2013)

Nicola Davis, AI mediation tool may help reduce culture war rifts, say researchers (Guardian, 17 October 2024)

Tim Elmo Feiten, Kristopher Holland and Anthony Chemero, Doing philosophy with a water-lance: art and the future of embodied cognition (Adaptive Behavior 2021) 

Michael Tessler et al, AI can help humans find common ground in democratic deliberation (Science, 18 October 2024)

Beyond the symbols vs signals debate (The Royal Society, 28-29 October 2024)

Wikipedia: Issue Based Information Systems (IBIS), Wicked Problem

See also Influencing the Habermas Machine (November 2024)

Thursday, June 6, 2024

All our eyes on the disgraceful Horizon

The scandal at the British Post Office, details of which are now emerging in the Public Enquiry, provides illustrations of many important aspects of organizational behaviour as discussed on this blog. 

Willful blindness. There is a strong attachment to a false theory, despite mounting evidence to the contrary, as well as the appalling human consequences.

Misplaced trust. Trusting a computer system (Horizon) above hundreds of ordinary people. And both the legal system and government ministers trusting the evidence presented by a public corporation, despite the fact that contrary evidence from expert witnesses had been accepted in a small number of cases (see below).

Defensive denial as one of the symptoms of organizational stupidity. In July 2013, Post Office boss Paula Vennells was told about faults in the Horizon system, and advised that denying these would be dangerous and stupid. This is something the Post Office had denied for years. ITV March 2024

A detail that struck me yesterday was a failure to connect the dots. In 2011, the auditors (EY) raised concerns about data quality, warning that if Horizon was not accurate, then they would not be able to sign off Post Office company accounts. Ms Perkins, who was giving evidence at an inquiry into the scandal, said at the time she did not make a link between the two. BBC June 2024. The pattern I'm seeing here is of assuming the sole purpose of audit as satisfying some regulatory requirement, with zero operational (let alone ethical) implications of anything the auditors might find. And assuming the regulatory requirement itself to have no real purpose, being merely a stupid and meaningless piece of bureaucracy.

Another failure to connect the dots occurred after Julie Wolstenholme successfully challenged the Post Office in 2003 with the aid of an expert technical witness. Why didn't this prompt serious questions about all the other cases? When asked about this at the enquiry, David Mills said he had not properly assimilated the information and pleaded lack of intelligence, saying I wasn’t that clever. I’m sorry, I didn’t ask about it. ITV April 2024

In my other pieces about organizational intelligence, I have noted that stupid organizations may sometimes be composed of highly intelligent people. Now that's one pattern the Post Office doesn't seem to illustrate. Or have the Post Office bosses merely chosen to present themselves as naive and incompetent rather than evil?


Tom Espiner, Ex-Post Office chair was told of IT risks in 2011 (BBC 5 June 2024)

ITV, Secret tape shows Paula Vennells was told about problems with Horizon and warned not to cover it up (29 March 2024)

ITV, Former Post Office boss tells inquiry he was not 'clever' enough to question Horizon IT system (16 April 2024)

Other Sources: Post Office Horizon IT Enquiry, British Post Office scandal (Wikipedia), Post Office Project (University of Exeter)

Friday, May 31, 2024

Thinking Academically

At Goldsmiths University yesterday for a discussion on Paratactical Life with Erin Manning and Brian Massumi. Academic jobs at Goldsmiths are currently threatened by a so-called Transformation Programme, similar to management initiatives at many other universities, giving critical urgency for those in the room to consider the primary task of the university in society, and the double task of the academic. For which Erin Manning advocates what she calls strategic duplicity.

This involves recognizing what works in the systems we work against. Which means: We don't just oppose them head on. We work with them, strategically, while nurturing an alien logic that moves in very different directions. One of the things we know that the university does well is that it attracts really interesting people. The university can facilitate meetings that can change lives. But systemically, it fails. And the systemic failure is getting more and more acute. Todoroff

One of the domains in which this duplicity is apparent is thinking itself. And this word thinking appears to have special resonance and meaning for academics - what academia calls thinking is not quite the same as what business calls thinking (which was the focus of my practitioner book on Organizational Intelligence) and certainly not the same as what tech calls thinking (the focus of Adrian Daub's book).

One of the observations that led to my work on Organizational Intelligence was the disconnect between the intelligence of the members of an organization and the intelligence of the organization itself. Universities are great examples of this, packed with clever people and yet the organization itself manifests multiple forms of stupidity. As of course do many other kinds of organization. I still believe that it is a worthwhile if often frustrating exercise to try to improve how a given organization collectively makes sense of and anticipates the demands placed on it by its customers and other stakeholders - in other words, how it thinks. However any such improvements would be almost entirely at the micropolitical level, I don't have much idea how one would go about dismantling what Deleuze calls the economy of stupidity.

Although I think the concept of organizational intelligence is a reasonable one, and have defended it here against those who argue that organizational functions and dysfunctions can always be reduced to the behaviours and intentions of individual human actors, I don't imagine that an organization will ever think in quite the way a person thinks. There are some deficiencies in organizational thinking, just as there are deficiencies in algorithmic thinking. For example, there are some interesting issues in relation to temporality, raised in some of the contributions to Subjectivity's Special Issue on Algorithms which I guest-edited last year.

For Brian Massumi, the key question is what is thinking for. In an academic context, we might imagine the answer to be something to do with knowledge - universities being where knowledge is created and curated, and where students are supposed to acquire socioeconomic advantage based on their demonstrated mastery of selected portions of this knowledge. Therefore much of the work of an academic is taken up with a form of thinking known as judgment or sorting out - deciding, agreeing and explaining the criteria by which students will be evaluated, using these criteria to assess the work of each student, and helping those students who don't fit the expected pattern for whatever reason.

But what really gives a student any benefit in the job market as a result of their studies is not just a piece of paper but a sense of their potential - for both thinking and doing. The problem with students using chatbots to write their assignments is not that they are cheating - after all, the ability to cheat without being found out is highly valued in many organizations, if not essential. The real problem is if they are learning a deficient form of thinking.

(This is far from a complete report on the afternoon, merely picking out some elements of the discussion that resonated with me.)

 

Update: Comments have been added to the goodreads version of this post.


Philip Boxer, The Three Asymmetries necessary to describing agency in living biological systems (Asymmetric Leadership, November 2023)

Philip Boxer, The Doubling of the Double Task (Asymmetric Leadership, February 2024)

Adrian Daub, What Tech Calls Thinking (Farrar Straus and Giroux, 2020)

Benoît Dillet, What Is Called Thinking?: When Deleuze Walks along Heideggerian Paths (Deleuze Studies 7/2 2013)

Kenan Malik, The affluent can have their souls enriched at university, so why not the poor as well? (Observer, 2 June 2024)

Brent Dean Robbins, Joyful Thinking-Thanking: A Reading of Heidegger’s “What is Called Thinking?” (Janus Head 13/2, October 2014) 

Uriah Marc Todoroff, A Cryptoeconomy of Affect (New Inquiry, May 2018)

Richard Veryard, Building Organizational Intelligence (Leanpub, 2012)

Richard Veryard, As we may think now (Subjectivity December 2023)

Related posts: Symptoms of Organizational Stupidity (May 2010), On Organizations and Machines (January 2022), Reasoning with the majority - chatGPT (January 2023), Creativity and Recursivity (September 2023)

Saturday, February 24, 2024

Anticipating Effects

There has been much criticism of the bias and distortion embedded in many of our modern digital tools and platforms, including search. Google recently released an AI image generation model that over-compensated for this, producing racially diverse images even for situations where such diversity would be historically inaccurate. With well-chosen prompts, this feature was made to look either ridiculous or politically dangerous (aka "woke"), and the model has been withdrawn for further refinement and testing.

I've just been reading an extended thread from Yishan Wong who argues 

The bigger problem he identifies is the inability of the engineers to anticipate and constrain the behaviour of a complex intelligent systems. As in many of Asimov's stories, where the robots often behave in dangerous ways.

Some writers on technology ethics have called for ethical principles to be embedded in technology, along the lines of Asimov's Laws. I have challenged this idea in previous posts, because as I see it the whole point of the Three Laws is that they don't work properly. Thus my reading of Asimov's stories is similar to Yishan's.

It looks like their testing didn't take context of use into account. 

Update: Or as Dame Wendy Hall noted later, This is not just safety testing, this is does-it-make-any-sense training.



Dan Milmo, Google pauses AI-generated images of people after ethnicity criticism (Guardian, 22 February 2024) 

Dan Milmo and Alex Hern, ‘We definitely messed up’: why did Google AI tool make offensive historical images? (Guardian, 8 March 2024)

Related posts: Reinforcing Stereotypes (May 2007), Purpose of Diversity (January 2010) (December 2014), Automation Ethics (August 2019), Algorithmic Bias (March 2021)

Tuesday, September 26, 2023

Creativity and Recursivity

Prompted by @jjn1's article on AI and creative thinking, I've been reading a paper by some researchers comparing the creativity of ChatGPT against their students (at an elite university, no less).

What is interesting about this paper is not that ChatGPT is capable of producing large quantities of ideas much more quickly than human students, but that the evaluation method used by the researchers rated the AI-generated ideas as being of higher quality. From 200 human-generated ideas and 200 algorithm-generated ideas, 35 of the top-scoring 40 were algo-generated.

So what was this evaluation method? They used a standard market research survey, conducted with college-age individuals in the United States, mediated via mTurk. Two dimensions of quality were considered: purchase intent (would you be likely to buy one) and novelty. The paper explains the difficulty of evaluating economic value directly, and argues that purchase intent provides a reasonable indicator of relative value.

The paper discusses the production cost of ideas, but this doesn't tell us anything about what the ideas might be worth. If ideas were really a dime a dozen, as the paper title suggests, then neither the impressive productivity of ChatGPT nor the effort of the design students would be economically justified. But the production of the initial idea is only a tiny fraction of the overall creative process, and (with the exception of speculative bubbles) raw ideas have very little market value (hence dime a dozen). So this research is not telling us much about creativity as a whole.

A footnote to the paper considers and dismisses the concern that some of these mTurk responses might have been generated by an algorithm rather than a human. But does that algo/human distinction even hold up these days? Most of us nowadays inhabit a socio-technical world that is co-created by people and algorithms, and perhaps this is particularly true of the Venn diagram intersection between college-age individuals in the United States and mTurk users. If humans and algorithms increasingly have access to the same information, and are increasingly judging things in similar ways, it is perhaps not surprising that their evaluations converge. And we should not be too surprised if it turns out that algorithms have some advantages over humans in achieving high scores in this constructed simulation.

(Note: Atari et al recommend caution in interpreting comparisons between humans and algorithms, as they argue that those from Western, Educated, Industrialized, Rich and Democratic societies - which they call WEIRD - are not representative of humanity as a whole.)

A number of writers on algorithms have explored the entanglement between humans and technical systems, often invoking the concept of recursivity. This concept has been variously defined in terms of co-production (Hayles), second-order cybernetics and autopoiesis (Clarke), and being outside of itself (ekstasis), which recursively extends to the indefinite (Yuk Hui). Louise Amoore argues that, in every singular action of an apparently autonomous system, then, resides a multiplicity of human and algorithmic judgements, assumptions, thresholds, and probabilities.

(Note: I haven't read Yuk Hui's book yet, so his quote is taken from a 2021 paper)

Of course, the entanglement doesn't only include the participants in the market research survey, but also students and teachers of product design, yes even those at an elite university. This is not to say that any of these human subjects were directly influenced by ChatGPT itself, since much of the content under investigation predated this particular system. What is relevant here is algorithmic culture in general, which as Ted Striphas's new book makes clear has long historical roots. (Or should I say rhizome?)

What does algorithmic culture entail for product design practice? For one thing, if a new product is to appeal to a market of potential consumers, it generally has to achieve this via digital media - recommended by algorithms and liked by people (and bots) on social media. Thus successful products have to submit to the discipline of digital platforms: being sorted, classified and prioritized by a complex sociotechnical ecosystem. So we might expect some anticipation of this (conscious or otherwise) to be built into the design heuristics (or what Peter Rowe, following Gadamer, calls enabling prejudices) taught in the product design programme at an elite university.

So we need to be careful not to interpret this research finding as indicating a successful invasion of the algorithm into a previously entirely human activity. Instead, it merely represents a further recalibration of algorithmic culture in relation to an existing sociotechnical ecosystem. 


Update April 2024

As far as I can see, the evaluation method used in this study did not consider the question of feasibility. If students have a stronger sense of the possible than algorithms do, this may inhibit their ability to put forward superficially attractive but practically ridiculous ideas, which might nevertheless score highly on the evaluation method used here. In my post ChatGPT and Entropy (April 2024), I look at the phenomenon of model collapse, which could lead to algorithms becoming increasingly disconnected from reality. But perhaps able to generate increasingly outlandish ideas?

 


Louise Amoore, Cloud Ethics: Algorithms and the Attributes of Ourselves and Others (Durham and London: Duke University Press 2020)

Mohammad Atari, Mona J. Xue, Peter S. Park, Damián E. Blasi and Joseph Henrich, Which Humans? (PsyArXiv, September 2023) HT @MCoeckelbergh

David Beer, The problem of researching a recursive society: Algorithms, data coils and the looping of the social (Big Data and Society, 2022)

Bruce Clarke, Rethinking Gaia: Stengers, Latour, Margulis (Theory Culture and Society 2017)

Karan Girotra, Lennart Meincke, Christian Terwiesch, and Karl T. Ulrich, Ideas are Dimes a Dozen: Large Language Models for Idea Generation in Innovation (10 July 2023)

N Katherine Hayles, The Illusion of Autonomy and the Fact of Recursivity: Virtual Ecologies, Entertainment, and "Infinite Jest" New Literary History , Summer, 1999, Vol. 30, No. 3, Ecocriticism (Summer, 1999), pp. 675-697

Yuk Hui, Problems of Temporality in the Digital Epoch, in Axel Volmar and Kyle Stine (eds) Media Infrastructures and the Politics of Digital Time (Amsterdam University Press 2021)

John Naughton, When it comes to creative thinking, it’s clear that AI systems mean business (Guardian, 23 September 2023) 

Peter Rowe, Design Thinking (MIT Press 1987)

Ted Striphas, Algorithmic culture before the internet (New York: Columbia University Press, 2023)

Richard Veryard, As We May Think Now (Subjectivity 30/4, 2023)

See also:  From Enabling Prejudices to Sedimented Principles (March 2013)

Thursday, March 9, 2023

Technology in use

In many blogposts I have mentioned the distinction between technology as designed/built and technology in use.

I am not sure when I first used these exact terms. I presented a paper to an IFIP conference in 1995 where I used the terms technology-as-device and technology-in-its-usage. By 2002, I was using the terms "technology as built" and "technology in use" in my lecture notes for an Org Behaviour module I taught (together with Aidan Ward) at City University. With an explicit link to espoused theory and theory-in-use (Argyris).

Among other things, this distinction is important for questions of technology adoption and maturity. See the following posts 

I have also talked about system-as-designed versus system-in-use - for example in my post on Ecosystem SOA 2 (June 2010). See also Trusting the Schema (March 2023).

Related concepts include Inscription (Akrich) and Enacted Technology (Fountain). Discussion of these and further links can be found in the following posts:


And returning to the distinction between espoused theory and theory-in-use. In my post on the National Decision Model (May 2014) I also introduced the concept of theory-in-view, which (as I discovered more recently) is similar to Lolle Nauta's concept of exemplary situation.



Richard Veryard, IT Implementation or Delivery? Thoughts on Assimilation, Accommodation and Maturity. Paper presented to the first IFIP WG 8.6 Working Conference, on the Diffusion and Adoption of Information Technology, Oslo, October 1995. 

Richard Veryard and Aidan Ward, Technology and Change (City University 2002)