Tuesday, July 14, 2020

Technology Mediating Relationships

In a May 2020 essay, @NaomiAKlein explains how Silicon Valley is exploiting the COVID19 crisis as an opportunity to reframe a long-standing vison of an app-driven, gig-fueled future. Until recently, Klein notes, this vision was being sold to us in the name of convenience, frictionlessness, and personalization. Today we are being told that these technologies are the only possible way to pandemic-proof our lives, the indispensable keys to keeping ourselves and our loved ones safe. Klein fears that this dubious promise will help to sweep away a raft of legitimate concerns about this technological vision.

In a subsequent interview with Katherine Viner, Klein emphasizes the importance of touch. In order to sell a touchless technology, touch has been diagnosed as the problem.

In his 1984 book, Albert Borgmann introduced the notion of the device paradigm. This means viewing technology exclusively as a device (or set of devices) that deliver a series of commodities, and evaluating the technical features and powers of such devices, without having any other perspective. A device is an artefact or instrument or tool or gadget or mechanism, which may be physical or conceptual. (Including hardware and software.)

According to Borgmann, it is a general trend of technological development that mechanisms (devices) are increasingly hidden behind service interfaces. Technology is thus regarded as a means to an end, an instrument or contrivance, in German: Einrichtung. Technological progress increases the availability of a commodity or service, and at the same time pushes the actual device or mechanism into the background. Thus technology is either seen as a cluster of devices, or it isn't seen at all.

However, Klein suggests that COVID19 might possibly have the opposite effect.

The virus has forced us to think about interdependencies and relationships. The first thing you are thinking about is: everything I touch, what has somebody else touched? The food I am eating, the package that was just delivered, the food on the shelves. These are connections that capitalism teaches us not to think about.

While Klein attributes this teaching to capitalism, where Borgmann and other followers of Heidegger would say technology, she appears to echo Borgmann's idea that we have a moral obligation not to settle mindlessly into the convenience that devices may offer us (via Stanford Encyclopedia).

Albert Borgmann, Technology and the Character of Contemporary Life: A philosophical inquiry (University of Chicago Press, 1984)

Naomi Klein, Screen New Deal (The Intercept, 8 May 2020). Reprinted as How big tech plans to profit from the pandemic (The Guardian, 13 May 2020)

Katherine Viner, Interview with Naomi Klein (The Guardian, 13 July 2020)

David Wood, Albert Borgmann on Taming Technology: An Interview (The Christian Century, 23 August 2003) pp. 22-25

Wikipedia: Technology and the Character of Contemporary Life

Stanford Encyclopedia of Philosophy: Phenomenological Approaches to Ethics and Information Technology - Technological Attitude

Sunday, July 12, 2020

Mapping out the entire world of objects

ImageNet is a large crowd-sourced database of coded images, widely used for machine learning. This database can be traced to an idea articulated by Fei-Fei Li in 2006: We’re going to map out the entire world of objects. In a blogpost on the Limitations of Machine Learning, I described this idea as naive optimism.

Such datasets raise both ethical and epistemological issues. One of the ethical problems thrown up by these image databases is that objects are sometimes also subjects. Bodies and body parts are depicted (often without consent) and labelled (sometimes offensively); people are objectified; and the objectification embedded in these datasets are then passed on to the algorithms that use them and learn from them. Crawford and Paglen argue convincingly that categorizing and classifying people is not just a technical process but a political act. And thanks to some great detective work by Vinay Prabhu and Abeba Birhane, MIT has withdrawn Tiny Images, another large image dataset widely used for machine learning.

But in this post, I'm going to focus on the epistemological and metaphysical issues - what constitutes the world, and how can we know about it. Li is quoted as saying Data will redefine how we think about models. The reverse should also be true, as I explain in my blogpost on the Co-Production of Data and Knowledge.

What exactly is meant by the phrase the entire world of objects and what would mapping this world really entail? Although I don't believe that philosophy is either necessary or sufficient to correct all of the patterns of sloppy thinking by computer scientists, even a casual reading of Wittgenstein, Quine and other 20th century philosophers might prompt people to question some simplistic assumptions of the relationships between Word and Object underpinning these projects.

The first problem with these image datasets is the assumption that images can be labelled according to the objects that are depicted in them. But as Prabhu and Birhane note, real-world images often contain multiple objects. Crawford and Paglen argue that images are laden with potential meanings, irresolvable questions, and contradictions and that ImageNet’s labels often compress and simplify images into deadpan banalities.

One photograph shows a dark-skinned toddler wearing tattered and dirty clothes and clutching a soot-stained doll. The child’s mouth is open. The image is completely devoid of context. Who is this child? Where are they? The photograph is simply labeled toy. Crawford and Paglen

Implicit in the labelling of this photograph is some kind of ontological precedence - that the doll is more significant than the child. As for the emotional and physical state of the child, ImageNet doesn't seem to regard these states as objects at all. (There are other image databases that do attempt to code emotions - see my post on Affective Computing.)

Given that much of the Internet is funded by companies that want to sell us things, it would not be surprising if there is an ontological bias towards things that can be sold. (This is what the word everything means in the Everything Store.) So that might explain why ImageNet chooses to focus on the doll rather than the child. But similar images are also used to sell washing powder. Thus the commercially relevant label might equally have been dirt.

But not only do concepts themselves (such as toys and dirt) vary between different discourses and cultures (as explored by anthropologists such as Mary Douglas), the ontological precedence between concepts may vary. People from a different culture, or with a different mindset, will jump to different conclusions as to what is the main thing depicted in a given image.

The American philosopher W.V.O. Quine argued that translation was indeterminate. If a rabbit runs past, and a speaker of an unknown language, Arunta, utters the word gavegai, we might guess that this word in Arunta corresponds to the word rabbit in English. But there are countless other things that the Arunta speaker might have been referring to. And although over time we may be able to eliminate some of these possibilities, we can never be sure we have correctly interpreted the meaning of the word gavegai. Quine called this the inscrutability of reference. Similar indeterminacy would seem to apply to our collection of images.

The second problem has to do with the nature of classification. I have talked about this in previous posts - for example on Algorithms and Governmentality - so I won't repeat all that here.

Instead, I want to jump to the third and final problem, arising from the phrase the entire world of objects - what does this really mean? How many objects are there in the entire world, and is it even a finite number? We can't count objects unless we can agree what counts as an object. What are the implications of what is included in everything and what is not included?

I occasionally run professional workshops in data modelling. One of the exercises I use is to display a photograph and ask the students to model all the objects they can see in the picture. Students who are new to modelling can always produce a simple model, while more advanced students can produce much more sophisticated models. There doesn't seem to be any limit to how many objects people can see in my picture.

ImageNet boasts 14 million images, but that doesn't seem a particularly large number from a big data perspective. For example, I guess there must be around a billion dogs in the world - so how many words and images do you need to represent a billion dogs?
Bruhl found some languages full of detail
Words that half mimic action; but
generalization is beyond them, a white dog is
not, let us say, a dog like a black dog.
Pound, Cantos XXVIII

Kate Crawford and Trevor Paglen, Excavating AI: The Politics of Images in Machine Learning Training Sets (19 September 2019)

Mary Douglas, Purity and Danger (1966)

Dave Gershgorn, The data that transformed AI research—and possibly the world (Quartz, 26 July 2017)

Vinay Uday Prabhu and Abeba Birhane, Large Image Datasets: A pyrrhic win for computervision? (Preprint, 1 July 2020)

Katyanna Quach, MIT apologizes, permanently pulls offline huge dataset that taught AI systems to use racist, misogynistic slurs Top uni takes action after El Reg highlights concerns by academics (The Register, 1 July 2020)

Stanford Encyclopedia of Philosophy: Feminist Perspectives on Objectification, Quine on the Indeterminacy of Translation

Related posts:  Co-Production of Data and Knowledge (November 2012), Have you got big data in your underwear (December 2014), Affective Computing (March 2019), Algorithms and Governmentality (July 2019), Limitations of Machine Learning (July 2020)

Saturday, July 4, 2020

Limitations of Machine Learning

In a recent discussion on Twitter prompted by some examples of erroneous thinking in Computing Science, I argued that you don't always need a philosophy degree to spot these errors. A thorough grounding in statistics would seem to settle some of them.

@DietrichEpp disagreed completely. If you want a machine to learn then you have to understand the difference between data and knowledge. Stats classes don’t normally cover this.

So there are at least two questions here. Firstly, how much do you really have to understand in order to build a machine. As I see it, getting a machine do something (including learning) counts as engineering rather than science. Engineering requires two kinds of knowledge - practical knowledge (how to reliably, efficiently and safely produce a given outcome) and socio-ethical knowledge (whom shall the technology serve). Engineers are generally not expected to fully understand the scientific principles that underpin all the components, tools and design heuristics that they use, but they have a professional and ethical responsibility to have some awareness of the limitations of these tools and the potential consequences of their work.

In his book on Design Thinking, Peter Rowe links the concept of design heuristic to Gadamer's concept of enabling prejudice. Engineers would not be able to function without taking some things for granted.

So the second question is - which things can/should an engineer trust. Most computer engineers will be familiar with the phrase Garbage In Garbage Out, and this surely entails a professional scepticism about the quality of any input dataset. Meanwhile, statisticians are trained to recognize a variety of potential causes of bias. (Some of these are listed in the Wikipedia entry on statistical bias.) Most of the statistics courses I looked at on Coursera included material on inference. (Okay, I only looked at the first dozen or so, small sample.)

Looking for relevant material to support my position, I found some good comments by Ariel Guersenzvaig, reported by Derek du Preez.
Unbiased data is an oxymoron. Data is biased from the start. You have to choose categories in order to collect the data. Sometimes even if you don’t choose the categories, they are there ad hoc. Linguistics, sociologists and historians of technology can teach us that categories reveal a lot about the mind, about how people think about stuff, about society.

And arriving too late for this Twitter discussion, two more stories of dataset bias were published in the last few days. Firstly, following an investigation by Vinay Prabhu and Abeba Birhane, MIT has withdrawn Tiny Images, a very large image dataset that has been widely used for machine learning, and asked researchers and developers to delete it. And secondly, FiveThirtyEight has published an excellent essay by Mimi Ọnụọha on the disconnect between data collection and meaningful change, arguing that it is impossible to collect enough data to convince people of structural racism.

Prabhu and Birhane detected significant quantities of obscene and offensively labelled material embedded in image datasets, which could easily teach a machine learning algorithm to deliver sexist or racist outcomes. They acknowledge the efforts made in the curation of image datasets, but insist that more could have been done, and will need to be done in future, to address some serious epistemological and ethical questions. With hindsight, it is possible to see the naive optimism of mapping out the entire world of objects in a rather different light.

Prabhu and Birhane mention Wittgenstein's remark in the Tractatus, ethics and aesthetics are one and the same. This thought brings me to the amazing work of Mimi Ọnụọha.
Classification.01 is a sculpture that consists of two neon brackets. When more than one viewer approaches to look at the piece, the brackets use a nearby camera to decide whether or the two viewers have been classifed as similar, according to a variety of algorithmic measures. The brackets only light up if the terms of classification have been met. The brackets do not share the code and the rationale behind the reason for the classification of the viewers. Just as with many of our technological systems, the viewers are left to determine on their own why they have been grouped, a lingering reminder no matter how much our machines classify, ultimately classification is also a human process.

In summary, there are some critical questions about data and knowledge that affect the practice of machine learning, and some critical insights from artists and sociologists. As for philosophy, famous philosophers from Plato to Wittgenstein have spent 2500 years exploring a broad range of abstract ideas about the relationship between data and knowledge, so you can probably find a plausible argument to support any position you wish to adopt. So this is hardly going to provide any consistent guidance for machine learning.


Thanks to Jag Bhalla for drawing my attention to @BioengineerGM's article on accountability in models. So not just GIGO (Garbage-In-Garbage-Out) but also AIAO (Accountability-In-Accountability-Out).

Guru Madhavan, Do-It-Yourself Pandemic: It’s Time for Accountability in Models (Issues in Science and Technology, 1 July 2020)

Mimi Ọnụọha, When Proof Is Not Enough (FiveThirtyEight, 1 July 2020)

Vinay Uday Prabhu and Abeba Birhane, Large Image Datasets: A pyrrhic win for computervision?(Preprint, 1 July 2020)

Derek du Preez, AI and ethics - ‘Unbiased data is an oxymoron’ (Diginomica, 31 October 2019)

Katyanna Quach, MIT apologizes, permanently pulls offline huge dataset that taught AI systems to use racist, misogynistic slurs Top uni takes action after El Reg highlights concerns by academics (The Register, 1 July 2020)

Peter Rowe, Design Thinking (MIT Press 1987)

Stanford Encyclopedia of Philosophy: Gadamer and the Positivity of Prejudice

Wikipedia: Algorithmic bias, All models are wrong, Bias (statistics), Garbage in garbage out

Further points and links in the following posts: Faithful Representation (August 2008), From Sedimented Principles to Enabling Prejudices (March 2013), Whom does the technology serve? (May 2019), Algorithms and Auditability (July 2019), Algorithms and Governmentality (July 2019), Naive Epistemology (July 2020), Mapping out the entire world of objects (July 2020)

Friday, July 3, 2020

Naive Epistemology

One of the things I learned from studying maths and philosophy is an appreciation of what things follow from what other things. Identifying and understanding what assumptions are implicit in a given argument, what axioms required to establish a given proof.

So when I see or hear something that I disagree with, I feel the need to trace where the disagreement comes from - is there a difference in fact or value or something else? Am I missing some critical piece of knowledge or understanding, that might lead me to change my mind? And if I want to correct someone's error, is there some piece of knowledge or understanding that I can give them, that will bring them around to my way of thinking?

(By the way, this skill would seem important for teachers. If a child struggles with simple arithmetic, exactly which step in the process has the child failed to grasp? However, teachers don't always have time to do this.)

There is also an idea of the economy of argument. What is the minimum amount of knowledge or understanding that is needed in this context, and how can I avoid complicating the argument by bringing in a lot of other material that may be fascinating but not strictly relevant. (I acknowledge that I don't always follow this principle myself.) And when I'm wrong about something, how can other people help me see this without requiring me to wade through far more material than I have time for.

There was a thread on Twitter recently, prompted by some weak thinking by a certain computer scientist. @jennaburrell noted that computer science has never been very strong on epistemology – either recognizing that it implicitly has one, that there might be any other, or interrogating its weaknesses as a way of understanding the world.

Some people suggested that the solution involves philosophy.

I completely agree with Dietrich about the value of philosophy and other humanities in general. However, I felt it was overkill for addressing the specific weaknesses identified by Professor Burrell, as her argument against this particular fallacy didn't seem to require any non-STEM knowledge or understanding.

Of course, statistics is not the whole answer; but then neither is philosophy. I mentioned statistics as an example of a STEM discipline in which students should have the opportunity to unlearn naive epistemology; but of course any proper scientific discipline should include some understanding of scientific method. Although computing often calls itself a science, it is largely an engineering discipline; if you use the word methodology with computer people, they usually think you are talking about design methods. Social scientists (I believe Professor Burrell's PhD is in sociology) tend to have a much better understanding of research methodology.

And of course, it's not just epistemology but also ethics.

One of the problems with professional philosophy is that it can be quite compartmentalized. There are philosophers who promote themselves as experts on technology ethics, but their published papers don't reference any recent literature on the philosophy of science and technology, or reveal any deep understanding of the challenges faced by scientists and engineers.

So although there is undoubtedly good reasons for broader education in both directions, I'm sceptical about expecting clever people in one discipline to acquire a small but dangerous amount of expertise in some other discipline. I'm much more interested in promoting dialogue between disciplines. In his tribute to Steve Jobs, @jonahlehrer called this Consilience.

What set all of Steve Jobs’s companies apart ... was an insistence that computer scientists must work together with artists and designers—that the best ideas emerge from the intersection of technology and the humanities.
The final word should go to @abebab

Jonah Lehrer, Steve Jobs: “Technology Alone Is Not Enough” (New Yorker, 7 October 2011)

Related posts:  From Convenience to Consilience - “Technology Alone Is Not Enough"  (October 2011), The Habitual Vice of Epistemology (June 2019), Limitations of Machine Learning (July 2020), Mapping out the entire world of objects (July 2020)

Monday, June 29, 2020

Bold, Restless Experimentation

In his latest speech, invoking the spirit of Franklin Delano Roosevelt, Michael Gove calls for bold, restless experimentation.

Although one of Gove's best known pronouncements was his statement during the Brexit campaign that people in this country have had enough of experts ..., Fraser Nelson suggests he never intended this to refer to all experts: he was interrupted before he could specify which experts he meant.

Many of those who share Gove's enthusiasm for disruptive innovation also share his ambivalence about expertise. Joe McKendrick quotes Valar Afshar of DisrupTV: If the problem is unsolved, it means there are no experts.

Joe also quotes Michael Sikorsky of Robots and Pencils, who links talent, speed of decision and judgement, and talks about pushing as much of the decision rights as possible right to the edge of the organization. Meanwhile, Michael Gove also talks about diversifying the talent pool - not only a diversity of views but also a diversity of skills.

In some quarters, expertise means centralized intelligence - for example, clever people in Head Office. The problems with this model were identified by Harold Wilensky in his 1967 book on Organizational Intelligence, and explored more rigorously by David Alberts and his colleagues in CCRP, especially under the Power To The Edge banner.

Expertise also implies authority and permission; so rebellion against expertise can also take the form of permissionless innovation. Adam Thierer talks about the tinkering and continuous exploration that takes place at multiple levels, while Bernard Stiegler talks about disinhibition - a relaxation of constraints leading to systematic risk-taking.
Elevating individual talent over collective expertise is a risky enterprise. Malcolm Gladwell calls this the Talent Myth, while Stiegler calls it Madness. For further discussion and links, see my post Explaining Enron.

Michael Gove, The Privilege of Public Service (Ditchley Annual Lecture, 27 June 2020)

Henry Mance, Britain has had enough of experts, says Gove (Financial Times, 3 June 2016)

Fraser Nelson, Don't ask the experts (Spectator, 14 January 2017)

Bernard Stiegler, The Age of Disruption: Technology and Madness in Computational Capitalism (Polity Press, 2019). Review by John Reader (Postdigital Science and Education, 2019).

Adam Thierer, Permissionless Innovation (Mercatus Center, 2014/2016)

Related posts: Demise of the Superstar (August 2004), Power to the Edge (December 2005), Explaining Enron (January 2010), Enemies of Intelligence (May 2010), The Ethics of Disruption (August 2019)

Tuesday, January 28, 2020

The Algorithmic Child and the Anxious Parent

#OIILondonLecture An interesting lecture by @VickiNashOII of @oiioxford at @BritishAcademy_ this evening, entitled Connected cots, talking teddies and the rise of the algorithmic child.

Since the early days of the World Wide Web, people have been concerned about the risks to children. Initially, these were seen in terms of protecting children from unsuitable content and from contact with unsuitable strangers. Children also needed to be prevented from behaving inappropriately on the Internet.

In the days when a typical middle-class household had a single fixed computer in a downstairs room, it was relatively easy for parents to monitor their children's use of the Internet. But nowadays childen in Western countries think themselves deprived if they don't have the latest smartphone, and even toddlers often have their own tablet computers. So much of the activity can be hidden in the bedroom, or even under the bedclothes after lights out.

Furthermore, connection to the Internet is not merely through computers, phones, tablets and games consoles, but also through chatbots and connected toys, as well as the Internet of Things. So there is increasing awareness of some additional threats to children, including privacy and security, and it is becoming increasingly difficult for parents to protect their children from all these threats. (Even confiscating the phones may not solve the problem: one resourceful Kentucky teenager managed to send messages from the family smartfridge.)

And as Dr Nash pointed out, it's no longer just about how children use the internet, but also how the internet uses children. Large-scale collection and use of data is not just being practised by the technology giants, but by an increasing number of consumer companies and other commercial enterprises. One of the most interesting developments here is the provision of surveillance tools to help parents monitor their children.

Parents are being told that good parenting means keeping your children safe, and keeping them safe means knowing where they are at all times, what they are doing, whom they are with, and so on. All thanks to various tracking apps that provide real-time information about your children's location and activity. And even when they are at home, asleep in their own beds, there are monitoring technologies to track their temperature or breathing, and alert the parents of any abnormal pattern.

Dr Nash argues that this expectation of constantly monitoring one's children contributes to a significant alteration in the parent-child relationship, and in our norms of parenthood. Furthermore, as children become teenagers, they will increasingly be monitoring themselves, in healthy or unhealthy ways. So how should the monitoring parents monitor the monitoring?

One of the problems with any surveillance technology is that provides a single lens for viewing what is going on. Although this may be done with good intentions, and may often be beneficial, it is also selective in what it captures. It is so easy to fall into the fallacy of thinking that what is visible is important, and what is not visible is not important.  Those aspects of a child's life and experience that can be captured by clever technology aren't necessarily those aspects that a parent should be paying most attention to.

Linda Geddes, Does sharing photos of your children on Facebook put them at risk? (The Guardian, 21 Sep 2014)

Victoria Nash, The Unpolitics of Child Protection (Oxford Internet Institute, 5 May 2013)

Victoria Nash, Connected toys: not just child’s play (Parent Info, May 2018)

Victoria Nash, Huw Davies and Allison Mishkin, Digital Safety in the Era of Connected Cots and Talking Teddies (Oxford Internet Institute, 25 June 2019)

Caitlin O'Kane, Teen goes viral for tweeting from LG smart fridge after mom confiscates all electronics (CBS News 14 August 2019)

Related posts IOT is coming to town (December 2017), Shoshana Zuboff on Surveillance Capitalism (February 2019), Towards Chatbot Ethics (May 2019)

Thursday, November 7, 2019


Until the arrival of the motor car, the street belonged to humans and horses. The motor car was regarded as an interloper, and was generally blamed for collisions with pedestrians. Cities introduced speed limits and other safety measures to protect pedestrians from the motor car.

The motor industry fought back. Their goal was to shift the blame for collisions onto the foolish or foolhardy pedestrian, who had crossed the road in the wrong place at the wrong time, or showed insufficient respect to our new four-wheeled masters. A new crime was invented, known as jaywalking, and newspapers were encouraged to describe road accidents in these terms.

In March 2018, a middle-aged woman was killed by a self-driving car. This is thought to be the first recorded death by a fully autonomous vehicle. According to the US National Safety Transportation Board (NTSB), the vehicle failed to recognise her as a pedestrian because she was not at an obvious designated crossing. In other words, she was jaywalking.

As I've observed before, ethics professors like to introduce the Trolley Problem into the ethics of self-driving cars, often carrying out opinion surveys (whom shall the vehicle kill?) because these are easily published in peer-reviewed journals. A recent study at MIT found that many people thought law-abiding pedestrians had more right to safety than jaywalkers. Therefore, if faced with this unlikely choice, the car should kill the jaywalker and spare the others. You have been warned.

Jack Denton, Is the Trolley Problem Derailing the Ethics of Self-Driving Cars? (Pacific Standard 29 November 2018)

Aidan Lewis, Jaywalking: How the car industry outlawed crossing the road (BBC News, 12 February 2014)

Peter Norton, Street Rivals: Jaywalking and the Invention of the Motor Age Street (Technology and Culture, Vol 48, April 2007)

Katyanna Quach, Remember the Uber self-driving car that killed a woman crossing the street? The AI had no clue about jaywalkers (The Register, 6 November 2019)

Joseph Stromberg, The forgotten history of how automakers invented the crime of "jaywalking" (Vox, 4 November 2015)

Related posts: Whom Does The Technology Serve? (May 2019), The Game of Wits between Technologists and Ethics Professors (June 2019)