Showing posts with label privacy. Show all posts
Showing posts with label privacy. Show all posts

Sunday, March 18, 2018

Security is downstream from strategy

Following @carolecadwalla's latest revelations about the misuse of personal data involving Facebook, she gets a response from Alex Stamos, Facebook's Chief Security Officer.

So let's take a look at some of his hand-wringing Tweets.

I'm sure many security professionals would sympathize with this. Nobody listens to me. Strategy and innovation surge ahead, and security is always an afterthought.

According to his Linked-In entry, Stamos joined Facebook in June 2015. Before that he had been Chief Security Officer at Yahoo!, which suffered a major breach under his watch in late 2014, affecting over 500 million user accounts. So perhaps a mere 50 million Facebook users having their data used for nefarious purposes doesn't really count as much of a breach in his book.

In a series of tweets he later deleted, Stamos argued that the whole problem was caused by the use of an API that everyone should have known about, because it was well-documented. As if his job was only to control the undocumented stuff.
Or as Andrew Keane Woods glosses the matter, "Don’t worry everyone, Cambridge Analytica didn’t steal the data; we were giving it out". By Monday night, Stamos had resigned.

In one of her articles, Carole Cadwalladr quotes the Breitbart doctrine
"politics is downstream from culture, so to change politics you need to change culture"
And culture eats strategy. And security is downstream from everything else. So much then for "by design and by default".







Carole Cadwalladr ‘I made Steve Bannon’s psychological warfare tool’: meet the data war whistleblower (Observer, 18 Mar 2018) via @BiellaColeman

Carole Cadwalladr and Emma Graham-Harrison, How Cambridge Analytica turned Facebook ‘likes’ into a lucrative political tool (Guardian, 17 Mar 2018)

Jessica Elgot and Alex Hern, No 10 'very concerned' over Facebook data breach by Cambridge Analytica (Guardian, 19 Mar 2018)

Hannes Grassegger and Mikael Krogerus, The Data That Turned the World Upside Down (Motherboard, 28 Jan 2017) via @BiellaColeman

Justin Hendrix, Follow-Up Questions For Facebook, Cambridge Analytica and Trump Campaign on Massive Breach (Just Security, 17 March 2018)

Casey Johnston, Cambridge Analytica's leak shouldn't surprise you, but it should scare you (The Outline, 19 March 2018)

Nicole Perlroth, Sheera Frenkel and Scott Shanemarch, Facebook Exit Hints at Dissent on Handling of Russian Trolls (New York Times, 19 March 2018)

Mattathias Schwartz, Facebook failed to protect 30 million users from having their data harvested by Trump campaign affiliate (The Intercept, 30 March 2017)

Andrew Keane Woods, The Cambridge Analytica-Facebook Debacle: A Legal Primer (Lawfare, 20 March 2018) via BoingBoing


Wikipedia: Yahoo data breaches

Related post: Making the World more Open and Connected (March 2018), Ethical Communication in a Digital Age (November 2018), The Future of Political Campaigning (November 2018)


Updated 20 March 2018 with new developments and additional commentary

Thursday, November 3, 2016

Pay as you Share

Announced and rapidly withdrawn, Admiral's proposed collaboration with Facebook was supposed to give drivers a discount on their car insurance premiums if their Facebook posts indicated the right kind of personality. According to some reports, the idea was that people who were reckless with punctuation (too many exclamation marks, not enough full stops) might also be reckless in their driving habits.

The punctuation example is probably a red herring. The analysis of personality will undoubtedly be based on much richer aspects than mere punctuation: Facebook is capable of much more sophisticated analysis, as well as selling data to other organizations for the same purpose.

For example, a Korean study in 2013 found that Facebook activities had predictive power in distinguishing depressed and nondepressed individuals. However, Facebook may not wish to draw too much public attention to such capabilities. (There are some important ethical issues in the use of algorithms to predict mental health issues, for example in recruitment screening, discussed at length by Cathy O'Neil.)

Meanwhile, insurance companies will wish to use any information and insight they can get their hands on, to try and calculate risk more accurately. People may consent to sharing their data if they feel they will benefit personally, or if they are unaware of the possible data uses and implications, but that could just result in discrimination against the people who refuse to share their data. So privacy campaigners may not be reassured by the fact that this particular collaboration has been withdrawn.



Cathy O'Neil, How algorithms rule our working lives (Guardian, 1 Sept 2016)

Sungkyu Park et al, Activities on Facebook Reveal the Depressive State of Users (J Med Internet Res. 2013 Oct; 15(10): e217)

Graham Ruddick, Admiral to price car insurance based on Facebook posts (Guardian, 2 November 2016, 00.01 GMT)

Graham Ruddick, Facebook forces Admiral to pull plan to price car insurance based on posts (Guardian, 2 November 2016, 18.41 GMT)


Related posts

Pay as you drive (Oct 2006, June 2008, June 2009)
Weapons of Math Destruction (Oct 2016)
Insurance and the Veil of Ignorance (Feb 2019)

Monday, October 31, 2016

The Transparency of Algorithms

Algorithms have been getting a bad press lately, what with Cathy O'Neil's book and Zeynap Tufekci's TED talk. Now the German Chancellor, Angela Merkel, has weighed into the debate, calling for major Internet firms (Facebook, Google and others) to make their algorithms more transparent.

There are two main areas of political concern. The first (raised by Mrs Merkel) is the control of the news agenda. Politicians often worry about the role of the media in the political system when people only pick up the news that fits their own point of view, but this is hardly a new phenomenon. Even in the days before the Internet, few people used to read more than one newspaper, and most people would prefer to read the newspapers that confirm their own prejudices. Furthermore, there have been recent studies that show that even when you give different people exactly the same information, they will interpret it differently, in ways that reinforce their previous beliefs. So you can't blame the whole Filter Bubble thing on Facebook and Google.

But they undoubtedly contribute further to the distortion. People get a huge amount of information via Facebook, and Facebook systematically edits out the uncomfortable stuff. It aroused particular controversy recently when its algorithms decided to censor a classic news photograph from the Vietnam war.

Update: Further criticism from Tufekci and others immediately following the 2016 US Election


The second area of concern has to do with the use of algorithms to make critical decisions about people's lives. The EU regards this as (among other things) a data protection issue, and privacy activists are hoping for provisions within the new General Data Protection Regulation (GDPR) that will confer a "right to an explanation" upon data subjects. In other words, when people are sent to prison based on an algorithm, or denied a job or health insurance, it seems reasonable to allow them to know what criteria these algorithmic decisions were based on.

Reasonable but not necessarily easy. Many of these algorithms are not coded in the old-fashioned way, but developed using machine learning. So the data scientists and programmers responsible for creating the algorithm may not themselves know exactly what the criteria are. Machine learning is basically a form of inductive reasoning, using data about the past to predict the future. As Hume put it, this assumes that “instances of which we have had no experience resemble those of which we have had experience”.

In a Vanity Fair panel discussion entitled “What Are They Thinking? Man Meets Machine,” a young black woman tried unsuccessfully to explain the problem of induction and biased reasoning to Sebastian Thrun, formerly head of Google X.
At the end of the panel on artificial intelligence, a young black woman asked Thrun whether bias in machine learning “could perpetuate structural inequality at a velocity much greater than perhaps humans can.” She offered the example of criminal justice, where “you have a machine learning tool that can identify criminals, and criminals may disproportionately be black because of other issues that have nothing to do with the intrinsic nature of these people, so the machine learns that black people are criminals, and that’s not necessarily the outcome that I think we want.”
In his reply, Thrun made it sound like her concern was one about political correctness, not unconscious bias. “Statistically what the machines do pick up are patterns and sometimes we don’t like these patterns. Sometimes they’re not politically correct,” Thrun said. “When we apply machine learning methods sometimes the truth we learn really surprises us, to be honest, and I think it’s good to have a dialogue about this.”

In other words, Thrun assumed that whatever the machine spoke was Truth, and he wasn't willing to acknowledge the possibility that the machine might latch onto false patterns. Even if the algorithm is correct, it doesn't take away the need for transparency; but if there is the slightest possibility that the algorithm might be wrong, the need for transparency is all the greater. And evidence is that some of the algorithms are grossly wrong.


In this post, I've talked about two of the main concerns about algorithms - firstly the news agenda filter bubble, and secondly the critical decisions affecting individuals. In both cases, people are easily misled by the apparent objectivity of the algorithm, and are often willing to act as if the algorithm is somehow above human error and human criticism. Of course algorithms and machine learning are useful tools, but an illusion of infallibility is dangerous and ethically problematic.



Rory Cellan-Jones, Was it Facebook 'wot won it'? (BBC News, 10 November 2016)

Ethan Chiel, EU citizens might get a ‘right to explanation’ about the decisions algorithms make (5 July 2016)

Kate Connolly, Angela Merkel: internet search engines are 'distorting perception' (Guardian, 27 October 2016)

Bryce Goodman, Seth Flaxman, European Union regulations on algorithmic decision-making and a "right to explanation" (presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY)

Mike Masnick, Activists Cheer On EU's 'Right To An Explanation' For Algorithmic Decisions, But How Will It Work When There's Nothing To Explain? (Tech Dirt, 8 July 2016)

Fabian Reinbold, Warum Merkel an die Algorithmen will (Spiegel, 26 October 2016)

Nitasha Tiku, At Vanity Fair’s Festival, Tech Can’t Stop Talking About Trump (BuzzFeed, 24 October 2016) HT @noahmccormack

Julia Carrie Wong, Mark Zuckerberg accused of abusing power after Facebook deletes 'napalm girl' post (Guardian, 9 September 2016)

New MIT technique reveals the basis for machine-learning systems’ hidden decisions (Kutzweil News, 31 October 2016) HT @jhagel

Video: When Man Meets Machine (Vanity Fair, 19 October 2016)

See Also
The Problem of Induction (Stanford Encyclopedia of Philosophy, Wikipedia)


Related Posts
The Shelf-Life of Algorithms (October 2016)
Weapons of Math Destruction (October 2016)

Updated 10 November 2016

Tuesday, November 27, 2007

Shakespeare on Identity Theft

On the Loss of Two CDs by Her Majesty's Revenue and Customs containing the Records of 25 Million Taxpayers and their Children.

Shall I compare thee to a string of digits?
Thou art more personal and more private.
Rough Humphreys doth quiz the Darling on Today,
And Gordon's lease hath all too short a date.
Sometime too close the eye of Google shines,
And oft is gold from banking accounts skimmed;
And every mother’s maiden name declines,
By chance, or nature's changing course untrimmed.
But thy perfect database shall not leak
Nor lose possession of that CD they sent;
Nor shall the hacker spam and phish and phreak,
When with eternal ID card thou went,
So long as cars have chips and streets have CCTV,
So long lives your identity, and this gives life to thee.


Sources: BBC News, The Register, Robin Wilton, Into the Machine.

Saturday, March 18, 2006

Network Privacy 2

Following on from my previous post on Network Privacy.

Privacy and data protection are primarily understood in terms of facts about one person. But most of the facts we are really interested in (gossip, political scandal, dastardly deeds and worse) involve more than one person.
This is particularly true if we are rigorous about including provenance. An allegation against person A by person B is a fact about B as well as a fact about A. B's credibility (and any other allegations made by B, as well as links between B and any other people making allegations against A) may be relevant to the veracity of the allegation.

(If someone made an unfounded allegation about me, I should perhaps feel slightly more comfortable if this was stored in some database as an allegation, with a defined provenance, rather than as unvarnished fact or vague probability. And I should want anyone reading the allegation to be automatically presented with my refutation as well. See my post on Google and Spin, which discusses the Prince Charles approach to news management.)

Why are we more interested in facts involving two or more people? One reason is that it is relevant to trust. If a politician has failed to disclose a loan, this may be relevant to his/her public duties. This is where there starts to be a conflict between privacy and public interest.

Where does this leave Prince Charles and his diaries? The relationship between royalty and the newspapers has often been uncomfortable. In 1908, Kaiser Wilhelm II of Germany unwisely gave an interview to the London Daily Telegraph, in which he liberally insulted half the people of Europe. Surely the people (vox populi and all that) have a right to know if the Kaiser is an ass?

Network Privacy

In his post on Social Cartography - Mapping the Electorate, Scribe reminds that it's not enough to have privacy and data protection at the individual level. We also need to consider the privacy of relationships between individuals.

There are many concerns about data protection and privacy at the individual level. (In his recent post on the Status of Privacy in the UK, Robin Wilton points out that Prince Charles used arguments based on confidentiality and copyright to protect his diaries, presumably because of a lack of adequate privacy legislation.)

But if we think about interpersonal privacy, this becomes much more complex, and raises some serious ontologicial and practical issues that privacy campaigners don't seem to be addressing. So I thought it might be useful to cross-post a few notes here.

Let's start with an incident that might be regarded as an example of breached privacy. John Major, former UK prime minister, was embarrassed by the publication of an autobiography by fellow (hrm hrm) politician Edwina Currie, in which she revealed details of a long-standing affair between them. His public response was ungracious and ungentlemanly. [BBC News, September 2002]

Privacy means that some data subject has some rights over some data.
  • What can the data subject do with the data? (e.g. publish, hide, preserve, alter, destroy)
  • What can other agents NOT do with the data? (e.g. publish, hide, preserve, alter, destroy)
  • What recompense is the data subject entitled to, in the event of any accidental or deliberate breach of these rights.
Data protection implies a set of mechanisms to support the rights of the data subject, to limit the actions of other agents, and to resolve any disputes. This raises a number of complex issues.

Ownership Who ‘owns’ the data? Does a company own the data it has collected about a person? Does a person have any ownership rights over his/her ‘own’ data? What data (if any) are governed by the principles of data protection, and what data are not so governed?
Identity There must be some reliable mechanism for matching the identity of the data subject with the identity referenced by the data. Furthermore, this mechanism should not itself represent an invasion of privacy.
Ontology Many types of data reference multiple individuals. For example, data about a secret relationship between two individuals can be understood as belonging to the pair (which is a composite data subject). However, the very existence of this pair may be part of the secret.
Collaboration If secret data belong collectively to multiple individuals, then any legitimate action over such data may require a collaboration between them. Of course, any individual named as a party to a secret relationship may seek individual recompense. It is not always clear what rights (if any) an individual has when details of a secret relationship are published unilaterally by one party.
Fiction / Libel Reports of a secret relationship may sometimes be fabricated. Standing up for one's rights against libel or slander may involve reference to a pairing that was only brought into being by the libel.

Note - these issues apply to commercial relationships between organizations, as well as to sexual relationships between consenting adults.

Technorati Tags:

Tuesday, November 29, 2005

Ambiguity

Alex Bosworth has just posted an excellent piece on Trust, Morality and Software Services, which cites some of the borderline ethical practices of (and potential threats posed by) leading internet and media companies - mentioning Friendster, Google, Microsoft, Sony, TiVo and Yahoo, among others.

In some cases, the threats come from ill-considered invasions of user rights by the company itself. In other cases, the threats may come simply from the accumulation of new forms of information, which become subject to official and unofficial snooping.

Some threats are pretty unambiguous, to the extent that we can identify their sources as hostile and malicious, bent on outcomes that are clearly criminal or worse. But it is the ambiguous threats (sometimes from companies that can afford to spend millions on legal fees and political lobbying) that may be more difficult to manage.

If a hacker tries to syphon a few dollars from my bank account this is clearly a criminal act. But if the bank itself syphons a few dollars from my account it is probably going to be hard for me to get this classified as a criminal act. (The bank can usually cite some vague service charge somewhere in the small print.) But the effect on my bank balance is much the same.

We often end up with a kind of shallow commodity trust. We accept the products and services of big companies because they are hard to avoid. But we have to remain wary of them.

Adam argues that "the only way to restore or create trust is by over time and repetition creating a pattern of ethical decisions". Yes, and this pattern must be clear and visible. Deep trust requires transparency and unambiguity. The ethical challenge for large companies is to maintain a strong trustworthy position across a diverse and complex marketplace.


Previous posts:

Unambiguous Threat (September 2005)
Intrusion and Immersion (November 2005)

Wednesday, September 7, 2005

Noticing Data Misuse

On information leakage, Bruce Schneier comments:

It's easy to say "we haven't seen any cases of fraud using our information," because there's rarely a way to tell where information comes from. ... Everyone thinks their data practices are good because there have never been any documented abuses stemming from leaks of their data and everyone is fooling themselves.

Many years ago, when I worked on some information systems for direct mail marketing, it was standard practice to include fictional entries in a mailing list, which allowed for the rapid detection of abuse. In this context, abuse generally means using the mailing list for a purpose not authorized by the mailing list owner/administrator, and/or without proper payment. The data owner has an incentive to control abuse, because abuse degrades the value of the data to the owner. The relationship between the data owner and the data user is one of provisional trust, with retrospective sanctions whenever abuses of trust come to light. This relationship works because of the detection mechanism. And the mechanism works because the data user cannot discriminate between the fictional entries and the real ones.

So why doesn't this work for the current spate of privacy violations and identity theft vulnerabilities? Assuming that the fictional entries are properly constructed. There are some technical considerations and some social considerations (including regulation), but the value of such a mechanism should be obvious.

Technorati Tags:

Thursday, July 8, 2004

Security Threats

I have received several copies of an email inviting me to download some Big Brother software, in order to keep tabs on my loved ones. (Is my spouse cheating online? Are my kids talking to dangerous people on instant messenger?)

At one level, this product claims to provide me with a mechanism to invade another person's privacy - in other words to breach the security of some system. It is therefore selling (or at least promising) a security threat. At another level, it is sold as a way of protecting me and my family from various threats - spouse or kids innocently (or not so innocently) having contacts with dubious characters via the internet. There is an interesting tension between these two levels.

But perhaps the real implied danger (apart from running up excessive phone/ISP bills) is when the contacts cease to be mediated by the Internet - e.g. online cheating leads to offsite cheating. Online cheating (whatever that may be) becomes not the primary offence/risk, but a clue towards some other offence/risk.

This illustrates a general point -- that there is a temptation (encouraged by technology) to measure and monitor what is easy to measure and monitor, even if this provides at best an indirect indication of what's really at issue.

The general point applied to trust and security is that security monitoring typically measures the wrong things. This may start with a valid observation, that there is a close correlation between X (which is the real threat) and Y (which is easy to measure). So by measuring Y, we get an indication of X. But this is vulnerable in two ways.

Firstly, the fact that X-threats are monitored via Y may leak out and become public knowledge - and therefore useless. Secondly, a determined investigator or hacker may be able to infer an internal connection between X and Y by observing (and testing) the behaviour of the system from the outside. The forced coupling between X and Y represents a simplification in behaviour, a reduction in requisite variety.

Note that indirect measurement is commonplace in quality management systems, as long as you have appropriate mechanisms to callibrate and control the metrics. Among other things, an indirect measurement may give earlier warning of an impending problem than waiting for the direct measurement. But the reasonable precautions that may be necessary and sufficient for quality management systems are grossly inadequate for security systems.

And this is not just a technological point, but a sociological one. People generally use weak and unreliable signals to make significant trust/mistrust decisions, and may flip catastrophically from blind trust to unremitting suspicion. Even in relation to their loved ones (as Shakespeare discovered).

Tuesday, May 18, 2004

Security and Warrants

In his latest newsletter, Bruce Schneier argues for the warrant as an essential (sociopolitical) control over certain security mechanisms (such as electronic surveillance).



Certain acts (for example, search or surveillance) are “security-charged”. By this I mean that these acts alter the geometry of risk for affected stakeholders. These acts are permitted under rules that are supposed to minimize the invasion of privacy while maximizing the effectiveness of crime prevention and criminal prosecution. Some of these acts (such as wire-tapping or domestic search) may require a warrant. A warrant is the authority to perform some security-charged act, granted by an independent body (such as a magistrate).



Schneier argues (and I agree with him) that if old methods of search and surveillance require a warrant, then new technological methods should also require a warrant. In other words, the rules should be expressed in technologically neutral terms, should not have technological loopholes, and should not be constantly lagging behind technological innovation.



But I have a more general concern about these security-charged acts. There is a need for a properly constituted governance function, which is much broader in powers than simply granting or denying a warrant in a particular instance. Governance has to do with setting objectives and priorities, making appropriate judgements about security trade-offs, deciding where to allocate security resources to achieve the best results with minimum social and economic cost.