Debate: AI and the Commons – sharing is caring

The old principle that knowledge is power has been proven true in the online space in a way precedent only by the innovation of print. Free knowledge is designed to be shareable and shared online. It is evident that as the custodians of one of its flagship projects, Wikipedia, we should always consider if we could afford disengaging from conversation about the power that is created with it. This reflection is especially relevant in any global movement whose collective actions weigh enough to make a difference globally.

Read the introduction to the debate

Read John Weitzmann’s take on the issue

Match made in (online) heaven

Emergence of Wikipedia, Wikimedia Commons and other such projects wouldn’t have been possible without reliable, standardised mechanisms of controlling creative outputs by ceding rights to them. Creative Commons licences are a societally recognised tool to do just that. Of course Wikipedia could have gone with any tool of a release of rights. But because of the diffusion of Creative Commons licences and the community behind it willing to translate, improve and finally use the licensing it makes sense that CC licensing is present on Wikipedia to such an extent.

It is a joyous feedback loop – Wikipedia has many contributors so the intake of CC licensing is massive. Then the images and materials licensed in that way start functioning in other contexts and projects. The two ideas: of a tool and of a knowledge-building practice are mutually reinforcing. No wonder that there is a significant personal overlap between two communities of contributors.

How is all this relevant to AI? Free knowledge is an ecosystem with many dependencies and a focus only on our immediate surroundings is too narrow. It is important to see that if we care about free knowledge we cannot dismiss major developments related to CC licences. That includes uses of photos to train AI in facial recognition that harms society.

The neutrality perspective

Many conversations on systemic problems affecting the Wikimedia movement and its endeavours are routinely scrutinised through a lens of project neutrality. There is no doubt that neutrality is – and should be – the cornerstone of creating encyclopaedic content. Even then one must account for implicit bias and ensure plurality of well-founded perspectives to get close to the ideal. But the very act of presenting true information is a political act. Enabling people to share in the sum of human knowledge is an ideological choice of letting everyone share the power. As noble as it is, it is sadly not a “neutral” one as there are many empowered proponents of the view that it is better if others are not able to make well-informed choices.

Artificial Intelligence is a neutral tool in the sense that the meaning of an elegantly designed algorithm can only be revealed through its intentional use by a human being. AI can be an empowering instrument of welfare or a weaponised tool of warfare. And, for that matter, so can be the data used to train it. We cannot, on one hand, claim that sharing in the sum of human knowledge is our end game with all the liberation that it brings and, on the other, pretend that uses of that knowledge that lead to disempowerment, disenfranchising and dehumanising of us humans are completely beyond our area of reflection.

“AI can be an empowering instrument of welfare or a weaponised tool of warfare.”

The thing is, we already care about the topic. We ensure transparency of our projects, we devise ethical frameworks and collective oversight for building and testing our own AI models, we don’t collect any personal user data to provide a safe harbour from ever-present algorithmically powered surveillance. Stretching our imagination to the damaging effects of facial recognition proliferation is a natural consequence not only of what we say we believe in but also of what we do.

What to do?

When discussing whether we should care that facial recognition training is built upon openly licensed photos I often sense that people are worried that showing interest will inevitably lead to a conclusion that the commons are the problem. Obviously they are not – I am convinced that their contribution to the world is of an overwhelmingly net positive value. The resolution of the “to care or not to care” dilemma is not to stop contributing to the commons.

What is more, the fact that we have a noble mission has to be weighed against our capacity to tackle the complexity of systemic problems. Surely, we cannot be responsible for all the bad that is happening online whenever anyone decides to use openly licensed content. The thing is that we have a unique chance to tackle a problem in a climate that is favourable to regulating it. It is neither too early – we already have evidence of bad and good applications of AI in almost any area of life. Nor is it too late, as the European Union has only begun debating a proposal of how to regulate AI.

“Wikimedians have an opportunity to frame the vocabulary of AI use and propagate the idea that it is not the technology that is good or bad but how we choose to use it should be regulated. “

The general objectives are pretty clear. Wikimedians have an opportunity to frame the vocabulary of AI use and propagate the idea that it is not the technology that is good or bad but how we choose to use it should be regulated. Even for face or voice recognition we can imagine an array of uses that assist those of us who are vision-impaired for example. The problem is that the revenue is in developing surveillance techniques and that the data that we provide to the world is employed to produce toxic results.

Regulating problematic AI uses

The European Commission proposed an AI Act that comes in handy as it determines AI uses that are unaccepted and rules for those that are considered high-risk. The first category includes a few cases where facial recognition may play a part, such as evaluation of trustworthiness based on social behaviour or personality, creating a societal score that is then used in social contexts unrelated to the primary data and real-time biometric identification. Public authorities would be forbidden to make use of AI in all these cases, where facial recognition is an indispensable tool. We need to discuss if this is enough or if there are any other uses of facial recognition AI that should make their way to this list.

There is also a dimension of the prohibition that concerns our projects directly. The data on the activity of our editors and contributors can become a part of a social scoring system or an evaluation of trustworthiness as well. A well-vetted list of forbidden AI uses also helps protect our communities.

Is AI a singularity point for open licensing?

On a different level, perhaps it is time to overview the open licensing system from the perspective of fundamental rights and to discuss whether there is a case to be made for an introduction of protection of people’s personal rights unrelated to copyright. Sure, this is not an easy problem to solve as these types of rights can be abused to protect public information for example. But if this is where the world is going with AI we need to follow with brainstorming remedies. When copyright was going from strength to strength, we didn’t just advise people to manage somehow, we gave them tools to do so.

Caring and actively devising ways to protect people from misuse of the imagery that portrays them, their families and their children may become important for the future uptake of open licensing. If people start associating open licensing with a gateway to an abuse of their rights and a tool aiding an oppressive system of control, they won’t use it. They may also object to others using it when it involves them as photo subjects.

Currently it seems that now the only way to somehow protect photos of people against the use for facial recognition training is to publish them under copyright. Then at least one can point to damages from a copyright breach when requesting a pull-out from such a database. It is certain that we wouldn’t want to bring about that result through our refusal to engage with the issue.