Skip to content

Wikimedia Europe

Visual Portfolio, Posts & Image Gallery for WordPress

NASA Goddard Space Flight Center from Greenbelt, MD, USA, Public domain, via Wikimedia Commons

Michael S Adler, CC BY-SA 4.0, via Wikimedia Commons

Benh LIEU SONG (Flickr), CC BY-SA 4.0, via Wikimedia Commons

JohnDarrochNZ, CC BY-SA 4.0, via Wikimedia Commons

Markus Trienke, CC BY-SA 2.0, via Wikimedia Commons

Stefan Krause, Germany, FAL, via Wikimedia Commons

Charles J. Sharp, CC BY-SA 4.0, via Wikimedia Commons

Dimi Dimitrov

Data Act: A small step for databases, an even smaller step for the EU

Today, the European Commission has leaked its proposal for a “Data Act”, a piece of legislation that is supposed to include a revision of the Database Directive and the sui generis right for the creators of databases (SGR) it establishes. 

Read More »Data Act: A small step for databases, an even smaller step for the EU

DSA: Parliament adopts position on EU Content Moderation Rules

Yesterday the European Parliament adopted its negotiation position on the EU’s new content moderation rules, the so-called Digital Services Act. The version of the text prepared by the Committee on Internal Market and Consumer Protection (IMCO) was mostly adopted, but a few amendments were added. 

Read More »DSA: Parliament adopts position on EU Content Moderation Rules

The EU’s New Content Moderation Rules & Community Driven Platforms

The EU is working on universal rules on content moderation, the Digital Services Act (DSA). Its co-legislators, the European Parliament (EP) and the Council, have adopted their respective negotiating positions in breakneck time by Brussels standards. Next, they will negotiate a final version with each other.   
While the EP’s plenary vote on the DSA is up in January and amendments are still possible, most changes parliamentarians agreed upon will stay. We therefore feel that this is a good moment to look at what both houses are proposing and how it may reflect on community-driven projects like Wikipedia, Wikimedia Commons and Wikidata.

Read More »The EU’s New Content Moderation Rules & Community Driven Platforms

Editorial: The DSA debate after Haugen and before the trilogues

If the EU really wants to revamp the online world, it should start shaping legislation with the platform models in mind it likes to support, instead of just going after the ones it dislikes.

Whistleblowers are important. They often provide evidence and usually carry conversations forward. They might be able to open the debate to new audiences. I am grateful to  Frances Haugen for having the courage to speak and the energy to do it over and over again across countries, as the discussion is indeed global. 

On the other hand the hearings didn’t reveal anything completely new, we didn’t learn something we didn’t already know. We live in a time where the peer-to-peer internet has essentially been replaced by a network of platforms, which, in their overwhelming majority, are for-profit, data-collecting and indispensable in everyday life. 

Read More »Editorial: The DSA debate after Haugen and before the trilogues

Meet “ClueBot NG”, an AI Tool to tackle Wikipedia vandalism

There are many bots on Wikipedia, computer-controlled  “user accounts” that perform simple, repetitive, maintenance-related tasks. Most are simple, trained to fix typos or using a list of blacklisted words to determine vandalism. ClueBot NG uses a combination of different detection methods which use machine learning at their core.

Bots on Wikipedia

A bot (a common nickname for a software robot) is an automated tool that carries out repetitive and mundane tasks. Bots are used to maintain different Wikimedia projects across language versions. Bots are able to make edits very rapidly, but can disrupt Wikipedia if they are incorrectly designed or operated. False positives are an issue as well. For these reasons, a bot policy has been developed.There are currently 2,534 bot tasks approved for use on the English Wikipedia; however, not all approved tasks involve actively carrying out edits. Bots will leave messages on user talk pages if the action that the bot has carried out is of interest to that editor. There are 323 bots flagged with the “bot” flag right now (and over 400 former bots) on English Wikipedia. On Bulgarian Wikipedia, a much smaller language version, there are currently 106 bot accounts, but only a number of them are active. Projects by smaller communities sometimes need to rely more on machines for page maintenance.

Read More »Meet “ClueBot NG”, an AI Tool to tackle Wikipedia vandalism

Wikimedia Projects & AI: Designing a “Section Recommendation” tool without reinforcing biases

There is an idea to use a  “section recommendation” feature to help editors write articles by suggesting possible sections to be added. But it is possible that its recommendations inadvertently increase gender bias. Here’s how we could deal with it.

Read More »Wikimedia Projects & AI: Designing a “Section Recommendation” tool without reinforcing biases

Wikimedia Projects & AI Tools: Vandalism Detection

There is a machine learning service available to interested Wikimedia projects and communities called ORES. It aims to recognise if an edit, for instance on Wikipedia, is damaging or done in good faith. Of course, false predictions cannot be avoided and thus remain a major risk. Here’s how we try to handle it.  

Read More »Wikimedia Projects & AI Tools: Vandalism Detection

DSA in imco: Three amendments we like and one that surprised us

Just before the summer recess, the European Parliament’s Internal Market and Consumer Protection committee released over 1300 pages of amendments to the EU’s foremost content moderation law. It took the summer to delve into the suggestions and are ready to kick off the new Parliamentary season by sharing some thoughts on them. Our main focus remains on how responsible communities can continue to be in control of online projects like Wikipedia, Wikimedia Commons and Wikidata.

1. The Greens/EFA on “manifestly illegal content”

AM 691 by Alexandra Geese on behalf of the Greens/EFA Group

Article 2 – paragraph 1 – point g a (new)

‘manifestly illegal content’ means any information which has been subject of a specific ruling by a court or administrative authority of a Member State or where it is evident to a layperson, without any substantive analysis, that the content is in not in compliance with Union law or the law of a Member State;

Almost any content moderation system will require editors or service providers to assess content and make ad-hoc decisions on whether something is illegal and therefore needs to be removed or not. Of course, things aren’t always black-and-white and sometimes it takes a while to make the right decision, like with leaked images of Putin’s Palace. Other times it is immediately clear that something is an infringement, like a verbatim copy of a hit song, for instance. In order to recognise these differences the DSA rightfully uses the term “manifestly illegal”, but if fails to actually give a definition thereof. We agree with Alexandra Geese and the Greens/EFA Group that the wording of Recital 47 should make it into the definitions. 

Read More »DSA in imco: Three amendments we like and one that surprised us

Data Governance Act: Good Intentions, Bad Definitions

The European Commission wants more European data (public, private and personal) to be shared for the purposes of innovation, research and business. It also wants to avoid a system where only a few large platforms control all the data. It thus wants to create mechanisms and tools to get there. That’s commendable! What the Commission  proposes in the Data Governance Act (DGA), though, is at times very unclear.

Here is a breakdown of the European Commission proposals by sector, peppered with our take on some relevant aspects and support for some European Parliament and Council amendments. 

Public Sector Data

DGA creates a mechanism for re-using protected public sector data (e.g. because of privacy rules, statistical confidentiality or IP) . Public sector bodies are to establish secure environments where data can be mined within the institution. Anonymised data could be provided through outside of the institution, if the re-use can’t happen within its infrastructure. 

Read More »Data Governance Act: Good Intentions, Bad Definitions

Takedown Notices and Community Content Moderation: Wikimedia’s Latest Transparency Report

In the second half of 2020 the Wikimedia Foundation received 380 requests for content alteration and takedown. Two were granted. This is because our communities do an outstanding job in moderating the sites. Something the Digital Services Act negotiators should probably have in mind.

See the organisational chart in full here

Wikipedia is a top 10 website globally anyone can edit and upload content to. Its sister projects host millions of files uploaded by users. Yet, all these projects together triggered only 380 notices. How in the world is this possible?

Read More »Takedown Notices and Community Content Moderation: Wikimedia’s Latest Transparency Report