Equitable Governance in Multilingual Wikipedia

January 19, 2010, [MD]

I have been thinking about the issue of "equitable access to governance in globally distributed multilingual organizations" for several years now. That's a mouth-full, but basically the idea is that you have organizations like Wikipedia, the KDE project (an open-source desktop) or iCommons. Although these kind of organizations are often legally based in the US or another given country (KDE is in Germany), anyone are invited to contribute from around the world, and they often also have chapters based on language or nationality, where people can make huge contributions without speaking English (Wikipedia is the classic example).

However, even though each Wikipedia version has a lot of freedom to set its own policies, it is still part of a much larger movement. The decisions on movement-wide changes, on what the Wikimedia Foundation should work on, etc, are mostly conducted in English. Whether it be on mailing lists, discussion pages, or international conferences. This excludes many people from participating, however solving it is a very thorny question. When I was invited to theCritical Point of View: WikiWars conference in Bangalore recently (see my tweets), I thought it would be a good chance for me to think more deeply and constructively about these issues, and see if I could come up with some suggestions.

This is doubly relevant because of the Peer2Peer University, which is planning to expand to offer courses in more languages. How can we ensure that the course-organizers and students in those courses still feel like empowered parts of the community?

With this presentation, I try to make two points. First of all, to make people cognizant that there is a problem. Secondly, to realize that there is no simple solution, but that we might be able to mitigate the problem, if we put our best minds to it.

I have embedded below the presentation I gave (15 minutes), with synched audio. You can also download only the audio MP3 (5MB). Video will be published later, and I'll link to it at that time. Below the presentation (mostly below the fold) I've also included my detailed "note of interest" for the conference. We are hoping to develop this further into a book-chapter or a part of a book-chapter, so I would love any kind of feedback, more ideas, or criticism from people!

Equitable Governance in Multilingual Wikipedia

View more documents from houshuang.

Equitable Governance in Multilingual WikipediaDetailed interest note for WikiWars 2010 by Stian Håklev Wikipedia began as an English-language project, but rapidly became a very international project, with editions in 270+ languages. Initially, each language was hosted on the same website, but soon, individual wikis were set up for each language. Thus, each language fostered its own community, with all communication happening in the given language on a localized Mediawiki platform. When visiting different Wikipedias, it is interesting to see the different community norms and standards that have emerged, for example different criteria for featured articles, notability, gaining admin status, etc.

Thus we have communities around the world using Wikipedia in their own language, spending large amounts of time and energy contributing to articles, and also participating actively in governance and discussion in their community. However, the different Wikipedias are also impacted by decisions made at the central level, that cover all the different Wikipedias. To what extent is this discussion and governance process accessible to users who do not speak English well?

Methods of communication for central coordination of Wikipedia (Wikimedia)Wikipedia is a part of the larger Wikimedia family of projects, which also includes projects like Wikiversity, Wiktionary, Commons, etc. Discussion for all of these projects, and across all languages, is supposed to happen on Meta. Meta is supposed to be multilingual, however as of 2004, 85% of the material posted there was in English. I don’t have newer figures, but despite the fact that there are facilities for translating Wikipages on Meta-Wiki, and some projects and standards for promoting translation, anecdotal evidence suggests that translation is very spotty, if available at all.

As an example, if I enter the front page of http://meta.wikimedia.org, I will see a list of “Meta in many languages”. Let us assume that I am a Japanese user - Japanese is the second most popular Wikipedia in terms of hits, with 1,3 million hits per hour (as of Nov 30, 2009), and English knowledge among Japanese is often limited. I would select to see Meta in Japanese, and I receive a very deceptively nice Japanese page with a number of links, to ongoing discussions about new projects, requests for adminship, links to mailing lists and IRC channels, news, etc. However, with very few exceptions, when I click on any of these links, the resulting page is in English. Thus if I as a Japanese user, who have made large an important contributions to both the content and the governance of the Japanese Wikipedia, decide that I want to participate in the central coordination, I have no venue except to learn English.

As mentioned above, there are also IRC channels, and a large number of mailing lists, including foundation-l, which discusses the legal and organizational issues facing the Wikimedia foundation, and wikitech-l, which is the main forum for coordinating development of the Mediawiki platform that all Wikipedias run on. These international mailing lists are all in English. There are also local mailing lists for each language, and in a few cases, regional mailing lists covering adjacent languages (there was a Scandinavian mailing list, for example, but it seems defunct).

Physical meetingsEvery year, the Wikimedia Foundation hosts an international conference called Wikimania. This is a place for all the different projects to come together and discuss their experiences, learn from each other, and also discuss issues facing Wikipedia and other Wikimedia projects going forwards. The official presentations at these conferences are usually all held in English, although sometimes there are parallel tracks in one other language, reflecting the location the conference is hosted (for example, Spanish in Latin America, and Chinese in Taiwan).

The informal communications happening outside of the planned tracks can of course be in any language, but for monolinguals, that is not of much help. An added challenge is that listening comprehension and ability to express ideas coherently orally is typically much more difficult to attain, than an ability to follow an argument in a written format. This means that even people with medium English-language skills can feel excluded from these discussions, may lack the self-confidence to speak up and contribute to a discussion, etc.

Problem statement

Thus, let me summarize. Wikipedia is a community project that strives to be democratic and participatory, both in the day to day minor decisions and management, but also in the larger policy issues. It is also a thoroughly international community, with a large percentage of users who do not speak English, or do not speak English well. Currently those users are able to participate fully at the individual Wikipedia level, but for all global discussions around the project as a whole - whose impact may often be felt in the day-to-day operation of the individual Wikipedias as well - are almost entirely conducted in English.

Wikipedia management and governance is a complicated issue, and there are certainly many who speak English perfectly, and still feel that participating is very difficult. It can be difficult to keep track of all the Talk pages and discussions going on at any point, many cannot afford to travel to Wikimania, or just prefer to spend their time contributing to the actual contents, rather than on long debates that might not lead to anything. Thus, this paper is not at all arguing that language skills present the only barrier to realizing full community participation. Much work should be done on discussing Wikipedia governance (and governance of large volunteer online projects) in general. However, language presents an additional obstacle that is almost insurmountable, even for the most savvy local administrator.

SolutionsIt is not the intention of this paper to propose any finished solutions, because this is a thorny problem that can probably never be solved to a hundred percent satisfaction. Thus, the goals of this paper are two-fold. First of all, to raise awareness of this important issue, and second to begin a discussing, and propose some tentative ideas for ways to mitigate the issue.

To discuss this topic in a systematic fashion, it would have been useful to have some kind of a map of the different Wikipedia decision/participation processes. We would need to separate between day-to-day decisions (many of these occur at the individual Wiki-level, but there are others that may occur at a more global level - for example, which new Wiki to approve, who should be given admin, etc) and participating in the larger policy debate about Wikipedia’s direction (such as whether to shift to the Creative Commons license, whether to allow non-commercial pictures in Commons, whether to allow ads, etc).

It would also be very helpful to understand more about the demographics of those active in the local Wikipedias. Some of the people active in non-English Wikipedias speak excellent English (for example many people editing the Hindi Wikipedia might be students and professors in the US, but even Norwegians living in Norway will typically have little problem speaking English). Another large group have some ability to read/write English, but might still have problems participating fully. A third group have extremely limited or no English skills at all.

Below I will try to sketch out two categories of support that can be given to the two last categories, first for online communications, and then for conferences. This is very preliminary, and I am hoping for more feedback and ideas in this area.

Online communications CrutchesFor people who do have some knowledge of English, but still have problems fully participating, we might be able to provide some “crutches” to help them participate more fully. This is exactly my position in Chinese -- I speak Chinese, and can read/write it, but read very slowly, and sometimes have problems formulating myself, etc. What would these crutches look like -- what are the issues people with limited English skills face?

Reading comprehension is one, but the Internet has done a lot alleviate this, with all kinds of online dictionaries and reading aids, etc. However, another problem that I constantly run into, is that my reading speed in Chinese is so extremely much lower than my reading speed in English. In addition, I am not able to “skim” in Chinese. Only when you loose this ability, do you realize how much information you process habitually when scanning websites looking for interesting news, etc.

Thus, things that might be helpful to enable this category to participate more actively online, would be providing summaries of discussions in different languages, translating headlines of debates, etc. If someone can quickly scan through a list in their own language to see if there are any issues that interest them, they can then spend the time it takes to work through the text in English on that topic, using the online tools to their disposal. But without such an “indexing” service, it can seem almost overwhelming to even begin reading through the page.

People with low or no English abilitiesFor these people, it’s not a question of helping them to read English, it’s offering an alternative -- ie. translation. However, how can this possibly be done by an organization that depends on voluntary labor, given the amount of languages to cover, and the amount of text that would need to be processed (much of which, it can seem, just consists of endless back-and-forths, that might be less than inspiring for even a motivated translator). There are a few possible ways of making this easier: we can reduce the amount of languages, reduce the amount of text, or make the translation work less demanding itself.

Reducing number of languagesAlthough there might be thousands of languages in the world, and several hundred represented on Wikipedia, many of these languages are closely related. To take my own language Norwegian, most Norwegians would be quite comfortable reading in Norwegian, Swedish or Danish, but would only themselves be able to write in Norwegian.

If we could thus determine that these three languages never had to be translated into each other, then everybody could write in Norwegian, Swedish or Danish, and content in other languages would be translated into one of these three. That already lightens the load considerably. There would need to be a lot of discussion about which languages this were possible for, some obvious that come to mind are Lao and Thai, Malay and Indonesian, Serbian and Croatian, Finnish and Estonian.

The North-Indian languages are written in different scripts, but using Google’s great transliteration engine you can now transliterate Bengali into Hindi (the words will still be somewhat different, but it’s possible that educated Indians would be comfortable with it). Perhaps Italian, Spanish and French (and Rumanian)? could be folded into one, etc. As I said, this is something that would have to be determined in consultation with all the language communities, although one would have to try to avoid too much politics interfering, people who refuse to understand their neighbours language for nationalistic reasons, etc.

One could probably also exclude a number of languages of countries where people are known to generally speak very good English (although we have to be very careful with this to avoid including only the elites). This is not a perfect or fair system, but we are not the UN and cannot hire thousands of translators, this is an attempt at ameliorating the existing sub-par situation.

Reducing amount of contentEven if there were people that were willing to make a contribution in translating a certain amount of material - how to choose? How to determine what is important and what is not? This is related to the above point in crutches, and figuring this out, might make participation in governance easier for everyone. We could also be better at parsing out important discussion questions into the various linguistic communities for them to discuss within themselves, and then have a summary brought to the international stage (although this could also be criticized for not enabling direct access to the international discussion.

ConferencesConferences are in many ways more difficult than online communications. However, there are certainly many possibilities for “crutches”, whether one gets the Powerpoints translated ahead of time, and lets people download them (it’s much easier to follow along if you know what someone is talking about), or perhaps someone is typing summaries of what is being said on IRC in realtime. There is a lot of room for experimentation here. A final possibility would be to better integrate all the different local and regional conferences that are held, by trying to have the results or summaries published, and translated, having people blogging in different languages about what is going on, etc.

Conclusion I hope I have convinced you that this is a really important question, and that although it is very thorny and does not have an easy solution, there are many ways in which we could improve on the existing condition. However, it’s imperative that we do this in consultation with all those who need this facilitation, and don’t “assume” what their needs might be. I look forward to meeting everyone at the conference, and hearing all of your ideas and feedback!

Stian Håklev January 19, 2010 Toronto, Canada
comments powered by Disqus