Research:Newcomer quality

From Meta, a Wikimedia project coordination wiki
This page documents a completed research project.


Recent research suggests that Wikipedia has been experiencing a decline in active editors since 2007 and implicates the decline on a dramatically reduced rate of newcomer retention [1][2]. This decline is concerning because active editors are the life-blood of Wikipedia. Without contributors, the encyclopedia would fall stagnant, or worse, the articles might lose value by succumbing to vandals.

In order to determine the cause of the decline and identify possible solutions, the Wikimedia foundation organized the 2011 Wikimedia summer of research (WSOR11), a series of research "sprints" conducted by 8 visiting researchers and the Community Dept. staff. WSOR11 identified several organizational barriers to newcomers' entry into Wikipedia. These barriers include:

  • A lack of clear instructions about where to ask for help. [3][4]
  • The increasing difficulty for newcomers to successfully contribute to the encyclopedia as it becomes more complete. [5]
  • Rejection of newcomer contributions which was shown to be a strong predictor of retention [6]
  • The growing rate of bot/tool/template based communication to newcomers.[7]

Together, these results and others were taken to suggest that the decrease in newcomer retention was caused by the hostility and helplessness experienced by newcomers. However, one important confound was not explored: What if the decline was caused by a decrease in the proportion of worthwhile newcomers? The growth in the rate of rejection for newcomer contribution during the exponential growth period of the encyclopedia could be the result of the decreasing quality of newcomers contributions. If this is true, the decreasing retention rate could be explained as an efficient culling of bad editors. This research project is designed to address this possible confound by directly measuring the quality of newcomers to the encyclopedia over time and relating their quality to the rate at which their first contributions to articles are rejected.

Methods[edit]

In order to determine how the quality of new users has changed over time, the quality of editors first contributions needed to be evaluated. Users from Wikipedia were randomly sampled such that each 6 month period between 2001 and 2011 contained 100 newcomers who performed their first article edit during that 6 month period. This method allows for comparison between the users of one time period to another and the observation of long-term trends.

To sample the quality of new editors upon entrance to Wikipedia, their first set of contributions was captured in the form of an edit session. An edit session is defined as a sequence of edits perfomed via a single user account where no two edits differ by more than one hour. We use this metric under the assumption that, if an editor stops editing for over on hour, they have either left Wikipedia or transitioned to reading from editing. Through the use of this metric, newcomers were observed to have performed an average of 3.67 edits in their first edit session.

A screenshot of the newcomer quality evaluation tool. Newcomer editors are fed into the coding tool along with their first edit session. A coder could save a category for an editor by selecting 1 of 4 category buttons: vandal, bad-faith, good-faith or golden.

To quickly evaluate the first session quality of users, a qualitative evaluation tool (pictured on right) was built that streamlined the process of viewing an editor's contributions and saving an evaluation. The tool was designed to hide any information about when the edits were made to reduce the possibility of temporal bias among volunteer coders.

5 coders (3 Wikipedian WMF fellows and 2 Academic Wikipedians) categorized newcomers into four levels of quality based on what they saw in their first edit sessions:

  • Vandal: Editors who are trying to harm other editors, Wikipedia or the subject of BLPs. e.g.:
    • Racial slurs
    • Blatant spam
    • Libel
    • Blatant spam
  • Bad-faith: Editors who are not trying to hurt anyone, but not being helpful either. e.g.:
    • Glamour article
    • Making fun of friends
    • Trying to be funny
    • Additions of nonsense
  • Good-faith: Editors who are trying to be some version of productive but not being successful. e.g.:
    • Test edits that explicitly say "test" or are self-reverted
    • Contributions of opinion
    • Creation of non-notable articles
  • Golden: Editors who are already contributing productively. e.g.:
    • Edits that make a valuable contribution to Wikipedia

A random sample of 100 newcomers were assessed by all coders to determine inter-rater reliability. Using Kendall's coefficient of concordance, the reliability between raters was found to be significantly concordant (W=0.413, p=0.002), but the concordance was lower than expected.

The edit ratings were therefore retroactively lumped together, and the two desirable categories (golden & other good-faith) compared to the two undesirable categories (vandal & other bad-faith). The concordance between these lumped categories was much higher:[8]

  • 93.6% of new editors were rated consistently by all five established editors
  • 4.6% of new editors were rated more highly by one (or two?[unclear]) established editors
  • 1.8% of new editors were rated more poorly by one (or two?[unclear]) established editors

Some of this improvement in concordance is due to a mere reduction in the number of categories.

Results and Discussion[edit]

The quality of newcomers[edit]

The Newcomer quality over time plot shows that, as expected, nearly all editors in Wikipedia were doing productive work when they first started editing until 2005. Between the first semester of 2005 and the first semester of 2006, we observe a substantial decline (61.2% to 40.9%) in the number of golden contributors, but also a smaller increase in the amount of good-faith contributors (24.3% to 42.0%). However, from the first semester of 2006 forward, the proportion of good-faith & golden editors appears to remain relatively consistent. Good-faith and golden newcomers represent 38.4% and 36.3% on average from 2006 forward respectively. This result suggests that any increase in rejection or decline in participation from the middle of the exponential growth period forward could not be attributed to a decrease in the quality of new editors.

The results suggest that established editors (n=5) generally agree on the intentions of a new editor towards the encyclopedia, but often disagree on whether they are actually succeeding in improving it. Semi-automated revert tools were brought in to deal with exponentially-increasing numbers of undesirable newcomers;[9] the results suggest that using these tools to reject undesirable newcomers may have little effect on the desirable ones, but that using them to reject the undesirable edits of good-faith newcomers may decrease retention.

Increasing rate of rejection[edit]

To explore whether Wikipedia was becoming a more hostile environment to newcomers, we plot the proportion of newcomers with at least one edit rejected (reverted or deleted) in their first edit session. The Newcomer rejection over time plot shows that good-faith editors experienced a significant jump in the rate of rejection (15.7% to 48.6%) between 2005 and 2006 (the middle of the exponential growth period of Wikipedia). This result suggests that good-faith newcomers (editors who are trying, but not succeeding at being productive) who joined Wikipedia during and after the exponential growth period were more likely to have their work rejected than editors who joined the project earlier.

The source of the decline[edit]

Regardless of the cause of the editor decline, it's interesting to know which editors are most substantially affected. After all, if the declining survival of editors could be accounted for by deviant editors, there shouldn't be concern. To explore the retention of the categorized groups of editors, we applied the "survival" metric used in [6]: an editor is considered to be surviving if they makes at least one edit between two and 6 months after their their session. The 6 month cutoff is used to discount a sunset effect on the data for more recent datapoints. The Newcomer survival over time figure plots the proportion of surviving editors over time for each of the four editor classes. While there appear to be no substantial changes in the survival of "vandal" and "bad-faith" editors, we observe a steady decline in the survival of "good-faith" from 35.7% in the first semester of 2004 to 11.3% in the second semester of 2010. "Golden" editors saw the most dramatic decrease in retention between the last semester of 2006 and the first semester of 2007 where the rate of survival went from 40.0% to 13.2% of newcomers.

Newcomer quality over time - The proportion of new users falling into each of the four quality categories is plotted over time. Error bars represent standard error based on the binomial approximation to a normal distribution. The proportion of new editors entering Wikipedia as good-faith or golden has remained relatively consistent since 2006.
Newcomer rejection over time - The proportion of new users with at least one first session edit rejected (reverted or deleted) is plotted over time for each of the four quality categories. Error bars represent standard error based on the binomial approximation to a normal distribution. The rate at which good-faith and golden editors are rejected increased substantially between the 2004-2007 growth period.
Newcomer survival over time - The proportion of new users that "survive" (make at least one edit 2-6 months after the first edit session). Error bars represent standard error based on the binomial approximation to a normal distribution. Between the end of 2006 and the beginning of 2007, the retention of golden editors sharply declined (40.0% to 13.2%).

Conclusion[edit]

This study refutes the hypothesis that the declining retention of new editors can be explained by a decline in the quality of newcomer contributions. The results show that, since the beginning of 2006 (middle of exponential growth) the proportion of new editors who are trying to edit productively has not decreased, but that the rate of rejection has increased, and the rate of survival among good faith editors has decreased.

Given the predictive power of rejection on retention of new editors [6], the increased rate of rejection observed in this analysis for good editors supports the hypothesis that the decline in editors could be at least partially due to a change in how good-faith newcomers are received by the community. The increased rate of rejection appears to mainly affect those newcomers that are editing in good-faith, but are not productive the first time they edit using their registered account. However, the most pronounced decrease in survival has affected editors who were already productive in their first edit session. This result suggest that, during the decline, the Wikipedia community has been failing to retain productive editors are the pre-exponential growth levels.

Limitations[edit]

This analysis was performed over the English Language Wikipedia. The results may not be generalizable to other language encyclopedias.

References[edit]

  1. Bongwon Suh, Gregorio Convertino, Ed H. Chi, and Peter Pirolli. 2009. The singularity is not near: slowing growth of Wikipedia. In Proceedings of the 5th International Symposium on Wikis and Open Collaboration (WikiSym '09). ACM, New York, NY, USA, , Article 8 , 10 pages. DOI=10.1145/1641309.1641322
  2. Research:New_user_help_requests
  3. Research:New_User_Participation_in_Help_Spaces
  4. Research:Newbie_reverts_and_article_length
  5. a b c Research:First edit session
  6. "Aaron Halfaker". www-users.cs.umn.edu. Retrieved 2016-05-04. 
  7. Halfaker, Aaron; Geiger, R. Stuart; Morgan, Jonathan; Riedl, John. "The Rise and Decline of an Open Collaboration System: How Wikipedia’s reaction to popularity is causing its decline" (PDF). 
  8. "Creating, Destroying, and Restoring Value in Wikipedia" (PDF). doi:10.1145/1316624.1316663.