Wikipedia:Statistics server

Wikipedia:Statistics | Wikidemia >

There are many reasons to set up a dedicated statistics server, which exists only to pull down and analyse the latest project data dumps.

There may be be enough demand for a variety of stats to have both an internal stats server (which includes non-public data such as raw referrer and other logs) and an external stats server (which sits outside the Wikimedia clusters and includes only public data, but always has the latest updates and stats-scripts.

Most requested

  • Traffic stats : # of independent IPs, users, and sessions per day/week/month/year.
  • Page popularity : # of visits per page per day/week/month/year.
  • Page unpopularity : Orphaned, dead-end, or single-editor pages.

Broken special pages

  • Lonelypages (orphans), Ancientpages (not edited for years), &c -- many stop showing results past the first 1000, and have not evolved with the growth of large wikipedias.
  • Recentchanges -- hard to track back past more than 5000 changes
  • Longpages -- modify this page so that admins can flag special pages as "OK" so that the default view is of long pages that should not be that long.


Current goings-on:

  • Currently active topics : over 10 edits in the last 1|5|24|72 hours
  • Currently active blockers / deleters : over 3|10|30 blocks/deletes in the last 1|5|24|72 hours
  • Active newbies : new accounts with over 10|30|100 edits in the last 1|5|24|72 hours


New requests

  • # of pages on over 100 watchlists
  • # of pages on 0 watchlists [rather than worrying endlessly about 'letting spammers see [such a] list', it should be kept empty!]
  • Pages with 0 categories
    • Pages with over 10 categories
  • Pages that have been "temporarily" protected for over 3 days


External research requests

Many researchers working on WP data outside the projects -- sociologists, economists, computer scientists, authors, &c. -- have statistical needs that are currently hard to meet. Some of their requests include:

  • A way to purchase a hard-drive with a recent full DB dump, to be shipped via snail- or air-mail.
  • A way to submit stats queries that can be queued to run on a central server, either once or regularly over a few weeks/months
  • A way to add questions for WP users to be answered by at least a few dozen participants, and ideally more, targeting a reasonably random cross-section of {the community, the active editors, the anon editors, [other]}.

Content Disclaimer

Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.

  1. The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
  2. There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
  3. It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
  4. Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
  5. Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.