User:DFRBot

Hi, I'm DFRBot, a wikiBot run by DFRussia. I intend to become a multi-purpose bot that runs several automated and/or user-assisted algorithms for crawling and editing wikipedia. If I am going crazy, please post on my talk page and I will terminate what I am doing. All my algorithms are open source, and DFRussia will request permission for each new algorithm as it is developed. Currently I am waiting for approval to run my first algorithm, listed below:

articleCheck

Current version: 0.1

Latest release date: November 1st, 2007

First release data: November 1st, 2007

Status: awaiting approval

This is a data mining algorithm that simple takes one or more files and reads them line by line, checking if a given line is an article on wikipedia. If the line is an article, it returns a link to the operator, if it is not then the operator is notified. This algorithm is ment to be employed for such simple (but sometimes annoying) tasks as checking for notable people in a long list of people (for instance, the faculty of a university).

This algorithm is written in Python, using the pywikipedia framework. The program can be run from command line with any number of files given as arguments.

articleCheck.py

import sys
import string
import wikipedia

site = wikipedia.getSite()
existing = []
for arg in sys.argv[1:]:
    try: #try to open the file
        f = open(arg)
    except IOError: #file can not be opened
        print "The file (" + arg + ") could not be opened\n"
    else: #file has been opened
        print "STARTING " + arg + "\n"
        for line in f:
            line = line.strip() #strip opening and ending whitespace and trailing "\n"
            if wikipedia.Page(site, line).exists():
                existing.append("[[" + line + "]]")
            else:
                print "!!" + line + " does not exist"
        print "\nEXISTING results:"
        for link in existing:
            print link
        print "\nFINISHED " + arg

Content Disclaimer

Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.

  1. The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
  2. There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
  3. It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
  4. Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
  5. Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.