33 Commits

Author SHA1 Message Date
33540344b0 Update jogobot submodule 2016-09-25 18:17:12 +02:00
1958ec222f Add a README.md
To have a basic description of this repo
2016-09-25 16:56:10 +02:00
f2d431ab84 Merge branch 'fs#67-more-detailed-logs' into test-v7 2016-09-25 15:10:27 +02:00
31d06224b0 Update file headers 2016-09-24 21:32:02 +02:00
51d8bb9da9 Read Edit Summary from config
To be able to change the Edit-Summary without touching the source code
2016-09-24 21:29:32 +02:00
f3635b2458 Log year change related actions
Improve logging related to atomatically changed years in list title

[https://fs.golderweb.de/index.php?do=details&task_id=67|FS#67]
2016-08-22 16:45:54 +02:00
962e0cb4de Notice End of Task in Log
Showing end of task in log will help to detect unexpectedly terminated
runs

[https://fs.golderweb.de/index.php?do=details&task_id=67|FS#67]
2016-08-22 16:43:45 +02:00
8948fcc78d Output log each parsed page and revision
To improve quality of log

[https://fs.golderweb.de/index.php?do=details&task_id=67|FS#67]
2016-08-22 15:43:25 +02:00
2f022d9d30 Call pywikibot.handle_args before jogobot.status
To prevent pywikibot outputting a warning because of creating site
objects before handling args
2016-07-16 16:00:30 +02:00
56701107db Jogobot module updated 2016-07-11 23:41:26 +02:00
7ccfb90888 Updated jogobot submodule 2016-07-09 20:19:20 +02:00
22a2cc5799 Merge branch 'fs#33-charts.py-abords-with-error' into test-v6 2016-03-09 17:26:12 +01:00
9d471bee20 Bug in function to detect the year from Pagetitle, returning whole title
Missing param added
Explicit int casting will throw errors in future if regex fails
2016-03-09 17:24:00 +01:00
16a774fae5 Merge branch 'CountryList-Entry-Title-SortKeyName' into test-v6 2016-02-25 17:52:10 +01:00
038dd6e36a SortKeyName should be used for Interpret not for Title 2016-02-25 17:48:46 +01:00
e468260f7f Merge branch 'unittest-countrylist' into test-v6
Conflicts:
	countrylist.py
2016-02-25 17:08:28 +01:00
da99dee429 Merge branch 'CountryList-Entry-Title-SortKeyName' into test-v6 2016-02-25 17:05:52 +01:00
b96c5d4a33 Handle SortKeyName and SortKey Template in Title 2016-02-25 17:05:04 +01:00
73bf26b627 Merge branch 'jogobot-StatusAPI' into test-v6 2016-02-25 16:27:28 +01:00
df2f13fb66 Update jogobot 2016-02-25 16:26:10 +01:00
7b27577915 Remove provisonal onwiki activation 2016-02-23 13:58:46 +01:00
d76f914615 Use JogoBot StatusAPI to check if Bot/Task is active 2016-02-23 13:57:56 +01:00
d9d385cfe8 Rename chartsbot.py to charts.py to get filename same as task_slug for jogobot-module 2016-02-23 11:40:15 +01:00
2076932cbf Merge branch 'improve-output' into test-v6
(@see https://fs.golderweb.de/index.php?do=details&task_id=20)
2016-02-23 11:35:12 +01:00
9fe1c36482 Merge branch 'test-v5' 2016-02-23 11:31:39 +01:00
c730d9ba9c Output diff also in verbose mode 2016-02-23 11:21:40 +01:00
3ed67431cf Use jogobot-framework as submodule to get a specific state (instead of directly use development dir as python module)
Use jogobot.output as wrapper for pywikiot outputs
2016-02-22 11:05:32 +01:00
287942e174 Merge branch 'remove-refs' into improve-output
Get recent changes before going on
2016-02-18 19:13:31 +01:00
4de2116717 Add possibility to manually check against any page in dewiki 2015-11-28 18:17:19 +01:00
3349c9f3d3 Add __str__-method to CountryList-class 2015-11-28 18:16:04 +01:00
a250074caa CountryList-module: Search current year via regex to also make parsing older lists possible 2015-11-28 17:26:27 +01:00
581e043255 Add unitest to CountryList-Modul 2015-11-28 13:42:32 +01:00
e932303c40 improve-output: Only show diff in interactive mode without -always flag 2015-11-27 14:10:33 +01:00
8 changed files with 312 additions and 72 deletions

2
.gitignore vendored
View File

@@ -62,3 +62,5 @@ target/
# Test # Test
test.py test.py
disabled

4
.gitmodules vendored Normal file
View File

@@ -0,0 +1,4 @@
[submodule "jogobot"]
path = jogobot
url = git@github.com:golderweb/wiki-jogobot-core.git
branch = test-v1

21
README.md Normal file
View File

@@ -0,0 +1,21 @@
# wiki-jogobot-charts
This is a [Pywikibot](https://www.mediawiki.org/wiki/Manual:Pywikibot) based [Wikipedia Bot](https://de.wikipedia.org/wiki/Wikipedia:Bots)
of [User:JogoBot](https://de.wikipedia.org/wiki/Benutzer:JogoBot) on the
[German Wikipedia](https://de.wikipedia.org/wiki/Wikipedia:Hauptseite).
On [JogoBots wikipedia user page](https://de.wikipedia.org/wiki/Benutzer:JogoBot/Charts) a more detailed description can be found.
## Requirements
* Python 3.4+ (at least it is only tested with those)
* pywikibot-core 2.0
* [jogobot-core module](https://github.com/golderweb/wiki-jogobot-core) used as submodule
* [Isoweek module](https://pypi.python.org/pypi/isoweek)
## Bugs
[wiki-jogobot-charts on fs.golderweb.de (de)](https://fs.golderweb.de/proj20)
## License
GPLv3+
## Author Information
Copyright 2016 Jonathan Golder <jonathan@golderweb.de>

View File

@@ -3,7 +3,7 @@
# #
# __init__.py # __init__.py
# #
# Copyright 2015 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2016 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by

View File

@@ -1,7 +1,7 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
# -*- coding: utf-8 -*- # -*- coding: utf-8 -*-
# #
# chartsbot.py # charts.py
# #
# original version by: # original version by:
# #
@@ -11,7 +11,7 @@
# #
# modified by: # modified by:
# #
# Copyright 2015 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2016 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by
@@ -46,10 +46,14 @@ The following parameters are supported:
import locale import locale
import os
import sys
import pywikibot import pywikibot
from pywikibot import pagegenerators from pywikibot import pagegenerators
import jogobot
from summarypage import SummaryPage from summarypage import SummaryPage
# This is required for the text that is shown when you run this script # This is required for the text that is shown when you run this script
@@ -87,31 +91,41 @@ class ChartsBot( ):
# Force parsing of countrylist # Force parsing of countrylist
self.force_reload = force_reload self.force_reload = force_reload
# Set the edit summary message # Output Information
jogobot.output( "Chartsbot invoked" )
# Save pywikibot site object
self.site = pywikibot.Site() self.site = pywikibot.Site()
self.summary = "Bot: Aktualisiere Übersichtsseite Nummer-eins-Hits"
# Define edit summary
self.summary = jogobot.config["charts"]["edit_summary"].strip()
# Make sure summary starts with "Bot:"
if not self.summary[:len("Bot:")] == "Bot:":
self.summary = "Bot: " + self.summary.strip()
# Set locale to 'de_DE.UTF-8' # Set locale to 'de_DE.UTF-8'
locale.setlocale(locale.LC_ALL, 'de_DE.UTF-8') locale.setlocale(locale.LC_ALL, 'de_DE.UTF-8')
# provisional-onwiki-activation
page_active = pywikibot.Page( self.site, "Benutzer:JogoBot/active" )
text_active = page_active.get()
if "true" not in text_active.lower():
pywikibot.output( "Bot ist deaktiviert!" )
return False
def run(self): def run(self):
"""Process each page from the generator.""" """Process each page from the generator."""
# Count skipped pages (redirect or missing)
skipped = 0
for page in self.generator: for page in self.generator:
self.treat(page) if not self.treat(page):
skipped += 1
if skipped:
jogobot.output( "Chartsbot finished, {skipped} page(s) skipped"
.format( skipped=skipped ) )
else:
jogobot.output( "Chartsbot finished successfully" )
def treat(self, page): def treat(self, page):
"""Load the given page, does some changes, and saves it.""" """Load the given page, does some changes, and saves it."""
text = self.load(page) text = self.load(page)
if not text: if not text:
return return False
################################################################ ################################################################
# NOTE: Here you can modify the text in whatever way you want. # # NOTE: Here you can modify the text in whatever way you want. #
@@ -126,7 +140,9 @@ class ChartsBot( ):
text = sumpage.get_new_text() text = sumpage.get_new_text()
if not self.save(text, page, self.summary, False): if not self.save(text, page, self.summary, False):
pywikibot.output(u'Page %s not saved.' % page.title(asLink=True)) jogobot.output(u'Page %s not saved.' % page.title(asLink=True))
return True
def load(self, page): def load(self, page):
"""Load the text of the given page.""" """Load the text of the given page."""
@@ -134,27 +150,31 @@ class ChartsBot( ):
# Load the page # Load the page
text = page.get() text = page.get()
except pywikibot.NoPage: except pywikibot.NoPage:
pywikibot.output(u"Page %s does not exist; skipping." jogobot.output( u"Page %s does not exist; skipping."
% page.title(asLink=True)) % page.title(asLink=True), "ERROR" )
except pywikibot.IsRedirectPage: except pywikibot.IsRedirectPage:
pywikibot.output(u"Page %s is a redirect; skipping." jogobot.output( u"Page %s is a redirect; skipping."
% page.title(asLink=True)) % page.title(asLink=True), "ERROR" )
else: else:
return text return text
return None return False
def save(self, text, page, comment=None, minorEdit=True, def save(self, text, page, comment=None, minorEdit=True,
botflag=True): botflag=True):
"""Update the given page with new text.""" """Update the given page with new text."""
# only save if something was changed (and not just revision) # only save if something was changed (and not just revision)
if text != page.get(): if text != page.get():
# Show the title of the page we're working on.
# Highlight the title in purple. # Show diff only in interactive mode or in verbose mode
pywikibot.output(u"\n\n>>> \03{lightpurple}%s\03{default} <<<" if not self.always or pywikibot.config.verbose_output:
% page.title())
# show what was changed # Show the title of the page we're working on.
pywikibot.showDiff(page.get(), text) # Highlight the title in purple.
pywikibot.output(u'Comment: %s' % comment) jogobot.output( u">>> \03{lightpurple}%s\03{default} <<<"
% page.title())
# show what was changed
pywikibot.showDiff(page.get(), text)
jogobot.output(u'Comment: %s' % comment)
if self.always or pywikibot.input_yn( if self.always or pywikibot.input_yn(
u'Do you want to accept these changes?', u'Do you want to accept these changes?',
@@ -165,17 +185,17 @@ class ChartsBot( ):
page.save(summary=comment or self.comment, page.save(summary=comment or self.comment,
minor=minorEdit, botflag=botflag) minor=minorEdit, botflag=botflag)
except pywikibot.LockedPage: except pywikibot.LockedPage:
pywikibot.output(u"Page %s is locked; skipping." jogobot.output( u"Page %s is locked; skipping."
% page.title(asLink=True)) % page.title(asLink=True), "ERROR" )
except pywikibot.EditConflict: except pywikibot.EditConflict:
pywikibot.output( jogobot.output(
u'Skipping %s because of edit conflict' u'Skipping %s because of edit conflict'
% (page.title())) % (page.title()), "ERROR")
except pywikibot.SpamfilterError as error: except pywikibot.SpamfilterError as error:
pywikibot.output( jogobot.output(
u'Cannot change %s because of spam blacklist \ u'Cannot change %s because of spam blacklist \
entry %s' entry %s'
% (page.title(), error.url)) % (page.title(), error.url), "ERROR")
else: else:
return True return True
return False return False
@@ -190,43 +210,65 @@ def main(*args):
@param args: command line arguments @param args: command line arguments
@type args: list of unicode @type args: list of unicode
""" """
# Process global arguments to determine desired site # Process global arguments to determine desired site
local_args = pywikibot.handle_args(args) local_args = pywikibot.handle_args(args)
# This factory is responsible for processing command line arguments # Get the jogobot-task_slug (basename of current file without ending)
# that are also used by other scripts and that determine on which pages task_slug = os.path.basename(__file__)[:-len(".py")]
# to work on.
genFactory = pagegenerators.GeneratorFactory()
# The generator gives the pages that should be worked upon.
gen = None
# If always is True, bot won't ask for confirmation of edit (automode) # Before run, we need to check wether we are currently active or not
always = False try:
# Will throw Exception if disabled/blocked
jogobot.is_active( task_slug )
# If force_reload is True, bot will always parse Countrylist regardless of except jogobot.jogobot.Blocked:
# parsing is needed or not (type, value, traceback) = sys.exc_info()
force_reload = False jogobot.output( "\03{lightpurple} %s (%s)" % (value, type ),
"CRITICAL" )
# Parse command line arguments except jogobot.jogobot.Disabled:
for arg in local_args: (type, value, traceback) = sys.exc_info()
if arg.startswith("-always"): jogobot.output( "\03{red} %s (%s)" % (value, type ),
always = True "ERROR" )
elif arg.startswith("-force-reload"):
force_reload = True
else:
genFactory.handleArg(arg)
if not gen: # Bot/Task is active
gen = genFactory.getCombinedGenerator()
if gen:
# The preloading generator is responsible for downloading multiple
# pages from the wiki simultaneously.
gen = pagegenerators.PreloadingGenerator(gen)
bot = ChartsBot(gen, always, force_reload)
if bot:
bot.run()
else: else:
pywikibot.showHelp() # This factory is responsible for processing command line arguments
# that are also used by other scripts and that determine on which pages
# to work on.
genFactory = pagegenerators.GeneratorFactory()
# The generator gives the pages that should be worked upon.
gen = None
# If always is True, bot won't ask for confirmation of edit (automode)
always = False
# If force_reload is True, bot will always parse Countrylist regardless
# if parsing is needed or not
force_reload = False
# Parse command line arguments
for arg in local_args:
if arg.startswith("-always"):
always = True
elif arg.startswith("-force-reload"):
force_reload = True
else:
pass
genFactory.handleArg(arg)
if not gen:
gen = genFactory.getCombinedGenerator()
if gen:
# The preloading generator is responsible for downloading multiple
# pages from the wiki simultaneously.
gen = pagegenerators.PreloadingGenerator(gen)
bot = ChartsBot(gen, always, force_reload)
if bot:
bot.run()
else:
pywikibot.showHelp()
if( __name__ == "__main__" ): if( __name__ == "__main__" ):
main() main()

View File

@@ -3,7 +3,7 @@
# #
# countrylist.py # countrylist.py
# #
# Copyright 2015 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2016 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by
@@ -25,6 +25,7 @@
Provides a class for handling charts list per country and year Provides a class for handling charts list per country and year
""" """
import re
import locale import locale
from datetime import datetime from datetime import datetime
@@ -33,6 +34,8 @@ from isoweek import Week
import pywikibot import pywikibot
import mwparserfromhell as mwparser import mwparserfromhell as mwparser
import jogobot
class CountryList(): class CountryList():
""" """
@@ -97,15 +100,15 @@ class CountryList():
def find_year( self ): def find_year( self ):
""" """
Try to find the year related to CountryList Try to find the year related to CountryList using regex
""" """
self.year = datetime.now().year match = re.search( r"^.+\((\d{4})\)", self.page.title() )
# Check if year is in page.title, if not try last year # We matched something
if str( self.year ) not in self.page.title(): if match:
self.year -= 1 self.year = int(match.group(1))
# If last year does not match, raise YearError
if str( self.year ) not in self.page.title(): else:
raise CountryListError( "CountryList year is errorneous!" ) raise CountryListError( "CountryList year is errorneous!" )
def parse( self ): def parse( self ):
@@ -113,6 +116,9 @@ class CountryList():
Handles the parsing process Handles the parsing process
""" """
# Set revid
self.revid = self.page.latest_revision_id
# Parse page with mwparser # Parse page with mwparser
self.generate_wikicode() self.generate_wikicode()
@@ -127,6 +133,10 @@ class CountryList():
# For easy detecting wether we have parsed self # For easy detecting wether we have parsed self
self.parsed = True self.parsed = True
# Log parsed page
jogobot.output( "Parsed revision {revid} of page [[{title}]]".format(
revid=self.revid, title=self.page.title() ) )
def detect_belgian( self ): def detect_belgian( self ):
""" """
Detect wether current entry is on of the belgian (Belgien/Wallonien) Detect wether current entry is on of the belgian (Belgien/Wallonien)
@@ -350,6 +360,45 @@ missing!" )
for ref in self._interpret_raw.ifilter_tags(matches="ref"): for ref in self._interpret_raw.ifilter_tags(matches="ref"):
self._interpret_raw.remove( ref ) self._interpret_raw.remove( ref )
# Handle SortKeyName and SortKey
for template in self._interpret_raw.ifilter_templates(
matches="SortKey" ):
if template.name == "SortKeyName":
# Differing Link-Destination is provided as param 3
if template.has(3):
# Construct link out of Template, Params:
# 1 = Surname
# 2 = Name
# 3 = Link-Dest
interpret_link = mwparser.nodes.wikilink.Wikilink(
str(template.get(3).value),
str(template.get(1).value) + " " +
str(template.get(2).value) )
# Default Link-Dest [[Surname Name]]
else:
interpret_link = mwparser.nodes.wikilink.Wikilink(
str(template.get(1).value) + " " +
str(template.get(2).value) )
# Replace Template with link
self._interpret_raw.replace( template, interpret_link )
# SortKey
else:
# Replace SortKey with text from param 2 if present
if template.has(2):
self._interpret_raw.replace( template,
template.get(2).value)
# Else Remove SortKey (text should follow behind SortKey)
else:
self._interpret_raw.replace( template, None)
# Normally won't be needed as there should be only one
# SortKey-Temlate but ... its a wiki
break
# Remove whitespace # Remove whitespace
self._interpret_raw = str(self._interpret_raw).strip() self._interpret_raw = str(self._interpret_raw).strip()
else: else:
@@ -409,6 +458,23 @@ missing!" )
else: else:
return str(keywords[0]) return str(keywords[0])
def __str__( self ):
"""
Returns str repression for Object
"""
if self.parsed:
return ("CountryList( Link = \"{link}\", Revid = \"{revid}\", " +
"Interpret = \"{interpret}\", Titel = \"{titel}\", " +
"Chartein = \"{chartein}\" )").format(
link=repr(self.wikilink),
revid=self.revid,
interpret=self.interpret,
titel=self.titel,
chartein=repr(self.chartein))
else:
return "CountryList( Link = \"{link}\" )".format(
link=repr(self.wikilink))
class CountryListError( Exception ): class CountryListError( Exception ):
""" """
@@ -422,3 +488,98 @@ class CountryListEntryError( CountryListError ):
Handles errors occuring in class CountryList related to entrys Handles errors occuring in class CountryList related to entrys
""" """
pass pass
class CountryListUnitTest():
"""
Defines Test-Functions for CountryList-Module
"""
testcases = ( { "Link": mwparser.nodes.Wikilink( "Benutzer:JogoBot/Charts/Tests/Liste der Nummer-eins-Hits in Frankreich (2015)" ), # noqa
"revid": 148453827,
"interpret": "[[Adele (Sängerin)|Adele]]",
"titel": "[[Hello (Adele-Lied)|Hello]]",
"chartein": datetime( 2015, 10, 23 ) },
{ "Link": mwparser.nodes.Wikilink( "Benutzer:JogoBot/Charts/Tests/Liste der Nummer-eins-Hits in Belgien (2015)", "Wallonien"), # noqa
"revid": 148455281,
"interpret": "[[Nicky Jam]] & [[Enrique Iglesias (Sänger)|Enrique Iglesias]]", # noqa
"titel": "El perdón",
"chartein": datetime( 2015, 9, 12 ) } )
def __init__( self, page=None ):
"""
Constructor
Set attribute page
"""
if page:
self.page_link = mwparser.nodes.Wikilink( page )
else:
self.page_link = None
def treat( self ):
"""
Start testing either manually with page provided by cmd-arg page or
automatically with predefined test case
"""
if self.page_link:
self.man_test()
else:
self.auto_test()
def auto_test( self ):
"""
Run automatic tests with predefined test data from wiki
"""
for case in type(self).testcases:
self.countrylist = CountryList( case["Link"] )
if( self.countrylist.is_parsing_needed( case["revid"] ) or not
self.countrylist.is_parsing_needed( case["revid"] + 1 ) ):
raise Exception(
"CountryList.is_parsing_needed() does not work!" )
self.countrylist.parse()
for key in case:
if key == "Link":
continue
if not case[key] == getattr(self.countrylist, key ):
raise Exception( key + " " + str(
getattr(self.countrylist, key ) ))
def man_test( self ):
"""
Run manual test with page given in parameter
"""
self.countrylist = CountryList( self.page_link )
self.countrylist.parse()
print( self.countrylist )
print( "Since we have no data to compare, you need to manually " +
"check data above against given page to ensure correct " +
"working of module!" )
def main(*args):
"""
Handling direct calls --> unittest
"""
# Process global arguments to determine desired site
local_args = pywikibot.handle_args(args)
# Parse command line arguments
for arg in local_args:
if arg.startswith("-page:"):
page = arg[ len("-page:"): ]
# Call unittest-class
test = CountryListUnitTest( page )
test.treat()
if __name__ == "__main__":
main()

1
jogobot Submodule

Submodule jogobot added at 9131235b7b

View File

@@ -3,7 +3,7 @@
# #
# summarypage.py # summarypage.py
# #
# Copyright 2015 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2016 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by
@@ -30,6 +30,8 @@ from datetime import datetime, timedelta
# import pywikibot # import pywikibot
import mwparserfromhell as mwparser import mwparserfromhell as mwparser
import jogobot
from countrylist import CountryList, CountryListError from countrylist import CountryList, CountryListError
@@ -145,6 +147,9 @@ class SummaryPageEntry():
# If list is from last year, replace year # If list is from last year, replace year
if (current_year - 1) in self.countrylist_wikilink.title: if (current_year - 1) in self.countrylist_wikilink.title:
jogobot.output( "Trying to use new years list for [[{page}]]"
.format( page=self.countrylist_wikilink.title ) )
self.countrylist_wikilink.title.replace( (current_year - 1), self.countrylist_wikilink.title.replace( (current_year - 1),
current_year ) current_year )
@@ -159,6 +164,10 @@ class SummaryPageEntry():
# If list is from last year, replace year # If list is from last year, replace year
if (current_year ) in self.countrylist_wikilink.title: if (current_year ) in self.countrylist_wikilink.title:
jogobot.output( "New years list for [[{page}]] does not " +
"exist, fall back to old list!".format(
page=self.countrylist_wikilink.title ) )
self.countrylist_wikilink.title.replace( current_year, self.countrylist_wikilink.title.replace( current_year,
(current_year - 1) ) (current_year - 1) )