32 Commits
v1.1 ... v1.2.2

Author SHA1 Message Date
39e98df7cd Merge branch 'hotfix-1.2.2' 2021-03-07 11:51:07 +01:00
a35546e53d Fix removed pywikibot config property db_hostname
https://phabricator.wikimedia.org/rPWBC2d73643f70a3f3289ff83e7ec142727d79d2649c
2021-03-07 11:48:07 +01:00
e6ffd7d14a Merge branch 'hotfix-1.2.1' 2019-06-08 13:40:09 +02:00
1b82de1eab Fix #72: Markpages terminates with error
Filtering templates with wikicode as match param does not work anymore,
explicitly cast to string
2019-06-08 13:38:25 +02:00
466b9da886 Merge branch 'release-1.2' 2018-10-05 12:37:23 +02:00
b80f5bd2c9 Prepare release v1.2 2018-10-05 12:36:49 +02:00
ff2421b63e Merge branch 'remove-jogobot-submodule' into develop 2018-10-05 11:42:08 +02:00
236ba6a870 Update requirements 2018-10-05 11:40:25 +02:00
0df2017387 Remove submodule jogobot
As now installable via pip
2018-10-05 11:37:56 +02:00
b9faed8847 Merge branch 'i#64-missingnotice' into develop 2018-10-05 11:35:50 +02:00
5cdccaeec6 missingnotice: Add RedFam counter output
Signalize bot is working, since processing RedFams takes about
15 minutes with out any output right now

Issue #64 (#64)
2018-09-25 17:51:01 +02:00
54d8b8ea3b missingnotice: Disable verbose logging of sqlalchemy
Issue #64 (#64)
2018-09-18 16:29:54 +02:00
dea5a393ad missingnotice: Call RedFamWorker.flush_db_cache
To write status changes to db

Issue #64 (#64)
2018-09-18 16:29:50 +02:00
f021a13202 missingnotice: Implement run()
The bots working sequence, using previously implemented methods to
update the list of missing notices

Issue #64 (#64)
2018-09-18 16:29:45 +02:00
4c8ba95534 missingnotice: Implement update_page()
This method updates the content of the configured or given wikipage with
the generated lines

Issue #64 (#64)
2018-09-18 16:29:40 +02:00
9804db212f missingnotice: Implement format_row()
With this method, the links to redundance discussions and articles
missining notice are concatenated and formated

Issue #64 (#64)
2018-09-18 16:29:34 +02:00
68b81b1111 missingnotice: Implement treat_redfam
For each redfam, we need to check weather related redundance discussion
exists and if there are missing notices. For those redfams return links
to discussion and articles missing notice.

Issue #64 (#64)
2018-09-18 16:29:28 +02:00
389c48605e redfam: Make get_disc_link() able to return wikilink
Issue #64 (#64)
2018-09-18 16:29:22 +02:00
95af95aca6 missingnotice: Implement article selection
Issue #64 (#64)
2018-09-18 16:29:17 +02:00
99adad873e missingnotice_test: Test article query
Issue #64 (#64)
2018-09-18 16:29:01 +02:00
dbcc2717d7 missingnotice: Implement article query
Issue #64 (#64)
2018-09-18 16:28:56 +02:00
e5a45fa692 tests: Add test script for missingnotice
Issue #64 (#64)
2018-09-18 16:28:44 +02:00
63d3f837e9 red.py: Introduce subtask missingnotice
Issue #64 (#64)
2018-09-18 16:28:38 +02:00
cfb3e8e37c bots: Add basic structure for MissingNoticeBot
Issue #64 (#64)
2018-09-18 16:28:29 +02:00
dfffe97200 redfam: Add method to check disc section
Sometimes disc sections are disapering since the heading is changed
and the famhash changes, so we get a new redfam. Mark those as absent

Issue #64 (#64)
2018-09-18 16:28:22 +02:00
246e94c228 redfam: Add generator for open redfams to Worker
Issue #64 (#64)
2018-09-18 16:27:51 +02:00
181486c718 Merge branch 'release-1.1.1' back into develop 2018-09-17 17:23:12 +02:00
4f31b1a792 Merge branch 'release-1.1.1' 2018-08-12 11:48:17 +02:00
3fbfd4ccd7 Prepare release-1.1.1 2018-08-12 11:46:40 +02:00
50b0e142ec Merge branch 'i#71-moved-page-exists' into develop 2018-08-12 11:43:18 +02:00
14db996a43 redfam: Check if moved page exists
To prevent creation of orphaned diskpages in case of special movement
constructs

Issue #71 (#71)
2018-08-12 11:41:50 +02:00
110589cb5b Merge branch 'release-1.1' back into develop 2018-08-12 11:15:30 +02:00
11 changed files with 414 additions and 9 deletions

3
.gitmodules vendored
View File

@@ -1,3 +0,0 @@
[submodule "jogobot"]
path = jogobot
url = ../jogobot

View File

@@ -11,6 +11,7 @@ The libraries above need to be installed and configured manualy considering [doc
* SQLAlchemy * SQLAlchemy
* PyMySQL * PyMySQL
* [jogobot-core module](https://git.golderweb.de/wiki/jogobot)
Those can be installed using pip and the _requirements.txt_ file provided with this packet Those can be installed using pip and the _requirements.txt_ file provided with this packet
@@ -18,6 +19,22 @@ Those can be installed using pip and the _requirements.txt_ file provided with t
Versions Versions
-------- --------
* v1.2.2
- Fix removed pywikibot config property db_hostname
* v1.2.1
- Fix [#72](https://git.golderweb.de/wiki/jogobot-red/issues/72)
* v1.2
- Create a list of redfams/articles missing reddisc notice
python red.py -task:missingnotice -family:wikipedia
- jogobot module not longer included
* v1.1.1
- Check if moved page exists
* v1.1 * v1.1
- Improved page filter - Improved page filter
@@ -57,6 +74,10 @@ Versions
* test-v1 * test-v1
Bugs
----
[jogobot-red Issues](https://git.golderweb.de/wiki/jogobot-red/issues)
License License
------- -------
GPLv3 GPLv3
@@ -64,6 +85,6 @@ GPLv3
Author Information Author Information
------------------ ------------------
Copyright 2017 Jonathan Golder jonathan@golderweb.de https://golderweb.de/ Copyright 2018 Jonathan Golder jonathan@golderweb.de https://golderweb.de/
alias Wikipedia.org-User _Jogo.obb_ (https://de.wikipedia.org/Benutzer:Jogo.obb) alias Wikipedia.org-User _Jogo.obb_ (https://de.wikipedia.org/Benutzer:Jogo.obb)

View File

@@ -293,7 +293,7 @@ class MarkPagesBot( CurrentPageBot ): # sets 'current_page' on each treat()
# Iterate over Templates with same name (if any) to search equal # Iterate over Templates with same name (if any) to search equal
# Link to decide if they are the same # Link to decide if they are the same
for present_notice in self.current_wikicode.ifilter_templates( for present_notice in self.current_wikicode.ifilter_templates(
matches=self.disc_notice.name ): matches=str(self.disc_notice.name) ):
# Get reddisc page.title of notice to add # Get reddisc page.title of notice to add
add_notice_link_tile = self.disc_notice.get( add_notice_link_tile = self.disc_notice.get(

201
bots/missingnotice.py Normal file
View File

@@ -0,0 +1,201 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# missingnotice.py
#
# Copyright 2018 Jonathan Golder <jonathan@golderweb.de>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
# MA 02110-1301, USA.
#
#
from sqlalchemy import create_engine
from sqlalchemy.engine.url import URL
import pywikibot
import jogobot
from lib.redfam import RedFamWorker
class MissingNoticeBot(pywikibot.bot.Bot):
"""
"""
# MySQL-query to get articles with notice
cat_article_query = """
SELECT `page_title`
FROM `categorylinks`
JOIN `category`
ON `cl_to` = `cat_title`
AND `cat_title` LIKE "{cat}\_%%"
JOIN `page`
ON `cl_from` = `page_id`
""".format(cat=jogobot.config["red.missingnotice"]["article_category"])
def __init__( self, genFactory, **kwargs ):
self.categorized_articles = list()
self.page_content = list()
super(type(self), self).__init__(**kwargs)
def run( self ):
# query articles containing notice
self.categorized_articles = type(self).get_categorized_articles()
fam_counter = 0
# iterate open redfams
for redfam in RedFamWorker.gen_open():
fam_counter += 1
links = self.treat_open_redfam(redfam)
if links:
self.page_content.append( self.format_row( links ) )
if (fam_counter % 50) == 0:
jogobot.output( "Processed {n:d} open RedFams".format(
n=fam_counter))
else:
# To write "absent" states to db
RedFamWorker.flush_db_cache()
# Update page content
self.update_page()
def treat_open_redfam( self, redfam ):
"""
Works on current open redfam
@param redfam Redfam to work on
@type redfam.RedFamWorker
@returns Tuple of disclink and list of articles missing notice or None
@rtype ( str, list(str*) ) or None
"""
# Check if related disc section exist
if not redfam.disc_section_exists():
return None
# Get links for articles without notice
links = self.treat_articles( redfam.article_generator(
filter_existing=True, filter_redirects=True ) )
# No articles without notice
if not links:
return None
return ( redfam.get_disc_link(as_link=True), links )
def treat_articles(self, articles):
"""
Iterates over given articles and checks weather them are included in
self.categorized_articles (contain the notice)
@param articles Articles to check
@type articles iterable of pywikibot.page() objects
@returns Possibly empty list of wikitext links ("[[article]]")
@rtype list
"""
links = list()
for article in articles:
if article.title(underscore=True, with_section=False ) not in \
self.categorized_articles:
links.append( article.title(as_link=True, textlink=True) )
return links
def format_row( self, links ):
"""
Formats row for output on wikipage
@param links Tuple of disc link and list of articles as returned by
self.treat_open_redfam()
@type links ( str, list(str*) )
@returns Formatet row text to add to page_content
@rtype str
"""
return jogobot.config["red.missingnotice"]["row_format"].format(
disc=links[0],
links=jogobot.config["red.missingnotice"]["link_sep"].join(
links[1] ) )
def update_page( self, wikipage=None):
"""
Handles the updating process of the wikipage
@param wikipage Wikipage to put text on, otherwise use configured page
@type wikipage str
"""
# if not given get wikipage from config
if not wikipage:
wikipage = jogobot.config["red.missingnotice"]["wikipage"]
# Create page object for wikipage
page = pywikibot.Page(pywikibot.Site(), wikipage)
# Define edit summary
summary = jogobot.config["red.missingnotice"]["edit_summary"]
# Make sure summary starts with "Bot:"
if not summary[:len("Bot:")] == "Bot:":
summary = "Bot: " + summary.strip()
# Concatenate new text
new_text = "\n".join(self.page_content)
# Save new text
self.userPut( page, page.text, new_text, summary=summary )
@classmethod
def get_categorized_articles( cls ):
"""
Queries all articles containing the notice based on category set by
notice template. Category can be configured in
jogobot.config["red.missingnotice"]["article_category"]
@returns List of all articles containing notice
@rtype list
"""
# construct connection url for sqlalchemy
url = URL( "mysql+pymysql",
username=pywikibot.config.db_username,
password=pywikibot.config.db_password,
host=jogobot.config["red.missingnotice"]["wikidb_host"],
port=jogobot.config["red.missingnotice"]["wikidb_port"],
database=jogobot.config["red.missingnotice"]["wikidb_name"],
query={'charset': 'utf8'} )
# create sqlalchemy engine
engine = create_engine(url, echo=False)
# fire the query to get articles with notice
result = engine.execute(cls.cat_article_query)
# return list with articles with notice
return [ row['page_title'].decode("utf-8") for row in result ]

Submodule jogobot deleted from d69d873624

View File

@@ -49,7 +49,7 @@ Base = declarative_base()
url = URL( "mysql+pymysql", url = URL( "mysql+pymysql",
username=config.db_username, username=config.db_username,
password=config.db_password, password=config.db_password,
host=config.db_hostname, host=config.db_hostname_format.format('tools'),
port=config.db_port, port=config.db_port,
database=( config.db_username + database=( config.db_username +
jogobot.config['redundances']['db_suffix'] ), jogobot.config['redundances']['db_suffix'] ),

View File

@@ -366,6 +366,9 @@ class RedFamParser( RedFam ):
- 3 and greater status was set by worker script, do not change it - 3 and greater status was set by worker script, do not change it
""" """
# Since we have parsed it, the section can never be absent
self.status.remove("absent")
# No ending, discussion is running: # No ending, discussion is running:
# Sometimes archived discussions also have no detectable ending # Sometimes archived discussions also have no detectable ending
if not self.ending and not self.redpage.archive: if not self.ending and not self.redpage.archive:
@@ -598,6 +601,11 @@ class RedFamWorker( RedFam ):
if self.article_has_status( "redirect", title=article ): if self.article_has_status( "redirect", title=article ):
try: try:
page = page.moved_target() page = page.moved_target()
# Short circuit if movement destination does not exists
if not page.exists():
continue
except pywikibot.exceptions.NoMoveTarget: except pywikibot.exceptions.NoMoveTarget:
pass pass
@@ -644,10 +652,13 @@ class RedFamWorker( RedFam ):
self.status.remove("note_rej") self.status.remove("note_rej")
self.status.add( "marked" ) self.status.add( "marked" )
def get_disc_link( self ): def get_disc_link( self, as_link=False ):
""" """
Constructs and returns the link to Redundancy discussion Constructs and returns the link to Redundancy discussion
@param as_link If true, wrap link in double square brackets (wikilink)
@type as_link bool
@returns Link to diskussion @returns Link to diskussion
@rtype str @rtype str
""" """
@@ -667,7 +678,42 @@ class RedFamWorker( RedFam ):
anchor_code = mwparser.parse( anchor_code ).strip_code() anchor_code = mwparser.parse( anchor_code ).strip_code()
# We try it without any more parsing as mw will do while parsing page # We try it without any more parsing as mw will do while parsing page
return ( self.redpage.pagetitle + "#" + anchor_code.strip() ) link = self.redpage.pagetitle + "#" + anchor_code.strip()
if as_link:
return "[[{0}]]".format(link)
else:
return link
def disc_section_exists( self ):
"""
Checks weather the redundance discussion is still existing. Sometimes
it is absent, since heading was changed and therefore we get a
different famhash ergo new redfam.
As a side effect, the method sets status "absent" for missing sections.
@returns True if it exists otherwise False
@rtype bool
"""
# The redpage
discpage = pywikibot.Page(pywikibot.Site(), self.get_disc_link() )
# Parse redpage content
wikicode = mwparser.parse( discpage.get() )
# List fams
fams = wikicode.filter_headings(
matches=RedFamParser.is_section_redfam_cb )
# Check if current fam is in list of fams
# If not, set status absent and return False
if self.heading not in [ fam.title.strip() for fam in fams]:
self.status.remove("open")
self.status.add("absent")
return False
# The section exists
return True
def generate_disc_notice_template( self ): def generate_disc_notice_template( self ):
""" """
@@ -745,6 +791,18 @@ class RedFamWorker( RedFam ):
yield redfam yield redfam
@classmethod
def gen_open( cls ):
"""
Yield red_fams stored in db by given status which have an ending after
given one
"""
for redfam in RedFamWorker.session.query(RedFamWorker).filter(
# NOT WORKING WITH OBJECT NOTATION
text("status LIKE '%open%'") ):
yield redfam
class RedFamError( Exception ): class RedFamError( Exception ):
""" """

4
red.py
View File

@@ -73,6 +73,10 @@ def prepare_bot( task_slug, subtask, genFactory, subtask_args ):
# Import related bot # Import related bot
from bots.markpages import MarkPagesBot as Bot from bots.markpages import MarkPagesBot as Bot
elif subtask == "missingnotice":
# Import related bot
from bots.missingnotice import MissingNoticeBot as Bot
# Subtask error # Subtask error
else: else:
jogobot.output( ( jogobot.output( (

View File

@@ -21,3 +21,6 @@ PyMySQL>=0.7
# Also needed, but not covered here, is a working copy of pywikibot-core # Also needed, but not covered here, is a working copy of pywikibot-core
# which also brings mwparserfromhell # which also brings mwparserfromhell
# jogobot
git+https://git.golderweb.de/wiki/jogobot.git#egg=jogobot

28
tests/context.py Normal file
View File

@@ -0,0 +1,28 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# missingnotice_tests.py
#
# Copyright 2018 Jonathan Golder <jonathan@golderweb.de>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
# MA 02110-1301, USA.
#
#
import os
import sys
sys.path.insert(
0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))

View File

@@ -0,0 +1,94 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# missingnotice_tests.py
#
# Copyright 2018 Jonathan Golder <jonathan@golderweb.de>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
# MA 02110-1301, USA.
#
#
"""
Test module bot/missingnotice.py
"""
import unittest
from unittest import mock # noqa
import pywikibot
import context # noqa
from bots.missingnotice import MissingNoticeBot # noqa
class TestMissingNoticeBot(unittest.TestCase):
"""
Test class MissingNoticeBot
"""
def setUp(self):
genFactory = pywikibot.pagegenerators.GeneratorFactory()
self.MissingNoticeBot = MissingNoticeBot(genFactory)
self.MissingNoticeBot.categorized_articles = [ "Deutschland",
"Max_Schlee",
"Hodeng-Hodenger" ]
@mock.patch( 'sqlalchemy.engine.Engine.execute',
return_value=( { "page_title": b"a", },
{ "page_title": b"b", },
{ "page_title": b"c", },
{ "page_title": b"d", }, ) )
def test_get_categorized_articles(self, execute_mock):
"""
Test method get_categorized_articles()
"""
self.assertFalse(execute_mock.called)
result = MissingNoticeBot.get_categorized_articles()
self.assertTrue(execute_mock.called)
self.assertEqual(result, ["a", "b", "c", "d"] )
def test_treat_articles( self ):
"""
Test method treat_articles()
"""
# articles with notice
a = pywikibot.Page(pywikibot.Site(), "Deutschland" )
b = pywikibot.Page(pywikibot.Site(), "Max_Schlee" )
c = pywikibot.Page(pywikibot.Site(), "Hodeng-Hodenger#Test" )
# articles without notice
x = pywikibot.Page(pywikibot.Site(), "Quodvultdeus" )
y = pywikibot.Page(pywikibot.Site(), "Zoo_Bremen" )
z = pywikibot.Page(pywikibot.Site(), "Nulka#Test" )
cases = ( ( ( a, b, c ), list() ),
( ( x, y, z ), [ "[[Quodvultdeus]]",
"[[Zoo Bremen]]",
"[[Nulka#Test]]" ]),
( ( a, b, y, z ), [ "[[Zoo Bremen]]",
"[[Nulka#Test]]" ]), )
for case in cases:
res = self.MissingNoticeBot.treat_articles( case[0] )
self.assertEqual( res, case[1] )
if __name__ == '__main__':
unittest.main()