35 Commits

Author SHA1 Message Date
4f31b1a792 Merge branch 'release-1.1.1' 2018-08-12 11:48:17 +02:00
3fbfd4ccd7 Prepare release-1.1.1 2018-08-12 11:46:40 +02:00
50b0e142ec Merge branch 'i#71-moved-page-exists' into develop 2018-08-12 11:43:18 +02:00
14db996a43 redfam: Check if moved page exists
To prevent creation of orphaned diskpages in case of special movement
constructs

Issue #71 (#71)
2018-08-12 11:41:50 +02:00
110589cb5b Merge branch 'release-1.1' back into develop 2018-08-12 11:15:30 +02:00
5c277495a3 Merge branch 'release-1.1' 2018-05-17 12:41:35 +02:00
a466ab4e74 Prepare release-1.1 2018-05-17 12:41:06 +02:00
860a285ab0 Merge branch 'i#68-exclude-users' into develop 2018-05-17 12:36:37 +02:00
2c105336b0 RedFamWorker: Exclude users and user talkpages
Users can't be part of valid redundances

Issue #68 (#68)
2018-05-17 12:35:38 +02:00
ea85ca731f Merge branch 'i#69-already-talkpage' into develop 2018-05-17 12:28:17 +02:00
6e119ea98f RedFamWorker: Improve talkpagetoggling
Do not toggle to main page if we have already a talkpage and vice versa

Issue #69 (#69)
2018-05-17 12:26:37 +02:00
67aaf3cbbe Merge branch 'i#70-follow-moved-pages' into develop 2018-05-17 12:24:00 +02:00
fa13e2a5cf Follow moved pages
Keep notice together with content
https://de.wikipedia.org/w/index.php?title=Benutzer_Diskussion:Jogo.obb&oldid=176464377#Redundanzhinweis_zu_zwischenzeitlich_verschobenen_Artikeln

Issue #70 (#70)
2018-05-17 12:18:13 +02:00
562e689418 Merge branch 'release-1.0' back into develop 2017-11-05 12:30:18 +01:00
ae1ee7d6a5 Merge branch 'release-1.0' 2017-11-05 12:28:21 +01:00
93447d8dc6 Prepare release v1.0
Update Copyright Notices
Version information
2017-11-05 12:25:13 +01:00
1b6faf9e53 Use own db for red-task
Since we have several tables and sometimes need to create a copy on
replication servers.
2017-11-05 12:17:05 +01:00
b4c193eedc Disable echoing of SQLAlchemy Egine
We don't need this extensive output for production
2017-11-05 12:07:38 +01:00
788a3df0cd Update jogobot-submodule to v0.1 2017-11-05 12:00:28 +01:00
04f591b466 Merge branch 'fs#161-add-article-titles' into develop 2017-11-05 11:24:15 +01:00
9640467f69 markpages: Use redarticle attribute of Page
Instead of trying to reconstruct our db article title, use the one added
to Page-object by redfam.article_generator

Related Task: [FS#161](https://fs.golderweb.de/index.php?do=details&task_id=161)
2017-11-05 11:22:43 +01:00
bfec2abf98 markpages: Get rid of PageWithTalkPageGenerator
Since redfam.article_generator can yield talkpage with additional
information about redfam and current article from db, we do not need it
anymore.

Related Task: [FS#161](https://fs.golderweb.de/index.php?do=details&task_id=161)
2017-11-05 11:20:55 +01:00
20103d589d redfam: article_generator add redfam info to page
Add reference to redfam object and article title from db to Page object
since Page.title() may differe (short Namespaces, anchors, special chars)

Related Task: [FS#161](https://fs.golderweb.de/index.php?do=details&task_id=161)
2017-11-05 11:18:53 +01:00
e18aa96a84 redfam: article_generator can return talkpage
To make pywikibot.pagegenerators.PageWithTalkPageGenerators unneccessary
so we can manipulate talkpage object directly

Related Task: [FS#161](https://fs.golderweb.de/index.php?do=details&task_id=161)
2017-11-05 11:15:04 +01:00
1dd4c7f87e Merge branch 'test-v7' back into develop 2017-11-02 18:57:59 +01:00
33b2e47312 Describe version test-v7 2017-10-28 22:43:53 +02:00
3bd17ce692 Merge branch 'fs#160-urlencoded-chars' into develop 2017-10-28 22:36:55 +02:00
5f4640d5ff Replace urlencoded chars with unicode equivalent
Otherwise we get value errors while marking since pwb replaces those

Related Task: [FS#160](https://fs.golderweb.de/index.php?do=details&task_id=160)
2017-10-28 22:35:25 +02:00
7e0456ae4f Merge branch 'test-v6' back into develop 2017-10-28 22:34:30 +02:00
108b7aa331 Describe version test-v6 2017-10-28 18:46:30 +02:00
a3adf31b89 Merge branch 'fs#86-activate-status-api' into develop 2017-10-28 18:44:42 +02:00
614f288bb9 Activate jogobot status api for onwiki disabling
Related Task: [FS#86](https://fs.golderweb.de/index.php?do=details&task_id=86)
2017-10-28 18:44:05 +02:00
c450a045bf Merge branch 'fs#159-space-before-anchor' into develop 2017-10-28 18:43:13 +02:00
84802cf521 Remove leading or trailing spaces in articles
Some articles contain spaces between title and anchor part which will
be stripped now

Related Task: [FS#159](https://fs.golderweb.de/index.php?do=details&task_id=159)
2017-10-28 18:41:06 +02:00
5f6c443ba8 Merge branch 'test-v5' back into develop 2017-10-28 18:17:01 +02:00
8 changed files with 74 additions and 28 deletions

View File

@@ -18,6 +18,22 @@ Those can be installed using pip and the _requirements.txt_ file provided with t
Versions Versions
-------- --------
* v1.1.1
- Check if moved page exists
* v1.1
- Improved page filter
* v1.0
- first stable release
- less debug output
- fixed problems with article title
* test-v7
- Fixed problem with url encoded chars in article title
* test-v6
- jogobot status API enabled (Bot can be disabled onwiki)
- Fixed problem with space between article title and anchor
* test-v5 * test-v5
- Feature _markpages_ working in full-automatic mode with _always_-flag - Feature _markpages_ working in full-automatic mode with _always_-flag

View File

@@ -3,7 +3,7 @@
# #
# markpages.py # markpages.py
# #
# Copyright 2016 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2017 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by
@@ -145,14 +145,10 @@ class MarkPagesBot( CurrentPageBot ): # sets 'current_page' on each treat()
for redfam in self.redfams: for redfam in self.redfams:
# We need the talkpage (and only this) of each existing page # We need the talkpage (and only this) of each existing page
for talkpage in pagegenerators.PageWithTalkPageGenerator( for talkpage in redfam.article_generator(
redfam.article_generator(
filter_existing=True, filter_existing=True,
exclude_article_status=["marked"] ), exclude_article_status=["marked"],
return_talk_only=True ): talkpages=True ):
# Add reference to redfam to talkpages
talkpage.redfam = redfam
yield talkpage yield talkpage
@@ -188,14 +184,8 @@ class MarkPagesBot( CurrentPageBot ): # sets 'current_page' on each treat()
# None if change was not accepted by user # None if change was not accepted by user
save_ret = self.put_current( self.new_text, summary=summary ) save_ret = self.put_current( self.new_text, summary=summary )
# Normalize title with anchor (replace spaces in anchor) # Get article as named in db
article = self.current_page.toggleTalkPage().title( article = self.current_page.redarticle
asLink=True, textlink=True)
article = article.strip("[]")
article_parts = article.split("#", 1)
if len(article_parts) == 2:
article_parts[1] = article_parts[1].replace(" ", "_")
article = "#".join(article_parts)
# Status # Status
if add_ret is None or ( add_ret and save_ret ): if add_ret is None or ( add_ret and save_ret ):

View File

@@ -3,7 +3,7 @@
# #
# reddiscparser.py # reddiscparser.py
# #
# Copyright 2016 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2017 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by

Submodule jogobot updated: 49ada2993e...d69d873624

View File

@@ -3,7 +3,7 @@
# #
# mysqlred.py # mysqlred.py
# #
# Copyright 2015 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2017 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by
@@ -51,10 +51,11 @@ url = URL( "mysql+pymysql",
password=config.db_password, password=config.db_password,
host=config.db_hostname, host=config.db_hostname,
port=config.db_port, port=config.db_port,
database=config.db_username + jogobot.config['db_suffix'], database=( config.db_username +
jogobot.config['redundances']['db_suffix'] ),
query={'charset': 'utf8'} ) query={'charset': 'utf8'} )
engine = create_engine(url, echo=True) engine = create_engine(url, echo=False)
Session = sessionmaker(bind=engine) Session = sessionmaker(bind=engine)

View File

@@ -3,7 +3,7 @@
# #
# redfam.py # redfam.py
# #
# Copyright 2017 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2018 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by
@@ -28,6 +28,7 @@ Provides classes for working with RedFams
import hashlib import hashlib
import locale import locale
import re import re
import urllib.parse
from datetime import datetime from datetime import datetime
import mwparserfromhell as mwparser # noqa import mwparserfromhell as mwparser # noqa
@@ -291,12 +292,19 @@ class RedFamParser( RedFam ):
# Make sure first letter is uppercase # Make sure first letter is uppercase
article = article[0].upper() + article[1:] article = article[0].upper() + article[1:]
# Unquote possible url encoded special chars
article = urllib.parse.unquote( article )
# Split in title and anchor part # Split in title and anchor part
article = article.split("#", 1) article = article.split("#", 1)
# Replace underscores in title with spaces # Replace underscores in title with spaces
article[0] = article[0].replace("_", " ") article[0] = article[0].replace("_", " ")
if len(article) > 1: if len(article) > 1:
# Strip both parts to prevent leading/trailing spaces
article[0] = article[0].strip()
article[1] = article[1].strip()
# other way round, replace spaces with underscores in anchors # other way round, replace spaces with underscores in anchors
article[1] = article[1].replace(" ", "_") article[1] = article[1].replace(" ", "_")
@@ -506,7 +514,8 @@ class RedFamWorker( RedFam ):
def article_generator(self, # noqa def article_generator(self, # noqa
filter_existing=None, filter_redirects=None, filter_existing=None, filter_redirects=None,
exclude_article_status=[], exclude_article_status=[],
onlyinclude_article_status=[] ): onlyinclude_article_status=[],
talkpages=None ):
""" """
Yields pywikibot pageobjects for articles belonging to this redfams Yields pywikibot pageobjects for articles belonging to this redfams
in a generator in a generator
@@ -520,6 +529,8 @@ class RedFamWorker( RedFam ):
set to False to get only redirectpages, set to False to get only redirectpages,
unset/None results in not filtering unset/None results in not filtering
@type filter_redirects bool/None @type filter_redirects bool/None
@param talkpages Set to True to get Talkpages instead of article page
@type talkpages bool/None
""" """
@@ -583,6 +594,34 @@ class RedFamWorker( RedFam ):
except Break: except Break:
break break
# Follow moved pages
if self.article_has_status( "redirect", title=article ):
try:
page = page.moved_target()
# Short circuit if movement destination does not exists
if not page.exists():
continue
except pywikibot.exceptions.NoMoveTarget:
pass
# Exclude Users & User Talkpage
if page.namespace() == 2 or page.namespace() == 3:
self.article_add_status( "user", title=article )
continue
# Toggle talkpage
if talkpages and not page.isTalkPage() or\
not talkpages and page.isTalkPage():
page = page.toggleTalkPage()
# Add reference to redfam to pages
page.redfam = self
# Keep article title from db with page object
page.redarticle = article
# Yield filtered pages # Yield filtered pages
yield page yield page

View File

@@ -3,7 +3,7 @@
# #
# redpage.py # redpage.py
# #
# Copyright 2015 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2017 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by

6
red.py
View File

@@ -3,7 +3,7 @@
# #
# reddiscparser.py # reddiscparser.py
# #
# Copyright 2016 GOLDERWEB Jonathan Golder <jonathan@golderweb.de> # Copyright 2017 Jonathan Golder <jonathan@golderweb.de>
# #
# This program is free software; you can redistribute it and/or modify # This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by # it under the terms of the GNU General Public License as published by
@@ -124,8 +124,8 @@ def main(*args):
# Disabled until [FS#86] is done # Disabled until [FS#86] is done
# Before run, we need to check wether we are currently active or not # Before run, we need to check wether we are currently active or not
# if not jogobot.bot.active( task_slug ): if not jogobot.bot.active( task_slug ):
# return return
# Parse local Args to get information about subtask # Parse local Args to get information about subtask
( subtask, genFactory, subtask_args ) = jogobot.bot.parse_local_args( ( subtask, genFactory, subtask_args ) = jogobot.bot.parse_local_args(