Commit Graph

229 Commits

Author SHA1 Message Date
ec2b84df2a Add requirements
To make setup of environment for this module easier
test-v4
2017-09-23 21:09:58 +02:00
88848cb084 Prepare Version test-v4 for release
Add a README.md file for this project
2017-09-23 20:32:13 +02:00
5057aed0d3 Merge branch 'fs#157-lowercase-title' into develop 2017-09-09 21:47:03 +02:00
02e53475f1 Prevent lowercase article titles in Parser
Since real lowercase article titles are not allowed, make sure to
convert all first letters of article titles to uppercase. This is
neccessary since pywikibot will return article titles like this.

Related Task: [FS#157](https://fs.golderweb.de/index.php?do=details&task_id=157)
2017-09-09 21:35:36 +02:00
d6f9b460c9 Merge branch 'fs#156-dbapi-charset' into develop 2017-09-02 22:13:20 +02:00
ff03ca8f13 Explicitly set charset for PyMySQL-Connection
Since PyMySQL-Connection otherwise uses charset 'latin-1', explicitly
set connection charset to 'utf8'

http://docs.sqlalchemy.org/en/rel_1_0/dialects/mysql.html#charset-selection
http://docs.sqlalchemy.org/en/rel_1_0/core/engines.html?highlight=url#sqlalchemy.engine.url.URL

Related Task: [FS#156](https://fs.golderweb.de/index.php?do=details&task_id=156)
2017-09-02 22:10:25 +02:00
88692ca678 Merge branch 'fs#155-article-surouding-space' into develop 2017-09-02 22:08:31 +02:00
d9b4fcc0bd Strip spaces before adding articles to redfam
Some article links have surounding spaces in their linktext. Remove them
before adding article to RedFam to have a cannonical title

Related Task: [FS#155](https://fs.golderweb.de/index.php?do=details&task_id=155)
2017-09-02 22:06:30 +02:00
22ff78ea98 Merge branch 'fs#154-categorie-colons-missing' into develop 2017-09-02 16:02:45 +02:00
b3cfcdc259 Improve title detection to get correct behaviour
Make sure that categorie links are starting with colon and non article
pages are returned with namespace.

Related Task: [FS#154](https://fs.golderweb.de/index.php?do=details&task_id=154)
2017-09-02 15:59:34 +02:00
b3e0ace2f4 Merge branch 'fs#153-nested-templates' into develop 2017-09-02 14:25:21 +02:00
f8002c85da Do not search for templates recursivly
Since nested templates did not get an index in global wikicode object
searching for index of an nested template results in ValueError

Related Task: [FS#153](https://fs.golderweb.de/index.php?do=details&task_id=153)
2017-09-02 14:23:25 +02:00
49bc05d29b Merge branch 'fs#151-normalize-article-titles-anchor' into develop 2017-09-02 13:36:17 +02:00
8a26b6d92a Normalize article titles with anchors
In our db article titles with anchors are stored with underscores in
anchor string. Therefore we need to replace spaces in anchor string
given by pywikibot.Page.title().

Related Task: [FS#151](https://fs.golderweb.de/index.php?do=details&task_id=151)
2017-08-25 18:11:41 +02:00
49a8230d76 Merge branch 'fs#141-place-notice-after-comment' into develop 2017-08-25 17:11:28 +02:00
31c10073a2 Prevent index errors searching for comments
Make sure not to exceed existing indexes of wikicode object while trying
to search for comments

Related Task: [FS#141](https://fs.golderweb.de/index.php?do=details&task_id=141)
2017-08-25 17:09:38 +02:00
642a29b022 Improve regex for blank lines
Do not match consecutive linebreaks as one

Related Task: [FS#141](https://fs.golderweb.de/index.php?do=details&task_id=141)
2017-08-24 18:47:18 +02:00
2f90751dc2 Merge branch 'fs#146-famhash-generator' into develop 2017-08-24 12:27:54 +02:00
024be69fe1 Use famhash as generator
If famhash is defined, fetch explicitly that redfam from db and work
only on this

Related Task: [FS#146](https://fs.golderweb.de/index.php?do=details&task_id=146)
2017-08-24 12:27:13 +02:00
b6d7268a7f select by famhash: Add methods to get param in bot
We need a method as callback to get bot specific params passed through
to our bot class.
Introduce -famhash parameter to work on specific famhash

Related Task:[FS#146](https://fs.golderweb.de/index.php?do=details&task_id=146)
2017-08-24 12:27:13 +02:00
526184c1e1 Merge branch 'fs#148-articles-mixed-up' into develop 2017-08-24 12:26:53 +02:00
3aa6c5fb1c Disable PreloadingGenerator temporarily
PreloadingGenerator mixes up yielded Pages. This is very unconvenient
for semi-automatic workflow with manual checks as the articles of the
RedFams were not following each other.

Related Task: [FS#148](https://fs.golderweb.de/index.php?do=details&task_id=148)
2017-08-24 12:23:17 +02:00
ec8f459db5 Merge branch 'fs#138-marked-articles-shown-again' into develop 2017-08-24 12:19:24 +02:00
3b2cb95f36 Do not fetch marked redfams from db
Exclude marked Redfams from DB-Query to prevent marking them again

Related Task: [FS#138](https://fs.golderweb.de/index.php?do=details&task_id=138)
2017-08-24 12:09:43 +02:00
41e5cc1a9d Merge branch 'fs#141-place-notice-after-comment' into develop 2017-08-24 12:06:03 +02:00
9b9d50c4d2 Improve detection of empty lines
Search with RegEx as empty lines could also contain spaces

Related Task: [FS#141](https://fs.golderweb.de/index.php?do=details&task_id=141)
2017-08-24 12:04:45 +02:00
a755288700 Merge branch 'fs#147-templates-in-heading' into develop 2017-08-23 14:55:43 +02:00
14ec71dd09 Rewrite get_disc_link to handle special cases
Use methods of pywikibot site-object and mwparser to get rid of any
special elements like templates or links in headings for construction
of our disc link.
Replace   by hand as it otherwise will occur as normal space and
wont work

Related Task: [FS#147](https://fs.golderweb.de/index.php?do=details&task_id=147)
2017-08-23 14:53:22 +02:00
e283eb78ac Merge branch 'fs#140-also-mark-redirects' into develop 2017-08-22 21:59:22 +02:00
cc02006fd2 Do not exclude redirects from beeing marked
In accordance with Zulu55 redirect discussion pages should also get
a notice, therefore do not exclude redirects.

Related Task: [FS#140](https://fs.golderweb.de/index.php?do=details&task_id=140)
2017-08-22 21:59:07 +02:00
37b0cbef08 Merge branch 'fs#138-marked-articles-shown-again' into develop 2017-08-22 21:58:22 +02:00
4137d72468 Look for existing notice by simple in-check
To detect maybe uncommented notices already present, check for them
using just a simple python x in y check over whole wikicode

Related Task: [FS#138](https://fs.golderweb.de/index.php?do=details&task_id=138)
2017-08-22 21:56:43 +02:00
cd87d1c2bb Fix already marked articles was reshown bug
Since we search for matching states for articles to include or exclude
in a loop, we could not control the outer loop via default break/
continue. Python docs recommend using Exceptions and try/except
structures to realise that most conveniently.

https://docs.python.org/3/faq/design.html#why-is-there-no-goto

Related Task: [FS#138](https://fs.golderweb.de/index.php?do=details&task_id=138)
2017-08-22 21:45:58 +02:00
456b2ba3d4 Merge branch 'fs#141-place-notice-after-comment' into develop 2017-08-21 22:11:51 +02:00
47b85a0b5e Add missing line break if there is no template
To make sure our notice template resides in its own line in every case

Related Task: [FS#141](https://fs.golderweb.de/index.php?do=details&task_id=141)
2017-08-21 22:09:59 +02:00
a6fdc974bd Merge branch 'fs#144-PyMySQL-instead-oursql' into develop 2017-08-21 13:58:34 +02:00
30de2a2e12 Replace oursql with PyMySQL
Since this is prefered on toolsforge and works out of the box after
installing via pip, replace oursql which caused some problems.
Especially oursql was not able to connect to db via ssh tunnel.

Related Task: [FS#144](https://fs.golderweb.de/index.php?do=details&task_id=144)
2017-08-21 13:55:33 +02:00
4a6855cf7b Merge branch 'fs#141-place-notice-after-comment' into develop 2017-08-21 13:51:32 +02:00
8422d08cb6 Keep comments and leading templates together
Prevent spliting up existing comments and templates as often those are
documenting archiv templates behaviour

Related Task: [FS#141](https://fs.golderweb.de/index.php?do=details&task_id=141)
2017-08-21 13:49:34 +02:00
ed78501821 Merge branch 'fs#115-lsec-no-template' into test-v3 test-v3 2017-08-21 13:16:28 +02:00
34e7e0d3be Prevent index Error if no template in leadsec
Check if there is a template in leadsec before accessing list item to
prevent IndexErrors

Related Task: [https://fs.golderweb.de/index.php?do=details&task_id=115 FS#115]
2017-03-11 12:22:10 +01:00
f9f081d072 Merge branch 'fs#114-remove-underscores-in-articles' into test-v3 2017-03-11 11:43:49 +01:00
0f930082b4 Also canonicalise anchor parts of articles
Replace spaces in anchors with underscores as spaces are not correct
there

Related Task: [https://fs.golderweb.de/index.php?do=details&task_id=114 FS#114]
2017-03-11 11:40:41 +01:00
80c94ccf4f Replace underscores in article titles
Remove underscores in article titles and replace with spaces to have
canonical state for all articles
Therefore we need to split title and posible anchors in heading parser

Related Task: [https://fs.golderweb.de/index.php?do=details&task_id=114 FS#114]
2017-03-11 11:30:19 +01:00
37704c6661 Replace pywikibot.showDiff with patched version
Pywikibot.bot.userPut does not support setting the value of diff context
so it is always zero. Therefore we need to patch either userPut or
showDiff to get some context.

Related Task: [https://fs.golderweb.de/index.php?do=details&task_id=113 FS#113]
2017-03-11 10:39:31 +01:00
4e4be1c6d0 Merge branch 'fs#110-markpages-status-problems' into test-v3 2017-03-11 00:06:00 +01:00
3e69a1c77e Remove problem indicating stati when set marked
Remove states which are indicating problems in previous runs if
successfully marked article and also whole RedFam

[https://fs.golderweb.de/index.php?do=details&task_id=112 FS#112]

Related Task: [https://fs.golderweb.de/index.php?do=details&task_id=110 FS#110]
2017-03-11 00:03:42 +01:00
56f326b568 Fix error all current redfams marked when quit
Restructure update_status to make sure, marked is only set when all
articles are marked or gone (means deleted or redirect)

[https://fs.golderweb.de/index.php?do=details&task_id=111 FS#111]

Related Task: [https://fs.golderweb.de/index.php?do=details&task_id=110 FS#110]
2017-03-10 23:45:48 +01:00
868894a38b Format fixes
Set locale to de_DE.utf-8 for whole Task

Make sure Template is added in own source line
2017-03-10 23:28:24 +01:00
65de6decb2 markpages: Filter redirects
Do not mark redirects discussion pages
2017-03-10 21:51:59 +01:00