Artikel mit Kleinbuchstaben am Anfang sollten beim Parsen auf Großbuchstaben gewandelt werden.
DB-Query Ansatz (muss noch für alle articleX erweitert werden:
<code>
SELECT *
FROM `dewikipedia_redfams`
WHERE CAST( `article0` as BINARY ) REGEXP '^[a-z]'
</code>
Schritte zum Beheben:
- Parser anpassen
- über Subselect bei betroffenen RedPages "parsed" löschen
- Parser ausführen
- Stichprobenartig auf neue RedFams prüfen (neuer Famhash)
- alte RedFams löschen
----
Imported from https://fs.golderweb.de/task/157 via GOLDERWEB FS->GITEA TICKETIMPORTER
Originially opened: Sat Sep 2 21:42:57 2017
SELECT *
FROM dewikipedia_redfams
WHERE CAST( article0 as BINARY ) REGEXP '^[a-z]' OR
CAST( article1 as BINARY ) REGEXP '^[a-z]' OR
CAST( article2 as BINARY ) REGEXP '^[a-z]' OR
CAST( article3 as BINARY ) REGEXP '^[a-z]' OR
CAST( article4 as BINARY ) REGEXP '^[a-z]' OR
CAST( article5 as BINARY ) REGEXP '^[a-z]' OR
CAST( article6 as BINARY ) REGEXP '^[a-z]' OR
CAST( article7 as BINARY ) REGEXP '^[a-z]'
<code>SELECT *
FROM `dewikipedia_redfams`
WHERE CAST( `article0` as BINARY ) REGEXP '^[a-z]' OR
CAST( `article1` as BINARY ) REGEXP '^[a-z]' OR
CAST( `article2` as BINARY ) REGEXP '^[a-z]' OR
CAST( `article3` as BINARY ) REGEXP '^[a-z]' OR
CAST( `article4` as BINARY ) REGEXP '^[a-z]' OR
CAST( `article5` as BINARY ) REGEXP '^[a-z]' OR
CAST( `article6` as BINARY ) REGEXP '^[a-z]' OR
CAST( `article7` as BINARY ) REGEXP '^[a-z]'</code>
----
Imported from https://fs.golderweb.de/task/157#comment35 via GOLDERWEB FS->GITEA TICKETIMPORTER
Originially added: Sat Sep 23 17:25:09 2017
Artikel mit Kleinbuchstaben am Anfang sollten beim Parsen auf Großbuchstaben gewandelt werden.
DB-Query Ansatz (muss noch für alle articleX erweitert werden:
SELECT *
FROM
dewikipedia_redfams
WHERE CAST(
article0
as BINARY ) REGEXP '^[a-z]'Schritte zum Beheben:
Imported from https://fs.golderweb.de/task/157 via GOLDERWEB FS->GITEA TICKETIMPORTER
Originially opened: Sat Sep 2 21:42:57 2017
SELECT *
FROM
dewikipedia_redfams
WHERE CAST(
article0
as BINARY ) REGEXP '^[a-z]' ORCAST(
article1
as BINARY ) REGEXP '^[a-z]' ORCAST(
article2
as BINARY ) REGEXP '^[a-z]' ORCAST(
article3
as BINARY ) REGEXP '^[a-z]' ORCAST(
article4
as BINARY ) REGEXP '^[a-z]' ORCAST(
article5
as BINARY ) REGEXP '^[a-z]' ORCAST(
article6
as BINARY ) REGEXP '^[a-z]' ORCAST(
article7
as BINARY ) REGEXP '^[a-z]'Imported from https://fs.golderweb.de/task/157#comment35 via GOLDERWEB FS->GITEA TICKETIMPORTER
Originially added: Sat Sep 23 17:25:09 2017
Closed as https://fs.golderweb.de/task/157#taskclosed via GOLDERWEB FS->GITEA TICKETIMPORTER
Originally closed: Sat Sep 23 17:25:26 2017