Page MenuHomePhabricator

MIMEsearchPage::reallyDoQuery failing on the logs due to taking too long to query
Closed, ResolvedPublic

Description

Setting it as low because I think this is a very specific query which very specific parameters that only few users executed. It is failing on commons and has been observed just twice failing over 2344 times.

{
  "_index": "logstash-2015.07.29",
  "_type": "mediawiki",
  "_id": "AU7Y3YNDQ7NQKcKbniwk",
  "_score": null,
  "_source": {
    "message": "MIMEsearchPage::reallyDoQuery\t10.64.16.31\t2013\tLost connection to MySQL server during query (10.64.16.31)\tSELECT  6 AS namespace,img_name AS title,img_name AS value,img_size,img_width,img_height,img_user_text,img_timestamp  FROM `image`   WHERE img_major_mime = 'image' AND img_media_type IN ('BITMAP','DRAWING','AUDIO','VIDEO','MULTIMEDIA','UNKNOWN','OFFICE','TEXT','EXECUTABLE','ARCHIVE')  AND img_minor_mime = 'jpeg'  LIMIT 3180500,501  ",
    "@version": 1,
    "@timestamp": "2015-07-29T08:11:12.447Z",
    "type": "mediawiki",
    "host": "mw1252",
    "level": "ERROR",
    "tags": [
      "syslog",
      "es",
      "es",
      "normalized_message_trimmed"
    ],
    "channel": "wfLogDBError",
    "url": "/w/index.php?title=Special:MIMESearch/image/jpeg&limit=500&offset=3180500",
    "ip": "10.64.0.104",
    "http_method": "GET",
    "server": "commons.wikimedia.org",
    "referrer": null,
    "uid": "80bb6f5",
    "process_id": 13345,
    "wiki": "commonswiki",
    "db_server": "10.64.16.31",
    "db_name": "commonswiki",
    "db_user": "wikiuser",
    "method": "DatabaseBase::reportQueryError",
    "errno": 2013,
    "error": "Lost connection to MySQL server during query (10.64.16.31)",
    "sql1line": "SELECT  6 AS namespace,img_name AS title,img_name AS value,img_size,img_width,img_height,img_user_text,img_timestamp  FROM `image`   WHERE img_major_mime = 'image' AND img_media_type IN ('BITMAP','DRAWING','AUDIO','VIDEO','MULTIMEDIA','UNKNOWN','OFFICE','TEXT','EXECUTABLE','ARCHIVE')  AND img_minor_mime = 'jpeg'  LIMIT 3180500,501  ",
    "fname": "MIMEsearchPage::reallyDoQuery",
    "normalized_message": "MIMEsearchPage::reallyDoQuery\t10.64.16.31\t2013\tLost connection to MySQL server during query (10.64.16.31)\tSELECT  6 AS namespace,img_name AS title,img_name AS value,img_size,img_width,img_height,img_user_text,img_timestamp  FROM `image`   WHERE img_major_"
  },
  "sort": [
    1438157472447
  ]
}

Event Timeline

jcrespo raised the priority of this task from to Low.
jcrespo updated the task description. (Show Details)
jcrespo added subscribers: jcrespo, Springle.

This basically does a full table scan. The query should either banned or paged by an index:

MariaDB PRODUCTION s4 localhost commonswiki > EXPLAIN SELECT  6 AS namespace,img_name AS title,img_name AS value,img_size,img_width,img_height,img_user_text,img_timestamp  
FROM `image`   
WHERE img_major_mime = 'image' AND 
img_media_type IN ('BITMAP','DRAWING','AUDIO','VIDEO','MULTIMEDIA','UNKNOWN','OFFICE','TEXT','EXECUTABLE','ARCHIVE')  AND 
img_minor_mime = 'jpeg'  
LIMIT 3180500,501\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: image
         type: range
possible_keys: img_media_mime
          key: img_media_mime
      key_len: 37
          ref: NULL
         rows: 19843763
        Extra: Using index condition; Rowid-ordered scan
1 row in set (0.00 sec)

On a trial, this takes 1 min 34.25 to execute.

From varnish:

{"hostname":"cp1068.eqiad.wmnet","sequence":14321379515,"dt":"2015-07-29T16:24:30","time_firstbyte":60.163419247,
"ip":"10.64.0.105","cache_status":"miss","http_status":"500","response_size":14602,"http_method":"GET",
"uri_host":"commons.wikimedia.org","uri_path":"/w/index.php","uri_query":"?title=Special:MIMESearch/image/jpeg&limit=500&offset=8443500",
"content_type":"text/html; charset=UTF-8","referer":"-","x_forwarded_for":"[ERASED]","user_agent":"-","accept_language":"-",
"x_analytics":"https=1","range":"-","x_cache":"cp1055 miss (0), cp1068 frontend miss (0)"}

x-forwarded-for (wmf-restricted): https://phabricator.wikimedia.org/P1161

Both several ops and devs have agreed on IRC this is a very low priority task, and that the main issue would be to make them fail faster so they do not take many resources AND the noise on the logs.

Searching on https://logstash.wikimedia.org/#/dashboard/elasticsearch/hhvm shows slow timer alerts for MIMEsearchPage::reallyDoQuery starting to trend around 2015-07-29T05:45 and continuing steadily after that at a rate of ~1200 per hour (20 req/minute).

Continuing on db1042 right now and queries regularly hitting 5min limit. The LIMIT 10405000,501 is most of the problem here; shouldn't be possible to generate such a search.

Also, all bot driven, eg: tide543.microsoft.com

Quick plug: make this particular special page noindex, nofollow?

Why is a bot indexing /w/index.php?title=Special:MIMESearch if /w/ is disabled on robots.txt ? Why not having a proper user agent and referrer? I do not thing think this is a "legit" crawler.

Krenair raised the priority of this task from Low to High.Jul 30 2015, 8:45 PM
Krenair subscribed.

This is the majority of the warnings in fluorine:/a/mw-log/hhvm.log at the moment

jcrespo lowered the priority of this task from High to Medium.Jul 30 2015, 8:46 PM

Proxies do not read an url each second monotonically, nor have agents like ".net Framework test", that I think I spotted. It is getting annoying on the logs.

jcrespo raised the priority of this task from Medium to High.Jul 30 2015, 8:46 PM

Change 228159 had a related patch set uploaded (by Ori.livneh):
Make Special:MIMESearch expensive again

https://gerrit.wikimedia.org/r/228159

Change 228165 had a related patch set uploaded (by Ori.livneh):
Make Special:MIMESearch expensive again

https://gerrit.wikimedia.org/r/228165

ori claimed this task.

This page is now marked as expensive, so it is disabled on large wikis.

This page is now marked as expensive, so it is disabled on large wikis.

I believe this statement is inaccurate: Special:MIMESearch is now disabled on all—not just large—wikis. For example, I checked test.wikipedia.org and it's disabled there. It would be easy to think that $wgMiserMode is disabled on small wikis, but T48098 remains unresolved. :-(

In general, Special:MIMESearch is kind of a goofy feature. I think the intention was to have a means of filtering Special:ListFiles (or Special:Search...) by file extension/type. Currently MediaWiki/Wikimedia's multimedia search is pretty terrible.

This page is now marked as expensive, so it is disabled on large wikis.

So if we just made it so that users weren't allowed to set an offset higher than 5000, would it be enable-able again?

This doesn't seem Mimesearch specific. All non-cached non-expensive subclasses of QueryPage probably have the same problem.

Change 250877 had a related patch set uploaded (by Brian Wolff):
Make query pages not accept offset > 10000 in miser mode

https://gerrit.wikimedia.org/r/250877

Change 250877 merged by jenkins-bot:
Make query pages not accept offset > 10000 in miser mode

https://gerrit.wikimedia.org/r/250877

Change 313238 had a related patch set uploaded (by Bartosz Dziewoński):
Make Special:MIMESearch non-expensive, again

https://gerrit.wikimedia.org/r/313238

Change 313238 merged by jenkins-bot:
Make Special:MIMESearch non-expensive, again

https://gerrit.wikimedia.org/r/313238