Page MenuHomePhabricator

InstantCommons broken by switch to HTTPS
Closed, ResolvedPublic

Description

InstantCommons has been broken by the switch to HTTPS-only mode - see e.g. beta enwiki. Fixed in master, needs backport to supported branches. HTTPS redirect has been temporarily disabled on Commons to give users time to upgrade.

Patch for people running older versions of MediaWiki: 8517b3

Alternatively, you can put this code snippet in your LocalSettings.php:

$wgUseInstantCommons = false;
$wgForeignFileRepos[] = array(
	'class' => 'ForeignAPIRepo',
	'name' => 'wikimediacommons',
	'apibase' => 'https://commons.wikimedia.org/w/api.php',
	'hashLevels' => 2,
	'fetchDescription' => true,
	'descriptionCacheExpiry' => 43200,
	'apiThumbCacheExpiry' => 86400,
);

If that does not help, the root certificate bundle of your server might be missing the certificate authority used by Wikimedia (GlobalSign), in which case it is probably badly outdated and you should update it. If you have shell access, you can check with this command (look at the "Server certificate" block):

curl -vso /dev/null 'https://upload.wikimedia.org/wikipedia/commons/7/70/Example.png' && echo success || echo failed

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I have installed the php5-curl now but still Instantcommons isn't working properly. What next?

I have installed the php5-curl now but still Instantcommons isn't working properly. What next?

Most likely, follow instructions at https://snippets.webaware.com.au/howto/stop-turning-off-curlopt_ssl_verifypeer-and-fix-your-php-config/ to set curl.cafile in php.ini

I downloaded 2 php.ini files via FileZilla from directories /etc/php5/apache2/ and /etc/php5/cli/.

Changed ;curl.cainfo = to curl.cainfo = "/etc/ssl/certs/ca-certificates.crt" in both files and uploaded them to server, after that replaced original files with new ones.

Then restarted server by sudo apache2ctl graceful and sudo service apache2 restart.

Still nothing ... Any further recommendations??

Still nothing ... Any further recommendations??

Make sure you have logging enabled then grep for ForeignAPIRepo: ERROR on GET:.

So, where are we at on removing the redirection workarounds here? I'd still like to get these removed ASAP. Have we released new software with https:// URLs? Does MW's http-fetching stuff in general always allow protocol-only redirects (this may affect internal cases of fetches from meta, too).

Have we released new software with https:// URLs?

No. The InstantCommons patch still needs to be merged to 1.24 and 1.23, and all supported versions (1.23, 1.24, 1.25) need a new release. And presumably some grace period after that.

The blocking tasks also need some work.

Does MW's http-fetching stuff in general always allow protocol-only redirects (this may affect internal cases of fetches from meta, too).

It does not allow redirects at all, unless explicitly configured. If redirects are enabled, there is no way to restrict them in any way.
See also T103043 (not sure if it should be a blocker).

So, where are we at on removing the redirection workarounds here?

Patches have been merged to all supported branches.

I'd still like to get these removed ASAP. Have we released new software with https:// URLs?

No, no release yet. 1.25.2 (T93267) is supposed to be a security release but it's not quite ready yet. If this is super urgent, I suppose we could do a release and just push back the security one to 1.25.3...

Does MW's http-fetching stuff in general always allow protocol-only redirects (this may affect internal cases of fetches from meta, too).

We don't follow redirects by default in our http-fetching stuff. It'd be up to an explicit codepath to turn that on. When it is on, it doesn't make any sort of assumptions about protocols, etc. We could probably improve things there...

So, where are we at on removing the redirection workarounds here?

Patches have been merged to all supported branches.

I'd still like to get these removed ASAP. Have we released new software with https:// URLs?

No, no release yet. 1.25.2 (T93267) is supposed to be a security release but it's not quite ready yet. If this is super urgent, I suppose we could do a release and just push back the security one to 1.25.3...

Just to clear up confusion with the @Tgr's comment as well: Are we planning to release for 1.23 and 1.24 as well, or just 1.25?

This whole thing doesn't really fit my definition of "Super Urgent", but on the other hand it's now been about a month, and I was expecting more like 2-3 weeks to pulling the exception out of varnish, and no real end in sight yet. If the plan is to take another month or two, then yeah, we need to do something about fixing this sooner. As far as I'm concerned, the cause of this is our own broken software. We've been harder on external breakage than we're being with ourselves here...

Does MW's http-fetching stuff in general always allow protocol-only redirects (this may affect internal cases of fetches from meta, too).

We don't follow redirects by default in our http-fetching stuff. It'd be up to an explicit codepath to turn that on. When it is on, it doesn't make any sort of assumptions about protocols, etc. We could probably improve things there...

Perhaps this part needs to be a separate task, but my feeling is that we'll continue to see breakage internally and externally so long as we don't fix the general case here. Anything that acts as HTTP[S]-fetching code in MediaWiki should always follow a protocol redirect (as in, nothing about the URL changes except the protocol switch from http to https), regardless of any sort of security-focused "don't follow redirects" flag.

Just to clear up confusion with the @Tgr's comment as well: Are we planning to release for 1.23 and 1.24 as well, or just 1.25?

I figured all 3 since patches were made for all of them...but I don't really care tbh

This whole thing doesn't really fit my definition of "Super Urgent", but on the other hand it's now been about a month, and I was expecting more like 2-3 weeks to pulling the exception out of varnish, and no real end in sight yet.

Sorry if we set a bad expectation here...releases never happen on time :)

If the plan is to take another month or two, then yeah, we need to do something about fixing this sooner. As far as I'm concerned, the cause of this is our own broken software. We've been harder on external breakage than we're being with ourselves here...

To be honest, even if I released the software tomorrow, we'll still see a long tail of people upgrading. You're still looking at months...

If we don't really care, then why not just remove the exceptions today? InstantCommons is an optional feature, off by default, and the broken behavior can be configured around.

Perhaps this part needs to be a separate task, but my feeling is that we'll continue to see breakage internally and externally so long as we don't fix the general case here. Anything that acts as HTTP[S]-fetching code in MediaWiki should always follow a protocol redirect (as in, nothing about the URL changes except the protocol switch from http to https), regardless of any sort of security-focused "don't follow redirects" flag.

Yeah. The whole behavior around redirection here is wonky. We should definitely fix it up. Filed T105765 for it.

Change 224557 had a related patch set uploaded (by BBlack):
HTTPS redirects: remove InstantCommons exception

https://gerrit.wikimedia.org/r/224557

Change prepped so it's easy. I'm open to debate on timing (I tend to think we should least have a software release available).

I don't see anything needing change on any of those.

Probably because Bawolff fixed them three weeks ago.

Still nothing ... Any further recommendations??

Make sure you have logging enabled then grep for ForeignAPIRepo: ERROR on GET:.

Did Tau find the solution to their problem?

I created mediawiki folder to /var/log/ (permissions rwx for both user and groups). Then added line $wgDebugLogFile = "/var/log/mediawiki/debug-{$wgDBname}.log"; to LocalSettings.php. Made few edits on wiki page but debug file was not created. Then added manually text file debug-my_wiki.log (permissions rwx for both user and groups) to /var/log/mediawiki/ made edits on wiki page, restarted server but still can't get the debug info saved into that file. What I am doing wrong?

Manual:$wgDebugLogFile recommends checking open_basedir. Also, are you sure the user MediaWiki runs under (probably www-data) is in the right group to write the file?

In both php.ini files the ";open_basedir= " (blank). Is it okay? Is this string set anywhere else too in addition to these two php.ini-s?

I typed "groups www-data" to command line and received "www-data : www-data" - this should be okay?

In both php.ini files the ";open_basedir= " (blank). Is it okay? Is this string set anywhere else too in addition to these two php.ini-s?

If you are running your own webserver (not some cheap shared host) and you didn't set it explicitly, it's not enabled.

I typed "groups www-data" to command line and received "www-data : www-data" - this should be okay?

That means for a file to be writable, one of these must be true:

  • the owner of the file is www-data
  • the group of the file is www-data
  • the file is world-writable (ie. rw-rw-rw or something like that)

Should the open_basedir be enabled then? What directory I should set to open_basedir?

If I change the owner of the /var/log/mediawiki folder to www-data then has it permission to write this file?

Or what I should do exactly?

If it is not a large production website, I would just do chmod -R a+rw /var/log/mediawiki.

Finnally I managed to get the logging enabled.

  1. I did chmod -R a+rw /var/log/mediawiki - still nothing
  2. Created manually log file /var/log/mediawiki/debug-mywiki.log - still nothing
  3. sudo chown www-data:www-data /var/log/mediawiki - manually created log file disappeared
  4. sudo chown www-data:www-data /var/log/mediawiki/debug-mywiki.log - still the same situation
  5. sudo chown admin:admin /var/log/mediawiki - manually created log file appeared again but instead of file size 0 bytes file size 50 982 bytes was shown thus the log was finally written.

BUT no such phrase as ForeignAPIRepo is included in it. How to get it?

BUT no such phrase as ForeignAPIRepo is included in it. How to get it?

Have you tried loading and purging some pages with remote images, to force thumbnail creation?

I have tried purging but with no success. Can turning ImageMagick on/off affect this issue? I will try some maintenance scripts next week.

I tried several maintenance scripts (purgeList, checkImages, rebuildImages etc.) but none of them helped. It's getting quite annoying already...

You could try to cherry-pick https://gerrit.wikimedia.org/r/#/c/223518/ and set $wgDebugLogGroups['http'] = <some custom log file>.

  1. Do I run it via command line? git fetch https://gerrit.wikimedia.org/r/mediawiki/core refs/changes/18/223518/4 && git cherry-pick FETCH_HEAD
  2. In which directory I should run it?
  3. This $wgDebugLogGroups['http'] = <some custom log file> goes to LocalSettings.php?

The easiest way is probably to run curl -s 'https://git.wikimedia.org/patch/mediawiki%2Fcore/64446397925576210c50baedc77becb470df84e2' | patch -p1 in the MediaWiki main directory (what you wrote would work too if you installed MediaWiki via git). $wgDebugLogGroups should go into LocalSettings.php.

Have some problems with running this patch...

It sounds like you already had that commit locally. How did you install MediaWiki, tarballs or a git checkout?

Apparently git.wikimedia.org patch pages are HTML, not plaintext. How fun.

So here's a command that works:

curl -s 'https://github.com/wikimedia/mediawiki/commit/64446397925576210c50baedc77becb470df84e2.patch' | patch -p1

You might have to restore the files includes/HttpFunctions.php and includes/filerepo/ForeignAPIRepo.php to their original version first.

Which is the easiest way to restore the files includes/HttpFunctions.php and includes/filerepo/ForeignAPIRepo.php to their original version if I don't have backup of them?

I saved HttpFunctions.php and ForeignAPIRepo.php as text files to my computer, then copied content from these files and replaced content in server files.

Still have trouble with this patch:

/var/www/html/mediawiki$ curl -s 'https://github.com/wikimedia/mediawiki/commit/64446397925576210c50baedc77becb470df84e2.patch' | patch -p1
patching file includes/HttpFunctions.php
Hunk #2 FAILED at 75.
1 out of 2 hunks FAILED -- saving rejects to file includes/HttpFunctions.php.rej
patching file includes/filerepo/ForeignAPIRepo.php
Hunk #2 FAILED at 523.
1 out of 2 hunks FAILED -- saving rejects to file includes/filerepo/ForeignAPIRepo.php.rej

I'll just upload the correct files then:

So, I see new releases a week ago for 1.2[345] containing the InstantCommons fix. Also, it's been about a month since the last time I complained about this issue. I'm leaning towards merging https://gerrit.wikimedia.org/r/#/c/224557/ sometime this week, unless anyone else has objections.

I'll just upload the correct files then:

I replaced old files with these files but still the log file is blank (tried changing user permissions and owners etc.).

Can this issue be fixed if I upgrade from 1.23.3 to 1.23.10?

Can this issue be fixed if I upgrade from 1.23.3 to 1.23.10?

Possibly. You might be using an old branch where certain logging changes were not backported yet.

@BBlack: there are two blocking tasks with unmerged patches, and they fix pretty serious issues with fopen. Do we know how many people use fopen vs. curl?

@Tgr Do we have any immediate plans to fix those anyways, or a sane plan to fix them that would apply to the bulk of users?

Answering my own question: there's mw/core patches attached to both, last activity about a month ago, with comments indicating that they seem to test well. Is there some unknown blocking them?

Nothing apart from lack of reviewers, I think. I can review them in the next couple days, but that only makes sense if you are willing to wait for the next tarball so they can go out. It looks like fopen would be broken badly without these patches, but I have no idea if that is a big deal or not. Are there environments where curl is not available? (Shared hosting?) Do we maybe log user agents so we can tell whether PHP fopen is used frequently?

To answer myself, we log them in api-feature-usage but I have no idea how to tell curl vs. fopen from that. Curl uses MediaWiki/<version> and fopen uses whatever is configured in php.ini.

Are there environments where curl is not available? (Shared hosting?)

Yes. (According to users.)

Pinged both of those tasks. I'm pretty much out of patience with waiting for PHP to suddenly become a less-horrible platform for making requests over the Internet.

Both tasks have patches written by @Bawolff and tested by me, waiting for reviewers and +2.

@BBlack: the good news is PHP 5.6 correctly speaks HTTPS without curl (SAN and CA repo), the nightmare is stopping :-/

So, now we're pending on merge of those 3 and a new sec release of those versions?

Bryan just merged them all, so just a new release (this probably does not count as a security issue, as PhpHttpRequest wasn't insecure before, it just did not work).

I have added the point to the Release-Engineering-Team weekly meeting on 2015-09-17.

So, now we're pending on merge of those 3 and a new sec release of those versions?

Yerp. @csteipp and I are talking about getting a release out soon.

^ Was the software released? We still haven't removed the exception itself...

Yes, a couple hours ago. We should write to mediawiki-announce, wait a week or so as a courtesy, and then drop Commons/uploads HTTP support.

Yes, a couple hours ago. We should write to mediawiki-announce, wait a week or so as a courtesy, and then drop Commons/uploads HTTP support.

Can someone take on composing and sending this email? I'm probably not the best person for it.

Question: wouldn't that be possible to ship the certificate as a parameter to $wgForeignXXXRepos and not depend on whatever is the installed curl default? It will be nightmare to track this down on every possible OS and distribution...

Question: wouldn't that be possible to ship the certificate as a parameter to $wgForeignXXXRepos and not depend on whatever is the installed curl default? It will be nightmare to track this down on every possible OS and distribution...

The problem with that is that you cannot define a certificate bundle in any "additive" way, and we don't want to override the server default which might be restrictive (or non-restrictive) for a good reason. Providing it as a patch might be a good idea though (if you have one, feel free to add it to the task description).

Right, might be difficult due to the way OpenSSL usually wants to have certificates. Will try to figure out!

It really shouldn't be hard for an OS/distribution/platform/language/whatever to have working TLS with a standard CA bundle these days. This almost feels like trying to ship a bundled TCP/IP stack and set of ethernet hardware drivers with an application in the days of Trumpet Winsock :P I haven't followed the low-level details too hard, but I think most of those with problems are on older PHP installs and/or Windows?

This almost feels like trying to ship a bundled TCP/IP stack and set of ethernet hardware drivers with an application in the days of Trumpet Winsock :P

Why not, remember we also provide wiki infrastructure to villages without electricity.

I haven't followed the low-level details too hard, but I think most of those with problems are on older PHP installs and/or Windows?

PHP without the cURL module (and not a very recent version - I wouldn't call 5.5 old) and/or Windows.

php-curl is pretty basic functionality nowadays, I doubt there are many shared hosts not supporting it. And for installations where you have root, adding php-curl is very easy. I don't think we need to do anything further in MediaWiki proper, although a manual workaround (patch, configuration instructions or whatever) would be nice. There is some information at InstantCommons#HTTPS already, I'll add a link to this bug.

It's been over a week since the email, which ended up going out a bit later after the releases than expected anyways. Merging the removal of the exception today!

Change 224557 merged by BBlack:
HTTPS redirects: remove InstantCommons exception

https://gerrit.wikimedia.org/r/224557

BBlack claimed this task.

Hello!

I was looking for help in this topic about year ago because instantcommons stopped working in my wiki after switch to HTTPS. I was unable to detect the reason and solve this issue and there wasn't much time after that to deal with it more profoundly. About week ago I found out that the new LTS release 1.27 has come out and there were info that was big relief for me:

InstantCommons made easier and cheaper
InstantCommons will now truly work out of the box, as long as your users can connect to upload.wikimedia.org: thumbnails will be served from that domain instead of requiring local generation (gerrit:251556).

So I thought that finally my problem is solved if I only upgrade from 1.23.13 to 1.27.0. So I did but when this was done I had to conclude that nothing changed - thumbnails of commons images are not generated, only red links are shown that lead to upload page.

Again I am asking the same question: do you have any ideas what's wrong?

I have Mediawiki 1.27.0, curl installed.

For example I have file c.jpg in my server. This is shown properly in my wiki when ImageMagick is off. When I turn ImageMagic on, then the thumbnail in my wiki is not thumbnail of my server file. Then it shows thumbnail of Commons file and when I click it, it leads to my wiki original file not to commons file. That's the only way I get anything from commons to appear in my wiki.

Also when I ran maintenance scrip update.php after upgrading I got following error:

Failed to set the local repo temp zone container to be private.
Purging caches...done.

Maybe I have made a very simple mistake for example in LocalSettings.php that is causing this error or have some wrong permissions and the solution is easier than expected? Anyway I would be very glad if someone could help me with resolving this issue, have tried several things but still no effort.

First steps when debugging InstantCommons issues are usually to check the http log channel and the value of $wgLocalFileRepo and $wgForeignFileRepos.

How do I check the http log channel?

In /mediawiki/LocalSettings.php I have following lines:

## To enable image uploads, make sure the 'images' directory
## is writable, then set this to true:
$wgEnableUploads = true;
# $wgUseImageMagick = true;
# $wgImageMagickConvertCommand = "/usr/bin/convert";

# InstantCommons allows wiki to use images from http://commons.wikimedia.org
$wgUseInstantCommons = true;

## If you use ImageMagick (or any other shell command) on a
## Linux server, this will need to be set to the name of an
## available UTF-8 locale
$wgShellLocale = "en_US.utf8";

In /mediawiki/includes/Setup.php I have following lines:

/**
 * Initialise $wgLocalFileRepo from backwards-compatible settings
 */
if ( !$wgLocalFileRepo ) {
	$wgLocalFileRepo = [
		'class' => 'LocalRepo',
		'name' => 'local',
		'directory' => $wgUploadDirectory,
		'scriptDirUrl' => $wgScriptPath,
		'scriptExtension' => '.php',
		'url' => $wgUploadBaseUrl ? $wgUploadBaseUrl . $wgUploadPath : $wgUploadPath,
		'hashLevels' => $wgHashedUploadDirectory ? 2 : 0,
		'thumbScriptUrl' => $wgThumbnailScriptPath,
		'transformVia404' => !$wgGenerateThumbnailOnParse,
		'deletedDir' => $wgDeletedDirectory,
		'deletedHashLevels' => $wgHashedUploadDirectory ? 3 : 0
	];
}
/**
 * Initialise shared repo from backwards-compatible settings
 */
if ( $wgUseSharedUploads ) {
	if ( $wgSharedUploadDBname ) {
		$wgForeignFileRepos[] = [
			'class' => 'ForeignDBRepo',
			'name' => 'shared',
			'directory' => $wgSharedUploadDirectory,
			'url' => $wgSharedUploadPath,
			'hashLevels' => $wgHashedSharedUploadDirectory ? 2 : 0,
			'thumbScriptUrl' => $wgSharedThumbnailScriptPath,
			'transformVia404' => !$wgGenerateThumbnailOnParse,
			'dbType' => $wgDBtype,
			'dbServer' => $wgDBserver,
			'dbUser' => $wgDBuser,
			'dbPassword' => $wgDBpassword,
			'dbName' => $wgSharedUploadDBname,
			'dbFlags' => ( $wgDebugDumpSql ? DBO_DEBUG : 0 ) | DBO_DEFAULT,
			'tablePrefix' => $wgSharedUploadDBprefix,
			'hasSharedCache' => $wgCacheSharedUploads,
			'descBaseUrl' => $wgRepositoryBaseUrl,
			'fetchDescription' => $wgFetchCommonsDescriptions,
		];
	} else {
		$wgForeignFileRepos[] = [
			'class' => 'FileRepo',
			'name' => 'shared',
			'directory' => $wgSharedUploadDirectory,
			'url' => $wgSharedUploadPath,
			'hashLevels' => $wgHashedSharedUploadDirectory ? 2 : 0,
			'thumbScriptUrl' => $wgSharedThumbnailScriptPath,
			'transformVia404' => !$wgGenerateThumbnailOnParse,
			'descBaseUrl' => $wgRepositoryBaseUrl,
			'fetchDescription' => $wgFetchCommonsDescriptions,
		];
	}
}
if ( $wgUseInstantCommons ) {
	$wgForeignFileRepos[] = [
		'class' => 'ForeignAPIRepo',
		'name' => 'wikimediacommons',
		'apibase' => 'https://commons.wikimedia.org/w/api.php',
		'url' => 'https://upload.wikimedia.org/wikipedia/commons',
		'thumbUrl' => 'https://upload.wikimedia.org/wikipedia/commons/thumb',
		'hashLevels' => 2,
		'transformVia404' => true,
		'fetchDescription' => true,
		'descriptionCacheExpiry' => 43200,
		'apiThumbCacheExpiry' => 86400,
	];
}

Hi! Finally InstantCommons is working in my wiki again. I don't know exactly what caused the error but after updating form PHP5 to PHP7 and following instructions here it is working again. Thanks to all who had helped me!