Page MenuHomePhabricator

Create ability to add / remove tags from edits / actions
Closed, ResolvedPublic

Description

Currently there's no way to remove tags from edits or actions. If an AbuseFilter filter calls a legitimate edit (spam) or (possible civility issue) or something and it really isn't, there needs to be a way to remove the tag from the edit or action. Especially given that the tags are permanent.


See Also:
T30213: AbuseFilter should let users to mark log entries as false positives

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Tags generally should be permanent. They need to be more easily applied though. Twinkle should be able to tag things as tagged with various maint tags, tagged as csd/prod/xfd, AWB should be able to put its tag as a tag instead of as part of the edit summary, Snuggle, my ACC welcoming script, OneClickArchiver, PER responder script etc should be able to tag instead of relying on putting tags in edit summaries. So, this needs a three fold ability to manipulate tags. A special page to allow trusted users (Rollbackers and up) to remove inappropriate or wrong tags, an API hook to use while posting to a page to add a tag, and a JavaScript mw.addTag()/mw.removeTag() to allow scripts like lupin's antivandal script to add/remove tags as well.

Aklapper lowered the priority of this task from Medium to Low.Dec 26 2014, 8:23 PM

[Feel free to increase the priority value if you plan to work on this. Changing back to more realistic "Low" value.]

While tags should be generally permanent (they show up in the history of the page, and the User contributions), it should be possible to remove them when clearly false positives; and the ability to add them would help deal with certain Wikipedia issues (see here for a list of some).

Change 181958 had a related patch set uploaded (by Parent5446):
Allow sysops to delete change tags

https://gerrit.wikimedia.org/r/181958

Patch-For-Review

matmarex removed a subscriber: Unknown Object (MLST).

So, what is the current situation here?

  • All defined tags [1] are defined by extensions. The other approach for defining tags (by adding a row to the valid_tag table) is not used, partly because it is not exposed anywhere in the UI or API.
  • It's possible to apply a tag without it being defined. This is mainly due to the silly way in which tags were implemented (see T20672) and I don't propose that we do anything about it here.
    • Notably, the OAuth extension doesn't bother to define the tags it uses. Impolite, if you ask me!
  • The only way for tags to be applied is by an extension or custom hook.
  • Tags cannot be removed from individual revisions once applied.
  • Once a tag is used by at least one revision (or log entry [2]), the wiki is stuck with it forever. It will stay on Special:Tags until the end of days (or until a shell user comes along and deletes it manually - T58237).

Here's what I propose to implement here:

  • Add a tags= parameter to API action=edit, allowing users to add specific change tags to the edits they make. This will be of particular value to bots and user scripts (e.g. Twinkle could stop adding its little ([[WP:TW|TW]]) "ad" to the edit summary, and use a tag instead). To prevent nonsense/test/spam tags from proliferating, only those tags listed in the valid_tag table would be allowed to be added this way. (3)
    • Question: Does this need to be restricted by a user right (addchangetags perhaps), or can we allow it for all users who can edit?
  • Implement a changetags user right, which would give the user access to additional UI on ?diff pages? ?old revisions? etc. to do the following:
    • Add a tag to a particular revision. (4)
      • Question: Should this be restricted to valid_tags only? Or can any tag be added after the fact?
    • Remove a tag from a particular revision. (4)
  • Implement a managechangetags user right, which would give the user access to additional UI on Special:Tags (as well as API modules) to do the following:
    • Create a new tag. This would add a row to the valid_tag table, allowing the tag to be applied manually by users and bots. (2)
    • Delete a tag. This would remove it from all revisions and log entries where it is in use, as well as removing it from the valid_tag table (if present). Limited to 5000 rows, due to DB performance concerns. (1)
    • Add an existing (extension-defined) tag to the valid_tag table, to allow it to be manually applied by users. (2)
    • Remove an extension-defined tag from the valid_tag table, if present, to prevent it from being manually applied. (2)
    • Merge two tags into one. Yet to determine how this would work, but it would be nice to have. (mysterious future)

Bold numbers indicate the order in which I propose to do the work.

Please comment with suggestions, criticisms or answers to the questions.


[1] The "active" column on Special:Tags tells whether a tag is defined.
[2] AFAIK no extensions currently apply tags to log entries, although such functionality is definitely possible.

  • Add a tags= parameter to API action=edit, allowing users to add specific change tags to the edits they make. This will be of particular value to bots and user scripts (e.g. Twinkle could stop adding its little ([[WP:TW|TW]]) "ad" to the edit summary, and use a tag instead). To prevent nonsense/test/spam tags from proliferating, only those tags listed in the valid_tag table would be allowed to be added this way. (3)
    • Question: Does this need to be restricted by a user right (addchangetags perhaps), or can we allow it for all users who can edit?

It will be a lot easier for tools (twinkle, awb, wpcleaner, ...) to use the tags if they are not restricted by an other right than edit. Otherwise, how do they know if they can remove the ad from comments?

True, but what about IPs? They can edit, but they can't use Twinkle or AWB (don't know about WPCleaner).

I'm not entirely comfortable letting IPs (and non-autoconfirmed users, on WMF) add tags to their edits, mainly because I can't see anything good they could do with that ability, and I can imagine bad (at least, annoying/disruptive) things they could do.

Same for WPCleaner, login is required to edit. I don't mind if tags are restricted to registered users.

If tags are available only through the API and not in the UI, tags will only be added by the tools themselves, not by a user (performing an edit through the API without a tool is not easy: if a user has the knowledge and time to do that, he could already modify the tools to do nasty things)

In T20670#950159, @TTO wrote:
  • Notably, the OAuth extension doesn't bother to define the tags it uses. Impolite, if you ask me!

Defining every possible tag would flood the list at Special:Tags with tags for a large number of apps that are never going to make logged actions, and also there's no way to "define" a tag without also marking it "active". See further comments at T60312.

As part of this, we probably need to more categories of tags:

  • "defined" (shows on Special:Tags even if 0), returned by ChangeTags::listDefinedTags()
  • "active" (says 'yes' on Special:Tags), which really shouldn't be identical to "defined"
  • "can be added by users", which IMO should be "valid_tag + a new hook"
  • "can be removed by users", which IMO should be "valid_tag + !defined ± a new hook"

All these categories should be indicated in the API action=query&list=tags output, BTW.

  • Add a tags= parameter to API action=edit, allowing users to add specific change tags to the edits they make. This will be of particular value to bots and user scripts (e.g. Twinkle could stop adding its little ([[WP:TW|TW]]) "ad" to the edit summary, and use a tag instead). To prevent nonsense/test/spam tags from proliferating, only those tags listed in the valid_tag table would be allowed to be added this way. (3)
    • Question: Does this need to be restricted by a user right (addchangetags perhaps), or can we allow it for all users who can edit?

It depends on the tag, IMO. Some tags might be OK for all users, while others might be open for abuse if they can be added by anyone. Perhaps we should add a column to valid_tag that has a pipe-separated list of rights to be checked with User::isAllowedAny()?

We might want to make that "valid_tag plus tags returned by some new hook", so for example the tag proposed in T52393 could be allowed to be added by users without the extension having to mess with valid_tag.

We'd probably also want to add "tags=" to other actions that can generate a log entry, although perhaps not all at once.

  • Implement a changetags user right, which would give the user access to additional UI on ?diff pages? ?old revisions? etc. to do the following:

Also API. "action=tag", possibly. Ideally it should allow for input of multiple values for revid, logid, or rcid, and allow for both adding and removing tags in the same request. This should, of course, generate a log entry for each revid/logid/rcid.

  • Add a tag to a particular revision. (4)
    • Question: Should this be restricted to valid_tags only? Or can any tag be added after the fact?

Same as above, I'd say: valid_tag plus a hook.

  • Remove a tag from a particular revision. (4)

This could be less restricted, but we still need to allow for extensions to prevent the removal of their tags, e.g. the "VisualEditor" tag would be somewhat useless for its purposes if people could go around deleting it.

A good default would probably be to allow removing any tag in valid_tag, any tag that isn't defined at all, plus a hook to allow extensions to adjust that list.

  • Delete a tag. This would remove it from all revisions and log entries where it is in use, as well as removing it from the valid_tag table (if present). Limited to 5000 rows, due to DB performance concerns. (1)

Now that I think of it, I wonder whether the "remove it from all revisions and log entries where it is in use" part should be optional. I could see a case where a tag should be removed from valid_tag (meaning no one can add it to new revisions/log entires) but the existing backlog should be kept to be manually cleaned up.

We could get around the 5000-row limit using the job queue; the basic process there is to create a subclass of Job that does a piece of the work and then resubmits itself, put it in $wgJobClasses, and then submit it when necessary. this job from SecurePoll might be a good starting point, but in this case you could just do a select with LIMIT and not worry about manually paging it.

The attempt to actually remove uses should use the same "is this tag allowed to be removed?" check as above.

  • Add an existing (extension-defined) tag to the valid_tag table, to allow it to be manually applied by users. (2)

This should not be allowed, IMO. We don't want people adding "VisualEditor" to non-VE edits, for example. So I'd say we shouldn't allow creating any already-defined tag, and we need a hook to allow extensions to prevent creating a tag even if it's not defined (e.g. OAuth would blacklist any tag with prefix "OAuth CID:").

  • Merge two tags into one. Yet to determine how this would work, but it would be nice to have. (mysterious future)

"How" is reasonably simple: delete the old tag from valid_tag (so it can't be added anymore), then iterate over all revisions with the old tag to remove the old tag while adding the new one. Again, the iteration would probably best be done via job queue.

It should also check whether the old tag is removable and the new is addable, of course.

[2] AFAIK no extensions currently apply tags to log entries, although such functionality is definitely possible.

OAuth does, for one. I also see the possibility to do so in AbuseFilter, TorBlock, MobileApp, and MobileFrontend, although I don't know for sure whether any code paths allow it.

Very helpful as always, Anomie. I agree with just about all that.

I would rather not use the job queue initially, if it's alright with you. I think the limit of 5000 rows should be enough for most purposes. Although being able to delete stuff like HHVM without having to concoct hackish scripts would be nice -- I think that is best left for a later patch -- perhaps written by someone other than me :)

Two lingering questions:

  1. I don't see any benefit in adding additional "active" semantics to manual tags - if a manual tag is inactive, that implies it's not supposed to be used anymore, so it may as well be un-defined. I would suggest to keep the existing semantics (defined === active) for stuff in the valid_tag table, and add a ListActiveTags hook for extensions to nominate which of their tags are active.
  2. "Perhaps we should add a column to valid_tag that has a pipe-separated list of rights to be checked with User::isAllowedAny()?" That feels a bit over-complicated. I can't see a clear need for this, but I'm open to being convinced.

Here's an example of what Special:Tags could look like. In order, we have:

  • an active, non-deletable, extension-defined tag
  • an active, deletable, extension-defined tag
  • an inactive, non-deletable, extension-defined tag
  • an inactive, deletable, extension-defined tag
  • an active; extension-defined, manually-applyable tag
  • an active manually-defined tag
  • and a "lost" tag (i.e. not in valid_tag and not defined by an extension, but still in change_tag)
Tag nameAppearanceDescriptionSourceActive?Tagged changesActions
VisualEditorVisualEditorEditing made sexyDefined by VisualEditor [1]Yes12,345 changes
section blankingSection blankingDon't do itDefined by edit filter $1Yes1,234 changesDelete
OAuth CID: 23OLD OAuth toolNo longer workingDefined by OAuth consumer $1No123 changes
seciotn blankingSection blankingMisspelt tagDefined by the edit filterNo123 changesDelete
no pingNo pingGo away, EchoDefined by Echo <br> Manually applied by users and botsYes12 changes
Twinkle[[WP:TW|Twinkle]]User scriptManually defined <br> Manually applied by users and botsYes12 changesDelete / Deactivate
michael jacksonMichael JacksonTestingNo longer in useNo12 changesDelete / Activate

[1] It could be helpful if we allow extensions to choose an extension-specific message for this field. e.g. Defined by VisualEditor / Defined by the abuse filter (enwiki could change it to "Defined by the edit filter")

I don't see the option to merge seciotn blanking with section blanking...  Would that be part of this interface or another?

sorry about that, I replied to the email instead of commenting directly

@Technical13: Could you please edit your comment to remove the repeat of my comment? :)

I imagine the "merge" interface could be implemented as a fieldset above the table at Special:Tags, below the fieldset that would be provided for creating a tag.

Change 182563 had a related patch set (by TTO) published:
[WIP] Creation, activation and improved management of change tags

https://gerrit.wikimedia.org/r/182563

Patch-For-Review

In T20670#951637, @TTO wrote:
  1. I don't see any benefit in adding additional "active" semantics to manual tags - if a manual tag is inactive, that implies it's not supposed to be used anymore, so it may as well be un-defined. I would suggest to keep the existing semantics (defined === active) for stuff in the valid_tag table, and add a ListActiveTags hook for extensions to nominate which of their tags are active.

I agree with that.

  1. "Perhaps we should add a column to valid_tag that has a pipe-separated list of rights to be checked with User::isAllowedAny()?" That feels a bit over-complicated. I can't see a clear need for this, but I'm open to being convinced.

At least lets design things so we don't have to replace all the hooks and so on if we decide to do this later.

Change 182563 abandoned by TTO:
[WIP] Creation, activation and improved management of change tags

Reason:
Never fear, this work is continuing at I77f476c8d0f32c80f720aa2c5e66869c81faa282

https://gerrit.wikimedia.org/r/182563

What is the relation between this task and T77964?

Change 188543 had a related patch set uploaded (by TTO):
Allow users to apply change tags as they edit using the API

https://gerrit.wikimedia.org/r/188543

Patch-For-Review

Change 181958 merged by jenkins-bot:
Creation, deletion and improved management of change tags

https://gerrit.wikimedia.org/r/181958

A few comments on applying change tags. I think we should distinguish between three categories of tags that can be added : those added by site scripts, by bot users, and by non-bot users. Tag managers (users with the managechantetags permission, possibly less restricted than deletechangetags) would specify the category and permissions would be checked like this :

  1. none (default) : cannot be applied
  2. bot : can be applied exclusively by users with both the 'addchangetags' userright and 'bot' userright
  3. user : can be applied exclusively by users with the 'addchangetags' userright and without the 'bot' userright
  4. script : can be applied exclusively by registered users through a site script

A tag may be in only one of those categories, so it may not be added both by bots and non-bots (except through scripts), this way users cannot add bot tags. If we want tags that can be added both by bot and non-bot users beyond script tags, then we need another category (bot+user), but in my opinion this should be discouraged, a non-script tag should be either reserved for bot use or manually applied.
However I don't know if it's possible for a tag to be applied when instructed by site scripts, but not in any other way such as with a user script ? If not, we would need a specific userright (addchangetags-script, probably restricted to autoconfirmed users).
Tag managers may be prevented to change the availability status of certain tags.
I guess that this could be implemented with an 'availability' field vt_availability added to the valid_tag table, checked when attempting to apply tags and at Special:Tags.

I think we should distinguish between three categories of tags that can be added : those added by site scripts, by bot users, and by non-bot users.

Why should there be tags that can only be added by non-bot users?

If by "site scripts" you're talking about gadgets or MediaWiki:Common.js, there's no way to determine that.

A tag may be in only one of those categories,

Why limit it so? IMO, a better model would be to be able to restrict modification of individual tags to users with some other arbitrary user right besides "addchangetag" (or whatever it ends up being called).

OTOH, the current implementation looks like it's going to be just the one "addchangetag" right, with no more complicated rights checking.

Why limit it so? IMO, a better model would be to be able to restrict modification of individual tags to users with some other arbitrary user right besides "addchangetag" (or whatever it ends up being called).
OTOH, the current implementation looks like it's going to be just the one "addchangetag" right, with no more complicated rights checking.

We need a system of permission checking that is robust enough to protect against mistakes and malicious tagging, which may be hard to spot, and hard to fix. It should also allow for future expansion since people have vastly different use cases in mind. I've rethought this a bit, here's what I think. The system I suggest is to check each attempt to apply a user-defined tag against a list of tags that have been allowed for each particular use case by tag managers. It's easy to manage (having several small lists to maintain is much better than a big one) and allows for finer permission checking. Each list could be updated at a subpage of Special:Tags (plain text like the raw watchlist is fine), after updating added tags would automatically be defined while removed tags would no longer be (activation would remain at Special:Tag). For each new use case that is implemented in mediawiki core or an extension, a new list can be created just for it. I'll list examples of use cases and the corresponding list.

This use case is bot tagging of edits, as discussed at Wikipedia:Village pump (proposals)/Archive 117#Bot tagging of edits :

  • list named "any-bot" : listed tags can be applied to any revision, log or recentchanges item, userright bot is checked
    • list items : possible copyvio (bot), possible cut and paste move (bot), possible vandalism (bot), possible spam (bot), etc

This use case is for tags applied to one's own edits when saving them (scripts, etc), in that case a new userright is needed :

  • list named "self-edit-bot" : needs both addchangetags-self and bot userrights
    • list items : AWB (bot), formatting fixes (bot), unreferenced (bot), ...
  • list named "self-edit-nonbot" : needs addchangetags-self but instead bot must not be a userright (see below for why)
    • list items : AWB, WP:TW, HG, WikiLove, ...

This is for T88771, tagging of admin actions (four use cases) :

  • list named "self-pagedelete" : no userright check needed (already checked when performing the 'delete' action)
    • list items : CSD:A1, CSD:G3, WP:AFD, WP:PROD, ...
  • same for the lists "self-revisiondelete", "self-protect" and "self-block"

That's six use cases, yet we only need one more userright, addchangetags-self. If we also want to implement manual tagging of other users' edits or actions, then one may need another specially for this use case : addchangetags.

Why should there be tags that can only be added by non-bot users?

If we have two lists, one only for bots, one for everyone, then if a tag manager mistakenly adds a tag supposed to be used only by bots to the list for everyone, there will be no warning sign of it and the tag may be improperly applied for quite a while. While if we have two lists, one only for bots, one only for non-bots, any mistake will be visible rapidly since the bot won't be able to perform any tagging. If, however, a tag must absolutely be shared by bots and non-bots, then it may be possible to add it to two lists at the same time (in this regard, it's different from my first suggestion which was exclusive), but at least this way it's less prone to mistakes.

It may be a good idea to save the id of the list checked for the tagging in the change_tag table. A log made out of it would be useful for use cases where tags are added after a fact, as we may want to know when the tagging occurred.

At some point we'll also have to implement a system for patrolling tags, since many of those are intended to draw attention for review, yet the current system is unwieldy. For this, we may use what we currently have : patrolled edits. Whenever an edit is marked as patrolled, it automatically marks tags appended to the new and previous revisions of the page as patrolled. On wikis where patrolling is disabled, we may add an option to enable patrolling only on pages where an unpatrolled tag exists. Not all tags have to be patrolled, some are intended for maintenance or information purposes, so there should be a way for tag managers to define which tags need patrolling at Special:Tags. The feature may be disabled entirely, such as on wiki with FlaggedRevs on all pages.

This comment was removed by Krinkle.

Change 188543 merged by Anomie:
Allow users to add, remove and apply change tags using the API

https://gerrit.wikimedia.org/r/188543

Change 204341 had a related patch set uploaded (by Anomie):
Allow users to add, remove and apply change tags using the API

https://gerrit.wikimedia.org/r/204341

Change 204341 merged by jenkins-bot:
Allow users to add, remove and apply change tags using the API

https://gerrit.wikimedia.org/r/204341

Question: I saw someone on one of the help boards asking about the changes to the history pages and am wondering if anyone can add/remove tags or if there are required permission sets?

Question: I saw someone on one of the help boards asking about the changes to the history pages and am wondering if anyone can add/remove tags or if there are required permission sets?

First, the tag needs to be created or activated for user editing on Special:Tags (this prevents people messing with tags like "mobile-edit" used by extensions). This requires the 'managechangetags' right, assigned to sysops by default.

Then, when making an edit (and in the future possibly when taking other logged actions) a user with 'applychangetags' can add those tags to the revision they're creating. A user with 'changetags' can add these tags to existing revisions and log entries, and can remove these tags and tags that are completely undefined from existing revisions and log entries. By default both 'applychangetags' and 'changetags' are given to all logged-in users, but T97013 / Gerrit change 208088 would change that on WMF wikis.

The fact that the checkboxes and buttons are showing up even when there are no user-editable tags is T97773: Only show edit-change-tags UI if editable change tags are defined, now fixed and backported.

@Anomie, @TTO: I think it should be possible to activate / deactivate extension-defined tags for usage by users. For some tags this could be useful, although it isn't good in some cases like VisualEditor oder mobile edit. In this cases it can stay deactivated.