Wikipedia:Edit filter noticeboard
Welcome to the edit filter noticeboard |
---|
Filter 174 — Actions: disallow
Filter 1325 — Pattern modified
Filter 614 — Pattern modified
Filter 782 — Pattern modified
Filter 833 — Pattern modified
Filter 1330 (new) — Actions: none; Flags: private; Pattern modified
Filter 1329 (new) — Actions: showcaptcha; Flags: enabled,private; Pattern modified
Filter 484 (restored) — Actions: <span style='color:red; Flags: enabled; Pattern modified
This is the edit filter noticeboard, for coordination and discussion of edit filter use and management. If you wish to request an edit filter, please post at Wikipedia:Edit filter/Requested. If you would like to report a false positive, please post at Wikipedia:Edit filter/False positives. Private filters should not be discussed in detail here; please email an edit filter manager if you have specific concerns or questions about the content of hidden filters. There are currently 333 enabled filters and 48 stale filters with no hits in the past 30 days. Filter condition use is ~1050, out of a maximum of 2000. ( ). See also the profiling data and edit filter graphs. |
Index 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 11, 12, 13, 14 |
This page has archives. Sections older than 20 days may be automatically archived by Lowercase sigmabot III. |
Filter 812 didn't disallow edits
Hatebread (contribs) (a non-autoconfirmed user) made 5 edits, each increased page sizes by 2.7 million bytes, and for some reason filter 812 didn't disallow any of them. Is there any reason why? Do filters sometimes miss edits? —MRD2014 Talk • Edits • Help! 00:37, 14 September 2017 (UTC)
- If I had to guess, it was because the edit was commented out? Other than that, I have no idea. Seems to be working properly everywhere else. Primefac (talk) 00:48, 14 September 2017 (UTC)
- It's enabled and hasn't been changed since December 2016. —MRD2014 Talk • Edits • Help! 01:18, 14 September 2017 (UTC)
- Sorry, I meant that the edits themselves were inside of comments. My second comment was regarding the fact that the filter was tripped 2-3 days ago, and about once a week since it was implemented. In other words, "it's working, and this is my only theory why it missed these edits". Though I do notice that they were all "new section" edits - maybe that threw it off? Primefac (talk) 01:22, 14 September 2017 (UTC)
- According to the filter, it doesn't matter what the edit summary says. —MRD2014 Talk • Edits • Help! 02:42, 14 September 2017 (UTC)
- Sorry, I meant that the edits themselves were inside of comments. My second comment was regarding the fact that the filter was tripped 2-3 days ago, and about once a week since it was implemented. In other words, "it's working, and this is my only theory why it missed these edits". Though I do notice that they were all "new section" edits - maybe that threw it off? Primefac (talk) 01:22, 14 September 2017 (UTC)
- It's enabled and hasn't been changed since December 2016. —MRD2014 Talk • Edits • Help! 01:18, 14 September 2017 (UTC)
- Another major AbuseFilter bug...??? It should definitely 100% have stopped this, and indeed I can test the filter against that user at Special:AbuseFilter/test and it matches. I think there was some breaking change that happened a while back, because we've had several filters malfunction, where apparently the variables being read have the wrong values, are for the wrong edits, other weirdness. I'll create a task and propose rolling back the extension to a stable version — MusikAnimal talk 16:30, 14 September 2017 (UTC)
Looks like a new filter is needed. I'll go and request one.TomBarker23 (talk) 10:40, 19 October 2017 (UTC)
- Nothing wrong with the filter as far as I know. There is a bug with this filter that is being tracked at phab:T175933, as shown at the top of this thread. —MRD2014 Talk • Edits • Help! 02:56, 23 October 2017 (UTC)
Request for Edit Filter Helper - Dane
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
- Closure available for this EFH nomination. (refresh)
- Dane (t · th · c · del · cross-wiki · SUL · edit counter · pages created (xtools · sigma) · non-automated edits · BLP edits · undos · manual reverts · rollbacks · logs (blocks · rights · moves) · rfar · spi · cci)
Hello,
I am requesting the Edit Filter Helper right so I can view hidden logs. I am a Tool Administrator at the ACC tool and run across tripped filters occasionally that I am unable to view/evaluate in the course of my work there currently and this would help me in processing those requests. Thank you! -- Dane talk 03:50, 4 October 2017 (UTC)
- Q: Can you give a specific (even hypothetical) scenario where seeing the logs would be useful? CrowCaw 15:01, 4 October 2017 (UTC)
- A: Yesterday, I worked a request for example that showed an Edit Filter was tripped but only had a number specified. When we defer requests for CheckUser, we've been asked by CU to supply as much information as possible to assist in their processing of the requests. Because I did not have that information available I wasn't able to properly defer even as a tool administrator and I had to find an on-wiki administrator to research the specific request and write up the summary for CU which can add a delay in processing. In some cases, having the EFH right may even allow a bypass for CU if it's clear that it's the target for a filter created for specific LTAs. -- Dane talk 19:43, 4 October 2017 (UTC)
- Expanding on the above, I don't believe being an ACC tool administrator is a sufficient reason to grant, and given the fact you were able to find an administrator to assist there's no real issue.. However, do you posses sufficient understanding of regex/AbuseFilter syntax to be able to reliably make the call on if a specific LTA filter was indeed targetting the request IP you're looking at? -- There'sNoTime (to explain) 19:47, 4 October 2017 (UTC)
- It is true that I was in one instance able to find an administrator who holds sysop on en-wiki as well to evaluate it, however there have been times where there is a delay in that. In asking for the right i'm looking to be able to avoid this delay to resolve the issue myself. I do have a basic understanding of regex and I would defer any case that was in question or unclear to me per the existing operating guidelines. I do not have intent on authoring any filters now or in the future. -- Dane talk 20:20, 4 October 2017 (UTC)
- Expanding on the above, I don't believe being an ACC tool administrator is a sufficient reason to grant, and given the fact you were able to find an administrator to assist there's no real issue.. However, do you posses sufficient understanding of regex/AbuseFilter syntax to be able to reliably make the call on if a specific LTA filter was indeed targetting the request IP you're looking at? -- There'sNoTime (to explain) 19:47, 4 October 2017 (UTC)
Oppose No real on-wiki need for the tool ifother, more trusted editorsadministrators and EFMs are easily available via the mailing list or on IRC. There's a reason we have private filters and granting access solely for off-wiki tools is not a good precedent to set, given there is a real risk ofsharing details of private filters with unauthorised parties
on mailing lists we would not have access to. EFH was not created for this purpose -- There'sNoTime (to explain) 20:59, 4 October 2017 (UTC)
This was the wrong place to bring up my concerns about criteria #1 -- There'sNoTime (to explain) 08:47, 5 October 2017 (UTC)
|
---|
This isn't a hill I want to die on, but my point is being misunderstood. Dane is perfectly trusted, I think he should have passed RfA and that isn't the issue I have. Granting criteria #1 highlights a requirement for "need". I don't see a need here when other trusted editors (read "Administrators, EFMs" as I clarified above - that was my poor wording) are available on IRC and through a mailing list. Perhaps I am using this as a point to highlight the inadequate definition of "need" (the major concerns from the RfC were creep), and that's not fair on Dane, so I apologise there. Consider my oppose struck, but a discussion needs to happen over things like this. This was the wrong place to have that discussion -- There'sNoTime (to explain) 08:41, 5 October 2017 (UTC)
|
- I'm still not sure that ACC yields enough of an ongoing need for the access. The logs won't tell you why a filter was tripped, just that it was... you'd need to dig into the regex to determine why. And since the number of EFs dealing with ACC is small compared to the total (Meta prefers to handle name blocks now), it seems like an email with the regex would cover 99% of the issues, with that 1% being new additions to the filter. CrowCaw 22:33, 4 October 2017 (UTC)
- @Crow: I think this argument can be reversed and be equally as valid - even if there isn't a hard ongoing need, if there aren't any specific issues that lead to the candidate user to be considered untrusted despite their long-time status on Wikipedia, as long as they are competent in regular expressions and know not to break things, and as long as there are even potential gains to be had, is there harm in granting the user group? To me, the potential usefulness of this access is at least some kind of positive factor, and there don't seem to be any negative factors beyond lack of precedent which seems like should it should be a relatively minor factor - though this is admittedly very much an outsider perspective (with respect to edit filters). --FastLizard4 (talk•contribs) 07:51, 5 October 2017 (UTC)
- Support - this is a perfectly reasonable request IMO. This is such an inconsequential right that I can't imagine why we would turn away anyone with a potential use for it. -- Ajraddatz (talk) 00:19, 5 October 2017 (UTC)
- As for this being an
inconsequential right
, I draw your attention to the closing caveats, namelycommunity and/or the granting administrator are adviced to carefully evaluate the candidates before granting the flag due to the huge fallouts of any misuse.
-- There'sNoTime (to explain) 07:23, 5 October 2017 (UTC)- There is almost no fallout for misuse. The absolute worst-case scenario is that an LTA is able to figure out what the filter conditions are, so they can bypass it a few minutes earlier than they could by trial and error. And when any sort of trusted editor is the one requesting the right, we can be pretty sure that the worst-case scenario won't happen. -- Ajraddatz (talk) 08:31, 5 October 2017 (UTC)
- As for this being an
- Pinging BU Rob13.Any comments?I am sure he could address your point lot better than me!Winged Blades of GodricOn leave 17:34, 5 October 2017 (UTC)
- Support - Per Ajraddatz. Perhaps the demonstrated need (#1) is not strong enough, but I would be satisfied if there are potential uses (indicated in the response above) by a trusted editor. Alex ShihTalk 06:44, 5 October 2017 (UTC)
- @Dane: Can you comment on how your use case exceeds that of a typical vandal fighter who could perhaps be benefited by viewing private logs? We decided rather resoundingly that such editors should not receive access to this right, and I am very worried about precedents being set with this user right. ~ Rob13Talk 10:19, 6 October 2017 (UTC)
- I hope to hear from Dane before committing one way or another, but put me down as oppose until I'm convinced further. I'm not sold on this use case, mostly because there are many admin ACC volunteers and this isn't frequent in the ACC workflow. (I hope no admin will close this until I get a response.) ~ Rob13Talk 19:32, 6 October 2017 (UTC)
- @BU Rob13: Sorry for the delay, i've been mobile most of the day. In regards to my usage vs. a typical vandal fighter, my usage of EFH will allow me to identify potential issues prior to giving someone a way to access en-wiki via an ACC request. On requests that do get deferred, our CU's have requested as much information as possible when we defer to them (they typically have a large backlog at ACC and a very detailed comment is helpful to their research) and this will allow me to present more robust comments to them regarding tripped filters. In my most recent example, I would have deferred a request without any idea why I was deferring and thus unable to give additional information to the CU - whereas with this right I could've supplied an explanation that would be helpful. -- Dane talk 20:22, 6 October 2017 (UTC)
- @Dane: At the risk of sounding like a judge: Is it your contention that all ACC volunteers automatically qualify for this right? If not, how are you distinct from the general group? If so, I have other questions. PLEASE DON'T CLOSE THIS FOR ANOTHER 24 HOURS. I think it's important to get the ACC precedent right. ~ Rob13Talk 01:37, 7 October 2017 (UTC)
- @BU Rob13: I am not attempting to set any sort of "precedent" here and think that the discussion regarding whether ACC volunteers do or do not automatically qualify for the right is best fit for an RfC and not my individual request for the permission. Until that RfC, I think it should be evaluated based on individual need as demonstrated until any such RfC happens, and I believe i've covered my individual need above a few times. -- Dane talk 02:37, 7 October 2017 (UTC)
- @Dane: Well, at least in my opinion, your need does not exceed that of a typical ACC volunteer. It's hard to separate your case from a potential precedent, as if we say yes to you, I see us saying yes to all ACC volunteers. That's a somewhat large group – certainly larger than who I anticipated getting this right. While I may not be opposed to expanding the use cases to include ACC, I think we need a larger discussion surrounding that with more time to ask questions and probe your recruiting process to determine if we can justify giving this highly-sensitive right to a potentially large group of people. Until such a discussion is held, and without seeing how your need exceeds that of a typical ACC volunteer, I must oppose. (I would support an RfA, as a side note. You are clearly qualified.) ~ Rob13Talk 03:20, 7 October 2017 (UTC)
- @BU Rob13: I am not attempting to set any sort of "precedent" here and think that the discussion regarding whether ACC volunteers do or do not automatically qualify for the right is best fit for an RfC and not my individual request for the permission. Until that RfC, I think it should be evaluated based on individual need as demonstrated until any such RfC happens, and I believe i've covered my individual need above a few times. -- Dane talk 02:37, 7 October 2017 (UTC)
- @Dane: At the risk of sounding like a judge: Is it your contention that all ACC volunteers automatically qualify for this right? If not, how are you distinct from the general group? If so, I have other questions. PLEASE DON'T CLOSE THIS FOR ANOTHER 24 HOURS. I think it's important to get the ACC precedent right. ~ Rob13Talk 01:37, 7 October 2017 (UTC)
- @BU Rob13: Sorry for the delay, i've been mobile most of the day. In regards to my usage vs. a typical vandal fighter, my usage of EFH will allow me to identify potential issues prior to giving someone a way to access en-wiki via an ACC request. On requests that do get deferred, our CU's have requested as much information as possible when we defer to them (they typically have a large backlog at ACC and a very detailed comment is helpful to their research) and this will allow me to present more robust comments to them regarding tripped filters. In my most recent example, I would have deferred a request without any idea why I was deferring and thus unable to give additional information to the CU - whereas with this right I could've supplied an explanation that would be helpful. -- Dane talk 20:22, 6 October 2017 (UTC)
- Unsure Crow thoughts and seeking more discussion. On the positive side, I see zero issues with trust, and the Nonpublic data agreement is a huge plus. On the flip side, I'm still having trouble with the ongoing need for the right. This may very well be due to my not understanding the ACC workflow, which is (to me): a user sends a request in to ACC because they were blocked from creating an account. They likely will state the name they tried to submit. ACC volunteer looks at the EF log for their IP (is it always going to be the same?) and sees that an EF blocked it. Without EFH that's as far as you can go. With EFH, you check the specific filter, pour through a huge ugly regex, see the match and say "Yep, you were blocked alright." There's not going to be much more context than that to tell you what to say to a CU other than "This guy tripped a filter and might be a sock of any of a hundred different people or could just be a troll, or might just be an innocent user." (Please elaborate or correct me on any of this).
- So even with that degree of utility, the ongoing need still eludes me. Those few EFs don't change often, and (again from my own POV), it would seem easier to copy the huge ugly regex to a text document and sort/alpha/annotate/de-regex/etc to make it an easier lookup than opening the huge ugly regex every time. (I keep saying that, you'll see what I mean!). In fact that's what I do to a couple of the larger aggregated ones... I often get lost parsing the H.U.R. so keep offline copies sorted into a much easier format. Thus my point above about a periodic dump of the handful of filters that would be of any use in ACC would be just as meaningful.
- I did support the EFH permission creation, (and may have had some part in kickstarting the RFC) but I respect and concur with the concerns expressed during it. As Rob alluded to, "ease of access" was a huge concern, as was "real ongoing need", which is where I'm stuck at the moment. I don't think access to the data itself is problematic, but that data being fairly static, the ongoing need for tool is the hangup. Free discussion welcome! CrowCaw 15:33, 6 October 2017 (UTC)
- @Crow: Just a comment regarding "a user sends a request in to ACC because they were blocked from creating an account. They likely will state the name they tried to submit. ACC volunteer looks at the EF log for their IP (is it always going to be the same?) and sees that an EF blocked it": I've handled a few ACC requests and I haven't come across one that was disallowed due to 527. 579 isn't disallow and afaik there are no other account creation-related filters. Dat GuyTalkContribs 16:16, 6 October 2017 (UTC)
- There is 102 that has the H.U.R. which is sort of related. CrowCaw 16:29, 6 October 2017 (UTC)
Regretful oppose In the absence of further discussion to my concern, and with the timer expired.Striking default qualifier as discussion followed. As DatGuy mentioned, there doesnt seem to be efs of concern to ACC that are both Private and Disallow. Since Meta handles username blacklisting now, it looks like we let them to free the cycles. To the initial use case given, if an action only gave a filter number then that filter was already public as are all its logs. So I land as oppose due to still not seeing the ongoing need wrt ACC. I'm still open to discussion to help me see something that I'm not understanding. Crow On The Go! Caw 16:22, 8 October 2017 (UTC)
- @Crow: In regards to the ACC use need, providing checkusers with a summary similar to "Request XXXXXX had an IP that tripped filter X correctly" is where this comes in handy to me. Just yesterday I was speaking with a CU who reiterated this would be way more helpful than "IP X tripped X", as we can actually confirm whether it was a positive (correct) trip of the filter. -- Dane talk 16:33, 8 October 2017 (UTC)
- Thanks for replying. The thing in that scenario is, all youre going to get with efh is the regex that tripped the filter. Without getting too beansy, youre unlikely to get much guidance to what spi it is from. I kno you cant use a real world case for privacy reasons but can you give an example of an actual filter trip leaving out the pii? As i mentioned i think most of the acc filters are public already. Crow On The Go! Caw 19:38, 8 October 2017 (UTC)
- @Crow: Abuse Filter 855 is an example of one that has tripped that I was unable to view/evaluate properly to give additional information. As you said, I know all about the WP:BEANS involved in this so I don't want to say much more - but viewing the filter to give the additional data behind what it's stopping/whether it appears valid or invalid trip. Without that, basically my only comment can be "IP X tripped filter XXX" instead of "IP X tripped filter XXX; appears to be a valid trip according to the details of XXX". -- Dane talk 19:51, 8 October 2017 (UTC)
- Interesting, 855 has nothing to do with account creation. It covers specific editor(s) against a small set of pages (and not their own) so should not be showing up at ACC. Unfortunately I'm still not seeing the need vis. ACC given the lack of private-&-disallow filters here, and that Meta now handles username blacklisting so we're likely to defer any future EF requests along those lines to Meta. — Preceding unsigned comment added by Crow (talk • contribs) 16:22, 9 October 2017 (UTC)
- @Crow: Without sharing too much, I can tell you that the data from 855 is relevant to our checks at ACC when deferring to a CU as previously discussed. -- Dane talk 16:43, 9 October 2017 (UTC)
- I can probably guess why, but in this particular case the only question is whether the filter is too broad, which only a CU can answer anyway. EFH/EFM can only add the specific consideration which brings it to CU's attention in the first place, and leave it to them to see if its a FP/Valid. It saves CU or SPI clerk a click or 2, and while I'm not one to push more work on them, a SPI clerk needs to endorse before a CU will look, and that is the main use-case that was driving the EFH creation. CrowCaw 17:01, 9 October 2017 (UTC)
- @Crow: Abuse Filter 855 is an example of one that has tripped that I was unable to view/evaluate properly to give additional information. As you said, I know all about the WP:BEANS involved in this so I don't want to say much more - but viewing the filter to give the additional data behind what it's stopping/whether it appears valid or invalid trip. Without that, basically my only comment can be "IP X tripped filter XXX" instead of "IP X tripped filter XXX; appears to be a valid trip according to the details of XXX". -- Dane talk 19:51, 8 October 2017 (UTC)
- Support - Dane is a trusted user and I can see how this would be beneficial for ACC purposes and having more eyes on things will be useful from an SPI standpoint. CHRISSYMAD ❯❯❯¯\_(ツ)_/¯ 03:51, 10 October 2017 (UTC)
- Oppose per lack of demonstrated need and per Crow. Users above who voted merely because a user is trusted are reminded of the concerns raised during the EFH RfC. This cavalier attitude to hat collecting and ease of access is problematic. Nihlus 04:03, 10 October 2017 (UTC)
- @Nihlus: With all due respect, Chrissymad is active on the ACC tool and an existing edit filter helper, so she likely is able to relate to the statements i've made above for need/usage of this right. Your comment overlooks her statement regarding how she see's the benefit for use for the ACC tool and implies she voted simply because she see's me as a "trusted user". -- Dane talk 04:29, 10 October 2017 (UTC)
- My comment was not directed at her, but rather, all the users above me who voted support. Nihlus 08:58, 10 October 2017 (UTC)
- I would counter by saying that the whole concept of requiring a "need" for the tools is problematic. Nobody needs any advanced permissions here. Nobody needs to be here at all. Instead, we have thousands of volunteers spending their time here, and the least we can do is give them the technical access necessary to fully do the jobs they've signed up for. We should look for whether a candidate has a potential use for the tools, without going near language as strong as "need". The candidate has clearly defined how he will use this permission - looking at filter tripping by IPs when evaluating certain ACC requests. As someone who formerly volunteered at ACC, I can attest to giving these users as much view access as possible so they can have the full picture when handling requests. All that remains is figuring out whether I trust the candidate to use the tools properly. In this case, proper use means looking at private filter logs from time to time, and not revealing the contents of the filters to long-term abusers who would then be able to bypass the filters 10 seconds faster than their usual trial-and-error method. I think I can make that leap of faith here. -- Ajraddatz (talk) 08:51, 10 October 2017 (UTC)
- The reason "need" keeps coming up is that it is the first consideration when considering the grant. See also the archives at RFPERM where "no need for" is often given when declining. And I won't even bring up RFA (though I guess I just did). Need was an important consideration and sticking point during the RFC as well. Now that all said, thanks to Dane for continuing to explain... it's not obvious to one outside the ACC environment, and it's really hard to explain things completely with a mouthful of WP BEANS, but I do now see the relevant use case where this is useful outside of action=accountcreate in EFs. So one more question if I may: when you defer a request to CU, how is that done? A button in the tool, mailing-list, SPI, an "on-call CU", etc? CrowCaw 14:57, 10 October 2017 (UTC)
- There's basically a 'defer to checkusers' button, that when pressed moves the request to a CU queue. Dat GuyTalkContribs 15:19, 10 October 2017 (UTC)
- @Crow: We don't have an SPI process (like clerks) and in my time at ACC we haven't ever discussed a CU case in the mailing list. We have a CU queue in ACC that we defer to using the tool and when we get a request that requires CU intervention, we make a note in the tool (a note can be set to "Tool User" or "Tool Admin" for visibility). The note usually contains a description of why we are deferring the request for CU (e.x. "What sock does it match? Is the IP CU Blocked without an ACC Ignore note? Did an Edit Filter trip? Is there something suspicious that needs to be evaluated?") and then CU processes the requests. The feedback from CUs has been that the descriptions with more information cut down the time for the CU to process (especially when the CU queue on ACC is frequently in the 40s to 60s with a couple weeks backlog). In evaluating a normal request that doesn't appear to be directly CU, for example, recent vandalism and edits within a certain timeframe goes into the CU queue. Filter trips are looked at as well when we evaluate requests, so this doesn't limit the use case to just CU-Blocked IPs. I hope this helps clarify a bit more. -- Dane talk 15:32, 10 October 2017 (UTC)
- Thanks and yes this does fill in a huge gap in my understanding about the use case and associated need. I shall ponder this, but for now I've struck through my oppose. CrowCaw 15:40, 10 October 2017 (UTC)
- @Nihlus: With all due respect, Chrissymad is active on the ACC tool and an existing edit filter helper, so she likely is able to relate to the statements i've made above for need/usage of this right. Your comment overlooks her statement regarding how she see's the benefit for use for the ACC tool and implies she voted simply because she see's me as a "trusted user". -- Dane talk 04:29, 10 October 2017 (UTC)
- Comment I fear this request is derailing slightly into overall comments on ACC volunteers having EFH. I've discussed this extensively off-wiki with Dane, and having recently become a checkuser I can appreciate where his comments and need comes from. Not having even a basic "looking at the filter I can see this was caught because X" adds significant time onto processing a request (not to mention trying to weigh up if a check is a good idea). I'm mindful of seeming insincere if I now support after my heated oppose above, but my sentiment remains one of supporting this request. Regardless of how this request ends, I'd like to propose we have a proper discussion on EFH for ACC (perhaps it could be dealt with in a similar way to SPI clerks?). Additionally, I would like to offer Dane mentorship should he be granted this right, as a compromise against the wholly valid concerns brought up regarding access and understanding of syntax -- There'sNoTime (to explain) 15:47, 10 October 2017 (UTC)
Filter 384 2
Can we add !(added_lines rlike "[A-Z][a-z]\s\bDick\b") &
to 384 to prevent this type of false positive? Nihlus 02:53, 9 October 2017 (UTC)
- Done – I assume you meant "\b[A-Z][a-z]+\sDick\b". Κσυπ Cyp 12:48, 9 October 2017 (UTC)
- Yep. Lazy copy paste on my part. :/ Thank you! Nihlus 13:13, 9 October 2017 (UTC)
- @Cyp: Another false positve. Can we exclude cases of "Dick's Sporting" as it's both a franchise name and on some stadiums/arenas?
!(added_lines rlike "\bDick\s[A-Z][a-z]|\b[A-Z][a-z]+\sDick\b|\bDick\'s\s[A-Za-z][a-z]") &
Should also catch false positives of the possessive Dick's when referring to someone's name. Nihlus 20:25, 15 October 2017 (UTC)- @Nihlus: Ok, added
('s)?
, which should be equivalent. Κσυπ Cyp 05:44, 16 October 2017 (UTC)- That will work for the "Dick's Sporting" but won't work for any possessive form of "Dick's". See examples such as "Dick's works" and "Philip K. Dick's poetic". Nihlus 14:05, 16 October 2017 (UTC)
- Sorry, hadn't seen the a-z in your
[A-Za-z]
. How about now; John Q. Dick and Dick Q. Johnson should be able to take Dick's kitten Mr. Dick to Dick's Vet to be checked by Dr. Dick, without triggering 384. Κσυπ Cyp 08:58, 17 October 2017 (UTC)
- Sorry, hadn't seen the a-z in your
- That will work for the "Dick's Sporting" but won't work for any possessive form of "Dick's". See examples such as "Dick's works" and "Philip K. Dick's poetic". Nihlus 14:05, 16 October 2017 (UTC)
- @Nihlus: Ok, added
- @Cyp: Another false positve. Can we exclude cases of "Dick's Sporting" as it's both a franchise name and on some stadiums/arenas?
- Yep. Lazy copy paste on my part. :/ Thank you! Nihlus 13:13, 9 October 2017 (UTC)
New function available for use
We have added ccnorm_contains_any. It is a convenience function 'cause it literally translates to contains_any(ccnorm(param1), ccnorm(param2), ...)
It can be used when we need to find multiple strings within another string and we want to use their canonical representation for comparison.
Here are some examples
Code | Result |
---|---|
ccnorm_contains_any( "w1k1p3d14", "wiKiP3D1A", "foo", "bar" )
|
true |
ccnorm_contains_any( "w1k1p3d14", "foo", "bar", "baz" )
|
false |
ccnorm_contains_any( "w1k1p3d14 is 4w3s0me", "bar", "baz", "some" )
|
true |
DMaza (talk) 02:40, 14 October 2017 (UTC)
- @DMaza (WMF): for the examples above, what is the reference string? — xaosflux Talk 02:56, 14 October 2017 (UTC)
- @Xaosflux: You would use it like this
ccnorm_contains_any(added_lines, 'foo', 'bar')
. Just likecontains_any
it will search for 'foo', 'bar' in added_lines, only that in this case it will ccnorm everything. Makes sense? DMaza (talk) 05:08, 14 October 2017 (UTC)- @DMaza (WMF): In the example table above - why is row 2 false? What was the example added line? — xaosflux Talk 09:32, 14 October 2017 (UTC)
- @Xaosflux:
ccnorm_contains_any("w1k1p3d14", "foo", "bar", "baz")
is equal tocontains_any("WIKIPEDIA", "FOO", "BAR", "BAZ")
, which is alwaysfalse
(for any edit) since "WIKIPEDIA" doesn't contain any of "FOO", "BAR" or "BAZ". And similarly, if someone adds a line saying "w1k1p3d14" so thatadded_lines == "w1k1p3d14"
, thenccnorm_contains_any(added_lines, "foo", "bar", "baz")
would befalse
for that particular edit, but would betrue
for an edit where someone adds a line saying "tw0 ducks w3nt t0 th3 b4r!!!!". Κσυπ Cyp 14:17, 14 October 2017 (UTC)
- @Xaosflux:
- @DMaza (WMF): In the example table above - why is row 2 false? What was the example added line? — xaosflux Talk 09:32, 14 October 2017 (UTC)
- @Xaosflux: You would use it like this
- @Cyp: Thank you, so in the examples above the first value is compared against all subsequent values correct? That is what I was missing. — xaosflux Talk 14:31, 14 October 2017 (UTC)
- @Xaosflux: Yes, that's correct. Κσυπ Cyp 14:56, 14 October 2017 (UTC)
What can edit filters on en.wp know about Commons images?
In a discussion (not on Wikipedia) about images used for vandalism, it was noted that edit filters are a good way of preventing some vandalism. This got me wondering what edit filters here can know about Commons images? Obviously they can know the filename of the image added, but is that it? Can it know what categories it is in? What other pages it is used on? Anything else?
This is not a request for an edit filter, it is a request to learn what an edit filter could theoretically do. Any questions about how it would be used, and whether using it for that is a good idea or not are for much later. Thryduulf (talk) 15:24, 14 October 2017 (UTC)
- Far as I know, local edit filters only get information from local wiki. So not Commons categories wouldn't be included, nor usage. Jo-Jo Eumerus (talk, contributions) 15:28, 14 October 2017 (UTC)
- This. We can also usually know the display size. To explain a bit further, we can detect uploads, and we can detect page edits. When you upload an image locally we can check file size, height, MIME, etc. When you make an edit we know basically nothing about anything linked in that edit, including local files, only the text being added. -- zzuuzz (talk) 16:05, 14 October 2017 (UTC)
Filter 680 (again)
Could someone exempt ✰ (shadowed white star
)? The jpop stars demand it. FP: https://en.wikipedia.org/wiki/Special:AbuseLog/19529628
What's the deal with stars tho... --QEDK (愛 • 海) 16:36, 18 October 2017 (UTC)
- @Cyp, MusikAnimal, and Rich Farmbrough: If it's alright. --QEDK (愛 • 海) 05:41, 20 October 2017 (UTC)
- Done All the best: Rich Farmbrough, 11:55, 20 October 2017 (UTC).
- Done All the best: Rich Farmbrough, 11:55, 20 October 2017 (UTC).
Request for EFH bit for Nihlus
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
- This EFH nomination ends has started. (refresh)
- Nihlus (t · th · c · del · cross-wiki · SUL · edit counter · pages created (xtools · sigma) · non-automated edits · BLP edits · undos · manual reverts · rollbacks · logs (blocks · rights · moves) · rfar · spi · cci)
I'd like to have Edit Filter Helper assigned to Nihlus. They clearly meet the "requirements for granting" #2 through #5. For criteria #1, need, I would like to highlight his work in finding false positives over at WP:EF/FP and the improvements he has suggested (such as the threads above). The Edit Filter Helper page describes one of the groups of editors the right will be useful to as Those interested in helping with edit filters but who do not meet the thresholds required to be able to modify them
- Nihlus shows a clear interest in helping out with our filters. His contributions to the noticeboard, mailing list, the regex used in his bot (NihlusBOT (talk · contribs · count)) and countless personal conversations with myself (which I can attest to) has show he has the required competence with syntax and regex. -- There'sNoTime (to explain) 20:57, 20 October 2017 (UTC)
- @Nihlus: do you "want" this? — xaosflux Talk 21:46, 20 October 2017 (UTC)
- @Xaosflux: I believe it will allow me to be more productive while also saving the EFM's time, so I accept the nomination. Nihlus 22:30, 20 October 2017 (UTC)
Slow filters
Community Tech set up a new log to record cases where specific filters take longer than half a second to execute. It's only been running for a few hours, but here are some early results:
Filter | # of times it's taken longer than 500 ms |
---|---|
Filter 12 | 1 |
Filter 80 | 5 |
Filter 149 | 2 |
Filter 180 | 4 |
Filter 755 | 22 |
Filter 765 | 15 |
Filter 777 | 2 |
Filter 871 | 4 |
Clearly, filters 755 and 765 are the most laggy. Looks like MusikAnimal has already disabled 765 :) Anything we can do to optimize 755? Kaldari (talk) 20:03, 25 October 2017 (UTC)
- 755 has been disabled as well. Nihlus 20:21, 25 October 2017 (UTC)
- @Kaldari: Is it feasible to build a "test" feature into the abuse filter extension that takes the most recent 500 edits to the wiki, runs them through the filter (possibly with pauses in between to prevent performance issues, if that's a concern), and returns the average, median, and maximum time the filter took to process for those edits? This may help edit filter managers (especially new ones) to ensure their filters aren't too expensive before enabling them. ~ Rob13Talk 22:39, 25 October 2017 (UTC)
- Way back when we had something like this, but it was ironically removed because it was hurting performance. Times have changed, and I think we might soon get it back. The Anti-Harassment Tools team is currently testing this on ptwiki (phab:T177641) and if all goes well, it hopefully will make it's return to enwiki (phab:T177017). As for the data you see above (logging of slow filters), there are hopeful plans to surface this within AbuseFilter, too (phab:T176895) — MusikAnimal talk 02:50, 26 October 2017 (UTC)
- Sorry I missed the "test" part. That's a good idea, and probably feasible, since the current batch testing tool essentially does the same thing minus performance measuring. As a workaround, once per-filter profiling is back you could just put your filter in log-only and test it that way — MusikAnimal talk 02:56, 26 October 2017 (UTC)
- @Kaldari: Is it feasible to build a "test" feature into the abuse filter extension that takes the most recent 500 edits to the wiki, runs them through the filter (possibly with pauses in between to prevent performance issues, if that's a concern), and returns the average, median, and maximum time the filter took to process for those edits? This may help edit filter managers (especially new ones) to ensure their filters aren't too expensive before enabling them. ~ Rob13Talk 22:39, 25 October 2017 (UTC)
Filter 384 3
This filter seems to be catching a few false positives with words that contain the word bitch in them. Some songs and anime have it in their title. Can we add something like \b([A-z]{2,3}bitch|bitch[A-z]{3,4})\b
to permit false positives such as this and this? That should still disallow words like bitches. Nihlus 21:10, 27 October 2017 (UTC)
- We could just wrap it in \b and that would help. I should mention filters like this one will never be free of false positives. It is simply the nature of the disruption it is designed to prevent, where there will always some good-faith addition that would trigger it. E.g. many songs outright have the word "bitch" in them. I would first investigate how much vandalism resultantly would get through with this proposed change, and also note each clause to prevent false positives adds more complexity to the filter and its effect on runtime performance. I suspect many new editors are not surprised their edits containing profanity don't go through. Another thing to consider is showing a more informative warning that explains how to request the edit be made on their behalf — MusikAnimal talk 17:50, 30 October 2017 (UTC)
- That might be the preferred method. Perhaps we should guide them to make an edit request with a warning since, as you said, it should make sense to new editors that it would be filtered out. My only hesitation is that spammers would then just go to the talk page and make a mess there, but we can cross that bridge when we get to it. I can work on the message as I think that would be more productive (since more people watch the talk pages of the articles and are more likely to have familiarity with the topic). Nihlus 18:02, 30 October 2017 (UTC)
- @MusikAnimal: See a drafted warning message at my sandbox. I would recommend MediaWiki:Abusefilter-profanity as the sensible choice for its location. However, I don't know if we should expand WP:EDITREQ to include an "edit filter" variant or if a simple request on the talk page would suffice. Nihlus 20:01, 30 October 2017 (UTC)
- MediaWiki:Abusefilter-warning-profanity, you mean, I think. --QEDK (愛 • 海) 18:25, 31 October 2017 (UTC)
- @MusikAnimal: See a drafted warning message at my sandbox. I would recommend MediaWiki:Abusefilter-profanity as the sensible choice for its location. However, I don't know if we should expand WP:EDITREQ to include an "edit filter" variant or if a simple request on the talk page would suffice. Nihlus 20:01, 30 October 2017 (UTC)
- That might be the preferred method. Perhaps we should guide them to make an edit request with a warning since, as you said, it should make sense to new editors that it would be filtered out. My only hesitation is that spammers would then just go to the talk page and make a mess there, but we can cross that bridge when we get to it. I can work on the message as I think that would be more productive (since more people watch the talk pages of the articles and are more likely to have familiarity with the topic). Nihlus 18:02, 30 October 2017 (UTC)
Weird filter trigger
The filter log for this user shows a trigger that I haven't seen before - "CheckUser Sock block" with no filter number shown. Is this normal? Home Lander (talk) 21:55, 30 October 2017 (UTC)
- @Home Lander: That filter is hidden, so the log won't tell you which filter number it hit, but that's the name of the filter. Sam Walton (talk) 22:22, 30 October 2017 (UTC)
- @Samwalton9: OK, should have figured as much. Is there any particular reason why that edit triggered the filter, or is that not publicly releasable either? (Just trying to determine whether triggering of that filter should generally be reported or not.) Home Lander (talk) 22:25, 30 October 2017 (UTC)
- That filter is being used by a specific admin/checkuser - who will already be monitoring it as needed. — xaosflux Talk 23:08, 30 October 2017 (UTC)
874 changed and disallowing again
Pretty evident from the filter + comments as to why. I believe its a good temporary alternative, hopefully you can agree given the issues at hand. Pinging Crow as the primary editor of the filter before I changed it -- There'sNoTime (to explain) 08:32, 1 November 2017 (UTC)
- I agree with that completely. CrowCaw 16:40, 1 November 2017 (UTC)