Wikidata July 2020

wikidata@lists.wikimedia.org

39 participants
24 discussions

Weekly Summary #338
by Léa Lacroix 10 Oct '20

10 Oct '20

*Here's your quick overview of what has been happening around Wikidata over the last week.* Discussions - Closed request for comments: Political alliance vs P4100 <https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Political_allia…>, Changes to P2737 and P2738 <https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Changes_to_P273…>, Why do we have an item for dogs and another one … [View More]

4 4

Differences in label searching with SPARQL and MediaWiki API
by Thad Guidry 07 Aug '20

07 Aug '20

This query times out: SELECT ?item ?label WHERE { ?item wdt:P31 ?instance ; rdfs:label ?label ; rdfs:label ?enLabel . FILTER(CONTAINS(lcase(?label), "Soriano")). FILTER(?instance != wd:Q5). SERVICE wikibase:label {bd:serviceParam wikibase:language "en".} } LIMIT 100 I have this feeling that it's not actually using an index or even asking the right question and so is slow and times out? However the MediaWiki wbsearchentities API does seem to use an index and is performant for … [View More]

2 7

Inquiry on Linked Data Content Managers
by Brian M. Watson 31 Jul '20

31 Jul '20

Hello all, I'm writing at the recommendation of Mairelys Lemus-Rojas after I approached her with the below inquiry and exchanged some emails about it. I was wondering if anyone was familiar with a semantic/linked data capable content management system or blog that has autofill or nanotation capabilities. What I mean by that is, say I'm writing a blog post about Paris, I'm looking for something that would autofill linked data 'under the hood' by either a dropdown (a la Omeka's Value Suggest &… [View More]

7 8

2 million queries against a Wikidata instance
by Adam Sanchez 31 Jul '20

31 Jul '20

Hi, I have to launch 2 million queries against a Wikidata instance. I have loaded Wikidata in Virtuoso 7 (512 RAM, 32 cores, SSD disks with RAID 0). The queries are simple, just 2 types. select ?s ?p ?o { ?s ?p ?o. filter (?s = ?param) } select ?s ?p ?o { ?s ?p ?o. filter (?o = ?param) } If I use a Java ThreadPoolExecutor takes 6 hours. How can I speed up the queries processing even more? I was thinking : a) to implement a Virtuoso cluster to distribute the queries or b) to load Wikidata … [View More]

5 7

Wikimedia Commons Query Service (WCQS)
by Guillaume Lederrey 29 Jul '20

29 Jul '20

Hello all! We are happy to announce the availability of Wikimedia Commons Query Service (WCQS): https://wcqs-beta.wmflabs.org/. This is a beta SPARQL endpoint exposing the Structured Data on Commons (SDoC) dataset. This endpoint can federate with WDQS. More work is needed as we iterate on the service, but feel free to begin using the endpoint. Known limitations are listed below: * The service is a beta endpoint that is updated via weekly dumps. Some caveats … [View More]include limited performance, expected downtimes, and no interface, naming, or backward compatibility stability guarantees. * The service is hosted on Wikimedia Cloud Services, with limited resources and limited monitoring. This means there may be random unplanned downtime. The data will be reloaded weekly from dumps. The service will be down during data reload. With the current amount of SDoC data, downtime will last approximately 4 hours, but this may increase as SDoC data grows. * Due to an issue with the dump format, the data currently only dates back to July 5th. We’re working on getting more up-to-date data and hope to have a solution soon. (https://phabricator.wikimedia.org/T258507 and https://phabricator.wikimedia.org/T258474) * The MediaInfo concept URIs (e.g. http://commons.wikimedia.org/entity/M37200540) are currently HTTP; we may change these to HTTPS in the near future. Please comment on T258590 if you have concerns about this change. * The service is restricted behind OAuth authentication, backed by Commons. You will need an account on Commons to access the service. This is so that we can contact abusive bots and/or users and block them selectively as a last resort if needed. * Please note that to correctly logout of the service, you need to use the logout link in WCQS - logging out of just Wikimedia Commons will not work for WCQS. This limitation will be lifted once we move to production. * No documentation on the service is available yet. In particular, no examples are provided yet. You can add your own examples at https://commons.wikimedia.org/wiki/Commons:SPARQL_query_service/queries/exa… following the format at https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples . * Please use the SPARQL template. Note that while there is currently a bug that doesn’t allow us to change the “Try it!” link endpoint, the examples will be displayed correctly on the WCQS GUI. * WCQS is a work in progress and some bugs are to be expected, especially related to generalizing WDQS to fit SDoC data. For example, current bugs include: * URI prefixes specific for SDoC data don’t yet work - you need to use full URIs if you want to query using them. Relations and Q items are defined by Wikidata’s URI prefixes, so they work correctly. * Autocomplete for SDoC items doesn’t work - without prefixes they’d be unusable anyway, but additional work will be required after we inject SDoC URI prefixes into WCQS GUI. * If you find any additional bugs or issues, please report them via Phabricator with the tag wikidata-query-service. * We do plan to move the service to production, but we don’t have a timeline on that yet. We want to emphasize that while we do expect a SPARQL endpoint to be part of a medium to long-term solution, it will only be part of that solution. Even once the service is production-ready, it will still have limitations in terms of timeouts, expensive queries, and federation. Some use cases will need to be migrated, over time, to better solutions - once those solutions exist. Have fun! Guillaume -- Guillaume Lederrey Engineering Manager, Search Platform Wikimedia Foundation UTC+1 / CET [View Less]

12 16

Call for anyone interested in Linked Data Content management wiki.js plugin development
by Brian M. Watson 27 Jul '20

27 Jul '20

Hello! Following up on my previous email request for information on linked data content managers (collected here https://www.wikidata.org/wiki/User:Sj/LDCM ). A number of folks expressed interest in the possibility of working on a plugin for wiki.js that would use javascript and such. Additionally, I discovered that the MSUL digital repository created a AJAX script to fetch wiki information to their subject headings in their content manager. MAUL has … [View More]

1 0

Weekly Summary #426
by Mohammed Sadat Abdulai 27 Jul '20

27 Jul '20

Here's your quick overview of what has been happening around Wikidata over the last week. Events <https://www.wikidata.org/wiki/Special:MyLanguage/Wikidata:Events> - The *Wikidata track* of the LD4 conference on Linked Data in Libraries <https://ld42020.sched.com/overview/type/Wikidata> takes place on Thursday 30 and Friday 31 July. Free, upon registration. - Seeking feedback on a possible *WikiCite … [View More]

1 0

Re: [Wikidata] WDQS outage - 2020/07/23
by Kingsley Idehen 27 Jul '20

27 Jul '20

On 7/26/20 1:44 PM, Egon Willighagen wrote: > > it's at the end of the page. Hard to miss, I thought :/ > > SELECT DISTINCT ?government_governmental_jurisdiction_governing_officials ?government_governmental_jurisdiction_governing_officials ?government_government_position_held_office_holder_inverse ?government_government_position_held_appointed_by ?government_government_position_held_basic_title_inverse > WHERE { > VALUES ?… [View More]

2 1

WDQS outage - 2020/07/23
by Ryan Kemper 26 Jul '20

26 Jul '20

Hi all, We experienced WDQS service disruptions on 2020/07/23. As a result there was a full outage (inability to respond to all queries) for a period of several minutes, and a more extended period of intermittently degraded service (inability to respond to a subset of queries) for 1-2 hours. The full incident report is available here: https://wikitech.wikimedia.org/wiki/Incident_documentation/20200723-wdqs-ou… Ultimately, we traced the proximate cause to a … [View More]

4 4

Call for Participation--Program for Cooperative Cataloging Wikidata Pilot
by Hilary K Thorsen 24 Jul '20

24 Jul '20

Greetings everyone, The PCC<https://www.loc.gov/aba/pcc/>, an international cooperative cataloging effort for library collections, is launching a Wikidata Pilot to further advance the movement toward identity management. Stated broadly in its Strategic Directions document, the PCC hopes to “Accelerate the movement toward ubiquitous identifier creation and identity management at the network level … attain an environment where identity management work … [View More]activity is characterized by much greater proportions and numbers of entities receiving identifiers … strategic partnerships and collaboration existing among cultural heritage organizations, rights management agencies, Wikidata, and others … collaborate with other identity management communities to facilitate and promote the use of unique identifiers.” More specifically, this Pilot is anticipated to involve • Comparing ease of use and benefits of Wikidata to other registries (LCNAF, ISNI) • Assessing the productivity and quality assurance tools that exist (or should exist) • Learning about the culture of the Wikidata community The upcoming Pilot was featured in the LD4 Wikidata Affinity Group meeting of June 16 and more background information and discussion can be found in the presentation recording<https://stanford.zoom.us/rec/share/_eAtNuzb_HNLcK_97GzcBJ95MN2-T6a8hHRI-PYO…>, slides<https://docs.google.com/presentation/d/1NpkAQdGGft1Wi2vX0zgMtIxwXWjPq96NtXx…>, and notes<https://docs.google.com/document/d/1z1SSAp4c4tftOGW3BbJ6Fxfd8oRIhfzveh0zjeb…>. Participants can choose to experiment in a range of focus areas based on what is of interest to their own institution, sharing their findings without each being required to delve into all the areas that are covered by the pilot. Projects of any size, however small or large, and at any stage of progress are welcome. The PCC invites interested institutions (both PCC and non-PCC) to participate by completing a short survey<https://forms.gle/5VEHS8sbQbG1JyQa9> describing their project and the issues of interest to them. Initial expressions of interest by the end of July will allow the Pilot to get underway with a kick off meeting in early August. We will solicit firm commitments for ongoing participation at a later date. The pilot is anticipated to last about 12 months. If you have questions, please write to John Riemer<mailto:jriemer@library.ucla.edu> or Michelle Durocher<mailto:durocher@fas.harvard.edu>. Hilary Thorsen, on behalf of the PCC Task Group on Identity Management in NACO Hilary Thorsen Resource Sharing Librarian Stanford Libraries thorsenh(a)stanford.edu 650-285-9429 [View Less]

2 1

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata July 2020