Property talk:P637

From Wikidata
Jump to navigation Jump to search

Documentation

DescriptionSpecifies the Protein ID. Use qualifier RefSeq (P656) to indicate Protein or RNA.
Applicable "stated in" valueRefSeq (Q7307074)
Data typeExternal identifier
Template parameteren:Template:GNF_Protein_box = Hs_RefseqProtein
Domain
According to this template: gene (Q7187)
According to statements in the property:
protein (Q8054) or peptide (Q172847)
When possible, data should only be stored as statements
Allowed values[NYXW]P_(\d{6}|\d{9})(\.\d{1,2})?
Examplereelin (Q13561329)NP_005036 (RDF)
Chromogranin B (Q63398)NP_001810 (RDF)
Format and edit filter validationCharacters + underscore + Digits
Sourcehttp://www.ncbi.nlm.nih.gov
Formatter URLhttps://www.ncbi.nlm.nih.gov/protein/$1
Lists
Proposal discussionProposal discussion
Current uses
Total772,428
Main statement771,550 out of 234,520,053 (0% complete)99.9% of uses
Qualifier2<0.1% of uses
Reference8760.1% of uses
Search for values
[create Create a translatable help page (preferably in English) for this property to be included here]
Format “[NYXW]P_(\d{6}|\d{9})(\.\d{1,2})?: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P637#Format, SPARQL
Type “protein (Q8054), peptide (Q172847): item must contain property “instance of (P31), subclass of (P279)” with classes “protein (Q8054), peptide (Q172847)” or their subclasses (defined using subclass of (P279)). (Help)
List of violations of this constraint: Database reports/Constraint violations/P637#Type Q8054, Q172847, hourly updated report, SPARQL
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P637#Entity types
Scope is as main value (Q54828448), as reference (Q54828450): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P637#Scope, SPARQL

Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.)

AMR format

[edit]

Besides the many ID's with the format [NYX]P_(\d{6}|\d{9})(\.\d{1,2})? there are also some ID's in the form AMR\d+. Are these also okay and should we change the format constraint? --Pasleim (talk) 08:49, 4 October 2016 (UTC)[reply]

Remove Distinct Value Constraint

[edit]

Due to NCBI's Prokaryotic RefSeq reannotation project, many prokaryotic ref seq protein ID's are no longer unique. This new type of identifier, the non redundant ref seq ID, begins with the WP prefix and indicates the protein is shared among many strains. To remove redundancy, NCBI now combines identical prokaryotic proteins into a single ID, and unfortunately will invalidate the distinct value constraint of this property. For instance, Chlamydia muridarum Str. Nigg uses the new non redundant ref seq ID format, and its genes are annotated with non redundant ref seq IDs that are assigned to all strains of Chlamydia muridarum. (See pmp). In particular, you can see in the comment section: "This record represents a single, non-redundant, protein sequence which may be annotated on many different RefSeq genomes from the same, or different, species." Although it will no longer be possible to query a distinct protein using this property, the same behavior can be retained by combining the query with the tax ID of the organism. I believe that we should stay consistent with NCBI and remove the distinct value constraint from this property so we can continue to properly annotate new genes with non redundant ref seq IDs in WikiData. Djow2019 (talk) 21:28, 7 August 2018 (UTC)[reply]