Uhup: Kokomoi data
Wikidata nopo nga impohon koilaan aiso barayan i milo basaon om nogi suntingon do tulun om mesin. Wikidata nopo nga iso mantad do ogumu projek Wiki i toinanon do Yayasan Wikimedia, iso kotinanan i au maganu do kolumaagan organisasi i bobos ointutunan maya do projek Wikipedia. Tikid do projek Yayasan Wikimedia kiwaa tarad dau sondii, miagal Wikipedia i kitonsi do ensiklopedia, Wikimedia Commons i monokodung do gambal om nogi mogikaakawo fail media, om Wiktionary i popotounda maklumat leksikal kokomoi boros miagal ko' rati om sinonim. Wikidata diti sondiri nga pointarad do data kistruktur.
Liwang nopo diti nga kitudu sabaagi ponokosangai poimponu data kistruktur. Nung noubas kou miampai data kistruktur, nga manu mongilo lobi ralom kokomoi pomogunaan poimbida Wikidata, poingkuro akses do data id Wikidata, toi ko' poingkuro manakadai data projek dokoyu sondiri kumaa Wikidata, ongoi id boogian do boogian kokomoi do data popiromut.
Mamarati Wikidata
Data kistruktur nopo nga sumuku data di noulud om pinoopi miampai woyo di pintantu, miampai do sinorou montok mengekod rati om popotilombus piromutan mantad titik data di mogisuusuai id suang iso set data.
Nga nunu iti data? Om nokuro tu minog kou do koilo kokomoi data kistruktur miampai poimbida?
Manahak pomoyon data
Data tagayo, eksperimen data, data poingukab, metadata - nokoumbal kou kanto nokorongou piipiro toi ko' toinsanan do istilah diti di pogulu po.
Potitikid istilah nopo nga kirati do tiso-iso ii mogisusuai nga toinsanan nga nowonsoi miampai do korotian data di mogiagal om kipotensi do popokito om popoingkawas korotian dati kokomoi pomogunan id posorili dati.
Sabaagi konsep abstrak, milo sorohon o data diti sabaagi do ponogulu kumaa maklumat, i kikomoyon sabaagi maklumat i milo rotion toi ko aanu mantad data.
Iti nopo nga' gama do soira data nopo nga' tutudan gisom id tonsi nopo iso set gatang kokomoi ahal. Gatang-gatang dii milo id numerik toi ko' kuantitatif miagal do pongintaban toi ko' mongilo koinsanan. Milo nogi o data di do kikowoowoyoon kualitatif, miagal do kotolinahasan toi ko' piadangan. Sabaagi poomitanan, milo toi' popoilo do "8,949 m (29,029 kaki)" nopo nga' data kigatang kokomoi do kinawas montok Nulu Everest om "aragang" id gatang data nopo nga' kokomoi do warana korita.
Miagal di noroitan di kopogulu, kopoilaan nopo nga' okon ko' data di miagal nga' asil ponimungan om pomorindakan data. Poomitanan nopo nga' "8,848" (data) nopo nga' numbur di aiso rati pointantu montok dau sondii hali pia koilo tokou do iri nopo nga' kinawas do nuluhon; milo tokou popoilo do "Nulu Everest nopo nga' nulu bobos takawas id pomogunan miampai 8,848 m kinawas" (kopoilaan) sokiro koilo tokou do pongintaban piawai kinawaa om soira koilo tokou kinawas do nulu suai. Lobi ouhan sokiro momonsoi tokou iso kolimpupuson miagal dii, momoruhang do wawasan om toilaan, om momonsoi fakta-fakta maamaso uludon o data dii - gumuli tokou id gagasan diti kaagu.
Hinonggo data?
Data nopo nga' id posorili dati. Ogumu kawo tadon do data, kohompit no data kousinan, biologi om sosial. Liwang diti nogi nga' haro data id suang dau! Poomitanan nopo nga' haro data toinsanan boros, tadauwulan winonsoi om tadauwulan nosimbanan, iso uhu toi ko' tema, piipiro angka ginumu do pinokito liwang dii, om nunu-nunu boros di pinotounda id liwang dii.
Hali pia miagal dii, maamaso toinsanan nokosiliu do toud data di kipotensi, data di amu nokosuang om noulud nopo nga' aiso do hiti. Miampai aiso struktur di mongimpohon, data di pinokito nopo nga' aiso rati om amu kosiliu sabaagi data di kiguno.
Maya ponguludan, komoyon nopo nga nokategori miampai woyo standard om au atalang. Data ii noulud om nokategori nopo nga komoyon do data i nokostruktur no.
Hinonggo struktur?
Id web, struktur nopo nga raja. Id boogian tagayo liwang web nopo nga nowonsoi miampai HTML, iso boros i natandaan ii manahak do impohon scaffolding, toi ko struktur mantad iso liwang web.
Markup languages are also used for tagging and describing page content so that search engines, bots, and applications like RSS feeds can easily process and "understand" it. For example, <title>
tags tell machines what the name of a website is.
Instead of supporting the structure and common elements of a web page, Wikidata provides structure for all the information stored in Wikipedia, and on the other Wikimedia projects. Wikidata is based on the Mediawiki software as is any other Wikimedia project, extended by Wikibase, the software which powers Wikidata and is designed to manage large amounts of structured data. Structure is not directly added to the content of Wikipedia or other Wikimedia site pages, as in tables or lists, nor is any knowledge of markup languages, data schemas, object notation, or other special syntax required by Wikidata users; instead, data is added to and edited in Wikidata through user-friendly input forms.
Toinsanan data ii noko'opi id Wikidata nopo nga milo gunoon montok papaasil kointayadan toi ko jadual toi ko liwang struktur suai i automatik id nunu nopo pautan Wikimedia om nogi honggo-honggo koiyonon suai.
Data montok Mountains | ||
Mountain | Kowoowoyoo | Gatang |
Mount Everest | height | 8,848 m |
K2 | hauteur | 8,611 m |
Kanchenjunga | height | 8,586 m |
Lhotse | height | 27940 ft |
Monstruktur data
For an example on the importance of structure, let's look at Table 1. In this table we can see data for the four highest mountains on Earth. If we would like to know a particular piece of information, such as the height of the second highest mountain in the world, we should be able to look at the provided data and find out the correct value. However, only three of the four mountains have their data categorized as a height value, and only two of those three mountains have values in metres. While we know that height and hauteur (French for height) can be understood as equal to each other, and how to convert metres to feet or vice versa, a machine, such as a bot or a computer program may not.
It would be much easier for both humans and machines to process the information and answer the original question about the second highest mountain when all underlying data is recorded in a similar way even if the presentation differs.
Data model
Collections of structured data, like Wikidata, are organized according to a data model. Data models are machine-readable, meaning they can be understood by a computer. While computers are powerful, they are often not as smart as us when it comes to simple reasoning. For instance, in the example above, a machine would not be able to know that height and hauteur are the same unless they were explicitly told somehow that was the case.
item: Pomogunan
Kowoowoyoo: titik taranggou
gatang: Everest
Data montok nulu | ||
Nulu | Kowoowoyoo | Gatang |
Mount Everest | continent | Asia |
K2 | continent | Asia |
Kanchenjunga | continent | Asia |
Lhotse | continent | Asia |
Data models vary based on the analysis needs, scope and conceptual framework of the dataset, and the technical requirements of a system. However, all data models typically will specify what kind of data can be supported by a system and what relationships between values can be understood and represented. For example, a data model could specify that height and hauteur be mapped to each other so that both terms represent one concept, or that measurements in feet be automatically converted into metres. The Wikidata data model shapes the way that data can be edited and added to the system by users. It is also a work in progress, with new data types being added to the model over time.
The data model also essentially translates human natural language patterns into something that can be processed by machines. For example, in English we might say:
- "Mount Everest is the highest mountain in the world"
This is also the raw, unstructured format of content currently on Wikipedia and all other Wikimedia sites.
On Wikidata, this would be represented by a statement, which consists of a property-value pair about an item, in this case Earth:
Earth (Q2) (item) → highest point (P610) (property) → Mount Everest (Q513) (value)
Additionally, Wikidata would also hold a statement about the item for Mount Everest (indicating it is a mountain):
Mount Everest (Q513) (item) → instance of (P31) (property) → mountain (Q8502) (value)
Note that because other items can be used as the values for statements, and all items have their own unique page on Wikidata, this means that all items in the system can be linked together through a series of statements. Because Wikidata uses a machine-readable format, this interlinking of data allows new relationships and connections to be discovered and processed by machines. For example, in Table 2 we see new data for our mountains, this time about their geographical location by continent but nothing about their heights. Assuming this continent data was linked to the mountain height data, we would feel more confident making predictions or drawing certain conclusions about it, like saying that Asia is home to the world's highest mountains.
Data popiromut
Besides being a collection of structured data, Wikidata also supports linked data. Linked data refers to the practice of publishing structured data so that it can be interlinked.
For Wikidata this means that volunteer-contributed data can also be linked to other datasets, databases, and data sources from all around the web and from diverse initiatives outside of the Wikimedia family. For example, Wikidata currently allows interlinking with datasets and databases as diverse as Google Books, Canmore (one of the Historic Environment Scotland databases), the Vatican Library, OmegaWiki, and MusicBrainz.
Miampai mananud do prinsip om amalan data i kikomoi, Wikidata nogi nga kaanu monokodung om gunoon id projek suai.
Prinsip data miromut
Wikidata momoguno do pongintutun do unik, toi ko' w:Uniform_resource_identifier (uniform resource identifiers) , montok toinsanan item w:Linked_data#Prinsip khias per standard data i piniromut.
Hali pia momoguno o Wikidata do model data di unik, suang nopo nga' milo eksporton kumaa RDF, format di gunoon montok popioput do data. Id istilah Wikidata, iso kenyataan nopo nga' kohompit no mantad item om pasangan harta kigatang. Montok diolo di noubas do konsep data di miungkait, iso item nopo nga' milo intangan sabaagi sobogian do subjek kotolu; kowoowoyoon nopo nga' popokito do predikat kotolu; om gatang di gunoon nopo nga' montok popokito do kakamot kotolu.
Hali pia miagal dii, pornyataan Wikidata nogi nga' kisuang do roromu soliwan do subjek-predikat-objek, miagal do toud om kualifikator (montok lobi toilaan milo mintong id Pornyataan)). Iti nopo nga' apaapagon montok mowakili tonsi do Wikidata miampai poimponu momoguno do boros RDF - kopoilaan potilombus kokomoi tantaman diti milo onuon id dokumen "[$ 1 Popointutun Wikidata kumaa Web Data Mionit]".
Manakadai data
Nung kiwa datasets dokoyu ii aanangan kou do manakadai id Wikidata, intangai Wikidata:Data donation.
Maganu data
Data id Wikidata nopo nga' pinotorbit id siriba do [Creative Commons Public Domain Dedication 1.0, ii papasaga do gunnon kawagu data miampai obibas. Milo ko maganu, monimban, manakadai om popoilang do data, hali pia montok kapansalan komersial, toinsanan miampai au tagal do mokianu kasagaan.
Intangai Akses data montok butiran kokomoi mogikaakawo woyo do mongakses data Wikidata maya program.
