The elementary work on the dm4-wikidata-toolkit module now done allows for easy adaptation and variation. Variation and revision are actually two of many good reasons for posting about this early and releasing stuff. This gives me the time to hold back for a moment, re-focus and reflect on what has been done and learnt. After this infokitchen-dinner i would to think and talk with you about: What if many different parties would do expose varying aspects of the full and public wikidata as a service (API)? What data covering which aspects of the world is in wikidata and which do we want to cultivate there? And ultimately, does this look like a good approach to be investigated any deeper?
The dm4-wikidata-toolkit module got started as a branch of the wikidata search mode for deepamehta4 when i started to work with the so called “wikidata dumps” (=one-point-in-time, complete database exports) instead of performing GET requests against the HTTP API endpoint of wikidata.org. Now after that switch of methods in last December i decided to spin that branch off into a dedicated plugin which solely builds on top of the work done by Markus Kroetzsch (and others) who developed the WikidataToolkit. So this plugin integrates the WDTK into the OSGi environment of Deepamehta 4. For developers it maybe noteworthy that DeepaMehta 4 uses Neo4J as its current storage layer and it allows for integration of other Neo4J modules. Thus our storage layer has already built-in basic support for spatial and time ranges queries over the transformed data. Once the data is within deepamehta4, we can very easily write custom queries (through JAX-RS) and expose a REST API query endpoint, if we know a bit of Java and know how to traverse a graph.
In the example live here i focused on analyzing and exposing certain aspects of wikidata to answer the question (among others): Who was employed by the BBC while being a citizen of Germany? The custom endpoint at http://wikidata-topics.beta.wmflabs.org/wdtk/list/ is able to serve JSON topics and inform your application about Cities, Countries, Persons and Institutions stored, described and related to each other in wikidata. Adapting this and writing another custom endpoint is, with dm4-wikidata-toolkit, a pretty straight forward job. The complete transformation (analyze, import and mapping) process is outlined at the end of this post. The implementation and documentation for the four, currently supported custom queries can be found here from line 160 onwards.
Usage scenarios and research on use-cases
When surfing around the topic of “wikidata query service” i found a few pages proposing “user stories” and these examples (from my experience) prove always very insightful for developers trying to meet some needs. As described by Phase 3 of the wikidata project, which is the current stage, the latest and widest goal is the deployment and roll-out of listings on wikipedia pages . The advantage of this for Wikipedians seems to be that these future list are not going to be manually edited and maintained in n-hundred versions anymore but just at one place which then gets integrated into all (language specific) Wikipedia pages. To further inform yourself about some questions which a wikidata query endpoint should be able to answer you might want to jump on reading this page.
In plain terms, for the respectively transformed wikidata, our endpoint can currently answer simple queries (all items using a certain property, all items using a certain property involving a specific item) and a query “walking along” two properties (employee of, citizen of) involving two specific items (BBC, Germany) – called a traversing with? query. A special feature of our approach is that we can also respond with a list of all claims for a given wikidata propertyId while naming both involved players (items) for the respective claim. One disadvantage for the latter (“claim listing query”) is that the result set will definitely be very large and that we (currently) can not skim or page through it.
After searching a bit more i came across the following page created as a research documentation by the developers (Wikibase Indexing) and was a bit struck by how much of its content i could relate to. It was that page where i heard about WikiGrok for the first time, a few weeks ago. There it is written that a query-service could be valuable in detecting and thus filling up niche-areas in wikidata through engaging mobile users with a mobile frontend/web-app called WikiGrok.
After having connected the dots from the contents of these wiki pages, with the super-next release of this plugin i might very well aim at the goals set by this page. Read on to get to know more about what this effort covers.