Matches in Ruben’s data for { ?s <http://schema.org/abstract> ?o }
- publication abstract "While hyperlinks are absolutely crucial to the Web’s success, they are currently uni-directional, as information is augmented with controls from the perspective of the information publisher. However, it is the user who needs those links to navigate—and the publisher cannot know how any user might want to interact with the information. Therefore, the most relevant links for a user might be omitted, severely limiting the applicability of hypertext. In this paper, I outline a plan to tackle this problem as part of my doctoral research, by explaining the research questions, the underlying hypotheses, and the approach, in which semantic technologies play a crucial role.".
- publication abstract "For publishers of Linked Open Data, providing queryable access to their dataset is costly. Those that offer a public SPARQL end-point often have to sacrifice high availability; others merely provide non-queryable means of access such as data dumps. We have developed a client-side query execution approach for which servers only need to provide a lightweight triple-pattern-based interface, enabling queryable access at low cost. This paper describes the implementation of a client that can evaluate SPARQL queries over such triple pattern fragments of a Linked Data dataset. Graph patterns of SPARQL queries can be solved efficiently by using metadata in server responses. The demonstration consists of SPARQL client for triple pattern fragments that can run as a standalone application, browser application, or library.".
- publication abstract "Proceedings of the ISWC Developers Workshop 2014, co-located with the 13th International Semantic Web Conference (ISWC 2014)".
- publication abstract "Functionality makes APIs unique and therefore helps humans and machines decide what service they need. However, if two APIs offer similar functionality, quality attributes such as performance and ease-of-use might become a decisive factor. Several of these quality attributes are inherently subjective, and hence exist within a social context. These social parameters should be taken into account when creating personalized mashups and service compositions. The Web API description format RESTdesc already captures functionality in an elegant way, so in this paper we will demonstrate how it can be extended to include social parameters. We indicate the role these parameters can play in generating functional compositions that fulfill specified quality attributes. Finally, we show how descriptions can be personalized by exploring a user’s social graph. This ultimately leads to a more focused, on-demand use of Web APIs, driven by functionality and social parameters.".
- publication abstract "The success or failure of Semantic Web services is non-measurable: many different formats exist, none of them standardized, and few to no services actually use them. Instead of trying to retrofit Web APIs to our models, building APIs in a different way makes them usable by generic clients. This paper argues why we should create Web APIs out of reusable building blocks whose functionality is self-descriptive through hypermedia controls. The non-functional aspects of such APIs can be measured on the server and client side, bringing us to a more scientific vision of agents on the Web.".
- publication abstract "“Web 1.0 connected humans with machines. Web 2.0 connected humans with humans. Web 3.0 connects machines with machines.” On the one hand, an incredible amount of valuable data is described by billions of triples, machine-accessible and interconnected thanks to the promises of Linked Data. On the other hand, rest is a scalable, resource-oriented architectural style that, like the Linked Data vision, recognizes the importance of links between resources. Hypermedia APIs are resources, too—albeit dynamic ones—and unfortunately, neither Linked Data principles, nor the rest-implied self-descriptiveness of hypermedia APIs sufficiently describe them to allow for long-envisioned realizations like automatic service discovery and composition. We argue that describing inter-resource links—similarly to what the Linked Data movement has done for data—is the key to machine-driven consumption of APIs. In this paper, we explain how the description format RESTdesc captures the functionality of APIs by explaining the effect of dynamic interactions, effectively complementing the Linked Data vision.".
- publication abstract "To unlock the full potential of Linked Data sources, we need flexible ways to query them. Public SPARQL endpoints aim to fulfill that need, but their availability is notoriously problematic. We therefore introduce Linked Data Fragments, a publishing method that allows efficient offloading of query execution from servers to clients through a lightweight partitioning strategy. It enables servers to maintain availability rates as high as any regular HTTP server, allowing querying to scale reliably to much larger numbers of clients. This paper explains the core concepts behind Linked Data Fragments and experimentally verifies their Web-level scalability, at the cost of increased query times. We show how trading server-side query execution for inexpensive data resources with relevant affordances enables a new generation of intelligent clients.".
- publication abstract "This masters thesis discusses the application of Semantic Web technologies to the automatic metadata, generation and annotation process for multimedia data. It describes the architecture of a generic semantic problem solving platform that uses independent algorithms to accomplish subtasks. A holistic vision on the metadata request harnesses more advanced problems, enabling the algorithms to interact with the solution as it is generated.".
- publication abstract "Publishing legacy data as Linked Data is too technical for most people, so this process needs expensive IT consulting. This high cost, balanced against sometimes underestimated benefits, might hold back governments from publishing Linked Data. To cut back on these costs, we illustrate how people with a non-technical background can use Google Refine to link their data to the Linked Data Cloud. Our straightforward method involves three steps: cleaning, reconciliation, and linking. In tests with several large data sets, over 80% of records could be connected to the Semantic Web in an almost automated process. In the end, the data set is in far better shape to either publish directly on the Web, or to pass to a higher level of processing, which will now be far less expensive. This way, we provide a substantial amount of enhancements that non-experts can perform on their own data, facilitating publication of Linked Data. Our belief is that, given the right tools, Linked Open Government Data can become a realistic goal even under tight schedules and limited budgets.".
- publication abstract "Data is supposed to be the new gold, but how can you unlock the value in your data? Managing large datasets used to be a task for specialists, but you don’t have to worry about inconsistencies or errors anymore. OpenRefine lets you clean, link, and publish your dataset in a breeze. Using OpenRefine takes you on a practical tour of all the handy features of this well-known data transformation tool. It is a hands-on recipe book that teaches you data techniques by example. Starting from the basics, it gradually transforms you into an OpenRefine expert. This book will teach you all the necessary skills to handle any large dataset and to turn it into high-quality data for the Web. After you learn how to analyze data and spot issues, we’ll see how we can solve them to obtain a clean dataset. Messy and inconsistent data is recovered through advanced techniques such as automated clustering. We’ll then show extract links from keyword and full-text fields using reconciliation and named-entity extraction. Using OpenRefine is more than a manual: it’s a guide stuffed with tips and tricks to get the best out of your data.".
- publication abstract "The REST architectural style assumes that client and server form a contract with content negotiation, not only on the data format but implicitly also on the semantics of the communicated data, i.e., an agreement on how the data have to be interpreted. In different application scenarios such an agreement requires vendor-specific content types for the individual services to convey the meaning of the communicated data. The idea behind vendor-specific content types is that service providers can reuse content types and service consumers can make use of specific processors for the individual content types. In practice however, we see that many RESTful APIs on the Web simply make use of standard non-specific content types, e.g., text/xml or application/json. Since the agreement on the semantics is only implicit, programmers developing client applications have to manually gain a deep understanding of several APIs from multiple providers.".
- publication abstract "Multimedia processing algorithms in various domains often communicate with different proprietary protocols and representation formats, lacking a rigorous description. Furthermore, their capabilities and requirements are usually described by an informal textual description. While sufficient for manual and batch execution, these descriptions lack the expressiveness to enable automated invocation. The discovery of relevant algorithms and automated information exchange between them is virtually impossible. This paper presents a mechanism for accessing algorithms as SPARQL endpoints, which provides a formal protocol and representation format. Additionally, we describe algorithms using OWL-S, enabling automated discovery and information exchange. As a result, these algorithms can be applied autonomously in varying contexts. We illustrate our approach by a use case in which algorithms are employed automatically to solve a complex multimedia annotation problem.".
- publication abstract "Web APIs are becoming an increasingly popular alternative to the more heavy-weight Web services. Recently, they also have been used in the context of sensor networks. However, making different Web APIs (and thus sensors) cooperate often requires a significant amount of manual configuration. Ideally, we want Web APIs to behave like Linked Data, where data from different sources can be combined in a straightforward way. Therefore, in this paper, we show how Web APIs, semantically described by the lightweight format RESTdesc, can be composed automatically based on their functionality. Moreover, the composition process does not require specific tools, as compositions are created by generic Semantic Web reasoners as part of a proof. We then indicate how the composition in this proof can be executed. We describe our architecture and implementation, and validate that proof-based composition is a feasible strategy on a Web scale. Our measurements indicate that current reasoners can integrate compositions of more than 200 Web APIs in under one second. This makes proof-based composition a practical choice for today’s Web APIs.".
- publication abstract "It is difficult for publishers to include the right links in documents, because they cannot predict all actions their users might want to perform. Existing adaptive navigation systems can generate relevant links, but doing this on a Web scale is non-trivial, especially if the targets are dynamic actions. As a result, adaptation often happens in a centralized way on a limited or closed document and action set. Distributed affordance is a technology to automatically generate links from any Web resource to matching actions from an open set of Web services, based on semantic annotations. In this paper, we indicate how this technology can be applied to adaptive navigation. We investigate how the generated links can be represented and how their relevance can be guaranteed. Based on that, we conclude that semantic technologies are an enabler to perform adaptive navigation to dynamic actions in a distributed way.".
- publication abstract "A tremendous amount of machine-interpretable information is available in the Linked Open Data Cloud. Unfortunately, much of this data remains underused as machine clients struggle to use the Web. We believe this can be solved by giving machines interfaces similar to those offered to humans, instead of separate interfaces such as SPARQL endpoints. We discuss the Linked Data Fragments vision on machine access to the Web of Data, and indicate how this impacts usage analysis of the LOD Cloud. A lot can be learned from how humans access the Web, and those strategies can be applied to querying and analysis. In particular, we encourage to focus first on solving queries that humans can answer easily, before attempting more difficult challenges.".
- publication abstract "Queryable Linked Data is available through several interfaces, including SPARQL endpoints and Linked Data documents. Recently, the popular DBpedia dataset was made available through a Triple Pattern Fragments interface, which proposes to improve query availability by dividing query execution between clients and servers. In this paper, we present an initial usage analysis of this interface so far. In 4 months time, the server had an availability of 99.999%, handling 4,455,813 requests, more than a quarter of which were served from cache. These numbers provide promising evidence that Triple Pattern Fragments are a viable strategy for live applications on top of public queryable datasets.".
- publication abstract "Linked Data has become an integral part of the Web. Like any other web resource, Linked Data changes over time. Typically, only the most recent version of a Linked Data set can be accessed via Subject-URIs and queried by means of SPARQL. Sometimes, select archived versions are made available for bulk download. This archive access approach is cheap for the publisher but, unfortunately, very expensive for consumers. The entire data dump must be downloaded and ingested into infrastructure that supports subject-URI and/or SPARQL access. Comparing data across different archived versions is even harder. To address this publisher/consumer imbalance, we propose a solution for publication of archived Linked Data that is affordable for publishers and functional for consumers. It consists of two components: a static storage approach for archived Linked Data that exposes a lightweight RDF interface, and the subsequent extension of that interface to versioned data.".
- publication abstract "Autonomous intelligent agents are advanced pieces of software that can consume Web data and services without being preprogrammed for a specific domain. In this paper, we look at the current state of the Web for agents and illustrate how the current diversity in formats and differences between static data and dynamic services limit the possibilities of such agents. We then explain how solutions that strive to provide a united interface to static and dynamic resources provide an answer to this problem. The relevance of current developments in research on semantic descriptions is highlighted. At every point in the discussion, we connect the technology to its impact on communication. Finally, we argue that a strong cooperation between resource providers and developers will be necessary to make the Web for agents emerge.".
- publication abstract "Thousands of APIs exist and their number is growing tremendously, while use cases become increasingly complex. In contrast to the human-oriented part of the Web, Web APIs are designed to be used by machines. This makes issues surrounding their design and integration significantly different from those of the traditional Web. However, this also enables the provisioning of new added-value automated solutions on the Web.".
- publication abstract "The Web is changing at a tremendous speed, and Web APIs play an important part in that. In fact, the number of Web APIs is growing so quickly that we face many challenges. The WS-REST workshop series aim to connect Web researchers and engineers to tackle the issues we are facing.".
- publication abstract "Despite numerous outstanding results, highly complex and specialized multimedia algorithms have not been able to fulfill the promise of fully automated multimedia interpretation. An essential problem is that they are insufficiently aware of the context they operate in. Algorithms that do take a form of context in consideration, often function in a domain-specific environment. The generic framework proposed in this paper stimulates algorithm collaboration on an interpretation task by continuously actualizing the context of the multimedia item under interpretation. Semantic Web knowledge, combined with reasoning methods, forms the corner stone of the integration of these various interacting agents. We believe that this framework will enable an advanced interpretation of multimedia data that goes beyond the capabilities of individual algorithms. A basic platform implementation already indicates the potential of the concept, clearing the path for even more complex interpretation scenarios.".
- publication abstract "Proceedings of the 3rd International Workshop on Geospatial Linked Data and the 2nd Workshop on Querying the Web of Data co-located with 15th Extended Semantic Web Conference (ESWC 2018)".
- publication abstract "As the building industry is rapidly catching up with digital advancements, and Web technologies grow in both maturity and security, a data- and Web-based construction practice comes within reach. In such an environment, private project information and open online data can be combined to allow cross-domain interoperability at data level, using Semantic Web technologies. As construction projects often feature complex and temporary networks of stakeholder firms and their employees, a property-based access control mechanism is necessary to enable a flexible and automated management of distributed building projects. In this article, we propose a method to facilitate such mechanism using existing Web technologies: RDF, SHACL, WebIDs, nanopublications and the Linked Data Platform. The proposed method will be illustrated with an extension of a custom Node.js Solid server. The potential of the Solid ecosystem has been put forward earlier as a basis for a Linked Data-based Common Data Environment: its decentralised setup, connection of both RDF and non-RDF resources and fine-grained access control mechanisms are considered an apt foundation to manage distributed digital twins.".
- componentsjs.readthedocs.io abstract "Components.js is a dependency injection framework for JavaScript applications that allows components to be instantiated and wired together declaratively using semantic configuration files. The advantage of these semantic configuration files is that software components can be uniquely and globally identified using URIs. As an example, this documentation has been made self-instantiatable using Components.js. This makes it possible to view the HTML-version of any page to the console, or serve it via HTTP on a local webserver.".
- Article-Live-Open-Data-Interfaces abstract "There are two mechanisms for publishing live changing resources on the Web: a client can pull the latest state of a resource or the server pushes updates to the client. In the state of the art, it is clear that pushing delivers a lower latency compared to pulling, however, this has not been tested for an Open Data usage scenario where 15k clients are not an exception. Also, there are no general guidelines when to use a polling or push-based approach for publishing live changing resources on the Web. We performed (i) a field report of live Open datasets on the European and U.S. Open Data portal and (ii) a benchmark between HTTP polling and Server-Sent Events (SSE) under a load of 25k clients. In this article, we compare the scalability and latency of updates on the client between polling and pushing. For the scenario where users want to receive an update as fast as possible, we found that SSE excels above polling in three aspects: lower CPU usage on the server, lower latency on the client and more than double the number of clients that can be served. However, considering that users can perceive a certain maximum latency on the client (MAL) of an update acceptable, we describe in this article at which MAL point a polling interface can be able to serve a higher number of clients than pushing. Open Data publishers can use these insights to determine which mechanism is the most cost-effective for the usage scenario they foresee of their live updating resources on the Web.".
- Article-Predicting-traffic-light-phases abstract "Dynamic traffic lights change their current phase duration according to the situation on the intersection, such as crowdedness. In Flanders, only the minimum and maximum duration of the current phase is published. When route planners want to reuse this data they have to predict how long the current phase will take in order to route over these traffic lights. We tested for a live Open Traffic Lights dataset of Antwerp how frequency distributions of phase durations (i) can be used to predict the duration of the current phase and (ii) can be generated client-side on-the-fly with a demonstrator. An overall mean average error (MAE) of 5.1 seconds is reached by using the median for predictions. A distribution is created for every day with time slots of 20 minutes. This result is better than expected, because phase durations can range between a few seconds and over two minutes. When taking the remaining time until phase change into account, we see a MAE around 10 seconds when the remaining time is less than a minute which we still deem valuable for route planning. Unfortunately, the MAE grows linear for phases longer than a minute making our prediction method useless when this occurs. Based on these results, we wish to present two discussion points during the workshop.".
- Article-Using-an-existing-website-as-a-queryable-low-cost-LOD-publishing-interface abstract "Maintaining an Open Dataset comes at an extra recurring cost when it is published in a dedicated Web interface. As there is not often a direct financial return from publishing a dataset publicly, these extra costs need to be minimized. Therefore we want to explore reusing existing infrastructure by enriching existing websites with Linked Data. In this demonstrator, we advised the data owner to annotate a digital heritage website with JSON-LD snippets, resulting in a dataset of more than three million triples that is now available and officially maintained. The website itself is paged, and thus Hydra partial collection view controls were added in the snippets. We then extended the modular query engine Comunica to support following page controls and extracting data from HTML documents while querying. This way, a SPARQL or GraphQL query over multiple heterogeneous data sources can power automated data reuse. While the query performance on such an interface is visibly poor, it becomes easy to create composite data dumps. As a result of implementing these building blocks in Comunica, any paged collection and enriched HTML page now becomes queryable by the query engine. This enables heterogeneous data interfaces to share functionality and become technically interoperable.".
- Article-ISWC2018-Demo-GraphQlLD abstract "The Linked Open Data cloud has the potential of significantly enhancing and transforming end-user applications. For example, the use of URIs to identify things allows data joining between separate data sources. Most popular (Web) application frameworks, such as React and Angular have limited support for querying the Web of Linked Data, which leads to a high-entry barrier for Web application developers. Instead, these developers increasingly use the highly popular GraphQL query language for retrieving data from GraphQL APIs, because GraphQL is tightly integrated into these frameworks. In order to lower the barrier for developers towards Linked Data consumption, the Linked Open Data cloud needs to be queryable with GraphQL as well. In this article, we introduce a method for transforming GraphQL queries coupled with a JSON-LD context to SPARQL, and a method for converting SPARQL results to the GraphQL query-compatible response. We demonstrate this method by implementing it into the Comunica framework. This approach brings us one step closer towards widespread Linked Data consumption for application development.".
- Article-ISWC2018-Demo abstract "Linked Data sources can appear in a variety of forms, going from SPARQL endpoints to Triple Pattern Fragments and data dumps. This heterogeneity among Linked Data sources creates an added layer of complexity when querying or combining results from those sources. To ease this problem, we created a modular engine, Comunica, that has modules for evaluating SPARQL queries and supports heterogeneous interfaces. Other modules for other query or source types can easily be added. In this paper we showcase a Web client that uses Comunica to evaluate federated SPARQL queries through automatic source type identification and interaction.".
- Article-ISWC2018-Resource abstract "Query evaluation over Linked Data sources has become a complex story, given the multitude of algorithms and techniques for single- and multi-source querying, as well as the heterogeneity of Web interfaces through which data is published online. Today’s query processors are insufficiently adaptable to test multiple query engine aspects in combination, such as evaluating the performance of a certain join algorithm over a federation of heterogeneous interfaces. The Semantic Web research community is in need of a flexible query engine that allows plugging in new components such as different algorithms, new or experimental SPARQL features, and support for new Web interfaces. We designed and developed a Web-friendly and modular meta query engine called Comunica that meets these specifications. In this article, we introduce this query engine and explain the architectural choices behind its design. We show how its modular nature makes it an ideal research platform for investigating new kinds of Linked Data interfaces and querying algorithms. Comunica facilitates the development, testing, and evaluation of new query processing capabilities, both in isolation and in combination with others.".
- Article-SSWS2020-AMF abstract "Depending on the HTTP interface used for publishing Linked Data, the effort of evaluating a SPARQL query can be redistributed differently between clients and servers. For instance, lower server-side CPU usage can be realized at the expense of higher bandwidth consumption. Previous work has shown that complementing lightweight interfaces such as Triple Pattern Fragments (TPF) with additional metadata can positively impact the performance of clients and servers. Specifically, Approximate Membership Filters (AMFs)—data structures that are small and probabilistic—in the context of TPF were shown to reduce the number of HTTP requests, at the expense of increasing query execution times. In order to mitigate this significant drawback, we have investigated unexplored aspects of AMFs as metadata on TPF interfaces. In this article, we introduce and evaluate alternative approaches for server-side publication and client-side consumption of AMFs within TPF to achieve faster query execution, while maintaining low server-side effort. Our alternative client-side algorithm and the proposed server configurations significantly reduce both the number of HTTP requests and query execution time, with only a small increase in server load, thereby mitigating the major bottleneck of AMFs within TPF. Compared to regular TPF, average query execution is more than 2 times faster and requires only 10% of the number of HTTP requests, at the cost of at most a 10% increase in server load. These findings translate into a set of concrete guidelines for data publishers on how to configure AMF metadata on their servers.".
- dokieli-rww abstract "Decentralising the creation, publication, and annotation of hypertext documents provides authors with a technological guarantee for independence of any publication authority. While the Web was designed as a decentralised environment, individual authors still lack the ability to conveniently author and publish documents, and to engage in social interactions with documents of others in a truly decentralised fashion. We present dokieli, a fully decentralised, browser-based authoring and annotation platform with built-in support for social interactions, through which people retain the ownership of and sovereignty over their data. The resulting “living” documents are interoperable and independent of dokieli since they follow standards and best practices, such as HTML+RDFa for a fine-grained semantic structure, Linked Data Platform for personal data storage, and Linked Data Notifications for updates. This article describes dokieli’s architecture and implementation, demonstrating advanced document authoring and interaction without a single point of control. Such an environment provides the right technological conditions for independent publication of scientific articles, news, and other works that benefit from diverse voices and open interactions.".
- asi.22763 abstract "The concept of Linked Data has made its entrance in the cultural heritage sector due to its potential use for the integration of heterogeneous collections and deriving additional value out o existing metadata. However, practitioners and researchers alike need a better understanding of what outcome they can reasonably expect of the reconciliation process between their local metadata and established controlled vocabularies which are already a part of the Linked Data cloud. This paper offers an in-depth analysis of how a locally developed vocabulary can be successfully reconciled with the Library of Congress Subject Headings (LCSH) and the Arts and Architecture Thesaurus (AAT) through the help of a general-purpose tool for interactive data transformation (Google Refine). Issues negatively affecting the reconciliation process are identified and solutions are proposed in order to get a maximum value from existing metadata and controlled vocabularies in an automated manner.".
- 978-3-030-01379-0_2 abstract "Scholarly publishing enhances the meaning of publications by enriching them with metadata. To achieve that, ad-hoc solutions were established so far for generating Linked Data from scholarly data, entailing non-negligible implementation and maintenance costs. Therefore, even though same or complementary data may be published by different data owners of scholarly data, existing ad-hoc solutions cannot be reused, whereas general purpose solutions were neglected. In this paper, we propose a Linked Data publishing workflow which can be (i) easily adjusted and, thus reused, by different data owners to generate and publish Linked Data from their data sources, and (ii) used to align scholarly data repositories with publications content. As a proof-of-concept, the proposed workflow was applied to iMinds research institute data warehouse which was aligned with publications derived from Ghent University’s digital repository. Moreover, a user interface, which is easily adjustable and extensible, was developed to facilitate lay users to explore semantically annotated data, as the ones of the iLastic Linked Data set.".
- 978-3-030-21395-4_12 abstract "Knowledge graphs are often generated using rules that apply semantic annotations to data sources. Software tools then execute these rules and generate or virtualize the corresponding RDF-based knowledge graph. RML is an extension of the W3C-recommended R2RML language, extending support from relational databases to other data sources, such as data in CSV, XML, and JSON format. As part of the R2RML standardization process, a set of test cases was created to assess tool conformance the specification. In this work, we generated an initial set of reusable test cases to assess RML conformance. These test cases are based on R2RML test cases and can be used by any tool, regardless of the programming language. We tested the conformance of two RML processors: the RMLMapper and CARML. The results show that the RMLMapper passes all CSV, XML, and JSON test cases, and most test cases for relational databases. CARML passes most CSV, XML, and JSON test cases regarding. Developers can determine the degree of conformance of their tools, and users determine based on conformance results to determine the most suitable tool for their use cases.".
- 978-3-030-32327-1 abstract "This book constitutes the thoroughly refereed post-conference proceedings of the Satellite Events of the 16th Extended Semantic Web Conference, ESWC 2019, held in Portorož, Slovenia, in June 2019. The volume contains 38 poster and demonstration papers, 2 workshop papers, 5 PhD symposium papers, and 3 industry track papers, selected out of a total of 68 submissions. They deal with all areas of Semantic Web research, semantic technologies on the Web and Linked Data.".
- 978-3-030-32327-1_7 abstract "Functions are essential building blocks of any (computer) information system. However, development efforts to implement these functions are fragmented: a function has multiple implementations, each within a specific development context. Manual effort is needed handling various search interfaces and access methods to find the desired function, its metadata (if any), and associated implementations. This laborious process inhibits discovery, and thus reuse. Uniform, implementation-independent access is needed. We demo the Function Hub, available online at https://fno.io/hub: a Web application using a semantic interoperable model to map function descriptions to (multiple) implementations. The Function Hub allows editing and discovering function description metadata, and add information about alternative implementations. This way, the Function Hub enables users to discover relevant functions independently of their implementation, and to link to original published implementations.".
- 978-3-030-39296-3_26 abstract "Governments typically store large amounts of personal information on their citizens, such as a home address, marital status, and occupation, to offer public services. Because governments consist of various governmental agencies, multiple copies of this data often exist. This raises concerns regarding data consistency, privacy, and access control, especially under recent legal frameworks such as GDPR. To solve these problems, and to give citizens true control over their data, we explore an approach using the decentralised Solid ecosystem, which enables citizens to maintain their data in personal data pods. We have applied this approach to two high-impact use cases, where citizen information is stored in personal data pods, and both public and private organisations are selectively granted access. Our findings indicate that Solid allows reshaping the relationship between citizens, their personal data, and the applications they use in the public and private sector. We strongly believe that the insights from this Flemish Solid Pilot can speed up the process for public administrations and private organisations that want to put the users in control of their data.".
- 978-3-030-50578-3_21 abstract "One of the guiding principles of open data is that anyone can use the raw data for any purpose. Public transit operators often publish their open data as a single data dump, but developers with limited computational resources may not be able to process all this data. Existing work has already focused on fragmenting the data by departure time, so that data consumers can be more selective in the data they process. However, each fragment still contains data from the entire operator’s service area. We build upon this idea by fragmenting geospatially as well as by departure time. Our method is robust to changes in the original data, such as the deletion or the addition of stops, which is crucial in scenarios where data publishers do not control the data itself. In this paper we explore popular clustering methods such as k-means and METIS, alongside two simple domain-specific methods of our own. We compare the effectiveness of each for the use case of client-side route planning, focusing on the ease of use of the data and the cacheability of the data fragments. Our results show that simply clustering stops by their proximity to 8 transport hubs yields the most promising results: queries are 2.4 times faster and download 4 times less data. More than anything though, our results show that the difference between clustering methods is small, and that engineers can safely choose practical and simple solutions. We expect that this insight also holds true for publishing other geospatial data such as road networks, sensor data, or points of interest.".
- 978-3-030-59833-4_6 abstract "A key source of revenue for the media and entertainment domain is ad targeting: serving advertisements to a select set of visitors based on various captured visitor traits. Compared to global media companies such as Google and Facebook that aggregate data from various sources (and the privacy concerns these aggregations bring), local companies only capture a small number of (high-quality) traits and retrieve an unbalanced small amount of revenue. To increase these local publishers’ competitive advantage, they need to join forces, whilst taking the visitors’ privacy concerns into account. The EcoDaLo consortium, located in Belgium and consisting of Adlogix, Pebble Media, and Roularta Media Group as founding partners, aims to combine local publishers’ data without requiring these partners to share this data across the consortium. Usage of Semantic Web technologies enables a decentralized approach where federated querying allows local companies to combine their captured visitor traits, and better target visitors, without aggregating all data. To increase potential uptake, technical complexity to join this consortium is kept minimal, and established technology is used where possible. This solution was showcased in Belgium which provided the participating partners valuable insights and suggests future research challenges. Perspectives are to enlarge the consortium and provide measurable impact in ad targeting to local publishers.".
- 978-3-030-62327-2_10 abstract "Rewarding people is common in several contexts, such as human resource management and crowdsourcing applications. However, designing a reward strategy is not straightforward, as it should consider different parameters. These parameters include, for example, rewarding task management and identifying critical features, like the type of rewards and gamification. Towards designing a reward scheme, an ontology can help in the communication among domain experts and management of complex concepts that are applied. Apart from that an ontology can also help in the interrelationship and integration among different reward schemes employed by different service providers. In this paper, we present REWARD, a general-purpose ontology for capturing various common features of diverse reward schemes. This ontology is a result of the CAP-A European project and its application to the crowdsourcing domain, but it is designed to cover different needs and domains.".
- 978-3-030-62466-8_13 abstract "Many Web developers nowadays are trained to build applications with a user-facing browser front-end that obtains predictable data structures from a single, well-known back-end. Linked Data invalidates such assumptions, since data can combine several ontologies and span multiple servers with different APIs. Front-end developers, who specialize in creating end-user experiences rather than back-ends, need an abstraction layer to the Web of Data that matches their existing mindset and workflow. We have developed LDflex, a domain-specific language that exposes common Linked Data access patterns as concise JavaScript expressions. In this article, we describe the design and embedding of the language, and discuss its daily usage within two companies. LDflex succeeds in eliminating a dedicated data layer for common and straightforward data access patterns, without striving to be a replacement for more complex cases. Our experiences reveal that designing a Linked Data developer experience—analogous to a user experience—is crucial for adoption by the target group, who can create Linked Data applications for end users. Crucially, simple abstractions require research to hide the underlying complexity.".
- 978-3-030-74296-6_3 abstract "Fostering interoperability, Public Sector Bodies (PSBs) maintain datasets that should become queryable as an integrated Knowledge Graph (KG). While some PSBs allow to query a part of the KG on their servers, others favor publishing data dumps allowing the querying to happen on third party servers. As the budget of a PSB to publish their dataset on the Web is finite, PSBs need guidance on what interface to offer first. A core API can be designed that covers the core tasks of Base Registries, which is a well-defined term in Flanders for the management of authoritative datasets. This core API should be the basis on which an ecosystem of data services can be built. In this paper, we introduce the concept of a Linked Data Event Stream (LDES) for datasets like air quality sensors and observations or a registry of officially registered addresses. We show that extra ecosystem requirements can be built on top of the LDES using a generic fragmenter. By using hypermedia for describing the LDES as well as the derived datasets, agents can dynamically discover their best way through the KG, and server administrators can dynamically add or remove functionality based on costs and needs. This way, we allow PSBs to prioritize API functionality based on three tiers: (i) the LDES, (ii) intermediary indexes and (iii) querying interfaces. While the ecosystem will never be feature-complete, based on the market needs, PSBs as well as market players can fill in gaps as requirements evolve.".
- 978-3-030-74296-6_5 abstract "Text-fields that need to look up specific entities in a dataset can be equipped with autocompletion functionality. When a dataset becomes too large to be embedded in the page, setting up a full-text search API is not the only alternative. Alternate API designs that balance different trade-offs such as archivability, cacheability and privacy, may not require setting up a new back-end architecture. In this paper, we propose to perform prefix search over a fragmentation of the dataset, enabling the client to take part in the query execution by navigating through the fragmented dataset. Our proposal consists of (i) a self-describing fragmentation strategy, (ii) a client search algorithm, and (iii) an evaluation of the proposed solution, based on a small dataset of 73k entities and a large dataset of 3.87 m entities. We found that the server cache hit ratio is three times higher compared to a server-side prefix search API, at the cost of a higher bandwidth consumption. Nevertheless, an acceptable user-perceived performance has been measured: assuming 150 ms as an acceptable waiting time between keystrokes, this approach allows 15 entities per prefix to be retrieved in this interval. We conclude that an alternate set of trade-offs has been established for specific prefix search use cases: having added more choice to the spectrum of Web APIs for autocompletion, a file-based approach enables more datasets to afford prefix search.".
- 978-3-030-77385-4 abstract "This book constitutes the refereed proceedings of the 18th International Semantic Web Conference, ESWC 2021, held virtually in June 2021. The 41 full papers and 2 short papers presented were carefully reviewed and selected from 167 submissions. The papers were submitted to three tracks: the research track, the resource track and the in-use track. These tracks showcase research and development activities, services and applications, and innovative research outcomes making their way into industry. The research track caters to both long-standing and emerging research topics in the form of the following subtracks: ontologies and reasoning; knowledge graphs (understanding, creating, and exploiting); semantic data management, querying and distributed data; data dynamics, quality, and trust; matching, integration, and fusion; NLP and information retrieval; machine learning; science data and scholarly communication; and problems to solve before you die.".
- 978-3-030-80418-3 abstract "This book constitutes the proceedings of the satellite events held at the 18th Extended Semantic Web Conference, ESWC 2021, in June 2021.".
- 978-3-030-81242-3_15 abstract "Digital applications typically describe their privacy policy in lengthy and vague documents (called PrPs), but these are rarely read by users, who remain unaware of privacy risks associated with the use of these digital applications. Thus, users need to become more aware of digital applications’ policies and, thus, more content about their choices. To raise privacy awareness, we implemented the CAP-A portal, a crowdsourcing platform which aggregates knowledge as extracted from PrP documents and motivates users in performing privacy-related tasks. The Rewarding Framework is one of the most critical components of the platform. It enhances user motivation and engagement by combining features from existing successful rewarding theories. In this work, we describe this Rewarding Framework, and show how it supports users to increase their privacy knowledge level by engaging them to perform privacy-related tasks, such as annotating PrP documents in a crowdsourcing environment. The proposed Rewarding Framework was validated by pilots ran in the frame of the European project CAP-A and by a user evaluation focused on its impact in terms of engagement and raising privacy awareness. The results show that the Rewarding Framework improves engagement and motivation, and increases users’ privacy awareness.".
- 978-3-030-91167-6_5 abstract "Link Traversal–based Query Processing (LTQP), in which a SPARQL query is evaluated over a web of documents rather than a single dataset, is often seen as a theoretically interesting yet impractical technique. However, in a time where the hypercentralization of data has increasingly come under scrutiny, a decentralized Web of Data with a simple document-based interface is appealing, as it enables data publishers to control their data and access rights. While LTQP allows evaluating complex queries over such webs, it suffers from performance issues (due to the high number of documents containing data) as well as information quality concerns (due to the many sources providing such documents). In existing LTQP approaches, the burden of finding sources to query is entirely in the hands of the data consumer. In this paper, we argue that to solve these issues, data publishers should also be able to suggest sources of interest and guide the data consumer towards relevant and trustworthy data. We introduce a theoretical framework that enables such guided link traversal and study its properties. We illustrate with a theoretic example that this can improve query results and reduce the number of network requests.".
- 978-3-031-16802-4_11 abstract "This paper addresses the growing dissatisfaction with the established scholarly communication system, which is heavily centralised and vertically integrated. In order to provide an alternative, we assume in this paper a decentralized and decoupled network of data and service nodes that want to capture the digital scholarly record, make it accessible, and preserve it over time. In such a Value-Adding network a range of agreements is required for the nodes to interact, discover artifacts and request services for them. The first result of this work, the “Notifications in Value-Adding Networks” specification, details interoperability requirements for the exchange real-time life-cycle information pertaining to artifacts using Linked Data Notifications. In an experiment, we applied our specification to one particular use-case: distributing Scholix data-literature links to a network of Belgian institutional repositories by a national service node. The results of our experiment confirm the potential of our approach and provide the framework to create a network of interacting nodes that provide the core scholarly functions (registration, certification, awareness and archiving) in a decentralized and decoupled way.".
- 978-3-319-11964-9_12 abstract "As the Web of Data is growing at an ever increasing speed, the lack of reliable query solutions for live public data becomes apparent. SPARQL implementations have matured and deliver impressive performance for public SPARQL endpoints, but poor availability—especially under high loads—prevents their use in real-world applications. We propose to tackle this availability problem with basic Linked Data Fragments, a concept and related techniques to publish and consume queryable data by moving intelligence from the server to the client. This paper formalizes the concept, introduces a client-side query processing algorithm using a dynamic iterator pipeline, and verifies its availability under load. The results indicate that, at the cost of lower performance, query techniques with basic Linked Data Fragments lead to high availability, thereby allowing for reliable applications on top of public, queryable Linked Data.".
- 978-3-319-16462-5_16 abstract "Web resources can be linked directly to their provenance, as specified in W3C PROV-AQ. On its own, this solution places all responsibility to the resource’s publisher, who hopefully maintains and publishes provenance information. In reality, however, most publishers lack of incentives to publish the resources’ provenance, even if the authors would like such information to be published. Currently it is impossible to link existing resources to new provenance information, either provided by the author or a third party. In this paper, we present a solution for this problem, by implementing a lightweight, read/write provenance query service, integrated with a pingback mechanism, following PROV-AQ.".
- 978-3-319-18818-8_29 abstract "Ad-hoc querying is crucial to access information from Linked Data, yet publishing queryable RDF datasets on the Web is not a trivial exercise. The most compelling argument to support this claim is that the Web contains hundreds of thousands of data documents, while only 260 queryable SPARQL endpoints are provided. Even worse, the SPARQL endpoints we do have are often unstable, may not comply with the standards, and may differ in supported features. In other words, hosting data online is easy, but publishing Linked Data via a queryable API such as SPARQL appears to be too difficult. As a consequence, in practice, there is no single uniform way to query the LOD Cloud today. In this paper, we therefore combine a large-scale Linked Data publication project (LOD Laundromat) with a low-cost server-side interface (Triple Pattern Fragments), in order to bridge the gap between the Web of downloadable data documents and the Web of live queryable data. The result is a repeatable, low-cost, open-source data publication process. To demonstrate its applicability, we made over 650,000 data documents available as data APIs, consisting of 30 billion triples.".
- 978-3-319-24258-3_65 abstract "Interest in eLearning environments is constantly increasing, as well as in digital textbooks and gamification. The advantages of gamification in the context of education have been proven. However, gamified educational material, such as digital textbooks and digital systems, are scarce. As an answer to the need for such material, the framework GEL (Gamification for EPUB using Linked Data) has been developed. GEL allows to incorporate gamification concepts in a digital textbook, using EPUB 3 and Linked Data. As part of GEL, we created the ontology GO (Gamification Ontology), representing the different gamification concepts, and a JavaScript library. Using GO allows to discover other gamified systems, to share gamification concepts between applications and to separate the processing and representation of the gamification concepts. Our library is interoperable with any JavaScript-based e-reader, which promotes its reusability.".
- 978-3-319-25518-7_14 abstract "In this paper, we present our solution for the first task of the second edition of the Semantic Publishing Challenge. The task requires extracting and semantically annotating information regarding CEUR-WS workshops, their chairs and conference affiliations, as well as their papers and their authors, from a set of html-encoded workshop proceedings volumes. Our solution builds on last year’s submission, while we address a number of shortcomings, assess the generated dataset for its quality and publish the queries as SPARQL query templates. This is accomplished using the RDF Mapping Language (RML) to define the mappings, RMLProcessor to execute them, RDFUnit to both validate the mapping documents and assess the generated dataset’s quality, and The DataTank to publish the SPARQL query templates. This results in an overall improved quality of the generated dataset that is reflected in the query results.".
- 978-3-319-25639-9_13 abstract "Data Catalog Vocabulary (DCAT) is a W3C specification to describe datasets published on the Web. However, these catalogs are not easily discoverable based on a user’s needs. In this paper, we introduce the Node.js module "dcat-merger" which allows a user agent to download and semantically merge different DCAT feeds from the Web into one DCAT feed, which can be republished. Merging the input feeds is followed by enriching them. Besides determining the subjects of the datasets, using DBpedia Spotlight, two extensions were built: one categorizes the datasets according to a taxonomy, and the other adds spatial properties to the datasets. These extensions require the use of information available in DBpedia’s SPARQL endpoint. However, public SPARQL endpoints often suffer from low availability, so a Triple Pattern Fragments alternative is used. However, the need for DCAT Merger sparks the discussion for more high level functionality to improve a catalog’s discoverability.".
- 978-3-319-25639-9_32 abstract "In order to reduce the server-side cost of publishing queryable Linked Data, Triple Pattern Fragments (TPF) were introduced as a simple interface to RDF triples. They allow for SPARQL query execution at low server cost, by partially shifting the load from servers to clients. The previously proposed query execution algorithm provides a solution that is highly inefficient, often requiring an amount of http calls that is magnitudes larger than the optimal solution. We have proposed a new query execution algorithm with the aim to solve this problem. Our solution significantly improves on the current work by maintaining a complete overview of the query instead of just looking at local optima. In this paper, we describe a demo that allows a user to easily compare the results of both implementations. We show both query results and number of executed http calls, proving a clear picture of the difference between the two algorithms.".
- 978-3-319-25639-9_54 abstract "Queryable Linked Data is published through several interfaces, including SPARQL endpoints and Linked Data documents. In October 2014, the DBpedia Association announced an official Triple Pattern Fragments interface to its popular DBpedia dataset. This interface proposes to improve the availability of live queryable data by dividing query execution between clients and servers. In this paper, we present a usage analysis between November 2014 and July 2015. In 9 months time, the interface had an average availability of 99.99%, handling 16,776,170 requests, 43.0% of which were served from cache. These numbers provide promising evidence that low-cost Triple Pattern Fragments interfaces provide a viable strategy for live applications on top of public, queryable datasets.".
- 978-3-319-25639-9_6 abstract "Algorithmic storytelling over Linked Data on the Web is a challenging task in which many graph-based pathfinding approaches experience issues with consistency regarding the resulting path that leads to a story. In order to mitigate arbitrariness and increase consistency, we propose to improve the semantic relatedness of concepts mentioned in a story by increasing the relevance of links between nodes through additional domain delineation and refinement steps. On top of this, we propose the implementation of an optimized algorithm controlling the pathfinding process to obtain more homogeneous search domain and retrieve more links between adjacent hops in each path. Preliminary results indicate the potential of the proposal.".
- 978-3-319-27036-4_10 abstract "Nowadays, many people use e-books, having high expectations with respect to their reading experience. In the case of digital storytelling, enhanced e-books can connect story entities and emotions to real-world elements. In this paper, we present the novel concept of a Hybrid Book, a generic Interactive Digital Narrative (IDN) artifact that requires seamless collaboration between content and smart devices. To that end, we extract data from a story and broadcast these data in RDF as Linked Data. Smart device services can then receive and process these data in order to execute corresponding actions. By following open standards, a Hybrid Book can also be seen as an interoperable and sustainable IDN artifact. In addition, according to our user-based evaluation, a Hybrid Book makes it possible to provide human sensible feedback while flipping pages, enabling a more enjoyable reading experience. Finally, the participants also showed a positive willingness to pay, thus making it possible to generate more revenue for authors and publishers.".
- 978-3-319-33245-1_10 abstract "Semantic Web reasoning can be a complex task: depending on the amount of data and the ontologies involved, traditional OWL DL reasoners can be too slow to face problems in real time. An alternative is to use a rule-based reasoner together with the OWL RL/RDF rules as stated in the specification of the OWL 2 language profiles. In most cases this approach actually improves reasoning times, but due to the complexity of the rules, not as much as it could. In this paper we present an improved strategy: based on the TBoxes of the ontologies involved in a reasoning task, we create more specific rules which then can be used for further reasoning. We make use of the EYE reasoner and its logic Notation3. In this logic, rules can be employed to derive new rules which makes the rule creation a reasoning step on its own. We evaluate our implementation on a semantic nurse call system. Our results show that adding a pre-reasoning step to produce specialized rules improves reasoning times by around 75%.".
- 978-3-319-34129-3_43 abstract "Although several tools have been implemented to generate Linked Data from raw data, users still need to be aware of the underlying technologies and Linked Data principles to use them. Mapping languages enable to detach the mapping definitions from the implementation that executes them. However, no thorough research has been conducted on how to facilitate the editing of mappings. We propose the RMLEditor, a visual graph-based user interface, which allows users to easily define the mappings that deliver the RDF representation of the corresponding raw data. Neither knowledge of the underlying mapping language nor the used technologies is required. The RMLEditor aims to facilitate the editing of mappings, and thereby lowers the barriers to create Linked Data. The RMLEditor is developed for use by data specialists who are partners of (i) a companies-driven pilot and (ii) a community group. The current version of the RMLEditor was validated: participants indicate that it is adequate for its purpose and the graph-based approach enables users to conceive the linked nature of the data.".
- 978-3-319-34129-3_5 abstract "In this paper, we investigate the Normalized Semantic Web Distance (NSWD), a semantics-aware distance measure between two concepts in a knowledge graph. Our measure advances the Normalized Web Distance, a recently established distance between two textual terms, to be more semantically aware. In addition to the theoretic fundamentals of the NSWD, we investigate its properties and qualities with respect to computation and implementation. We investigate three variants of the NSWD that make use of all semantic properties of nodes in a knowledge graph. Our performance evaluation based on the Miller-Charles benchmark shows that the NSWD is able to correlate with human similarity assessments on both Freebase and DBpedia knowledge graphs with values up to 0.69. Moreover, we verified the semantic awareness of the NSWD on a set of 20 unambiguous concept-pairs. We conclude that the NSWD is a promising measure with (1) a reusable implementation across knowledge graphs, (2) sufficient correlation with human assessments, and (3) awareness of semantic differences between ambiguous concepts.".
- 978-3-319-38791-8_41 abstract "Biodiversity is essential to life on Earth and motivates many efforts to collect data about species. These data are collected in different places and published in different formats. Researchers use it to extract new knowledge about living things, but it is difficult to retrieve, combine and integrate data sources from different places. This work will investigate how to integrate biodiversity information from heterogeneous sources using Semantic Web technologies. Its main objective is to propose an architecture to link biodiversity data using mainly their spatiotemporal dimension, effectively search these linked data sets and test them using real use cases, defined with the help of biodiversity experts. It is also an important objective to propose a suitable provenance model that captures not only data origin but also temporal information. This architecture will be tested on a set of representative data from important Brazilian institutions that are involved in studies of biodiversity.".
- 978-3-319-40349-6_23 abstract "Path-based storytelling with Linked Data on the Web provides users the ability to discover concepts in an entertaining and educational way. Given a query context, many state-of-the-art pathfinding approaches aim at telling a story that coincides with the user’s expectations by investigating paths over Linked Data on the Web. By taking into account serendipity in storytelling, we aim at improving and tailoring existing approaches towards better fitting user expectations so that users are able to discover interesting knowledge without feeling unsure or even lost in the story facts. To this end, we propose to optimize the link estimation between (and the selection of facts in) a story by increasing the consistency and relevancy of links between facts through additional domain delineation and refinement steps. In order to address multiple aspects of serendipity, we propose and investigate combinations of weights and heuristics in paths forming the essential building blocks for each story. Our experimental findings with stories based on DBpedia indicate the improvements when applying the optimized algorithm.".
- 978-3-319-46565-4_18 abstract "Searching for relationships between Linked Data resources is typically interpreted as a pathfinding problem: looking for chains of intermediary nodes (hops) forming the connection or bridge between these resources in a single dataset or across multiple datasets. Linked Open Data, linked datasets available via the web, introduce challenges for pathfinding algorithms. In many cases centralizing all needed linked data in a certain (specialized) repository or index to be able to run the algorithm is not possible or at least not desired. To address this, we propose an approach to top-k shortest pathfinding, which optimally translates a pathfinding query into sequences of triple pattern fragment requests. Triple Pattern Fragments were recently introduced as a solution to address the availability of data on the Web and the scalability of linked data client applications, preventing data processing bottlenecks on the server. The results are streamed to the client, thus allowing clients to asynchronous processing of the top-k shortest paths. We explain how this approach behaves using a training dataset with 10 million triples and show the trade-offs to a SPARQL approach where all the data is gathered in a single triple store on a single machine.".
- 978-3-319-47075-7_50 abstract "In order to present and communicate the condition of monitored environments to supervising experts, a dashboard is needed to present the status of all sensors. The heterogeneity and vast amount of sensors, as well as the difficulty of creating interesting sensor data combinations, hinder the deployment of fixed structure dashboards as they are unable to cope with the accordingly vast amount of required mappings. Therefore, in this paper, the development of a dynamic dashboard is presented, able to visualize any particular and user defined data and sensor composition. By implementing the heterogeneous sensors as semantically annotated Web APIs, a dynamic sensor composition and visualization is enabled. The resulting condition monitoring dashboard provides a clear overview of the system KPIs in acceptable timing and provides helpful tools to detect anomalies in system behaviour.".
- 978-3-319-47602-5_10 abstract "Applications built on top of the Semantic Web are emerging as a novel solution in different areas, such as – among others – decision making and route planning. However, to connect results of these solutions – i.e., the semantically annotated data – with real-world applications, this semantic data needs to be connected to actionable events. A lot of work has been done (both semantically as non-semantically) to describe and define Web services, but there is still a gap on a more abstract level, i.e., describing interfaces independent of the technology used. In this paper, we present a data model, specification, and ontology to semantically declare and describe functions independently of the used technology. This way, we can declare and use actionable events in semantic applications, without restricting ourselves to programming language-dependent implementations. The ontology allows for extensions, and is proposed as a possible solution for semantic applications in various domains.".
- 978-3-319-47602-5_1 abstract "Traditional RDF stream processing engines work completely server-side, which contributes to a high server cost. For allowing a large number of concurrent clients to do continuous querying, we extend the low-cost Triple Pattern Fragments (TPF) interface with support for time-sensitive queries. In this poster, we give the overview of a client-side RDF stream processing engine on top of TPF. Our experiments show that our solution significantly lowers the server load while increasing the load on the clients. Preliminary results indicate that our solution moves the complexity of continuously evaluating real-time queries from the server to the client, which makes real-time querying much more scalable for a large amount of concurrent clients when compared to the alternatives.".
- 978-3-319-47602-5_25 abstract "Linked Data is in many cases generated from (semi-)structured data. This generation is supported by several tools, a number of which use a mapping language to facilitate the Linked Data generation. However, knowledge of this language and other used technologies is required to use the tools, limiting their adoption by non-Semantic Web experts. We demonstrate the RMLEditor: a graphical user interface that utilizes graphs to easily visualize the mappings that deliver the rdf representation of the original data. The required amount of knowledge of the underlying mapping language and the used technologies is kept to a minimum. The RMLEditor lowers the barriers to create Linked Data by aiming to also facilitate the editing of mappings by non-experts.".
- 978-3-319-47602-5_44 abstract "Existing solutions to query dynamic Linked Data sources extend the SPARQL language, and require continuous server processing for each query. Traditional SPARQL endpoints already accept highly expressive queries, so extending these endpoints for time-sensitive queries increases the server cost even further. To make continuous querying over dynamic Linked Data more affordable, we extend the low-cost Triple Pattern Fragments (TPF) interface with support for time-sensitive queries. In this paper, we introduce the TPF Query Streamer that allows clients to evaluate SPARQL queries with continuously updating results. Our experiments indicate that this extension significantly lowers the server complexity, at the expense of an increase in the execution time per query. We prove that by moving the complexity of continuously evaluating queries over dynamic Linked Data to the clients and thus increasing bandwidth usage, the cost at the server side is significantly reduced. Our results show that this solution makes real-time querying more scalable for a large amount of concurrent clients when compared to the alternatives.".
- 978-3-319-58451-5_11 abstract "Data science increasingly employs cloud-based Web application programming interfaces (APIs). However, automatically discovering and connecting suitable APIs for a given application is difficult due to the lack of explicit knowledge about the structure and datatypes of Web API inputs and outputs. To address this challenge, we conducted a survey to identify the metadata elements that are crucial to the description of Web APIs and subsequently developed the smartAPI metadata specification and associated tools to capture their domain-related and structural characteristics using the FAIR (Findable, Accessible, Interoperable, Reusable) principles. This paper presents the results of the survey, provides an overview of the smartAPI specification and a reference implementation, and discusses use cases of smartAPI. We show that annotating APIs with smartAPI metadata is straightforward through an extension of the existing Swagger editor. By facilitating the creation of such metadata, we increase the automated interoperability of Web APIs. This work is done as part of the NIH Commons Big Data to Knowledge (BD2K) API Interoperability Working Group.".
- 978-3-319-58451-5_3 abstract "Mapping languages allow us to define how Linked Data is generated from raw data, but only if the raw data values can be used as is to form the desired Linked Data. Since complex data transformations remain out of scope for mapping languages, these steps are often implemented as custom solutions, or with systems separate from the mapping process. The former data transformations remain case-specific, often coupled with the mapping, whereas the latter are not reusable across systems. In this paper, we propose a methodology where data transformations (i) are defined declaratively and (ii) are aligned with the mapping languages. We employ an alignment of data transformations described using the Function Ontology (FnO) and mapping of data to Linked Data described using the RDF Mapping Language (RML). We validate that our approach can map and transform DBpedia in a declaratively defined and aligned way. Our approach is not case-specific: data transformations are independent of their implementation and thus interoperable, while the functions are decoupled and reusable. This allows developers to improve the generation framework, whilst contributors can focus on the actual Linked Data, as there are no more dependencies, neither between the transformations and the generation framework nor their implementations.".
- 978-3-319-58694-6_1 abstract "The process of extracting, structuring, and organizing knowledge from one or multiple data sources and preparing it for the Semantic Web requires a dedicated class of systems. They enable processing large and originally heterogeneous data sources and capturing new knowledge. Offering existing data as Linked Data increases its shareability, extensibility, and reusability. However, using Linking Data as a means to represent knowledge can be easier said than done. In this tutorial, we elaborate on the importance of semantically annotating data and how existing technologies facilitate their mapping to Linked Data. We introduce [R2]RML languages to generate Linked Data derived from different heterogeneous data sources (databases, XML, JSON, …) from different interfaces (documents, Web APIs, …). Those who are not Semantic Web experts can annotate their data with the RMLEditor, whose user interface hides all underlying Semantic Web technologies to data owners. Last, we show how to easily publish Linked Data on the Web as Triple Pattern Fragments. As a result, participants, independently of their knowledge background, can model, annotate and publish data on their own.".
- 978-3-319-58694-6_29 abstract "Linked Datasets typically change over time, and knowledge of this historical information can be useful. This makes the storage and querying of Dynamic Linked Open Data an important area of research. With the current versioning solutions, publishing Dynamic Linked Open Data at Web-Scale is possible, but too expensive. We investigate the possibility of using the low-cost Triple Pattern Fragments (TPF) interface to publish versioned Linked Open Data. In this paper, we discuss requirements for supporting versioning in the TPF framework, on the level of the interface, storage and client, and investigate which trade-offs exist. These requirements lay the foundations for further research in the area of low-cost, Web-Scale dynamic Linked Open Data publication and querying.".
- 978-3-319-60131-1_26 abstract "While some public transit data publishers only provide a data dump – which only few reusers can afford to integrate within their applications – others provide a use case limiting origin-destination route planning API. The Linked Connections framework instead introduces a hypermedia API, over which the extendable base route planning algorithm “Connections Scan Algorithm” can be implemented. We compare the CPU usage and query execution time of a traditional server-side route planner with the CPU time and query execution time of a Linked Connections interface by evaluating query mixes with increasing load. We found that, at the expense of a higher bandwidth consumption, more queries can be answered using the same hardware with the Linked Connections server interface than with an origin-destination API, thanks to an average cache hit rate of 78%. The findings from this research show a cost-efficient way of publishing transport data that can bring federated public transit route planning at the fingertips of anyone.".
- 978-3-319-61252-2_3 abstract "The success of the Semantic Web highly depends on its ingredients. If we want to fully realize the vision of a machine-readable Web, it is crucial that Linked Data are actually useful for machines consuming them. On this background it is not surprising that (Linked) Data validation is an ongoing research topic in the community. However, most approaches so far either do not consider reasoning, and thereby miss the chance of detecting implicit constraint violations, or they base themselves on a combination of different formalisms, e.g. Description Logics combined with SPARQL. In this paper, we propose using Rule-Based Web Logics for RDF validation focusing on the concepts needed to support the most common validation constraints, such as Scoped Negation As Failure (SNAF), and the predicates defined in the Rule Interchange Format (RIF). We prove the feasibility of the approach by providing an implementation in Notation3 Logic. As such, we show that rule logic can cover both validation and reasoning if it is expressive enough.".
- 978-3-319-68204-4_28 abstract "DBpedia EF, the generation framework behind one of the Linked Open Data cloud’s central interlinking hubs, has limitations with regard to quality, coverage and sustainability of the generated dataset. DBpedia can be further improved both on schema and data level. Errors and inconsistencies can be addressed by amending (i) the DBpedia EF; (ii) the DBpedia mapping rules; or (iii) Wikipedia, from which it extracts information. However, even though the DBpedia EF and mapping rules are continuously evolving and several changes were applied to both of them, there are no significant improvements on the DBpedia dataset since its limitations were identified. To address these shortcomings, we propose adapting a different semantic-driven approach that decouples, in a declarative manner, the extraction, transformation and mapping rule execution. In this paper, we discuss the new DBpedia EF, its architecture, its technical implementation, and extraction results. The extraction time remains within the same magnitude, but the resulting extraction process is more sustainable. This way, we achieve an enhanced data generation process that can be broadly adopted, and which improves DBpedia’s quality, coverage, and sustainability.".
- 978-3-319-70407-4_13 abstract "The popularity of digital comic books keeps rising, causing an increase in interest from traditional publishers. Digitizing existing comic books can require much work though, since older comic books were made when digital versions were not taken into account. Additions such as digital panel segmentation and semantic annotation, which increase the discoverability and functionality, were only introduced at a later point in time. To this end, we made ComSem: a tool to support publishers in this task by automating certain steps in the process and making others more accessible. In this paper we present our demo and how it can be used to easily detect comic book panels and annotate them with semantic metadata.".
- 978-3-319-70407-4_14 abstract "Linked Datasets often evolve over time for a variety of reasons. While typical scenarios rely on the latest version only, useful knowledge may still be contained within or between older versions, such as the historical information of biomedical patient data. In order to make this historical information cost-efficiently available on the Web, a low-cost interface is required for providing access to versioned datasets. For our demonstration, we set up a live Triple Pattern Fragments interface for a versioned dataset with queryable access. We explain different version query types of this interface, and how it communicates with a storage solution that can handle these queries efficiently.".
- 978-3-319-70407-4_18 abstract "In 2015, Flanders Information started the OSLO² project, aimed at easing the exchange of data and increasing the interoperability of Belgian government services. RDF ontologies were developed to break apart the government data silos and stimulate data reuse. However, ontology design still encounters a number of difficulties. Since domain experts were generally unfamiliar with RDF, a design process was needed that allowed these experts to understand the model as it was being developed. We designed the ontologies using UML, a modeling language well known within the government, as a single source specification. From this source, the ontology and other relevant documents were generated. This paper describes the conversion tooling and the pragmatic approaches that were taken into account in its design. While this tooling is somewhat focused on the design principles used in the OSLO² project, it can serve as the basis for a generic conversion tool. All source code and documentation are available online.".
- 978-3-319-70407-4_24 abstract "Clients of Triple Pattern Fragments (TPF) interfaces demonstrate how a SPARQL query engine can run within a browser and re-balance the load from the server to the clients. Imagine connecting these browsers using a browser-to-browser connection, sharing bandwidth and CPU. This builds a fog of browsers where end-user devices collaborate to process SPARQL queries over TPF servers. In this demo, we present Ladda: a framework for query execution in a fog of browsers. Thanks to client-side inter-query parallelism, Ladda reduces the makespan of the workload and improves the overall throughput of the system.".
- 978-3-319-70407-4_32 abstract "DBpedia data is largely generated from extracting and parsing the wikitext from the infoboxes of Wikipedia. This generation process is handled by the DBpedia Extraction Framework (DBpedia EF). This framework currently consists of data transformations, a series of custom hard-coded steps which parse the wikitext, and schema transformations, which model the resulting RDF data. Therefore, applying changes to the resulting RDF data needs both Semantic Web expertise and development within the DBpedia EF. As such, the current DBpedia data is being shaped by a small amount of core developers. However, by describing both schema and data transformations declaratively, we shape and generate the DBpedia data using solely declarations, splitting the concerns between implementation and modeling. The parsing functions development is decoupled from the DBpedia EF, and other data transformation functions can easily be integrated during DBpedia data generation. This demo showcases an interactive Web application that allows non-technical users to (re-)shape the DBpedia data and use external data transformation functions, solely by editing the mapping document via HTML controls.".
- 978-3-319-98192-5_28 abstract "The road to publishing public streaming data on the Web is paved with trade-offs that determine its viability. The cost of unrestricted query answering on top of data streams, may not be affordable for all data publishers. Therefore, public streams needs to be funded in a sustainable fashion to remain online. In this paper we introduce an overview of possible query answering features for live time series as multidimensional interfaces. For example, from a live parking availability data stream, pre-calculated time constrained statistical indicators or geographically classified data can be provided to clients on demand. Furthermore, we demonstrate the initial developments of a Linked Time Series server that supports such features through an extensible modular architecture. Benchmarking the costs associated to each of these features allows to weigh the trade-offs inherent to publishing live time series and establishes the foundations to create a decentralized and sustainable ecosystem for live data streams on the Web.".
- 978-3-319-98192-5_40 abstract "Linked Data is often generated based on a set of declarative rules using languages such as R2RML and RML. These languages are built with machine-processability in mind. It is thus not always straightforward for users to define or understand rules written in these languages, preventing them from applying the desired annotations to the data sources. In the past, graphical tools were proposed. However, next to users who prefer a graphical approach, there are users who desire to understand and define rules via a text-based approach. For the latter, we introduce an enhancement to their workflow. Instead of requiring users to manually write machine-processable rules, we propose writing human-friendly rules, and generate machine-processable rules based on those human-friendly rules. At the basis is YARRRML: a human-readable text-based representation for declarative generation rules. We propose a novel browser-based integrated development environment called “Matey”, showcasing the enhanced workflow. In this work, we describe our demo. Users can experience first hand how to generate triples from data in different formats by using YARRRML’s representation of the rules. The actual machine-processable rules remain completely hidden when editing. Matey shows that writing human-friendly rules enhances the workflow for a broader range of users. As a result, more desired annotations will be added to the data sources which leads to more desired Linked Data.".
- 978-3-319-98192-5_8 abstract "Data management increasingly demands transparency with respect to data processing. Various stakeholders need information tailored to their needs, e.g. data management plans (DMP) for funding agencies or privacy policies for the public. DMPs and privacy policies are just two examples of documents describing aspects of data processing. Dedicated tools to create both already exist. However, creating each of them manually or semi-automatically remains a repetitive and cognitively challenging task. We propose a data-driven approach that semantically represents the data processing itself as workflows and serves as a base for different kinds of result-sets, generated with SPARQL, i.e. DMPs. Our approach is threefold: (i) users with domain knowledge semantically represent workflow components; (ii) other users can reuse these components to describe their data processing via semantically enhanced workflows; and, based on the semantic workflows, (iii) result-sets are automatically generated on-demand with SPARQL queries. This paper demonstrates our tool that implements the proposed approach, based on a use-case of a researcher who needs to provide a DMP to a funding agency to approve a proposed research project.".
- 978-3-662-46641-4 abstract "If we want automated agents to consume the Web, they need to understand what a certain service does and how it relates to other services and data. The shortcoming of existing service description paradigms is their focus on technical aspects instead of the functional aspect—what task does a service perform, and is this a match for my needs? This paper summarizes our recent work on RESTdesc, a semantic service description approach that centers on functionality. It has a solid foundation in logics, which enables advanced service matching and composition, while providing elegant and concise descriptions, responding to the demands of automated clients on the future Web of Agents.".
- 978-3-662-46641-4_20 abstract "In an often retweeted Twitter post, entrepreneur and software architect Inge Henriksen described the relation of Web 1.0 to Web 3.0 as: “Web 1.0 connected humans with machines. Web 2.0 connected humans with humans. Web 3.0 connects machines with machines.” On the one hand, an incredible amount of valuable data is described by billions of triples, machine-accessible and interconnected thanks to the promises of Linked Data. On the other hand, REST is a scalable, resource-oriented architectural style that, like the Linked Data vision, recognizes the importance of links between resources. Hypermedia APIs are resources, too—albeit dynamic ones—and unfortunately, neither Linked Data principles, nor the REST-implied self-descriptiveness of hypermedia APIs sufficiently describe them to allow for long-envisioned realizations like automatic service discovery and composition. We argue that describing inter-resource links—similarly to what the Linked Data movement has done for data—is the key to machine-driven consumption of APIs In this paper, we explain how the description format RESTdesc captures the functionality of APIs by explaining the effect of dynamic interactions, effectively complementing the Linked Data vision.".
- s00799-014-0136-9 abstract "In this article, we present PREMIS OWL. This is a semantic formalisation of the PREMIS 2.2 data dictionary of the Library of Congress. PREMIS 2.2 are metadata implementation guidelines for digitally archiving information for the long term. Nowadays, the need for digital preservation is growing. A lot of the digital information produced merely a decade ago is in danger of getting lost as technologies are changing and getting obsolete. This also threatens a lot of information from heritage institutions. PREMIS OWL is a semantic long-term preservation schema. Preservation metadata are actually a mixture of provenance information, technical information on the digital objects to be preserved and rights information. PREMIS OWL is an OWL schema that can be used as data model supporting digital archives. It can be used for dissemination of the preservation metadata as Linked Open Data on the Web and, at the same time, for supporting semantic web technologies in the preservation processes. The model incorporates 24 preservation vocabularies, published by the LOC as SKOS vocabularies. Via these vocabularies, PREMIS descriptions from different institutions become highly interoperable. The schema is approved and now managed by the Library of Congress. The PREMIS OWL schema is published at http://www.loc.gov/premis/rdf/v1.".
- s10619-017-7211-3 abstract "Fast, massive, and viral data diffused on social media affects a large share of the online population, and thus, the (prospective) information diffusion mechanisms behind it are of great interest to researchers. The (retrospective) provenance of such data is equally important because it contributes to the understanding of the relevance and trustworthiness of the information. Furthermore, computing provenance in a timely way is crucial for particular use cases and practitioners, such as online journalists that promptly need to assess particular pieces of information. Social media currently provide insufficient mechanisms for provenance tracking, publication and generation, while state-of-the-art on social media research focuses mainly on explicit diffusion mechanisms (like retweets in Twitter or reshares in Facebook).The implicit diffusion mechanisms remain understudied due to the difficulties of being captured and properly understood. From a technical side, the state of the art for provenance reconstruction evaluates small datasets after the fact, sidestepping requirements for scale and speed of current social media data. In this paper, we investigate the mechanisms of implicit information diffusion by computing its fine-grained provenance. We prove that explicit mechanisms are insufficient to capture influence and our analysis unravels a significant part of implicit interactions and influence in social media. Our approach works incrementally and can be scaled up to cover a truly Web-scale scenario like major events. The results show that (on a single machine) we can process datasets consisting of up to several millions of messages at rates that cover bursty behaviour, without compromising result quality. By doing that, we provide to online journalists and social media users in general, fine grained provenance reconstruction which sheds lights on implicit interactions not captured by social media providers. These results are provided in an online fashion which also allows for fast relevance and trustworthiness assessment.".
- s11042-010-0709-6 abstract "Automatic generation of metadata, facilitating the retrieval of multimedia items, potentially saves large amounts of manual work. However, the high specialization degree of feature extraction algorithms makes them unaware of the context they operate in, which contains valuable and often necessary information. In this paper, we show how Semantic Web technologies can provide a context that algorithms can interact with. We propose a generic problem-solving platform that uses Web services and various knowledge sources to find solutions to complex requests. The platform employs a reasoner-based composition algorithm, generating an execution plan that combines several algorithms as services. It then supervises the execution of this plan, intervening in case of errors or unexpected behavior. We illustrate our approach by a use case in which we annotate the names of people depicted in a photograph.".
- s11042-014-2445-9 abstract "Due to the ubiquitous Web-connectivity and portable multimedia devices, it has never been so easy to produce and distribute new multimedia resources such as videos, photos, and audio. This ever increasing production leads to an information overload for consumers, which calls for efficient multimedia retrieval techniques. Multimedia can be efficiently retrieved using its metadata, but the multimedia analysis methods that can automatically generate this metadata are currently not reliable enough for highly diverse multimedia content. A reliable and automatic method for analyzing general multimedia content is needed. We introduce a domain-agnostic framework that annotates multimedia resources using currently available multimedia analysis methods. By using a three-step reasoning cycle, this framework can assess and improve the quality of multimedia analysis results, by consecutively (1) combining analysis results effectively, (2) predicting which results might need improvement, and (3) invoking compatible analysis methods to retrieve new results. By using semantic descriptions for the Web services that wrap the multimedia analysis methods, compatible services can be automatically selected. By using additional semantic reasoning on these semantic descriptions, the different services can be repurposed across different use cases. We evaluated this problem-agnostic framework in the context of video face detection, and showed that it is capable of providing the best analysis results regardless of the input video. The proposed methodology can serve as a basis to build a generic multimedia annotation platform, which returns reliable results for diverse multimedia analysis problems. This allows for better metadata generation, and improves the efficient retrieval of multimedia resources.".
- s11761-018-0234-4 abstract "Over the last decade, Web services composition has become a thriving area of research and development endeavors for application integration and interoperability. Although Web services composition has been heavily investigated, several issues still need to be addressed. In this paper, we mainly discuss two major bottlenecks in the current process of modeling compositions. The first bottleneck is related to the level of expertise required to achieve a composition process. Typical procedural style of modeling, inspired by workflow/business process paradigm, do not provide the required abstractions. Therefore, they fail to support dynamic and self-managed compositions able to adapt to unpredictable changes. The second bottleneck in current service compositions concerns their life cycle and their management, also called their governance. In this context, we propose a declarative proof-based approach to Web service composition. Based on the three stages of pre-composition, abstraction, and composition, our solution provides an easy way to specify functional and non-functional requirements of composite services in a precise and declarative manner. It guides the user through the composition process while allowing detection and recovery of violations at both design and run-time using proofs and planning. Experiment results clearly show the added value of the proof-based solution as a viable strategy to improve the composition process.".
- j.future.2019.10.006 abstract "Functions are essential building blocks of information retrieval and information management. However, efforts implementing these functions are fragmented: one function has multiple implementations, within specific development contexts. This inhibits reuse: metadata of functions and associated implementations need to be found across various search interfaces, and implementation integration requires human interpretation and manual adjustments. An approach is needed, independent of development context and enabling description and exploration of functions and (automatic) instantiation of associated implementations. In this paper, after collecting scenarios and deriving corresponding requirements, we (i) propose an approach that facilitates functions’ description, publication, and exploration by modeling and publishing abstract function descriptions and their links to concrete implementations; and (ii) enable implementations’ automatic instantiation by exploiting those published descriptions. This way, we can link to existing implementations, and provide a uniform detailed search interface across development contexts. The proposed model (the Function Ontology) and the publication method following the Linked Data principles using standards, are deemed sufficient for this task, and are extensible to new development contexts. The proposed set of tools (the Function Hub and Function Handler) are shown to fulfill the collected requirements, and the user evaluation proves them being perceived as a valuable asset during software retrieval. Our work thus improves developer experience for function exploration and implementation instantiation.".
- j.jbi.2017.05.006 abstract "The volume and diversity of data in biomedical research has been rapidly increasing in recent years. While such data hold significant promise for accelerating discovery, their use entails many challenges including: the need for adequate computational infrastructure, secure processes for data sharing and access, tools that allow researchers to find and integrate diverse datasets, and standardized methods of analysis. These are just some elements of a complex ecosystem that needs to be built to support the rapid accumulation of these data. The NIH Big Data to Knowledge (BD2K) initiative aims to facilitate digitally enabled biomedical research. Within the BD2K framework, the Commons initiative is intended to establish a virtual environment that will facilitate the use, interoperability, and discoverability of shared digital objects used for research. The BD2K Commons Framework Pilots Working Group (CFPWG) was established to clarify goals and work on pilot projects that would address existing gaps toward realizing the vision of the BD2K Commons. This report reviews highlights from a two-day meeting involving the BD2K CFPWG to provide insights on trends and considerations in advancing Big Data science for biomedical research in the United States.".
- j.websem.2016.03.003 abstract "Billions of Linked Data triples exist in thousands of RDF knowledge graphs on the Web, but few of those graphs can be queried live from Web applications. Only a limited number of knowledge graphs are available in a queryable interface, and existing interfaces can be expensive to host at high availability. To mitigate this shortage of live queryable Linked Data, we designed a low-cost Triple Pattern Fragments interface for servers, and a client-side algorithm that evaluates SPARQL queries against this interface. This article describes the Linked Data Fragments framework to analyze Web interfaces to Linked Data and uses this framework as a basis to define Triple Pattern Fragments. We describe client-side querying for single knowledge graphs and federations thereof. Our evaluation verifies that this technique reduces server load and increases caching effectiveness, which leads to lower costs to maintain high server availability. These benefits come at the expense of increased bandwidth and slower, but more stable query execution times. These results substantiate the claim that lightweight interfaces can lower the cost for knowledge publishers compared to more expressive endpoints, while enabling applications to query the publishers’ data with the necessary reliability.".
- j.websem.2017.12.003 abstract "Visual tools are implemented to help users in defining how to generate Linked Data from raw data. This is possible thanks to mapping languages which enable detaching mapping rules from the implementation that executes them. However, no thorough research has been conducted so far on how to visualize such mapping rules, especially if they become large and require considering multiple heterogeneous raw data sources and transformed data values. In the past, we proposed the RMLEditor, a visual graph-based user interface, which allows users to easily create mapping rules for generating Linked Data from raw data. In this paper, we build on top of our existing work: we (i) specify a visual notation for graph visualizations used to represent mapping rules, (ii) introduce an approach for manipulating rules when large visualizations emerge, and (iii) propose an approach to uniformly visualize data fraction of raw data sources combined with an interactive interface for uniform data fraction transformations. We perform two additional comparative user studies. The first one compares the use of the visual notation to present mapping rules to the use of a mapping language directly, which reveals that the visual notation is preferred. The second one compares the use of the graph-based RMLEditor for creating mapping rules to the form-based RMLx Visual Editor, which reveals that graph-based visualizations are preferred to create mapping rules through the use of our proposed visual notation and uniform representation of heterogeneous data sources and data values.".
- j.websem.2019.04.001 abstract "Since the invention of Notation3 Logic, several years have passed in which the theory has been refined and applied in different reasoning engines like cwm, EYE, and FuXi. But despite these developments, a clear formal definition of Notation3’s semantics is still missing. This does not only form an obstacle for the formal investigation of that logic and its relations to other formalisms, it has also practical consequences: in many cases the interpretations of the same formula differ between reasoning engines. In this paper we tackle one of the main sources of that problem, namely the uncertainty about implicit quantification. This refers to Notation3’s ability to use bound variables for which the universal or existential quantifiers are not explicitly stated, but implicitly assumed. We provide a tool for clarification through the definition of a core logic for Notation3 that only supports explicit quantification. We specify an attribute grammar which maps Notation3 formulas to that logic according to the different interpretations and thereby define the semantics of Notation3. This grammar is then implemented and used to test the impact of the differences between interpretations on practical cases. Our dataset includes Notation3 implementations from former research projects and test cases developed for the reasoner EYE. We find that 31% of these files are understood differently by different reasoners. We further analyse these cases and categorize them in different classes of which we consider one most harmful: if a file is manually written by a user and no specific built-in predicates are used (13% of our critical files), it is unlikely that this user is aware of possible differences. We therefore argue the need to come to an agreement on implicit quantification, and discuss the different possibilities.".
- S1471068416000016 abstract "Machine clients are increasingly making use of the Web to perform tasks. While Web services traditionally mimic remote procedure calling interfaces, a new generation of so-called hypermedia APIs works through hyperlinks and forms, in a way similar to how people browse the Web. This means that existing composition techniques, which determine a procedural plan upfront, are not sufficient to consume hypermedia APIs, which need to be navigated at runtime. Clients instead need a more dynamic plan that allows them to follow hyperlinks and use forms with a preset goal. Therefore, in this article, we show how compositions of hypermedia APIs can be created by generic Semantic Web reasoners. This is achieved through the generation of a proof based on semantic descriptions of the APIs’ functionality. To pragmatically verify the applicability of compositions, we introduce the notion of pre-execution and post-execution proofs. The runtime interaction between a client and a server is guided by proofs but driven by hypermedia, allowing the client to react to the application’s actual state indicated by the server’s response. We describe how to generate compositions from descriptions, discuss a computer-assisted process to generate descriptions, and verify reasoner performance on various composition tasks using a benchmark suite. The experimental results lead to the conclusion that proof-based consumption of hypermedia APIs is a feasible strategy at Web scale.".
- iet-its.2016.0269 abstract "The European Data Portal shows a growing number of governmental organisations opening up transport data. As end users need traffic or transit updates on their day-to-day travels, route planners need access to this government data to make intelligent decisions. Developers however, will not integrate a dataset when the cost for adoption is too high. In this paper, the authors study the internal and technological challenges to publish data from the Department of Transport and Public Works in Flanders for maximum reuse. Using the qualitative Engage STakeholdErs through a systEMatic toolbox (ESTEEM) research approach, they interviewed 27 governmental data owners and organised both an internal workshop as a matchmaking workshop. In these workshops, data interoperability was discussed on four levels: legal, syntactic, semantic and querying. The interviews were summarised in ten challenges to which possible solutions were formulated. The effort needed to reuse existing public datasets today is high, yet they see the first evidence of datasets being reused in a legally and syntactically interoperable way. Publishing data so that it is reusable in an affordable way is still challenging.".
- 10494820.2017.1343191 abstract "An e-TextBook can serve as an Interactive Learning Environment (ILE), facilitating more effective teaching and learning processes. In this paper, we propose the novel concept of an EPUB 3-based Hybrid e-TextBook, which allows for interaction between the digital and the physical world. In that regard, we first investigated the gap between the expectations of teachers with respect to e-TextBook functionalities, on the one hand, and the ILE functionalities offered by e-TextBooks, on the other hand. Next, together with teachers, we co-designed and developed prototype EPUB 3-based Hybrid e-TextBooks that make it possible to connect their learning content to smart devices in classrooms, leveraging both digital publishing and Semantic Web tools. Based on experimentation with our prototype Hybrid e-TextBooks, we can argue that a semantically enriched EPUB 3-based Hybrid e-TextBook is able to act as a comprehensive ILE, providing the tools needed by teachers in smart classrooms. Furthermore, expert observations and Smiley o’meter results demonstrate an effective impact on student cognition and motivation.".