Matches in Ruben’s data for { ?s <http://schema.org/abstract> ?o }
- fine-grained-content-negotiation abstract "Information resources can be expressed in different representations along many dimensions such as format, language, and time. Through content negotiation, HTTP clients and servers can agree on which representation is most appropriate for a given piece of data. For instance, interactive clients typically indicate they prefer HTML, whereas automated clients would ask for JSON or RDF. However, labels such as “JSON” and “RDF” are insufficient to negotiate between the rich variety of possibilities offered by today’s languages and data models. This position paper argues that, despite widespread misuse, content negotiation remains the way forward. However, we need to extend it with more granular options in order to serve different current and future Web clients sustainably.".
- incentivized-collaboration abstract "Personal data is being centralized at an unprecedented scale, and this comes with widely known and far-reaching consequences, considering the recent data scandals with companies such as Equifax and Facebook. Decentralizing personal data storage allows people to take back control of their data, and Semantic Web technologies can facilitate data integration at runtime. However, such data processing over decentralized data requires far more expensive algorithms, while at the same time, less processing power is available in individual stores compared to large-scale data centers. This article presents a vision in which nodes in decentralized networks are incentivized to collaborate on data processing using a distributed ledger. By leveraging the collective processing capacity of all nodes, we can provide a sustainable alternative to the current generation of centralized solutions, and thereby put people back in control without compromising on functionality.".
- queryable-research-data abstract "Publishing research on the Web accompanied by machine-readable data is one of the aims of Linked Research. Merely embedding metadata as RDFa in HTML research articles, however, does not solve the problems of accessing and querying that data. Hence, I created a simple ETL pipeline to extract and enrich Linked Data from my personal website, publishing the result in a queryable way through Triple Pattern Fragments. The pipeline is open source, uses existing ontologies, and can be adapted to other websites. In this article, I discuss this pipeline, the resulting data, and its possibilities for query evaluation on the Web. More than 35,000 RDF triples of my data are queryable, even with federated SPARQL queries because of links to external datasets. This proves that researchers do not need to depend on centralized repositories for readily accessible (meta-)data, but instead can—and should—take matters into their own hands.".
- redecentralizing-the-web abstract "Originally designed as a decentralized ecosystem, the Web has undergone a significant centralization in recent years. In order to regain control over our digital self, over the digital aspects of our lives, we need to understand how we arrived at this point and how we can get back on track. This chapter explains the history of decentralization in a Web context, and details Tim Berners-Lee’s role in the continued battle for a free and open Web. The challenges and solutions are not purely technical in nature, but rather fit into a larger socio-economic puzzle, to which all of us are invited to contribute. Let us take back the Web for good, and leverage its full potential as envisioned by its creator.".
- the-semantic-web-identity-crisis abstract "For a domain with a strong focus on unambiguous identifiers and meaning, the Semantic Web research field itself has a surprisingly ill-defined sense of identity. Started at the end of the 1990s at the intersection of databases, logic, and Web, and influenced along the way by all major tech hypes such as Big Data and machine learning, our research community needs to look in the mirror to understand who we really are. The key question amid all possible directions is pinpointing the important challenges we are uniquely positioned to tackle. In this article, we highlight the community’s unconscious bias toward addressing the Paretonian 80% of problems through research—handwavingly assuming that trivial engineering can solve the remaining 20%. In reality, that overlooked 20% could actually require 80% of the total effort and involve significantly more research than we are inclined to think, because our theoretical experimentation environments are vastly different from the open Web. As it turns out, these formerly neglected “trivialities” might very well harbor those research opportunities that only our community can seize, thereby giving us a clear hint of how we can orient ourselves to maximize our impact on the future. If we are hesitant to step up, more pragmatic minds will gladly reinvent technology for the real world, only covering a fraction of the opportunities we dream of.".
- the-semantic-web-identity-crisis abstract "For a domain with a strong focus on unambiguous identifiers and meaning, the Semantic Web research field itself has a surprisingly ill-defined sense of identity. Started at the end of the 1990s at the intersection of databases, logic, and Web, and influenced along the way by all major tech hypes such as Big Data and machine learning, our research community needs to look in the mirror to understand who we really are. The key question amid all possible directions is pinpointing the important challenges we are uniquely positioned to tackle. In this article, we highlight the community’s unconscious bias toward addressing the Paretonian 80% of problems through research—handwavingly assuming that trivial engineering can solve the remaining 20%. In reality, that overlooked 20% could actually require 80% of the total effort and involve significantly more research than we are inclined to think, because our theoretical experimentation environments are vastly different from the open Web. As it turns out, these formerly neglected “trivialities” might very well harbor those research opportunities that only our community can seize, thereby giving us a clear hint of how we can orient ourselves to maximize our impact on the future. If we are hesitant to step up, more pragmatic minds will gladly reinvent technology for the real world, only covering a fraction of the opportunities we dream of.".
- web-api-ecosystem abstract "The fast-growing Web API landscape brings clients more options than ever before—in theory. In practice, they cannot easily switch between different providers offering similar functionality. We discuss a vision for developing Web APIs based on reuse of interface parts called features. Through the introduction of 5 design principles, we investigate the impact of feature-based reuse on Web APIs. Applying these principles enables a granular reuse of client and server code, documentation, and tools. Together, they can foster a measurable ecosystem with cross-API compatibility, opening the door to a more flexible generation of Web clients.".
- selling-a-story-in-one-minute abstract "In my hometown Ghent, an exciting contest took place: PhD students could to send in a one-minute video about their research. Winners get to give a talk at TEDxGhent, a local edition of the famous TED conferences. I badly wanted to participate, so I had to find an original and effective way of selling my message in one minute. My goals: tease the audience, entertain the audience, and, ultimately, activate them to vote.".
- get-doesnt-change-the-world abstract "Recently, I wanted to offer my visitors the option to add any of my publications to their Mendeley paper library. When creating the “add to Mendeley” links, I noticed that papers got added without asking the visitor for a confirmation. Then I wondered: could I exploit this to trick people into adding something to their Mendeley library without their consent? Turns out I could, and here is why: Mendeley did not honor the safeness property of the HTTP GET method.".
- rest-wheres-my-state abstract "HTTP, the Hypertext Transfer Protocol, has been designed under the constraints of the REST architectural style. One of the well-known constraints of this REpresentational State Transfer style is that communication must be stateless. Why was this particular constraint introduced? And who is in charge then of maintaining state, since it is clearly necessary for many Web applications? This post explains how statelessness works on today’s Web, explaining the difference between application state and resource state.".
- perl-and-the-preikestolen abstract "If I wanted to join the Oslo Perl Mongers for an RDF hackaton, Kjetil Kjernsmo asked me two months ago. We had met at the LAPIS workshop in Greece, where he showed me the open source work he had been doing. “Sure, I’d love to join”, I replied, “but there’s only a minor problem—I don’t know Perl!” Turns out there was nothing to worry about: learning Perl is easy, and the community embraces newcomers. Plus, the hackaton was located near a beautiful mountain landscape in Norway. Needless to say, I had a splendid week.".
- the-object-resource-impedance-mismatch abstract "Most programmers are not familiar with resource-oriented architectures, and this unfamiliarity makes them resort to things they know. This is why we often see URLs that have action names inside of them, while they actually shouldn’t. Indeed, URLs are supposed to identify resources, and HTTP defines the verbs we can use to view and manipulate the state of those resources. Evidently, there is quite a mismatch between imperative (object-oriented) languages and HTTP’s resources-and-representations model. What would happen if we think the other way round and model HTTP methods in an imperative programming language?".
- social-media-as-spotlight-on-your-research abstract "As researchers, communication is arguably the most important aspect of our job, but unfortunately not always the most visible. Sometimes, our work is so specific that it seems impossible to share it as a story with the outside world. Surprisingly, day-to-day social media such as Facebook and Twitter can be highly effective to give your work the attention it deserves. To achieve this, researchers must become conscious social media users who engage in every social network with a purpose—and a plan.".
- everything-is-connected-in-strange-ways abstract "What’s the connection between the Eiffel Tower and the Big Ben? How are you related to Mickey Mouse? Or Elvis Presley? Today, there’s a fun way to find out: Multimedia Lab’s new Web app Everything is Connected allows you to see how any two topics in this world connect. Choose a start topic (this might be you!) and watch an on-the-fly video that takes you to any destination topic you select. You’ll be amazed to discover how small the world we live in really is. In this post, I’ll take you behind the scenes of this fascinating app.".
- asynchronous-error-handling-in-javascript abstract "Anything that can go wrong will go wrong, so we better prepare ourselves. The lessons we’ve been taught as programmers to nicely throw and catch exceptions don’t apply anymore in asynchronous environments. Yet asynchronous programming is on the rise, and things still can and therefore will go wrong. So what are your options to defend against errors and graciously inform the user when things didn’t go as expected? This post compares different asynchronous error handling tactics for JavaScript.".
- what-web-agents-want abstract "The iPhone’s Siri has given the world a glimpse of the digital personal assistant of the future. “Siri, when is my wife’s birthday?” or “Siri, remind me to pick up flowers when I leave here” are just two examples of things you don’t have to worry about anymore. However cool that is, Siri’s capabilities are not unlimited: unlike a real personal assistant, you can’t teach her new tricks. If you had a personal agent that could use the whole Web as its data source—instead of only specific parts—there would be no limits to what it could do. However, the Web needs some adjustments to make it agent-ready.".
- programming-is-an-art abstract "People who have programmed with me or have seen my open-source work on GitHub know that I put a lot of effort in my coding style. I indeed consider programming a creative act, which necessarily involves aesthetics. And then, some people consider aesthetics the enemy of the pragmatic: “don’t spend time writing beautiful code when you can write effective code”. However, I argue that my sense of beauty serves pragmatism much better, because it leads to more concise and maintainable code, and is thereby far more effective.".
- affordances-weave-the-web abstract "What makes the Web more fascinating to read than any book? It’s not that the information is more reliable or people have become tired of the smell of paper. The exciting thing about consuming information on the Web is that you can keep clicking through for more. Hyperlinks have always been a source of endless curiosity. Few people realize that the hypertext concept actually far predates the Web. The idea that information itself could become an actionable entity has revolutionized our world and how we think.".
- lightning-fast-rdf-in-javascript abstract "Node.js has spawned a new, asynchronous generation of tools. Asynchronous thinking is different from traditional stream processing: instead of actively waiting for data in program routines, you write logic that acts when data arrives. JavaScript is an ideal language for that, because callback functions are lightweight. I have written a parser for Turtle, an RDF serialisation format, that uses asynchrony for maximal performance.".
- towards-serendipitous-web-applications abstract "Hyperlinks are the door handles of the Web, as they afford going to the next place you want to be. However, in a space as large as the Web, there is an awful lot of possible next places, so the webpage might not offer the door handle you are looking for. Luckily of course, there’s a thing called Google, but wouldn’t it be much more awesome if the links you need were already there on the page? Because right now, the author of the webpage has to make the decision where you can go, as he is the architect of the information. But should he also be the architect of the navigation or should that be you, the person surfing the Web?".
- scientific-posters-are-ineffective abstract "Dreaded scientific posters—if you attend conferences, you definitely saw them. They’re boring and ugly. On purpose. Because that’s what everybody does, right? The adjective scientific seems to imply that we should restrict our creativity. After all, content is king, and too much fanciness won’t get you anywhere? And the term poster is just because “abstract of 84cm × 119cm where you choose the colors” is too long? It’s this kind of reasoning that gets us nowhere.".
- one-hammer-for-a-thousand-nails abstract "“When all you have is a hammer, every problem starts to look like a nail” is but one of the many wordings of the infamous Law of the Instrument. Many of us are blinkered by our tools, instantaneously choosing what we know best to solve a problem—even though it might not be the best solution to that problem. It doesn’t take long to end up with complex solutions for simple things. Fortunately, the more tools you master, the higher the chance you choose the right one. Thus, an extensive toolbox is exactly what I recommend.".
- using-openrefine-data-are-diamonds abstract "Data is often dubbed the new gold, but no label can be more wrong. It makes more sense to think about data as diamonds: highly valuable, but before they are of any use, they need intensive polishing. OpenRefine, the latest incarnation of Google Refine, is specifically designed to help you with this job. Until recently, getting started with OpenRefine was rather hard because the amount of functionality can overwhelm you. This prompted Max De Wilde and myself to write a book that will turn you into an OpenRefine expert.".
- can-i-sparql-your-endpoint abstract "SPARQL, the query language of the Semantic Web, allows clients to retrieve answers to complex questions. It’s a core technology in the Semantic Web Stack, as it enables flexible querying of Linked Data. If the Google search box is the entry to the human Web, a SPARQL query field is the entry to the machine Web. There’s only one slight problem: nobody seems able to keep a SPARQL endpoint up. Maybe the issue is so fundamental that more processing power cannot solve it.".
- research-is-teamwork abstract "Research is a rewarding job. You get to work on a cool thing, communicate about it, travel around the world to demonstrate it to others… But most of all, you get the opportunity to work together with highly talented people, in ways that are impossible in industry. The International Semantic Web Conference reunited people working on future Web technology for the 12th year in a row, and I was very lucky to be there. Moreover, our MMLab team, together with the Web & Media Group of the VU, set a new record by winning the Best Demo Award two consecutive years. I’ve come to realize how important communicating and collaborating with people are for good research—simply invaluable.".
- the-lie-of-the-api abstract "Really, nobody takes your website serious anymore if you don’t offer an API. And that’s what everybody did: they got themselves a nice API. An enormous amount of money and energy is wasted on developing APIs that are hard to create and even harder to use. This is wonderful news for developers, who get paid to build two pieces of software—a server and a client—that were actually never needed in the first place. The API was there already: it’s your website itself. Shockingly, a majority of developers seems unable to embrace the Web and the important role URLs and hypermedia play on it. The lie called “API” has trapped many publishers, including the Digital Public Library of America and Europeana.".
- promiscuous-promises abstract "Promises allow to asynchronously return a value from a synchronous function. If the return value is not known when the function exists, you can return a promise that will be fulfilled with that value later on. This comes in handy for JavaScript, which is often used for Web programming and thus has to rely on asynchronous return values: downloads, API calls, read/write operations, … In those cases, promises can make your code easier and thus better. This post briefly explains promises and then zooms in on the methods I used to create promiscuous, a small Promise implementation for JavaScript.".
- apologies-for-cross-posting abstract "Apologizing is a polite and functional act of communication: it helps people to let go any negative sentiments you may have caused. However, communication is only effective when it is actually meant to help others, not to help yourself. We sometimes send messages out of habit, which strangely can give them the opposite effect than was intended by adopting that habit. Therefore, always think before you communicate to ensure you convey the right message.".
- my-phd-on-semantic-hypermedia abstract "More than three years of research and several hundred pages of text later, I’m finally ready to defend my PhD. Why did I start this whole endeavor again? Well, I was—and still am—fascinated by the possibilities the Web has to offer, and working as a PhD student gives you the opportunity and the freedom to dive into the things you love. I wanted to make the Web more accessible for machines, so they can perform tasks in a more autonomous way. This brought me to the crossroads of Semantic Web and REST APIs: semantic hypermedia.".
- towards-web-scale-web-querying abstract "Most public SPARQL endpoints are down for more than a day per month. This makes it impossible to query public datasets reliably, let alone build applications on top of them. It’s not a performance issue, but an inherent architectural problem: any server offering resources with an unbounded computation time poses a severe scalability threat. The current Semantic Web solution to querying simply doesn’t scale. The past few months, we’ve been working on a different model of query solving on the Web. Instead of trying to solve everything at the server side—which we can never do reliably—we should build our servers in such a way that enables clients to solve queries efficiently.".
- www2014-and-25-years-of-web abstract "The yearly World Wide Web conferences are highlights for my research: every time again, the world’s most fascinating people meet to discuss novel ideas. This year’s edition moved to Seoul, and I happily represented Ghent University for the third time, together with my colleagues. In addition to hosting the WS‑REST2014 workshop, I presented Linked Data Fragments at LDOW2014. The combination of these workshops represents for me what is important to move the Web forward: flexible data and API access for automated clients.".
- the-pragmantic-web abstract "Like any technological or scientific community, optimism in the beginning years of the Semantic Web was high. Artificial intelligence researchers in the 1960s believed it would be a matter of years before machines would become better at chess than humans, and that machines would seamlessly translate texts from one language into another. Semantic Web researchers strongly believed in the intelligent agents vision, but along the way, things turned out more difficult. Yet people still seem to focus on trying to solve the complex problems, instead of tackling simple ones first. Can we be more pragmatic about the Semantic Web? As an example, this post zooms in on the SemWeb’s default answer to querying and explains why starting with simple queries might just be a better idea.".
- a-hands-on-linked-data-book-for-people abstract "The Linked Data hype is surrounded by questions, and most of those questions are only answered from the technology perspective. Such answers often insufficiently address the needs of people who just want to publish their data. Practitioners from libraries, archives and museums all over the world have very valuable data that they would love to share, but they often don’t find the right practical guidance to do this. Our new handbook Linked Data for Libraries, Archives and Museums changes that. We wrote it for non-technical people, by combining clear explanations with hands-on case studies.".
- reviewers-shouldnt-hide-their-name abstract "Peer review is research’ most powerful instrument. Having your manuscript reviewed by independent researchers in your own field improves the odds that your published work is valid—and valuable. The drawback of this mechanism is that many researchers are often on reviewer duty; I find myself reviewing several papers a month. It’s not hard to imagine that sloppiness can creep in sometimes… And sadly, there are not a lot of ways to prevent this: reviews in the research community remain largely anonymous. This means that, if a reviewer has a bad day or doesn’t want to read a paper with their full attention, they cannot be held accountable for that. If you have written a grounded opinion, why don’t you put your name on it?".
- writing-a-sparql-parser-in-javascript abstract "If we want to make the Semantic Web more webby, we need software in the Web’s main language JavaScript. The support for the SemWeb’s data format RDF has been quite good in JavaScript, with several libraries supporting Turtle and JSON-LD. However, proper support for the SPARQL query language has been lacking so far, especially for the latest version SPARQL 1.1. Since I need SPARQL for several of my projects, such as Linked Data Fragments, I wrote a proper SPARQL parser myself. It is created through the Jison parser generator and converts SPARQL into JSON. Its main design goals are completeness, maintainability, and small size.".
- bringing-fast-triples-to-nodejs-with-hdt abstract "Reading a selection from a large dataset of triples is an important use case for the Semantic Web. But files in textual formats such as Turtle become too slow as soon as they contain a few thousand triples, and triple stores are often too demanding, since they need to support write informations. The HDT (Header Dictionary Triples) binary RDF format offers fast, read-only access to triples in large datasets. Until recently, this power was only available in Java and C++, so I decided it was high time to port it to Node.js as well ;-)".
- the-year-of-the-developers abstract "The Semantic Web is plagued by various issues, one rather prominent fact being that few people actually heard about it. If you ask me, it’s because we have been focusing almost exclusively on research lately, which is quite odd. After all—no matter how good the research is—eventually, code is at the core of all Web systems. Why is it that we have been selectively deaf and blind for those who build what we need most: actual applications that use Linked Data in the real world? Fortunately, the first Semantic Web Developers Workshop found a very passionate audience. We need more of this, and we need it now.".
- distinguishing-between-frank-and-nancy abstract "Ever looked up a person in an encyclopedia without knowing whether it was a man or a woman? And if you did, was it explicitly mentioned in the article? I’m guessing the answer two both questions is “no”. Gender is of course not that important; we’re interested in people for what they do. Yet at the same time, this particular piece of information is so trivial and obvious that we often just don’t mention it. This means that machines, which require explicit instruction, have no way to determine this elementary fact. Therefore, it’s hard to study even simple statistics in an automated way. This is why the Dutch DBpedia chapter had asked me to experiment with gender extraction for people, based on their Wikipedia pages.".
- thank-you-for-your-attention abstract "Talks at academic conferences seldom feature a high knowledge per minute ratio. Speakers often talk for themselves, unwittingly spawning facts that are not directly useful to their audience. For me, the most symptomatic aspect is the obligatory “thank you for your attention” at the end of a talk. Think about what you’re saying. Was your talk so bad that people had to do you an actual favor by paying attention? We’ve got this whole thing backwards. You are one of the people the audience paid for to see. They should be thanking you for doing a great job—provided of course, that you really do the best you can to help them understand.".
- 600000-queryable-datasets-and-counting abstract "What good is a Web full of Linked Data if we cannot reliably query it? Whether we like to admit it or not, queryable data is currently the Semantic Web’s Achilles’ heel. The Linked Data cloud contains several high-quality datasets with a total of billions of triples, yet most of that data is only available in downloadable form. Frankly, this doesn’t make any sense on the Web. After all, would you first download Wikipedia in its entirety just to read a single article? Probably not! We combined the power of the LOD Laundromat, a large-scale data cleansing apparatus, with the low-cost Triple Pattern Fragments interface so you can once and for all query the Web.".
- fostering-intelligence-by-enabling-it abstract "In a couple of months, 15 years will have passed since Tim Berners-Lee, Jim Hendler, and Ora Lassila wrote the Scientific American article “The Semantic Web”. It’s hard to imagine that, another 15 years before this, the Web didn’t even exist. The article talks heavily about agents, which would use the Web to do things for people. Somehow, somewhere, something went terribly wrong: the same time needed for the Web to liberate the world has hardly been sufficient for the Semantic Web to reach any adoption. And still, there are no agents, nor are there any signs that we will see them in the near future. Where should we even start?".
- federated-sparql-queries-in-your-browser abstract "Querying multiple sources reveals the full potential of Linked Data by combining data from heterogeneous origins into a consistent result. However, I have to admit that I had never executed a federated query before. Executing regular SPARQL queries is relatively easy: if the endpoint is up, you can just post your query there. But where do I post my query if there are multiple endpoints, and will they communicate to evaluate that query? Or do I have to use a command-line tool? We wanted federated queries to be as accessible as anything else on the Web, so our federated Triple Pattern Fragments engine runs in your browser. At last, multiple Linked Data sources can be queried at once, at very low server-side cost.".
- turtles-all-the-way-down abstract "How can we ever talk about intelligent clients if we don’t provide them with opportunities to be intelligent? The current generation of RDF APIs is patronizing its clients by only describing its data in RDF. This contrasts to websites for humans, where data would be quite useless if it were not accompanied by context and controls. By omitting these, we withhold basic information from clients, like “what’s in this response?” and “where can I go next?”. This post proposes to extend the power of self-descriptiveness from data to API responses as a whole. Using RDF graphs, we can combine data, context, and controls in one response. RDF APIs need to become like websites, explaining clients where they are and what they can do.".
- querying-history-with-linked-data abstract "Data on the World Wide Web changes at the speed of light—today’s facts are tomorrow’s history. This makes the ability to look back important: how do facts grow and change over time? It gets even more interesting when we zoom out beyond individual facts: how do answers to questions evolve when data ages? With Linked Data, we are used to query the latest version of information, because updating a SPARQL endpoint is easier than maintaining every historical version. With the lightweight Triple Pattern Fragments interface, it becomes very easy for a server to host multiple versions. Using the Memento framework to switch between versions based on a timestamp, your browser can evaluate SPARQL queries over any point in time. We tried this with DBpedia—and so can you!".
- use-the-web-instead abstract "Few things annoy me more than a random website asking me: “do you want to use the app instead?” Of course I don’t want to—that’s why I use your website. There are people who like apps and those who don’t, but regardless of personal preferences, there’s a more important matter. The increasing cry of apps begging to invade—literally—our personal space undermines some of the freedoms for which we have long fought. The Web is the first platform in the history of mankind that allows us to share information and order services through a single program: a browser. Apps gladly circumvent this universal interface, replacing it with their own custom environment. Is it really the supposedly better user experience that pushes us towards native apps, or are there other forces at work?".
- truth-takes-time abstract "Newspapers everywhere were quick to blame social media for some of 2016’s more surprising political events. However, filter bubbles, echo chambers, and unsubstiantiated claims are as old as humanity itself, so Facebook and friends have at most acted as amplifiers. The real mystery is that, given our access to unprecedented technological means to escape those bubbles and chambers, we apparently still prefer convenient truths over a healthy diet of various information sources. Paradoxically, in a world where the Web connects people more closely than ever, its applications are pushing us irreconcilably far apart. We urgently need to re-invest in decentralized technologies to counterbalance the monopolization of many facets of the Web. Inevitably, this means trading some of the omnipresent luxuries we’ve grown accustomed to for forgotten basic features we actually need most. This is a complex story about the relationship between people and knowledge technology, the eye of the beholder, and how we cannot let a handful of companies act as the custodians of our truth.".
- paradigm-shifts-for-the-decentralized-web abstract "Most Web applications today follow the adage “your data for my services”. They motivate this deal from both a technical perspective (how could we provide services without your data?) and a business perspective (how could we earn money without your data?). Decentralizing the Web means that people gain the ability to store their data wherever they want, while still getting the services they need. This requires major changes in the way we develop applications, as we migrate from a closed back-end database to the open Web as our data source. In this post, I discuss three paradigm shifts a decentralized Web brings, demonstrating that decentralization is about much more than just controlling our own data. It is a fundamental rethinking of the relation between data and applications, which—if done right—will accelerate creativity and innovation for the years to come.".
- designing-a-linked-data-developer-experience abstract "While the Semantic Web community was fighting its own internal battles, we failed to gain traction with the people who build apps that are actually used: front-end developers. Ironically, Semantic Web enthusiasts have failed to focus on the Web; whereas our technologies are delivering results in specialized back-end systems, the promised intelligent end-user apps are not being created. Within the Solid ecosystem for decentralized Web applications, Linked Data and Semantic Web technologies play a crucial role. Working intensely on Solid the past year, I realized that designing a fun developer experience will be crucial to its success. Through dialogue with front-end developers, I created a couple of JavaScript libraries for easy interaction with complex Linked Data—without having to know RDF. This post introduces the core React components for Solid along with the LDflex query language, and lessons learned from their design.".
- shaping-linked-data-apps abstract "Ever since Ed Sheeran’s 2017 hit, I just can’t stop thinking about shapes. It’s more than the earworm though: 2017 is the year in which I got deeply involved with Solid, and also when the SHACL recommendation for shapes was published. The problem is a very fundamental one: Solid promises the separation of data and apps, so we can choose our apps independently of where we store our data. The apps you choose will likely be different from mine, yet we want to be able to interact with each other’s data. Building such decentralized Linked Data apps necessitates a high level of interoperability, where data written by one app needs to be picked up by another. Rather than relying on the more heavy Semantic Web machinery of ontologies, I believe that shapes are the right way forward—without throwing the added value of links and semantics out of the window. In this post, I will expand on the thinking that emerged from working with Tim Berners-Lee on the Design Issue on Linked Data shapes, and sketch the vast potential of shapes for tackling crucial problems in flexible ways.".
- a-data-ecosystem-fosters-sustainable-innovation abstract "We’re living in a data-driven economy, and that won’t change anytime soon. Companies, start-ups, organisations, and governments all require some of our data to provide us with the services we want and need. Unfortunately, decades of Big Data thinking has led many companies to a consequential fallacy: the belief that they need to harvest and maintain that personal data themselves in order to deliver their services and thus survive in the data-driven economy. This prompted a never-ending rat race, dominated by a handful of large players and driven by a deeply flawed notion of “winning”, with as a result that most people and companies collectively end up losing much more than they put in. Pointless data greed has falsified competition and stifled innovation from the moment data collection became more important than quality of experience. A way out of this dead end is to put people fully in control of their own data by equipping them with a personal data vault. Vaults enable us to break the standstill, as they re-level the playing field by giving all parties equal chances to access data under people’s control. Halting data harvesting is, paradoxically, how companies can leverage more data towards their services instead of less. Yet they won’t own that data—and in a sustainable ecosystem, there’s no need to. In this post, I dive into the surprising economics of an overdue data revolution.".
- reflections-of-knowledge abstract "Web services emerged in the late 1990s as a way to access specific pieces of remote functionality, building on the standards-driven stability brought by the universal protocol that HTTP was readily becoming. Interestingly, the Web itself has drastically changed since. During an era of unprecedented centralization, almost all of our data relocated to remote systems, which appointed Web APIs as the exclusive gateways to our digital assets. While the legal and socio-economic limitations of such Big Data systems began painfully revealing themselves, the window of opportunity for decentralized data ecosystems opened up wider than ever before. The knowledge graphs of the future are already emerging today, and they’ll be so massively large and elusive that they can never be captured by any single system—and hence impossibly be exposed through any single API. This begs the question of how servers can provide flexible entry points into this emerging Web-shaped knowledge ecosystem, and how clients can sustainably interact with them. This blog post describes the upcoming shift from API integration to data integration, why we assume the latter is the easier problem, and what it fundamentally means to channel abstract knowledge through concrete Web APIs.".
- lets-talk-about-pods abstract "Who decides what your Solid pod looks like? For a long time, the answer has been the first one who writes, decides. That is, the first app to sculpt documents and containers in your pod determines where other apps need to look for data. Unfortunately, this creates an undesired dependency between apps, which now have to agree amongst each other on how to store things. Yet Solid promises apps that will seamlessly and independently reuse data in order to provide us with better and safer experiences. At the heart of this contradiction is that the mental model we’re using for Solid pods no longer works. This model restricts our solution space and is a main reason why apps struggle to reuse each other’s data. In this blog post, I argue why we should stop thinking of a pod as a set of documents, and start treating it as the hybrid graph it actually is. By adjusting our perspective, Solid apps can become more independent of variations in data—and thus more powerful for us.".
- phd abstract "Ever since its creation at the end of the 20th century, the Web has profoundly shaped the world’s information flow. Nowadays, the Web’s consumers no longer consist of solely people, but increasingly of machine clients that have been instructed to perform tasks for people. Lacking the ability to interpret natural language, machine clients need a more explicit means to decide what steps they should take. This thesis investigates the obstacles for machines on the current Web, and provides solutions that aim to improve the autonomy of machine clients. In addition, we will enhance the Web’s linking mechanism for people, to enable serendipitous reuse of data between Web applications that were not connected previously.".
- publication abstract "Nowadays, the Web has become one of the main sources of biodiversity information. An increasing number of biodiversity research institutions add new specimens and their related information to their biological collections and make this information available on the Web. However, mechanisms which are currently available provide insufficient provenance of biodiversity information. In this paper, we propose a new biodiversity provenance model extending the W3C PROV Data Model. Biodiversity data is mapped to terms from relevant ontologies, such as Dublin Core and GeoSPARQL, stored in triple stores and queried using SPARQL endpoints. Additionally, we provide a use case using our provenance model to enrich collection data.".
- publication abstract "Factories of the future will autonomously deal with the ever increasing amount of available data. Processes will be planned automatically. Computers will keep track of machine parameters, product quality and workforce activities. But, how powerful these systems might become, the resulting new "Smart Factories" will always rely on experienced human workers and their skills. Therefore, such systems should support the worker where possible. They should intelligently enable him to further develop his skills, to learn new things and to fully use his innovation capacities. Our project Worker-Centric Workplaces in Smart Factories focuses on the factory worker. We do not understand the worker as a rather insignificant component of the modern factory, but as its center. Our ambition is to create "FACTorieS for WORKERS" (FACTS4WORKERS).".
- publication abstract "Since the development of Notation3 Logic, several years have passed in which the theory has been refined and used in practice by different reasoning engines such as cwm, FuXi or EYE. Nevertheless, a clear model-theoretic definition of its semantics is still missing. This leaves room for individual interpretations and renders it difficult to make clear statements about its relation to other logics such as DL or FOL or even about such basic concepts as correctness. In this paper we address one of the main open challenges: the formalization of implicit quantification. We point out how the interpretation of implicit quantifiers differs in two of the above mentioned reasoning engines and how the specification, proposed in the W3C team submission, could be formalized. Our formalization is then put into context by integrating it into a model-theoretic definition of the whole language. We finish our contribution by arguing why universal quantification should be handled differently than currently prescribed.".
- publication abstract "Traditionally, nurse call systems in hospitals are rather simple: patients have a button next to their bed to call a nurse. Which specific nurse is called cannot be controlled, as there is no extra information available. This is different for solutions based on semantic knowledge: if the state of care givers (busy or free), their current position, and for example their skills are known, a system can always choose the best suitable nurse for a call. In this paper we describe such a semantic nurse call system implemented using the EYE reasoner and Notation3 rules. The system is able to perform OWL-RL reasoning. Additionally, we use rules to implement complex decision trees. We compare our solution to an implementation using OWL-DL, the Pellet reasoner, and SPARQL queries. We show that our purely rule-based approach gives promising results. Further improvements will lead to a mature product which will significantly change the organization of modern hospitals.".
- publication abstract "In modern factories different machines and devices offering their services such as producing parts or simply providing information become more and more important. The number and diversity of such devices is increasing and the task of combining available resources into workflows becomes a challenge which can hardly be handled by a human user. In this paper we describe how we use RESTdesc, a formalism to semantically describe possible actions of RESTful Web APIs via existential rules to automatically generate and execute such workflows. Our approach makes use of Notation3 reasoners and their ability to produce proofs. These proofs are interpreted as workflow descriptions which can be easily executed and updated. The latter makes our approach very adaptable to unforeseen situations. By using one rule per possible API call, our system is very modular and easy to maintain; services can be readily added or removed. Our implementation shows how the use of rule based reasoning can significantly improve the daily work in today’s factories.".
- publication abstract "Many data scientists make use of Linked Open Data (LOD) as a huge interconnected knowledge base represented in RDF. However, the distributed nature of the information and the lack of a scalable approach to manage and consume such Big Semantic Data makes it difficult and expensive to conduct large-scale studies. As a consequence, most scientists restrict their analyses to one or two datasets (often DBpedia) that contain – at most – hundreds of millions of RDF triples. LOD-a-lot is a dataset that integrates a large portion (over 28 billion triples) of the LOD Cloud into a single ready-to-consume file that can be easily downloaded, shared and queried, locally or online, with a small memory footprint. This paper shows there exists a wide collection of Data Science use cases that can be performed over such a LOD-a-lot file. For these use cases LOD-a-lot significantly reduces the cost and complexity of conducting Data Science.".
- publication abstract "Intelligent and automatic overlays for video streams know an increasing demand in broadcasting and conference systems. These overlays provide additional information regarding the broadcast or the conference to better engage the end users. In this paper, a platform is presented that employs Linked Data to intelligently determine the content of the overlay data, based on the current context. Semantic reasoning is utilized to decide which overlays should be shown and how the cameras should be automatically controlled to capture all important events.".
- publication abstract "People increasingly rely on multi-modal (car, train, bus, e-bike, etc.) transport. However, current route planning does not allow the flexibility to seamlessly integrate users’ preferences and goals, different transport types, or adapt to disruptive traffic events. In this demonstrator, we showcase a context-aware semantic route planner that combines users’ individualized goals with geographical route planning using multi-modal transport modes. This means that the route planning can incorporate various goals of the user, e.g. visiting a friend or picking up an item, while the planner figures out the best order of visiting and the most suited transport modes. As transport is very dynamic, Stream Reasoning techniques are employed for real-time traffic monitoring. This allows the planner to automatically adapts the route when interruptions or obstructions, resulting in delays, are detected.".
- publication abstract "This paper evaluates the performance of the OWL 2 reasoners Pellet and HermiT in an eHealth context where most of the ABox is considered static and discrete transient events describing the environment are incrementally added and processed. The considered use case is the assignment of tasks and calls to nurses. To provide personalized and optimized care, the selection process utilizes reasoning to make intelligent assignment decisions based on the available information. This has been implemented using multiple SPARQL queries to enable easy adaptation of the assignment algorithm. Since limited time is available to perform the assignments, the decision should be made in at most five seconds. An analysis of the performance and scalability of the reasoners is presented. To deal with the limited time frame, several optimizations are suggested, which exploit that most of the ABox is considered static.".
- publication abstract "Qualitative research is based on an iterative methodology. Is it possible to develop algorithms for handling large amounts of data that are principally based on such iteration? This question will be addressed in this article in the context of using the so-called Semantic Web as the source of Big Data for automatically answering questions.".
- publication abstract "Base registries are trusted authentic information sources controlled by an appointed public administration or organization appointed by the government. Maintaining a base registry comes with extra maintenance costs to create the dataset and keep it up to date. In this paper, we study the possibility to entangle the maintenance of base registries at the core of existing administrative processes and to reduce the cost of maintaining a new data source. We demonstrate a method to manage Local Council Decisions as Linked Data, which creates a new base registry for mandates. We found that no extra effort was needed in the process by local administrations. We show that an end-to-end approach for Local Council Decisions as Linked Data is feasible. Furthermore, using this proof of concept, we established a momentum to roll out these ideas for the region of Flanders in Belgium.".
- publication abstract "Calculating a public transit route involves taking into account user preferences: e.g., one might prefer trams over buses, one might prefer a slight detour to pass by their favorite coffee bar or one might only be interested in wheelchair accessible journeys. Traditional route planning interfaces do not expose enough features for these kind of questions to be answered. In previous work, we proposed a Linked Data interface, called Linked Connections, which allows user-agents to evaluate the route planning queries on the client-side, and thus allow for extra features to be implemented by data reusers. In this work, we study how and where these new features can be added to the Linked Connections framework. We researched this by adding the feature of wheelchair-accessibility both on server and client, and comparing these two solution on query execution time, cache performance and CPU usage on server and client. We found that for the use case of wheelchair-accessibility, there is no advantage of adding this feature on the server: the query execution time does not improve, while the cache hit rate lowers.".
- publication abstract "If we want a broad adoption of Linked Data, the barrier to conform to the Linked Data principles need to be as low as possible. One of the Linked Data principles is that URIs should be dereferenceable. This demonstrator shows how to set up The DataTank and configure a Linked Data repository, such as a turtle file or SPARQL endpoint, in it. Different content-types are acceptable and the response in the right format is generated at request time.".
- publication abstract "Route planning providers manually integrate different geo-spatial datasets before offering a Web service to developers, thus creating a closed world view. In contrast, combining open datasets at runtime can provide more information for user-specific route planning needs. For example, an extra dataset of bike sharing availabilities may provide more relevant information to the occasional cyclist. A strategy for automating the adoption of open geo-spatial datasets is needed to allow an ecosystem of route planners able to answer more specific and complex queries. This raises new challenges such as (i) how open geo-spatial datasets should be published on the Web to raise interoperability, and (ii) how route planners can discover and integrate relevant data for a certain query on the fly. We republished Open Street Map’s road network as “Routable Tiles” to facilitate its integration into open route planners. To achieve this, we use a Linked Data strategy and follow an approach similar to vector tiles. In a demo, we show how client-side code can automatically discover tiles and perform a shortest path algorithm. We provide four contributions: (i) we launched an open geo-spatial dataset that is available for everyone to reuse at no cost, (ii) we published a Linked Data version of the Open Street Map ontology, (iii) we introduced a hypermedia specification for vector tiles that extends the Hydra ontology, and (iv) we released the mapping scripts, demo and routing scripts as open source software.".
- publication abstract "Ever since public transit agencies have found their way to the Web, they inform travelers using route planning software made available on their website. These travelers also need to be informed about other modes of transport, for which they have to consult other websites, or for which they have to ask the transit agency’s server maintainer to implement new functionalities. In this demo, we introduce an affordable publishing method for transit data, called Linked Connections, that can be used for intermodal route planning, by allowing user agents to execute the route planning algorithm. We publish paged documents containing a stream of hops between transit stops sorted by departure time. Using these documents, clients are able to perform intermodal route planning in a reasonable time. Furthermore, such clients are fully in charge of the algorithm, and can now also route in different ways by integrating datasets of a user’s choice. When visiting our demo, conference attendees will be able to calculate intermodal routes by querying the Web of data using their phone’s browser, without expensive server infrastructure.".
- publication abstract "The Public Sector Information directive has made Open Data the default within European Public Sector Bodies. End-user multimodal planners need access to government data to make intelligent route planning decisions. We studied both the needs of the market and the vision of the department of Mobility and Public Works in Flanders by interviewing 6 market players and 27 governmental data owners and by organising 2 workshops. We found a moderate willingness to publishing open data, amongst others, thanks to the European PSI and ITS directives. Furthermore, we found evidence of existing open data reuse among commercial multimodal route planners. We identified 3 caveats: not every dataset will be reused as there is a cost for adoption, data quality needs to be high enough and metadata is crucial. We formulate opportunities that lie within the Web’s principles to reduce cost of publishing and reusing, and raise the data quality.".
- publication abstract "Thanks to architectural constraints adopted by its stakeholders, the World Wide Web was able to scale up to its current size. To realize the ITS directive, which stimulates sharing data on large scale between different parties across Europe, a large-scale information system is needed as well. We discuss three constraints which lie at the basis of the success of the Web, and apply these to transport data publishing: stateless interaction, cacheability and a uniform interface. The city of Ghent implemented these constraints for publishing the dynamic capacity of the parking sites. The information system, allowing federated queries from the browser, achieves a good user perceived performance, a good network efficiency, achieves a scalable server infrastructure, and enables a simple to reuse dataset. To persist such a transport data information system, still a well-maintained Linked Data vocabulary is needed. We propose to add these URIs into the DATEX2 specification.".
- publication abstract "In this paper, we provide a novel semantic workflow system, based on semantic functional service descriptions and a rule file. The workflow engine follows a three-step process. First, it determines for all the resources in its knowledge base the functionality they need to progress in the workflow. This uses a phase-functionality rule file which binds phases of the workflow to functionalities. During a second phase, the functionalities are mapped to REST service calls using RESTdesc functional descriptions. During the third step, the engine executes the generated service calls and pushes the resource it acted on to the next phase in the workflow using a phase-transition rule file. The main advantage of this approach is that each step can be influenced by external information from the Linked Open Data cloud. It exploits the fact that Linked Open Data and RESTful Web services and APIs are resource-oriented. Moreover, the workflow rule file makes the system easily adaptable and extensible to achieve new functionalities or to obey changing company policies. Finally, the separation between functional descriptions and service descriptions supports easy management over the fast-changing services at hand.".
- publication abstract "In this paper, we provide a view on the future Web as a Semantic read–write Web. Given a number of prerequisites for enabling a fully read–write Web for machines, we predict the following. First, datasets and data in general will become self-organizing, driven by local updates of machines. Second, human-created machine-readable schemas will become obsolete. Third, Web services will no longer be visible on the Web. Fourth, reasoning on the Web stays monotonic and will be optimized for versioned data. Additionally, two application areas are discussed where the read–write Semantic Web will have a deep impact: the Web of Things and advanced personalisation.".
- publication abstract "Until now, the SPARQL query language was restricted to simple entailment. Now SPARQL is being extended with more expressive entailment regimes. This allows to query over inferred, implicit knowledge. However, in this case the SPARQL endpoint provider decides which inference rules are used for its entailment regimes. In this paper, we propose an extension to the SPARQL query language to support remote reasoning, in which the data consumer can define the inference rules. It will supplement the supported entailment regimes of the SPARQL endpoint provider with an additional reasoning step using the inference rules defined by the data consumer. At the same time, this solution offers possibilities to solve interoperability issues when querying remote SPARQL endpoints, which can support federated querying frameworks. These frameworks can then be extended to provide distributed, remote reasoning.".
- publication abstract "Querying the Linked Open Data cloud as a whole still remains problematic. Prior knowledge is required to federate the queries to the appropriate datasets: each dataset provides its own SPARQL endpoint to query that dataset, and each dataset uses its own vocabulary to describe their information. In this paper, we propose a federated and asynchronous SPARQL framework to query the global data space. Our query federation is based on the fact that owl:sameAs relationships are symmetrical and transitive and, hence, that such linked resources are interchangeable. The provenance of the generated owl:sameAs links during the enrichment process is processed to resolve the SPARQL endpoints of linked datasets and their vocabulary mappings. This information is used during the query federation. Asynchronous SPARQL processing makes the query federation more scalable and exploits better Web caching.".
- publication abstract "In this paper, we provide a novel semantic workflow system, based on semantic functional service descriptions and a rule file that can be scheduled on a grid. The workflow engine follows a three-step process. First, it determines for all the resources in its knowledge base the functionality they need to progress in the workflow. This uses a phase-functionality rule file which binds phases of the workflow to functionalities. During a second phase, the functionalities are mapped to REST service calls using RESTdesc functional descriptions. During the third step, the engine executes the generated service calls and pushes the resource it acted on to the next phase in the workflow using a phase-transition rule file. On a grid, each step will be fulfilled by another type of node. The grid architecture of the workflow engine guarantees its scalability.".
- publication abstract "We present the implementation of the smartAPI ecosystem to facilitate the creation of FAIR (Findable, Accessible, Interoperable, Reusable) APIs.".
- publication abstract "There is an abundance of services and applications that find the most efficient route between two places, but people are not always interested in efficiency; sometimes we just want a pleasant route. Such routes are subjective though, and may depend on contextual factors that route planners are oblivious to. One possible solution is to automatically learn what a user wants, but this requires behavioral data, leading to a cold start problem. Moreover, this approach falls flat when someone wants to try something new, as users become locked in filter bubbles. An alternative approach is to let the user express their desires explicitly, effectively helping them create the most pleasant route themselves. In this paper we provide a proof of concept of a client-side route planner that does exactly that. We aggregated the Point of Interest information from OpenStreetMap into Regions of Interest, and published the results on the Web. These regions are described semantically, enabling the route planner to align the user’s input to what is known about their environment. Planning a 3km long pedestrian route through a city center takes 5 seconds, but subsequent adjustments to the route require less than a second to compute. These execution times imply that our approach is feasible, although further optimizations are needed to bring this to the general public.".
- publication abstract "Travelers have higher expectations than current route planning providers can fulfill, yet new solutions struggle to break through. Matching user experience from existing applications is already challenging without the large-scale infrastructure most of them have at their disposal; additionally integrating datasets such as the road network, public transportation schedules, or even real time air quality data is an even more laborious endeavour. The OSM road network has recently been published as routable Linked Open Data, following a similar approach to vector tiles. We explore several ways of preprocessing the tiles to improve the user-perceived performance of query evaluation. We integrated the results into a route planner for public transportation.".
- publication abstract "Automatic analysis of multimedia resources has become a necessity due to an ever increasing multimedia production. In this paper, we introduce a novel framework that integrates multiple web services in an abduction-deduction-induction reasoning cycle. By intelligently combining multiple services, better analysis results are achieved than by using a single analysis service. The use of semantic service descriptions and semantic reasoning enables automatic service selection and repurposing of the different services. We evaluated the proposed framework for the use case of face detection, showing promising results.".
- publication abstract "Learning analytics can provide adaptive learning and performance support by analyzing user tracking logs. However, data-driven learning is usually confined to a specific context (e.g., learning English within one application), and thus not interoperable across systems or domains. In this paper, we investigate ways to improve integration of data across applications and educational domains, by means of Linked Data, using existing standards such as the Experience API. Using JSON-LD, existing Learning Record Store tools can be used to store the tracking logs, which are then interpreted and aligned as Linked Data. We have applied the solution in an initial data capture resulting in more than two million statements spanning two different applications. This way, we aim to enrich adaptive learning and performance support across contexts.".
- publication abstract "Finding relevant content automatically is still not straightforward due to the unstructured nature of large text corpora. Moreover, traditional techniques to extract structured information out of these corpora are mostly very fine-grained, which deteriorates the needed high-level overview to be able to compare publications. Also, publishing this information as Linked Data can provide for very important context information. This demo paper describes StoryBlink, a Web application that enables the discovery of stories through linked books. By finding paths between compact semantic summaries of stories, it provides the user with relevant stories, based on previously selected publications. By also returning the semantic similarities between these stories, it gives the user a quick insight into how certain stories are connected. As such, StoryBlink enables an automatic content-based discovery journey between linked stories.".
- publication abstract "The DBpedia Extraction Framework, the generation framework behind one of the Linked Open Data cloud’s central hubs, has limitations which lead to quality issues with the DBpedia dataset. Therefore, we provide a new take on its Extraction Framework that allows for a sustainable and general-purpose Linked Data generation framework by adapting a semantic-driven approach. The proposed approach decouples, in a declarative manner, the extraction, transformation, and mapping rules execution. This way, among others, interchanging different schema annotations is supported, instead of being coupled to a certain ontology as it is now, because the DBpedia Extraction Framework allows only generating a certain dataset with a single semantic representation. In this paper, we shed more light to the added value that this aspect brings. We provide an extracted DBpedia dataset using a different vocabulary, and give users the opportunity to generate a new DBpedia dataset using a custom combination of vocabularies.".
- publication abstract "Data has been made reusable and machine-interpretable by publishing it as Linked Data. However, Linked Data automatic processing is not fully achieved yet, as manual effort is still needed to integrate existing tools and libraries within a certain technology stack. To enable automatic processing, we propose exposing functions and methods as Linked Data, publishing it in different programming languages, using content negotiation to cater to different technology stacks, and making use of common, technology-independent identifiers to make them discoverable. As such, we can enable automatic processing of Linked Data across formats and technology stacks. By using discovery endpoints, similar to those used to discover vocabularies and ontologies, the publication of these functions can remain decentralized.".
- publication abstract "RDF generation processes are becoming more interoperable, reusable, and maintainable due to the increased usage of mapping languages: languages used to describe how to generate an RDF graph from (semi-)structured data. This leads to a rise of new mapping languages, each with different characteristics. However, it is not clear which mapping language can be used for a given task. Thus, a comparative framework is needed. In this paper, we investigate a set of mapping languages that inhibit complementary characteristics, and present an initial set of comparative characteristics based on requirements put forward by those mapping languages. Initial investigation found 9 broad characteristics, classified in 3 categories. To further formalize and complete the set of characteristics, further investigation is needed, requiring a joint effort of the community.".
- publication abstract "More and more users obtain new knowledge using e-learning systems, and often assess their understanding of this new knowledge using corresponding assessment items. However, the distribution of content items and assessment items in a learning object is tightly bound. To publish assessment items, independently of the corresponding content items, it is required to wrap these assessment items into separate learning objects, which introduces a large overhead. Moreover, current learning objects are closely coupled with their execution environment. A stand-alone and lightweight format to describe assessment items is needed. This way their publication is facilitated and their discoverability can be increased. This paper proposes some important features for such a format and introduces SERIF: a Semantic ExeRcise Interchange Format, whose underlying data model is based on the QTI data model. SERIF was applied successfully in three proof-of-concept applications, where we assessed how SERIF is (i) decoupled from the execution environment, and (ii) extendable to other content types and interaction types. Its machine-interpretability can allow for automatic discovery and combination of relevant assessment items.".
- publication abstract "Digital publications can be packaged, distributed, and viewed via the Open Web Platform using the EPUB 3 format. Meanwhile, the increased amount of mobile clients and the advent of HTML5’s Geolocation have opened a whole range of possibilities for digital publications to interact with their readers. However, EPUB3 files often remain closed silos of information, no longer linked with the rest of the Web. In this paper, we propose a solution that addresses the difficulties of reconnecting digital publications with the Web using the spatial location of the concepts mentioned in the publication. We enrich digital publications by connecting the detected concepts to their URIs on, e.g., DBpedia, and use an algorithm that uses these URIs to retrieve or approximate the coordinates of these concepts. The evaluation of the approximation algorithm showed that almost any concept can be linked to a coordinate, but that the errors can be very high when context information is limited. This means relevant locations for a user can be shown, based on the content he or she is reading, and based on his or her location. This methodology can be used to reconnect digital publications with the online world, to entice readers, and ultimately, as a novel location-based recommendation technique.".
- publication abstract "Digital publications host a large amount of data that currently is not harvested, due to its unstructured nature. However, manually annotating these publications is tedious. Current tools that automatically analyze unstructured text are too fine-grained for larger amounts of text such as books. A workable machine-interpretable version of larger bodies of text is thus necessary. In this paper, we therefore suggest a workflow to automatically create and publish a machine-interpretable version of digital publications as linked data via DBpedia Spotlight. Furthermore, we make use of the Everything is Connected Engine on top of this published linked data to link digital publications using a Web application dubbed “StoryBlink”. StoryBlink shows the added value of publishing machine-interpretable content of unstructured digital publications by finding relevant books that are connected to selected classic works. Currently, the time to find a connecting path can be quite long, but this can be overcome by using caching mechanisms, and the relevancy of found paths can be improved by better denoising the DBpedia Spotlight results, or by using alternative disambiguation engines.".
- publication abstract "Ontologies and reasoning algorithms are considered a promising approach to create decision making applications. Rule-based reasoning systems have the advantage that rule sets can be managed and applied separately, which facilitates the custom configuration of those systems. However, current implementations of rule-based reasoning systems usually introduce a trade-off between expressiveness and performance, which either deteriorates the configurability of the application, or limits its performance in an event-driven system. In this paper, we devise an event-driven rule-based reasoning system that preserves its expressiveness. We devise an automatic nurse call system that is able to handle hard time constraints without limiting the possibilities of the reasoner, and list the encountered problems together with their suggested solutions. We achieve reasonable performance in small-scale environments when evaluating this system using N3 rules and the EYE reasoner. We however observe that a large dynamic database limits the performance of the system, because of the file-based nature of the EYE reasoner. As long as no in-memory reasoning is supported, the performance of the resulting system cannot compete with the state of the art. However, the linear scaling of the proposed expressive solution is promising.".
- publication abstract "A large part of scientific output entails computational experiments, e.g., processing data to generate new data. However, this generation process is only documented in human-readable form or as a software repository. This inhibits reproducibility and comparability, as current documentation solutions do not provide detailed metadata and rely on the availability of specific software environments. This paper proposes an automatic capturing mechanism for interchangeable and implementation independent metadata and provenance that includes data processing. Using declarative mapping documents to describe the computational experiment, term-level provenance can be automatically captured, for both schema and data transformations, and storing both the used software tools as the input-output pairs of the data processing executions. This approach is applied to mapping documents described using RML and FnO, and implemented in the RMLMapper. The captured metadata can be used to more easily share, reproduce, and compare the dataset generation process, across software environments.".
- publication abstract "Data quality is an important factor for the success of the envisaged Semantic Web. As machines are inherently intolerant at the interpretation of unexpected input, low quality data produces low quality results. Recently, constraint languages such as SHACL were proposed to assess the quality of data graphs, decoupled from the use case and the implementation. However, these constraint languages were designed with machine-processability in mind. Defining data shapes requires knowledge of the language’s syntax – usually RDF – and specification, which is not straightforward for domain experts, as they are not Semantic Web specialists. The notion of constraint languages is very recent: the W3C Recommendation for SHACL was finalized in 2017. Thus, user interfaces that enable domain experts to intuitively define such data shapes are not thoroughly investigated yet. In this paper, we present a non-exhaustive list of desired features to be supported by a user interface for editing data shapes. These features are applied to unSHACLed: a prototype interface with SHACL as its underlying constraint language. For specifying the features, we aligned existing work of ontology editing and linked data generation rule editing with data shape editing, and applied them using a drag-and-drop interface that combines data graph and data shape editing. This work can thus serve as a starting point for data shape editing interfaces.".
- publication abstract "Data is scattered across service providers, heterogeneously structured in various formats. By lack of interoperability, data portability is hindered, and thus user control is inhibited. An interoperable data portability solution for transferring personal data is needed. We demo PROV4ITDaTa: a Web application, that allows users to transfer personal data into an interoperable format to their personal data store. PROV4ITDaTa leverages the open-source solutions RML.io, Comunica, and Solid: (i) the RML.io toolset to describe how to access data from service providers and generate interoperable datasets; (ii) Comunica to query these and more flexibly generate enriched datasets; and (iii) Solid Pods to store the generated data as Linked Data in personal data stores. As opposed to other (hard-coded) solutions, PROV4ITDaTa is fully transparent, where each component of the pipeline is fully configurable and automatically generates detailed provenance trails. Furthermore, transforming the personal data into RDF allows for an interoperable solution. By maximizing the use of open-source tools and open standards, PROV4ITDaTa facilitates the shift towards a data ecosystem wherein users have control of their data, and providers can focus on their service instead of trying to adhere to interoperability requirements.".
- publication abstract "In recent years, the concept of machine-interpretable annotations – for example using RDFa – has been gaining support in the Web community. Websites are increasingly adding these annotations to their content, in order to increase their discoverability and visibility to external agents and services. This paper highlights two problems with current annotations and offers potential solutions. The first problem is that annotations largely remain centralized: the party responsible for publishing the content is also in control of the annotations. This limits the available annotation sources; for instance, note, comments, and links offered by third parties can only appear with the explicit approval of the content publisher. The second problem is that digital books have not undergone the same evolution yet, and the vast amount of useful information contained within them remains siloed and unaccessible for machines.".
- publication abstract "Data provenance is defined as information about entities, activities and people producing or modifying a piece of data. On the Web, the interchange of standardized provenance of (linked) data is an essential step towards establishing trust. One mechanism to track (part of) the provenance of data, is through the use of version control systems (VCS), such as Git. These systems are widely used to facilitate collaboration primarily for both code and data. Here, we describe a system to expose the provenance stored in VCS in a new standard Web-native format: W3C PROV. This enables the easy publication of VCS provenance on the Web and subsequent integration with other systems that make use of PROV. The system is exposed as a RESTful Web service, which allows integration into user-friendly tools, such as browser plugins.".
- publication abstract "A popular way to log learning processes is by using the Experience API (abbreviated as xAPI), also referred to as Tin Can. While Tin Can is great for developers who need to log learning experiences in their applications, it is more challenging for data processors to interconnect and analyze the resulting data. An interoperable data model is missing to raise Tin Can to its full potential. We argue that in essence, these learning process logs are provenance. Therefore, the W3C PROV model can provide the much-needed interoperability. In this paper, we introduce a method to expose PROV using Tin Can statements. To achieve this, we made the following contributions: (1) a formal ontology of the xAPI vocabulary, (2) a context document to interpret xAPI statements as JSON-LD, (3) a mapping to convert xAPI JSON-LD statements into PROV, and (4) a tool implementing this mapping. We preliminarily evaluate the approach by converting 10 xAPI statements taken from the public Tin Can Learning Record Store to valid PROV without loss of information, therefore ensuring that the conversion process is reversible.".
- publication abstract "Assessing the trustworthiness of a dataset is of crucial importance on the Web of Data and depends on different factors. In the case of Linked Data derived from (semi-)structured data, the trustworthiness of a dataset can be assessed partly through their mappings. The accuracy with which schema(s) are combined and applied to semantically annotate data – as described by its custom mapping definitions – plays a determinant role to the dataset’s overall potential. The inherent value of mapping definitions is often neglected, as they are considered part of the implementation executing them. However, an approach was proposed to assess and refine such mapping definitions, which was proven to be more effective than assessing and refining the quality of a dataset directly. In this paper, we derive important metadata from mappings quality assessment and refinement in the form of provenance information. The provenance of these mappings enables us to assess the relative trustworthiness of the datasets they generate.".
- publication abstract "In this paper, we propose a semantically enabled news exploration method to aid journalists in overcoming the information overload in today’s news streams. To achieve this, our approach semantically tags news articles, calculates their relatedness through their similarity based on these tags, and creates an article graph to be browsed by an end-user. Based on related work, the Jaccard metric seemed very suitable for this task. However, when we evaluated this similarity measure through crowdsourcing on a set of 120 article pairs, the results were only acceptable in the lower levels of relatedness, with unpredictable errors elsewhere. This reveals a need for better ground-truth data, and calls for clarification of the semantics of relatedness and similarity, and their relation.".
- publication abstract "The European textile and clothing (henceforth T&C) sector is forced to heavily invest in research and development in order to fight against global competition focused on cheap, fast fashion products. Micro businesses or individuals have trouble keeping up to date with these innovations. TCBL is a Horizon 2020 funded innovation action that started in July 2015 and will run for 4 years. TCBL aims to help the European T&C industry find new business models aimed at sustainability and innovation. Local business labs will help bring people together and offer a place with access to specialized machinery. An online community will be founded and act as a cross-boundary meeting place. A knowledge repository, the Knowledge Spaces, will support people eager to learn or share knowledge, thereby stimulating innovation. Semantic interpretation of the user interactions with the Knowledge Spaces will form the basis for an analytic tool which will help anticipate user needs.".
- publication abstract "Semantic Web reasoners are powerful tools that allow the extraction of implicit information from RDF data. This information is reachable through the definition of ontologies and/or rules provided to the reasoner. To achieve this, various algorithms are used by different reasoners. In this paper, we explain how state space search can be applied to perform backward-chaining rule-based reasoning. State space search is an approach used in the Artificial Intelligence domain that solves problems by modeling them as a graph and searching (using diverse algorithms) for solutions within this graph. State space search offers inherent proof generation and the ability to plug in different search algorithms to determine the characteristics of the reasoner such as: speed, memory or ensuring shortest proof generation.".
- publication abstract "Semantically annotating and interlinking Open Data results in Linked Open Data which concisely and unambiguously describes a knowledge domain. However, the uptake of the Linked Data depends on its usefulness to non-Semantic Web experts. Failing to support data consumers understanding the added-value of Linked Data and possible exploitation opportunities could inhibit its diffusion. In this paper, we propose an interactive visual workflow for discovering and exploring Linked Open Data. We implemented the workflow considering academic library metadata and carried out a qualitative evaluation. We assessed the workflow’s potential impact on data consumers which bridges the offer as published Linked Open Data, and the demand as requests for: (i) higher quality data; and (ii) more applications that re-use data. More than 70% of the test users agreed that the workflow fulfills its goal: it facilitates non-Semantic Web experts to understand the potential of Linked Open Data.".
- publication abstract "Linked Data offers an entity-based infrastructure to resolve indirect relations between resources, expressed as chains of links. If we could benchmark how effective retrieving chains of links from these sources is, we can motivate why they are a reliable addition for exploratory search interfaces. A vast number of applications could reap the benefits from encouraging insights in this field. Especially all kinds of knowledge discovery tasks related for instance to ad-hoc decision support and digital assistance systems. In this paper, we explain a benchmark model for evaluating the effectiveness of associating chains of links with keyword-based queries. We illustrate the benchmark model with an example case using academic library and conference metadata where we measured precision involving targeted expert users and directed it towards search effectiveness. This kind of typical semantic search engine evaluation focusing on information retrieval metrics such as precision is typically biased towards the final result only. However, in an exploratory search scenario, the dynamics of the intermediary links that could lead to potentially relevant discoveries are not to be neglected.".