
Recently Sebastian Trüg held a presentation about Nepomuk-KDE and kindly provided me the slides. In this regard this post is an extension to the post State and Plans of Nepomuk-KDE.
More than just files
After the last post about Nepomuk-KDE many people discussed the pros and cons of a central storage of meta data. Additionally, alternative solutions like using file system capabilities (xarguments, etc.) where often mentioned and discussed.
However, most of this discussions failed to understand what meta data in the context of the semantic desktop are about: they are not only about files. Instead, the main goal of the Semantic Desktop idea is to gather all the data which cannot be connected to a single physical equivalent of a file. Think of bookmarks, e-mails (mbox format for example) and similar things. The other way around is also possible: projects often contain entire sets of files which all belong to one project. Also, in cases like address cards it doesn’t make sense to save the meta data of the file in the attributes of the physical file because in the end you would have to replicate the entire file content.
Soprano and Strigi
As already mentioned the meta data will be stored in a RDF storage – meet Soprano:
Soprano is a library which provides a QT wrapper API to different RDF storage solutions. It features named graphs (contexts) and has a modular plug-in structure which allows to use RDF backends implemented with different RDF Storage.
As the central meta data storage Soprano will be accessible to all applications through the KDE application framework
The storage itself will be filled in different ways. First, there is KMetaData: It provides easy to use functions for system developers to create and read meta data in storage. Think of applications where meta data are an essential part of the program: Digikam and Amarok are typical examples.
Second, strigi – KDE 4’s desktop search machine – will walk through the data available on the hard disk and will extract the file meta data as well as the content of the files (where it makes sense). As an example audio files often cary information about the artist in the meta data, while PDF files can contain meta data about the author but of course also the text in the PDF file itself.
KDE integration
Anyway, back to Nepomuk-KDE: to get a better picture how all the pieces like Nepomuk-KDE, Soprano, KMetaData and strigi work together Sebastian created a chart showing the different pieces:

Integrated into the programs this could look like this mockup:

This window with additional information about a sender of an e-mail contains more related information about the sender: some of information are directly aggregated from the contact data, like the e-mail address, the phone number and the web page, but there are also files displayed which the user has received by this contact, and you can see other people who are related to this person.
This is just a mockup but it gives a pretty good impression of what you can expect in the future – with much more to come.
KNepomuk, KMetaData and Nepomuk – about names
A last word about the naming: although I used the term KMetaData in this post (and particularly in the graphic), this name is actually not valid anymore: libnepomuk now contains KMetaData (together with Konto) and uses only one single namespace, “Nepomuk”. The old knepomuk became the new “Nepomuk::Middleware”.
The original plan was to change the name KMetaData to Braid, however this plan was dropped in favour of the restructuring and the fact that the term Nepomuk is already out there and pretty well known to the people.