KulturNav. A collaborative approach to the management of shared vocabularies

Sara Kayser, Ulf Bodin

The expansion of the world wide web has revolutionised the way in which museums communicate with one another. While attempts to align cataloguing through common nomenclatures and authorities go back several decades, it is only with the emergence of the semantic web that it has become truly feasible to deploy common terminologies. Widely used collaborative authorities such as the Getty Art and Architecture Thesaurus (Getty AAT) are no longer static resources, and the digital revolution has removed the need for costly revisions of paper publications.

More recently, the advent of open data has enabled users to link information across geographies through open sources like Wikidata and DBpedia. However, a key challenge remains: as most catalogues are in languages other than English and contain art and cultural objects of mostly local interest, how do we connect the large array of local terminologies to these more widely adopted authorities so that all institutions can benefit from the semantic web, regardless of size, resources or location? And how do we keep vocabularies relevant and up-to-date for a diverse community of institutions?

KulturNav, a platform for common cultural heritage authorities, developed by KulturIT since 2011, strives to provide a solution to these and other challenges facing museums in Norway and Sweden during their current digital transition. In this paper, we will demonstrate how the local can be connected to the national and beyond through online collaboration around the development and maintenance of common authorities. We will also highlight some of the challenges that we have faced in trying to create a tool that will support this task.

The homepage
The homepage
About KulturNav

The platform is the result of Norwegian museums’ requirement for a space where common authorities can be stored and maintained. Financial and political support from the Art Council Norway has made it possible for KulturIT and the participating museums to develop KulturNav into a collaborative space where museums can develop the authorities they need. Information can be provided both as human and machine-readable data, so that it can be shared and reused by as many institutions as possible.

While the content on KulturNav is free for anyone to use, the development of the platform is directly linked to the development of the collections management system Primus and its online platform DigitaltMuseum where the authorities are used to link between objects from different institutions, also developed by KulturIT and its partner museums.

KulturNav is a Software as a Service (SaaS) built on open source technology. The authority lists are published as Linked Open Data in standard formats such as RDF/XML and JSON-LD. The data model is inspired by the CIDOC CRM and Europeana EDM. Data on the platform is free to use and can be integrated through an Open API or exported as a CSV or Excel file. KulturNav metadata is compliant with the DCAT Application Profile for data portals in Europe (DCAT-AP).

Terminologies as datasets

KulturNav handles all types of controlled vocabularies as datasets, whether they consist of terms, names of persons or places. There are currently five types of datasets: Agent, Concept, Place, Named object and Time span.

The datasets function as administrative containers and are not strictly facets as they may be used in a true vocabulary, such as that of the Getty AAT5. However, the system is versatile: the datasets can contain a list of Swedish photographers, for example, or it can establish a thesaurus of materials used in documenting conservation interventions.

KulturNav is also capable of creating datasets of great depth. It supports the SKOS data model but also has some bespoke features. A concept can be defined with more than 20 different types of relations and a term label can be recorded in an unlimited number of languages. The platform also supports complex person records, although most entries focus on names and biographical information (birth, death, life role, etc.).

Examples of a concept entry to the left and a person entry to the right
Examples of a concept entry to the left and a person entry to the right

To upload or create a dataset requires a user account on https://KulturNav.org. Once this is set up, KulturIT can import an existing vocabulary, or the user can create the dataset by adding concepts. Each dataset is attached to an official owner – usually an institution with expertise within the subject that the dataset covers. The owner institution appoints an administrator who is responsible for the maintenance of the dataset. The administrator can also nominate editors and reviewers. These roles can be filled by users from other institutions so that networks of experts can cooperate around a dataset. We will discuss collaboration in more detail in a later section.

Users of the collections management system (CMS) Primus can search directly among all the datasets on KulturNav from the database or import complete datasets. It is also possible to for them to create their own “bundles” of concepts and synchronise these with their local vocabularies. When users choose to utilise one of the KulturNav names or concepts instead of a local one, the object in question will be published on DigitaltMuseum, with a link to all other objects using the same record from KulturNav. Consequently, Primus users who choose to publish data on DigitaltMuseum are able to make connections to objects in their own collection, as well as in more than 300 other collections online.

A search in KulturNav from Primus and the KulturNav authority record displayed on DigitaltMuseum with linked objects
A search in KulturNav from Primus and the KulturNav authority record displayed on DigitaltMuseum with linked objects

Anyone with a CMS other than Primus, can select the terms they want to use and export them for use in their own systems.
KulturNav also contains datasets which are not editable, for example Europeana Fashion7. This dataset contains so called placeholders which are static records which have been uploaded from another source and can therefore not be edited in KulturNav. Any changes must be made in the original source. It will then remain static, as changes are only permitted to be made at source, in order to avoid creating new UUID’s for objects which already have one. This feature enables the user to use the content of the terminologies and match them with local terms without the need to store them twice.

Making connections

KulturNav exploits the potential of the semantic web in several ways. Users can manually create a record and link it to external web resources by adding URLs with a SameAs relationship for an object or a concept matching relation based on SKOS. Names and concepts can be fetched from the Virtual International Authority File (Viaf) or the Getty AAT and used to create the entry in KulturNav using the data in the linked source. Once a record is created, an improvement panel will make suggestions for enhancements that can be made to the records. A bot searches through KulturNav and a predefined set of external sites and proposes additional links.

Concepts linked to a Wikidata entry or an entry in Store Norske Leksikon (the Norwegian national encyclopaedia) automatically receive a snippet related to that term.

A Person record with the improvement panel and Wikipedia snippet
A Person record with the improvement panel and Wikipedia snippet

In summary, the KulturNav bot ensures that linking to related terms within the platform and to sources on the semantic web is made easy. It also completes records with information from external sources to increase the quality of the available data.

KulturNav offers further ways of enhancing data quality, most notably by encouraging users from different institutions to refine shared information. Such collaboration sometimes takes a formal approach, perhaps in the form of a project where we work with several institutions with expertise within an area to develop an authority. This type of initiative often creates large datasets with detailed records that are widely used by the collaborating institutions. However, KulturNav also allows for a more informal and efficient way for institutions to collaborate, that requires less organisation and can take place entirely online.

This works in the following way: on the platform, an owner can open a dataset for suggestions from other users. Where this option is chosen, any user with an account can propose a change directly in the dataset. The administrator receives a notification when a suggestion has been made, and has the option to accept, reject, edit or delete it completely. By applying such ‘crowdsourcing’ functionality to datasets, users are given an opportunity to ensure that vocabularies contain the concepts they need, while also creating a forum for collaboration.

The dialogue tool in action
The dialogue tool in action

KulturNav’s collaborative tools help to keep datasets relevant to many different types of institutions, while ensuring that they are gradually extended and improved upon. This is particularly important in the context of national and regional linkages.

Incorporating both national and regional institutions, means that KulturNav is able to boost both the accessibility and relevance of any given dataset. Regional institutions often possess more detailed information on local materials, artists or musicians, but tend to have fewer resources, and may lack the digital competence of the larger national institutions. Hence, access to a common platform where both types of institutions can collaborate around, for example, a national list of artists, is mutually beneficial.8 In this way, a little-known artist, added to a dataset on KulturNav could be placed in a wider global context through links to national and international data. For users without access to the local information, the datasets on KulturNav are likely the only source on individuals and topics not covered by Viaf, the Getty AAT or even Wikipedia.

Even specialist datasets, such as that of Norwegian canning factories, have wider use than one might expect. Where else would a museum with an obscure Norwegian tin can in their collection find everything, they need to know about the factory that canned it so readily available?

Addressing Challenges

Making connections between data sources and among our partner museums is not the only advantage of this approach. Collaboratively produced information also makes for more relevant datasets. Possibilities to add local and dialectical synonyms to terms in the vocabularies also increases the relevance, as it ensures that the terminology used is understood both internally and externally.

In our experience, increased relevance leads to better and more frequent usage. However, to ensure that the authorities are productively utilised, other challenges must also be addressed, many of which are not unique for KulturNav. While the ‘ownership’ of datasets by official cultural heritage institutions creates confidence in the data, ensures quality and relevance, and provides a contact point for anyone with questions or comments, other aspects of the approach can sometimes produce undesirable results.

These include: datasets left unfinished or becoming static as the responsible organisation prioritises other tasks; some areas being comprehensively addressed while others have very limited data; and features such as synonyms and relationships being inconsistently applied, leading to data of uneven quality.

Moreover, many institutions still prefer to use local vocabularies for some catalogues as these contain terminology which is well understood internally. KulturNav addresses these challenges in a number of ways. The in-built improvement panel, for example, helps to reduce discrepancies in areas such as synonym usage and the ordering of hierarchical relationships.

KulturNav also provides the option to use common, more accessible and widely-linked vocabularies at the same time as a local vocabulary, and in the same database. A user of Primus can choose to save a local copy of an authority record so that it can be enriched with information pertaining to that specific collection, but which may have little or no significance for the wider museum community, or has to remain internal for security reasons.

Allowing this mix of local and shared authorities in the same database ensures that authorities remain relevant for local, specialist users as well as for public use. Importantly, by allowing dataset editors to accept or reject proposals to lists, we can enjoy the benefits of collaboration while ensuring vocabularies remain scholarly and authoritative. To remedy the uneven coverage of subjects and stagnant datasets we must first address the underlying cause: a lack of resources within the organisations to set aside the time necessary to complete and maintain datasets.

Recently, the Norwegian government has reinforced its commitment to the KulturNav project in the form of a Norwegian Art Council grant. Under the terms of the grant, cultural institutions are being asked to take responsibility for datasets of national importance within their area of expertise and collaborate with other institutions with similar collections. The aim of the support is also to establish a framework for a sustainable model for common management of cultural heritage vocabularies. This is a step in the right direction, and we look forward to seeing the results of the grant’s implementation.

Progress to Date

As we have tried to demonstrate above, the KulturNav platform encourages collaboration around vocabularies that can open up local collections to a wider audience through links to other datasets within KulturNav and external vocabularies. Our aim is also to make this process as effortless as possible. So, how are we doing so far? In many respects, the collaborative approach has already proved a success, as is evident in the large number of published datasets that are regularly in use.

A dataset with the ‘lodometer’ showing the number of organizations that use the dataset (49), how many links there are to objects on DigitaltMuseum (295394) and the percentage of entries that have links (25%). This information is important in establishing the datasets’ “authority”
A dataset with the ‘lodometer’ showing the number of organizations that use the dataset (49), how many links there are to objects on DigitaltMuseum (295394) and the percentage of entries that have links (25%). This information is important in establishing the datasets’ “authority”

In our experience, the most promising examples of successful collaboration are those in which the initiative comes from the institutions that will use the authorities. In most cases a need is identified, but the institution concerned lacks the full competency to cover all aspects of a theme, or place or type of object, and therefore seeks the help of museums with similar collections.

As an example, eight maritime museums in Sweden, Norway and Finland (Åland)10 identified a common need for coherent terminologies, and decided to form a working group. The collaboration produced three datasets covering watercraft types, which are widely used by the institutions that created them. Similarly, we have also found that working with established regional and national networks, such as The Norwegian institute of bunad and folk costume (NBF), on lists common to a large number of organisations has created sustainable results.

Conclusion

Today’s platform is a specialised tool for managing cultural heritage vocabularies and authorities, and publishing them as linked open data. KulturNav as a project also provides the space for dialogue around how authorities should be maintained and developed for future use. Our goal has been, and continues to be, to provide a practical means for cultural heritage institutions in Norway, Sweden, and beyond, to turn their locally used terminologies into established common authorities – as well as to link the information in their catalogues to the global network of resources that make up the semantic web. There is a great opportunity to use the collaborative approach to expand the network geographically through semantic links to other open resources. We already work with Europeana to include some of the datasets on KulturNav among the platform’s dereferenceable vocabularies13. We also continue to exchange links with Wikidata – currently there are 18,000 links between KulturNav and the Wikimedia project.

Finally, the many examples of collaboration around common vocabularies in Norway, Sweden and elsewhere, suggest that institutions and their governing bodies agree that this is a useful way to ensure that data is interoperable across the sector, as well as accessible to a wider group of users.

However, some work remains when it comes to demonstrating the value of linked open data to those organisations that are asked to take responsibility for it, as well as the policy makers who determine the levels of investment to it. A number of well-defined impact goals would be useful in this regard, as a means to inspire organisations to continue their collaborative efforts and encourage smaller institutions to take advantage of the possibilities offered to them in the digital age.

To us, the benefits of collaboration are obvious: data from many different sources are brought productively together, and sharing the work of development among organisations reduces the burden on any one individual. Perhaps more importantly, the act of collaboration results in a form of ‘certification’, helping institutions to ensure that the concepts they deploy are accepted within the sector. KulturNav is only one way of facilitating such collaboration. We look forward to linking up with many more.

References

Brenden Hansen, Y., D. Hensten, G.B. Pedersen, M. Bognerud (2018) Norwegian Artist Names: Authority list of artists in Norwegian art collections. CIDOC 2018. [Avaliable at: https://bit.ly/2TDuT1P (Accessed 2 September 2020)].

DBPedia (2021). [Available at: https://wiki.dbpedia.org/ (Accessed 29 April 2021)].

Europeana (2021). [Available at: https://www.europeana.eu/en (Accessed 29 April 2021)].

SKOS (2021). [Available at: https://www.w3.org/2001/sw/wiki/SKOS (Accessed 29 April 2021)].

Store Norske Leksikon online (2021). [Available at: https://snl.no/ (Accessed 20 September 2021)].

The J. Paul Getty Trust (2021) Getty Art and Architecture Thesaurus Online. [Available at: https://www.getty.edu/research/tools/vocabularies/aat/ (Accessed 29 April 2021)].

VIAF (2021). [Available at: https://viaf.org (Accessed 29 April 2021)].

Wikidata (2021). [Available at: https://www.wikidata.org/wiki/Wikidata:Main_Page (Accessed 29 April 2021)].

Wikipedia (2021). [Available at: https://en.m.wikipedia.org/wiki/Main_Page (Accessed 29 April 2021)].

Sara_Kayser

Sara Kayser

She is a museum consultant with KulturIT AS, where she works with partner museums to facilitate the integration of digital systems and provides training and support to users. She worked on and supervised several digitisation projects in Egypt from 2005-2018, most recently as project curator for the British Museum. Her professional interests include the digital documentation and dissemination of cultural heritage.

Ulf_Bodin

Ulf Bodin

He is an archaeologist who has developed solutions for digital cultural heritage since the 1990s when he worked at the Swedish National Heritage Board and the National Museums of History in Stockholm. He is a consultant at KulturIT AS since 2011 and is responsible for the development of KulturNav.org. His other interests are information architecture, vocabularies, authority management and Linked Open Data (LOD).