Blog

Improving Access to Archives: Navigating the RES Index

By Elliot Smith

The focus of the Research & Education Space (RES) is to improve access to public archives for use in education. RES does this by indexing the data in those archives, then exposing that index as a single, aggregated point of entry to that data.

Rather than providing end user applications, RES instead supplies a service for developers, which enables them to include archive materials in their own Powered by RES applications. Most of these applications are intended for teachers and learners; typically, those applications help users find media resources stored in public archives. Some example use cases are:

  • A geography lecturer searches for a video of glacial movement.
  • A secondary school pupil needs an image of tadpoles to include in a presentation.
  • A physics teacher looks for a web page which explains the different states of matter.

Here are some examples of what’s already in the RES index:

Note that the media isn’t stored in the index; only the metadata about where it is, what it’s about, who contributed to it, when it was created etc. Also note that, at the moment, most of the data in the index comes from the BBC; but we’re working with external partners to add their data, too.

The Acropolis API is the gateway to finding data in the RES index. There is a general technical document which explains how to use this API in detail. This rest of this article builds on that document, giving examples of the kind of data returned and how to interpret it.

Using the Acropolis API

The Acropolis API is developer-focused and not intended for end users. However, it does have a simple interface you can experiment with through a web browser. As an example of how to use and make sense of the API without having to write any code, we’ll walk through one of the use cases above: finding an image of tadpoles for a presentation. The steps below take you through the process.

1. Search for a keyword

Go to http://acropolis.org.uk/?q=tadpole in your browser. Note the q=tadpole part of the URL. This asks Acropolis to search its index for the term tadpole and return matching resources. Note that the term “resource” here has a formal definition from RDF, the data format used by Acropolis. By this definition, a resource can represent anything, real or imagined: a person, a place, an event, a novel, a play, a web page, a video, an image, an idea etc.

The results are information resources which contain your keyword(s), displayed as a list of titles and descriptions (if available):

image

Each link points to a web page describing a proxy resource (see below).

2. Fetch the proxies

When a search is run against Acropolis, the result contains links to proxy resources (proxies). Proxies are created by Acropolis and are effectively aggregates of resources from different archives. For example, if both BBC Images and BBC Teach have data about the concept tadpole, Acropolis combines that data into a single proxy which stands for both tadpole in BBC Images and tadpole in BBC Teach. Media from both of those archives are then associated with the proxy, as you can see in the real Acropolis proxy for tadpole.

A client (whether person or program) using the index can follow links from tadpole proxies to data and media about the concept tadpole in all of the archives indexed by Acropolis. This simplifies the task of finding media across multiple archives: a client doesn’t have to individually search each archive, and can instead just search Acropolis.

For more details about how Acropolis generates proxy resources, see this blog post which explains aggregation.

In the case of our example search, clicking the first link takes us to the tadpole proxy page:

image

Notice that one section of the page is expanded. This section of the page contains data about the resource which exactly matches the URL in the browser address bar (http://acropolis.org.uk/4c56607b4e1b4fe8ac29a1ad1b245246#id); i.e. the proxy representing the concept of tadpole, mostly derived from data extracted from DBPediaLite.

The two columns inside the expandable section display statements about the proxy. For example:

dct:description "larva of amphibians" [en]
rdfs:seeAlso /15f8690cf16148f88db57bd79fe00595#id

You can read these statements as <proxy> has a <left-hand column value> property with a value <right-hand column value>, e.g.

the tadpole proxy has a dct:description property with a value '"larva of amphibians" [en]'

The property dct:description refers to the term description from the dct vocabulary; clicking on dct:description will take you to the web page which describes that vocabulary (DCMI Metadata Terms). So what the statement really says is:

the tadpole proxy has a description property (as defined in the DCMI Metadata Terms vocabulary) with a value '"larva of amphibians" [en]'

The reason for specifying the vocabulary for each property is to reduce ambiguity. Because the DCMI Metadata Terms vocabulary formally defines the meaning of description, a machine reading dct:description can differentiate this from other meanings of description in other vocabularies.

Given a set of result proxies, a client can use their properties to find media players and assets depicting them, as described in the next section.

3. Find the media

Finding the media linked to a resource requires some understanding of how Acropolis organises its index. Acropolis uses two main properties to associate resources with media:

  • mrss:player

This links a resource to a player: an HTML page (typically) which embeds some media, and which may provide controls for playing it and/or (optionally) some metadata about it (e.g. caption, copyright declaration, links to alternative sizes or formats). An iPlayer page is an example of a player, as are a YouTube video page and a Flickr image page.

The player for a piece of media may prompt for authentication, to prevent access by unauthorised users. (Note that Acropolis doesn’t require any authentication and relies on the server hosting the player to do this.)


  • mrss:content

This links a resource to a digital asset (e.g. an audio, image or video file). The asset can be directly embedded into an application through its URI. For example, this resource has an mrss:content statement in the index:

mrss:content http://remarc-assets.pilots.bbcconnectedstudio.co.uk/images/1980_TV&Radio_CrackerjackA.jpg

The URI is a direct link to a JPEG image asset which can be embedded directly in an application. As with players, Acropolis doesn’t enforce any access control; but a client can expect that an asset linked to using mrss:content has no access controls and can be freely used (with the caveat that its copyright must be respected).

For the purposes of finding media, a client can look for mrss:player and mrss:content statements attached to any proxies it has retrieved. Unfortunately, the tadpole proxy we have at the moment has no mrss:player or mrss:content statements. However, in such cases, we can fetch more proxies which are related to the resources we already have, in the hope that they will have mrss:player or mrss:content properties.

The most relevant are likely to be those associated with the proxy through rdfs:seeAlso statements: these resources have some kind of relationship with the proxy, such as representing the same entity, having the proxy as a subject, or being about the proxy.

For example, the first rdfs:seeAlso link for the tadpole proxy goes to this resource…

image

…which is a photograph of tadpoles.

The photo relates to our original tadpole proxy via rdfs:seeAlso. This is because the photo is about a Tadpole, as defined by DBPedia resource…

dct:subject http://dbpedia.org/resource/Tadpole

…and our original tadpole proxy is the same as (owl:sameAs) the Tadpole, as defined by DBPedia resource…

http://acropolis.org.uk/4c56607b4e1b4fe8ac29a1ad1b245246#id owl:sameAs http://dbpedia.org/resource/Tadpole

In diagrammatic form, the relationships look like this:

image

As you can see, there is an mrss:player statement on the photo proxy:

mrss:player http://bbcimages.acropolis.org.uk/14605105/player

This URI is for a player page which displays an image of tadpoles (you may need to accept a licence agreement before you can see it).

And, finally, we’ve achieved our goal of finding an image of some tadpoles.

Conclusions

The above only covers a small part of the capabilities of the Acropolis API, but should provide some clues to finding useful media before you write any code. The same steps can be used programmatically to find media, the main difference being that your code would fetch and parse RDF rather than HTML. But that’s a topic for another article.

Some developers have already built applications which use data from Acropolis, using techniques similar to these. You can find out more about them on the RES website.

If you need any assistance getting started with the Acropolis API, please contact the RES team

RES: A Presentation & Demonstration Event

By Alison Kelly and Aisling Jarvis

Last month we held an event for senior leaders from the education, cultural and developer communities to present the achievements of the Research & Education Space (RES) so far and demonstrate some of the educational products currently being powered by RES.  As we gathered in the majestic rooms of The Royal Society, where some of the greatest pioneering luminaries of the past four centuries have been Fellows - it felt like a seminal moment for RES.

image

We wanted to offer the delegates an opportunity to listen to the experiences stakeholders have had with RES across the board; within the edtech community, within schools, from the product developers and also the influencers within cultural organisations. We asked the eight product developer organisations who have been working with RES to come and demonstrate their products, and asked others to present their experiences of working with RES. We managed to host an interesting and eclectic mix of products including demonstrations from Learning on Screen, Imagen, Gooii, Amplify Design, the BBC Shakespeare Archive Resource and Planet eStream. We also heard presentations from Craig Smith (Business Development Manager, Gooii), Richard Light (Principal at Richard Light Consulting), Alex Morris (Communications Officer, Learning on Screen), Professor John Ellis (Professor of Media Arts, Royal Holloway University), Andrew Milburn (Co-Founder at Planet eStream), Eberhard Frank (Director, Amplify Design), English Teacher Anita Ark and our very own RES Product Manager Jake Berger.

Anita Ark, Head of English at Eastbury Community School in Barking gave an inspiring talk on how the BBC’s Shakespeare Archive Resource, powered by RES, has helped her students to better understand Shakespeare:

“Having more visual-audio resources has definitely changed the way we teach Shakespeare. It’s a really good starting point for planning your lessons and planning schemes of learning. It’s easy to access and you can use the resources whenever you like.”

image

For RES to be as successful as it can be, we want to make as much content available to students and educators as possible, and we have sought support from UK cultural organisations to work with them to get their digital archives into RES to be made available to the education sector. After attending the event Gill Webber (Executive Director, Content & Programmes at the Imperial War Museum) commented:

“Giving access to Imperial War Museums' fantastic content to more students, teachers and academics is high on our agenda. The RES seminar gave real insight into how RES can help us with this ambition.  We will be working with RES to ensure our content is in a format compatible with the platform to ensure easy access to all.”

Two of the companies we asked to present and demonstrate their products at the event, Gooii and Amplify Design, developed RES prototypes following a BBC Connected Studio workshop in December 2014. Craig Smith from Gooii demonstrated one of the first products to be built using RES, RESBuilder, and he had this to say of his experience:

“Just a quick word of thanks and to say how much we enjoyed attending the RES event and to thank you for the opportunity to present RESBuilder, one of the first products to be built for the open RES platform. Both RESBuilder and the platform were well received by the attendees and we enjoyed presenting the product with its unique approach to Concept based mining.”

image

The keynote speaker, Matthew Postgate (Chief Technology & Product Officer at the BBC) speaking at the event said:

“One of the duties of the BBC is to develop enabling, open technology. I am very proud of what the Research & Education Space has achieved, working in partnership with educational, cultural and other UK public service institutions. We want to continue to put our technology and digital capabilities at their service, and at the service of the wider industry, to benefit all.”

image

RES is still evolving – the team are continuing to work with education, cultural partners and product developers to grow the platform. If you would like to join us, please contact Jake Berger at jake.berger@bbc.co.uk

image
PoweredbyRES OpenData EdTech

RES – where now?

By Hilary Bishop, Director for RES, BBC

I feel I haven’t posted for a while - we’ve been busy, so time for an update from me.

The RES project phase ended in March and the team are now part of business as usual. We’re still working with education, cultural partners and product developers to develop the RES open platform to make connections between different sets of linked open data. 

We’re also working with three companies – Planet eStream, Gooii and Amplify – to make archive material available via their online educational products – more of that later.

New BBC Collections
The BBC collections and catalogues now available in the platform include:

  • 50,000 images from the BBC photo library, which have been handpicked for use by learners and teachers and include images of many of the most important people, places and events from the last 90 years.
  • More than 1,500 video clips from BBC Teach which offers inspiring primary and secondary video teaching resources covering all of the curriculum subjects, across all of the Key Stages. We have published enriched metadata about each clip to make it easier to find clips relevant to a particular subject area or lesson plan.
  • 1,500 video clips, audio clips and images, covering each decade from the 1930s to the 2000s from BBC RemArc, a Reminiscence Archive developed to benefit dementia patients and their carers. Much of this content, which includes ‘Events’, ‘People’, ‘Leisure’, ‘Sport’ and ‘Childhood’, hasn’t been seen or heard for many decades, and offers useful historical and cultural context for many curriculum subjects.

Collections coming soon

In the next few months we’ll have ….

  • The British Library’s Discovering Literature – featuring over 8,000 pages of collection items about more than 20 authors
  • BBC Wildlife – 3,000 video clips and information covering 1,700 species of animal, from the BBC Wildlife Finder Website
  • BBC Domesday - 147,000 articles and 23,000 images from the 1986 BBC Domesday project which offered a snapshot of the life and times of the people of Britain, 900 years after the original Domesday book.

These are in addition to the library of over 1 million BBC TV and Radio programmes which ERA authorised providers of off-air capture (like BoB, Planet eStream Connect, screenacademy and ClickView) offer schools, colleges and universities. The BBC Shakespeare Archive Resource of hundreds of TV and radio programmes from the BBC’s Shakespeare collection also lives on.

Developers

One of the most exciting areas of progress this year has been to see how companies are making use of the RES platform in the educational tools they make. For instance, Planet eStream have included additional functionality to their product which allows users to search the BBC image archive via RES to enhance lesson plans.

image

Two of the companies, Gooii and Amplify, who developed RES prototypes following a BBC Connected Studio in December 2014, have also re-imagined their educational products using the new content in the RES aggregator. RESBuilder, built by Gooii, uses an innovative PDF search function that enables users to select text and images from any document, with the system displaying all related media based on these organic searches. 

image

Amplify has developed Tunnels, a tool developed around the concept of journeys to compile and explore content along defined themes. In addition, True Teach, a search engine for teachers is currently surfacing BBC Shakespeare content for educational use, and is looking into adding other RES media to their search.

image

Internally the BBC is having a good look at data and using some of the lessons learnt in, and technology built for the RES project, to look at how it uses its own data. Progress feels tangible and if you want to find out more about educational products, how to take part with your collection or how to build on the RES platform – please visit res.space.

Planet eStream – Our Journey with RES

By John Jackson, Technical Services Director, Planet eStream Team

The Research & Education Space or RES project first came onto our radar in 2013 when we were exhibiting at the annual Standing Heads of Media Services (SCHOMS) conference at the University of Aberdeen. During the conference, the RES team presented the project to digital leaders of Further and Higher Education. The simple concept of providing an aggregator for educators to be able to search for openly accessible content that they could reliably use in their learning resources, was well received and from our perspective, thought provoking.

We could foresee even at this early stage in the project, that some form of integration with our own secure media platform would further expand the potential for digital learning provision for our educators.

Since 2013, we tracked the progress of RES with interest, attending RES development events and promoting the project to our educational client base. We wanted to raise awareness of the educationally valuable work that the BBC were working on and gain some insights into how educators felt that they could ultimately make use of this resource, in particular, in conjunction with their Planet eStream platform.

Our journey with RES started during 2015 when we collaborated with the BBC RES team to provide our educational customers with free access to the BBC Digital archive of TV and radio programmes dating from 2007. In addition, we obtained access to the fantastic broadcast content of over 500 Shakespeare-related TV and radio programmes from the BBC Shakespeare Archive dating back to the 1950s, all accessible to UK education, alongside the Freeview TV archives of our user group via our Planet eStream Connect Service. This service was launched in April 2016 and has been a great success; our educators have been thrilled to be able to integrate this high quality, educationally relevant content with their learning resources. This was just phase one in our journey and in preparation for phase two, we started to think about how RES could be used to complement both the BBC broadcast video resources and the user generated content from our educators.

Immediately, our team identified some initial features within Planet eStream that lent themselves to being expanded and enhanced by the power of RES……

Slideshows

The type of content that can be found via RES firmly dictates how that information can be presented. One of the first collections available was the BBC Image Archive of over 47,000 still images. Planet eStream already enables users to create slideshows associated with video content from images and PowerPoint presentations, so it made sense to integrate RES to enhance this feature and deliver RES powered slideshows as an option.

We previewed a prototype of this integration at the BETT show in January 2017 and raised awareness of the potential for RES in education; it was a great platform to showcase the potential of RES to educators and there was a real buzz around the possibilities that our integration offered.

image

We are continuing our work on this prototype with the aim to introduce the feature formally in our Q2 general release.

Related Media and Interactivity

As the collections from other RES partner organisations such as The British Museum, Natural History Museum, Wellcome Trust, British Library, The National Archives and others are added to RES, the scope for its use expands exponentially, as does our integration journey.

As a secure media platform, Planet eStream enables educators and students to use video content in an environment tailored for learning. With RES, we will be able to introduce the ability to search RES and add associated content to Planet eStream videos via our related media feature. This provides a list of associated and relevant media from various sources that relate to a particular video, offering a very comprehensive overall learning resource that an end user can access at their own pace.

The ability to search RES for content via Planet eStream means we can introduce this to other elements of Planet eStream like lesson plans and, more excitingly, our interactive videos and quizzes.

image

Adding interactivity to video content allows for educators to turn passive viewing into effective learning, as it requires end users to answer questions or look at related media before progressing through the video material itself. Using the timeline of a video to provide a workflow of learning can be a useful way of making learning more appealing and engaging for students. Educators will be able to expand the information available via this feature by adding content found via RES to augment what is being communicated and taught.

Phase two of our roadmap with RES will provide our educators with an even more fantastic toolset to create comprehensive learning resources, supporting all teaching methods including flipped and blended learning. Once exposed to the education community we look forward to the feedback that will allow us to progress the development of this feature to work for educators.

We see almost endless possibilities to harness the power of RES to compliment Planet eStream and we look forward to the next legs of the journey!

A Review of the Year

by Hilary Bishop, BBC Project Director for RES

image

2016, a tricky year in many respects, draws to a close, but on the Research and Education Space (RES) project it’s time to reflect on quite a bit of progress. The partnership initiative between the BBC, Jisc and Learning on Screen, which uses technology innovatively to improve access to our public archives in order to enrich the materials available for teaching and study in the UK, has definitely had some highlights: we’ve worked closely with educational product developers, engaged more with digital teams from galleries, libraries and museums (GLAM), made more BBC content available and won an award!

Our Tech team have been busy preparing data for indexing by the RES platform both from the BBC’s own programme library and the fantastic digitised collections held by other UK public institutions. The BBC collections and datasets that will be made available in RES in the New Year include:

  • 47,000 images from the BBC library
  • 1,650 classroom clips from BBC Teach 
  • 1,500 media items from BBC RemArc (a Reminiscence Archive developed to benefit dementia patients and their carers)
  • 10,000 BBC Playable programmes, including BBC World Service content.

This is in addition to the library of over 1 million BBC TV and Radio programmes that can be permanently served into educational establishments via ERA authorised providers of off-air capture in to schools, colleges and universities, such as BoB, Planet eStream Connect and ClickView.

We’ve also made improvements to the BBC Shakespeare Archive Resource launched a year ago to mark 400 years since Shakespeare’s death, which provides schools, colleges and universities across the UK with access to hundreds of BBC television and radio broadcasts of Shakespeare’s plays, sonnets and documentaries about Shakespeare. We’ve added subtitles to all the television programmes, which means students, teachers and academics can now watch the plays and sonnets of Shakespeare together with their corresponding transcript. As much of the content pre-dates the use of subtitles, we are really excited to be offering users the chance to watch many BBC adaptations, dating from the 1950s, with subtitles for the very first time.

We’ve also been working closely this year with the following teams from the GLAM sector, who are busy digitising their collections, to help make their data compatible with RES:  

We’ve also had lots of meetings with product developers to see how they can use the RES aggregator to enrich the educational tools they make. Several companies are already building educational products ‘powered by RES’, including the RES prototypes developed by commercial companies Amplify and Gooii, following a BBC Connected Studio in December 2014; and True Teach, a search engine for teachers which is currently surfacing BBC Shakespeare content  for educational use. We expect other companies to begin work in 2017 and we’re looking forward to demonstrating how you can make use of semantically structured open data in RES at BETT 2017 in January. 

The biggest surprise of the year was definitely winning the European Linked Data Contest award for Linked Enterprise Data at this year’s Semantics Conference in Leipzig in recognition of our innovative work on building an open semantic web platform that uses linked open data to connect cultural heritage collections for use in UK education and research. In awarding the prize, the international jury of ambassadors said: 

“RES demonstrates the suitability and potential impact of Linked Data as a medium for delivering large-scale educational resources”.

image

Lots of smiles all around and a very nice glass trophy for the office! 

Looking ahead to 2017, there is widespread support both within the BBC and with our education, cultural and heritage partners for continuing to develop RES as an open platform for publishing linked open data. The BBC will use the lessons learnt from the RES project to look at how it models its own data and continue to explore the benefits of developing education propositions powered by RES with 3rd parties.

Of course none of the year’s achievements would have been possible without the support, commitment and enthusiasm of the many people who have contributed  along the way: our project partners Learning on Screen (who have had an exciting year with their re-brand and launch of the new BoB) and Jisc; the collection holder organisations who have shared their digitised collections as linked open data; all those who have helped champion an understanding of what linked open data means to public sector organisations; and the project team, whose hard-work, unerring belief, passion and technical wizardry have made this year the success it has been.

I wish you all a joyful festive break and I look forward to working with you again in 2017. In the meantime please enjoy the RES Advent Calendar which showcases some of the digitised collections now being indexed by the platform.  And don’t forget to keep visiting our blog and follow @RES_Project on Twitter for all the latest updates.

What does RES do with your data?

by Elliot Smith

In the third of a series of posts from members of the RES Technical Team, Data Architect Dr Elliot Smith explains in more detail what RES does with your data.

Since joining the RES team a couple of months ago I have been preparing data for indexing by RES. I thought it would be useful to share with you what I have learnt. This post therefore describes what happens to RDF when it is indexed by RES. It doesn’t go into detail about the process which RDF goes through during indexing; instead, it explains how RES produces proxy resources which embellish and aggregate data from multiple sources.

The RDF samples used are broadly similar to real data in RES. They come from two (pretend) online movie datasets available as Linked Open Data. Each dataset contains a description of the movie Carnival of Souls, but with slightly different data:

  1. The Monster Movie database (http://minl.co.uk/)
    This site uses the BBC Programmes Ontology to describe movies, and includes links to movies on archive.org where they are available. The site metadata is published under the OGL.
  2. The Horror Movies database (http://scary.townx.org/)
    This site uses the schema.org ontology to describe movies, with links to YouTube clips of the movies where they are available. The site metadata is published under a public domain licence.

In the rest of this article, examples of the data are given in Turtle format. The following prefixes are used in these examples:

@prefix dct: <http://purl.org/dc/terms/> .
@prefix dcmitype: <http://purl.org/dc/dcmitype/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix formats: <http://www.w3.org/ns/formats/> .
@prefix frbr: <http://purl.org/vocab/frbr/core#> .
@prefix mrss: <http://search.yahoo.com/mrss/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix powder: <<http://www.w3.org/2007/05/powder-s#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

Note that the examples contain URLs with the host name res to represent resources created by RES. These were produced by running a development version of RES cloned from the Acropolis git repo. On the live RES server, these URLs would have the host name acropolis.org.uk, e.g. http://res/3405331994fc4fe890fc12c4823137d9#id on my development server would be http://acropolis.org.uk/3405331994fc4fe890fc12c4823137d9#id on the live server.

If you are unfamiliar with RDF and Linked Open Data, Inside Acropolis provides some background.

Ingesting data from minl.co.uk

The minl.co.uk site has a resource at the URI http://minl.co.uk/carnival_of_souls#id, representing the movie “Carnival of Souls”:

<http://minl.co.uk/carnival_of_souls#id>
  a po:Episode ;
  dct:title "Carnival of Souls"@en ;
  po:director <http://dbpedia.org/resource/Herk_Harvey> ;
  owl:sameAs <http://dbpedia.org/resource/Carnival_of_Souls> ;
  dct:license <http://creativecommons.org/publicdomain/zero/1.0/> ;
  rdfs:seeAlso <http://www.imdb.com/title/tt0055830/> ;
  mrss:player <https://archive.org/details/CarnivalofSouls> .

This resource represents data about the movie, with links to:

  • The DBPedia resource for the movie (owl:sameAs).
  • The DBPedia resource for the movie’s director (po:director).
  • The IMDB page for the movie (rdfs:seeAlso).
  • A player for the movie on archive.org (mrss:player).

However, note that minl.co.uk doesn’t have all the data about the movie: for example, it is missing the running time and the year it was released.

As this data is available on the web, the minl.co.uk site requests that RES index it. The RES team then queues the minl.co.uk site for crawling; as the crawl progresses, the resources on the site are fetched and “ingested” by RES, becoming part of its index. (Contact Jake Berger, RES Product Manager at jake.berger@bbc.co.uk if you have a Linked Open Data site you’d like included in RES’s index.)

After being ingested by RES, how would someone get at the movie data? The main entry point is a search, using a URL like http://acropolis.org.uk?q=Carnival+of+Souls, where the q= querystring variable specifies a search string (“Carnival of Souls”). On my development server, this brings up a list of results, one of which is a Creative Work corresponding to the movie:

image

This is the first example of how RES indexes the data it ingests: a resource is put into one or more categories (as shown on the RES home page) if it is of a particular type. In this case, because the resource for the movie was marked as a po:Episode, it was added to RES’s “Creative Works” category. The resource now appears under that category in RES’s web user interface. For example, here’s the page shown for Creative Works on my development server:

image

Clicking on the link for “Carnival of Souls” on this page takes us to the explorer view for it:

image

This is a human-readable view of the RDF data about the movie. Note that the address for this view is http://res/3405331994fc4fe890fc12c4823137d9#id. This URI is important, as it’s the URI of the proxy resource RES created for the movie “Carnival of Souls”. When RES indexes RDF, it creates one of these proxy resources for each resource in the original RDF. The proxy is RES’s own version of the original data: a slimmed down copy of the original, merged with and linked to other data already in the index. RES maintains the proxy as well a copy of the original RDF linked to it.

To see the full RDF for the proxy and the original resource, we can go to the Turtle link for the “Carnival of Souls” Creative Work (http://res/3405331994fc4fe890fc12c4823137d9.ttl). Here’s an extract:

<http://res/3405331994fc4fe890fc12c4823137d9#id>
  a po:Episode, frbr:Work ;
  rdfs:label "Carnival of Souls"@en ;
  mrss:player <https://archive.org/details/CarnivalofSouls> ;
  dct:rights <http://creativecommons.org/publicdomain/zero/1.0/> .

<http://minl.co.uk/carnival_of_souls#id>
  a po:Episode ;
  dct:license <http://creativecommons.org/publicdomain/zero/1.0/> ;
  dct:title "Carnival of Souls"@en ;
  mrss:player <https://archive.org/details/CarnivalofSouls> ;
  rdfs:seeAlso <http://www.imdb.com/title/tt0055830/> ;
  owl:sameAs <http://dbpedia.org/resource/Carnival_of_Souls>,
    <http://res/3405331994fc4fe890fc12c4823137d9#id> .

RES creates the proxy resource as a locus for collating data from disparate datasets. At the moment, the proxy only stands for one resource (http://minl.co.uk/carnival_of_souls#id), so it’s not really an aggregate (yet). But note that the proxy has inherited the RDF type po:Episode from the ingested resource. In addition, RES has added its first embellishment: an additional type, frbr:Work, which RES has inferred and added to the proxy.

This embellishment came about via rules in RES which map from known types and properties to a smaller group of types and properties which have special meaning for RES. One of these rules states that anything which is a po:Episode should also be stored as a frbr:Work in RES’s index. Once a resource references this special set of types and properties, RES can index and present that resource more effectively, e.g. by including it in a RES category like “Creative Works”.

In addition, where an ingested resource has properties from this special set, those properties are mirrored onto the proxy. In this case, we can see that the proxy has the same mrss:player statements as the original. Other statements may be mapped between the original and the proxy: the value retained, but the property type changed. One example here is the dct:rights statement in the proxy, which was mapped from the dct:license statement in the original. Again, this is done via RES’s mapping rules, which specify that a dct:license statement on an ingested resource should be mapped to a dct:rights statement on its proxy. Similarly, dct:title in the original has been mapped to rdfs:label on the proxy.

Finally, when RES stores the original resource, it only retains statements which it considers “useful”; any other statements are stripped from the original resource. In the case of the minl.co.uk resource, the only statement which is stripped is the po:director one.

The diagram below shows how the original resource and its proxy are related, and how RES has inferred, mirrored, mapped and stripped the RDF statements:

image

RES also uses aggregation to track relationships between resources; especially owl:sameAs relationships. This relationship specifies that two resources are equivalent to each other. For example, the data we ingested had this statement in it:

<http://minl.co.uk/carnival_of_souls#id>
  ...
  owl:sameAs <http://dbpedia.org/resource/Carnival_of_Souls> ;

This states that the resource identified by http://minl.co.uk/carnival_of_souls#id is the same resource identified by http://dbpedia.org/resource/Carnival_of_Souls. However, because RES has added a proxy resource for the movie, the original resource gets an additional owl:sameAs statement to show its relationship to the proxy:

<http://minl.co.uk/carnival_of_souls#id>
  a po:Episode ;
  ...
  owl:sameAs <http://dbpedia.org/resource/Carnival_of_Souls>,
    <http://res/3405331994fc4fe890fc12c4823137d9#id> .

This becomes even more useful when RES ingests resources from other datasets; in the next section, we’ll see what happens in this situation.

More data: “Carnival of Souls” from scary.townx.org

After ingesting the data from minl.co.uk, another site, scary.townx.org, requests that its data be ingested by RES. When RES gets round to crawling scary.townx.org, it finds another (slightly different) resource for “Carnival of Souls”:

<http://scary.townx.org/b08e1de8#movie>
  a schema:Movie, frbr:Work ;
  schema:name "Carnival of Souls"@en-gb ;
  schema:duration "PT1H24M" ;
  schema:dateCreated "1962"^^schema:Date ;
  owl:sameAs <http://dbpedia.org/resource/Carnival_of_Souls> ;
  mrss:player <https://www.youtube.com/watch?v=7JIt_pNW_3w> .

This is similar to the resource from minl.co.uk, but with the duration and creation date of the movie and no director. It also has a link to a player for the movie on YouTube.

If we return to the Creative Works view after this new resource has been ingested, RES will still only have a single resource for the movie. This is because when RES ingested the resource from scary.townx.org, it already “knew”:

<http://minl.co.uk/carnival_of_souls#id> owl:sameAs
  <http://dbpedia.org/resource/Carnival_of_Souls> .

Now, because the new RDF from scary.townx.org states…

<http://scary.townx.org/b08e1de8#movie> owl:sameAs
  <http://dbpedia.org/resource/Carnival_of_Souls> .

…RES is able to infer that http://minl.co.uk/carnival_of_souls#id and http://scary.townx.org/b08e1de8#movie are also owl:sameAs each other by the logical rule:

if A is the same as C
and B is the same as C
then A is the same as B

This means that RES doesn’t need to create a new resource for the data from scary.townx.org; instead, it can just add new statements to the proxy resource it already has, describing the equivalence between the new resource http://scary.townx.org/b08e1de8#movie and the existing proxy resource. In practice, this is done via owl:sameAs statements which connect the resource to the proxy. We can see these by returning to the proxy at http://res/3405331994fc4fe890fc12c4823137d9#id:

<http://res/3405331994fc4fe890fc12c4823137d9#id>
  a po:Episode, frbr:Work, schema:Movie ;
  rdfs:label "Carnival of Souls"@en, "Carnival of Souls"@en-gb ;
  dct:rights <http://creativecommons.org/publicdomain/zero/1.0/> ;
  mrss:player <https://archive.org/details/CarnivalofSouls>,
    <https://www.youtube.com/watch?v=7JIt_pNW_3w> .

<http://minl.co.uk/carnival_of_souls#id>
  a po:Episode ;
  dct:title "Carnival of Souls"@en ;
  dct:license <http://creativecommons.org/publicdomain/zero/1.0/> ;
  mrss:player <https://archive.org/details/CarnivalofSouls> ;
  rdfs:seeAlso <http://www.imdb.com/title/tt0055830/> ;
  owl:sameAs <http://dbpedia.org/resource/Carnival_of_Souls>,
    <http://res/3405331994fc4fe890fc12c4823137d9#id> .

<http://scary.townx.org/b08e1de8#movie>
  a frbr:Work, schema:Movie ;
  schema:name "Carnival of Souls"@en-gb ;
  mrss:player <https://www.youtube.com/watch?v=7JIt_pNW_3w> ;
  owl:sameAs <http://dbpedia.org/resource/Carnival_of_Souls>,
    <http://res/3405331994fc4fe890fc12c4823137d9#id> .

Note that the two indexed resources, http://minl.co.uk/carnival_of_souls#id and http://scary.townx.org/b08e1de8#movie, now reference the same DBPedia resource via owl:sameAs; they both also reference the proxy resource, http://res/3405331994fc4fe890fc12c4823137d9#id.

RES has done the other types of embellishment it did for the minl.co.uk resource (statements have been inferred, mirrored, and mapped). Statements have been added to the proxy which reference the new scary.townx.org resource, in particular:

  • The proxy now has two mrss:player statements, one mirrored from each aggregated resource:
    • <https://archive.org/details/CarnivalofSouls> - from minl.co.uk
    • <https://www.youtube.com/watch?v=7JIt_pNW_3w> - from scary.townx.org
  • The proxy has two rdf:label statements, one from each aggregated resource:
    • “Carnival of Souls”@en - from minl.co.uk
    • “Carnival of Souls”@en-gb - from scary.townx.org
  • In the case of the scary.townx.org resource, RES mapped from the schema:name statement to an rdfs:label statement on the proxy resource. The resulting label is treated as separate from the “Carnival of Souls”@en label because it has a different language tag, “en”, so the proxy ends up with two labels.
  • RES has mirrored the schema:Movie type from the scary.townx.org resource onto the proxy.

This demonstrates how RES becomes more useful as it ingests more data: the proxy is now an aggregate of the useful data from the two original resources. A user interested in finding media related to the film “Carnival of Souls” could just consult the proxy resource to find it, rather than the two sites which provided the original data.

The proxy also contains data about the copyright on the movie, as derived from one of the aggregated resources, and labels in any languages it knows about: all of which can help when displaying data about a movie. In addition, the original resources are returned with the proxy, so the user can see the resources which were aggregated into it.

This article has glossed over a lot of the subtlety of RES, and some of its major features, such as filtering search results by audience. But it has hopefully explained how RES enriches the data added to its index, by embellishing and aggregating it into accessible, useful RDF resources. The richness of these resources will only improve as more datasets are processed and added to RES’s index.

RES Advent Calendar: Connecting Christmas

image

We are celebrating the start of December with the first ever Research and Education Space (RES) Advent Calendar.

For the next 24 days, discover how RES is connecting the past to inspire new ways of learning. Every day a new snowflake will float onto one of the linked balls which, when opened, will reveal one of the digital treasures from the digitised collections of the UK’s publicly-held archives.

We hope you enjoy our connected Christmas countdown at res.space/advent

Subtitling Shakespeare

By Mark Macey, Education Engagement Manager, RES

image

When we launched the BBC Shakespeare Archive Resource last year to mark the 400th anniversary of Shakespeare’s death, our ambition was to add subtitles to all the television programmes featured on the site.  We are delighted to announce that this piece of work is now complete. This means students, teachers and academics can now watch the plays and sonnets of Shakespeare together with their corresponding transcript. Not only does this help deaf and hard-of-hearing viewers to follow the action, but it also allows students to see the words coming alive as they are performed. Take a look at this short Teach Shakespeare film to see how two schools are using this resource to help students better understand Shakespeare in the classroom.

As much of the content pre-dates the use of subtitles, we are really excited to be offering users the chance to watch many BBC adaptations, dating from the 1950s, with subtitles for the very first time. This includes part of the 1955 production of The Merry Wives of Windsor starring Anthony Quayle, the RSC’s production of As You Like It, broadcast in 1963 starring a young Vanessa Redgrave as Rosalind, the earliest surviving production of A Midsummer Night’s Dream from 1958 and Maggie Smith and Robert Stephens in a 1967 production of Much Ado About Nothing.

image

In addition, we have subtitled all the documentaries about Shakespeare and entertainment programmes containing references to the Bard. In the factual category you will find Paul Robeson talking in 1959 about acting Shakespeare and the difficulty of English accents, and the series of Shakespeare in Perspective which accompanied the major BBC series of Shakespeare’s works, transmitted between 1978 and 1985. On the lighter side, there is Blackadder punching Shakespeare ‘for every schoolboy and schoolgirl for the next four hundred years!’, as well as sketches from The Morecambe and Wise Show.

The BBC Shakespeare Archive Resource is only available to those in formal UK education and is free at the point of use. We have tried to make access as simple as possible; schools in Wales, Scotland and Northern Ireland can log in using their Hwb+, Glow or C2K credentials and schools in Scotland and England can also access the site using RM Education’s Launchpad. If you are unable to automatically view and play the media, you will need to email us at shakespeareaccess@bbc.co.uk to establish that you are part of a formal educational institution in the UK so we can provide you with access credentials.

Seven Solutions to Seven Problems

By Jake Berger, Product Manager for the Research and Education Space (RES)

Having been away from RES for a couple of years, it’s been a pleasant surprise to see how far it has come during my absence. Perhaps I should leave projects more often…

I was involved in shaping the original proposal for what we then called the ‘Digital Learning Network’, and helped turn an idea in to a proposal and then onwards in to a project. Coming back, as RES Product Manager, it is heartening to find that RES is now a usable (and award-winning!) product, albeit still one that has plenty of room to grow and mature.

During my first few weeks in the new job, I spent a while reminding myself about the fundamental purpose of RES: why did we decide to build it; why we built it in the way that we did; and who were we building it for.  

Having a clear vision of the demand for, purpose of and benefits of a product is surely vital if one is to successfully manage that product. It is also vital to understand the suppliers of your data and the users of your data.

Our suppliers are those who publish data that can be indexed by RES and our users are those who consume data made available through RES.  

The publishers of data might be individuals, organisations, museums, archives, libraries, galleries or broadcasters. If a person or organisation publishes their data in the right way, their data will automatically become available in RES – it’s not an ‘opt in or out’ system.

The consumers of data via RES could be loosely defined as ‘people who learn or teach’.  They consume our data through products made by third parties, such as Virtual Learning Environments, or other software systems used in education and research. There are certainly many other potential consumers of RES data – any person or machine that can make use of semantically structured open data – but our focus so far has been those who learn and those who teach.

After considerable ponderance (is that a word?) I boiled my thoughts down to seven key ‘problems’ that RES tries to solve on behalf of publishers and consumers of data on the internet for teaching and learning, how RES addresses those problems, and why it matters:

Problem 1 – there is too much clutter and ambiguity on the Internet, meaning that huge quantities of potential learning resources are never discovered.

Solution 1 – RES understands the difference between ‘classes’ of entities – ‘Bastille’ the place versus ‘Bastille’ the band – which pushes more relevant results to the top.

How does that help? This means that you will find more, and more relevant material and can be certain that the thing you are looking at is the thing you want.

Problem 2 – there are too many websites to visit to find all resources relating to a subject.

Solution 2 – RES brings together all relevant resources it knows about for a subject /concept.  

How does that help? This means that you can see every relevant item in a single place.

Problem 3 – it’s difficult to know who is publishing information.

Solution 3 – provenance of items in RES is clear.  We only include material if we can say where it’s from.

How does that help?  You can be completely confident that you understand your sources and citations.

Problem 4 – different institutions describe the same things in different ways, leading to too many confusing ways of describing the world.

Solution 4 – RES transforms diverse ‘ontologies’ in to a simple unified ontology, but retains the original definitions and relationships.

How does that help?  You can explore a diverse set of specialist material without having to know all of the jargon.

Problem 5 – there is not enough simple certainty of what you can use and how.

Solution 5 – RES only contains items with machine-readable licensing, describing how the resource can be used, and if there is any cost to use it.

How does that help?  You can filter out stuff that doesn’t fit your usage needs or budget.

Problem 6 – computers want to help us, but people still do most of the hard work.

Solution 6 – RES encourages cultural institutions to describe their assets semantically - in a structured and specific way - that machines can read and understand.

How does that help?  Finding truly relevant material, and other material related to it becomes easier. The solution is scaleable.

Problem 7 – it is difficult to share our insights in to cultural and learning resources.

Solution 7 – RES enables learners’ and educators’ knowledge and insight to be shared with others.

How does that help?  You can find material that has been tagged by other educators and learners as relevant to a particular subject, course or age group. 

Now, I must be clear that RES hasn’t YET fully solved all of these problems. However, we are confident that the architecture of the system IS capable of doing so.

What we need now is more data published in the right way and more people developing products and services on the platform, in order to begin the process of creating things that benefit both the publishers of data and people who teach and learn.

If you think you can help make this happen, please get in contact with me at: jake.berger@bbc.co.uk.

If you want to know more about the process of making your data available via RES, or how you can build products and services on the RES platform, check out the Guide To Acropolis.

Why publish Modes records as Linked Data?

By Richard Light

Richard Light works as a Technical Partner with the Modes Users Association.

Modes is the most widely-used museum cataloguing software in the U.K.  It is mostly used by smaller museums, and these museums do not have the luxury of in-house IT support staff. So we think it is important that Modes provides as complete a solution as possible to the challenge of recording, and then publishing, museum collections.

These days, a popular method of publishing a museum catalogue is to set up a searchable website. Over the years we’ve offered our users a variety of ways to do that, most recently by developing a Wordpress plugin which uses ‘shortcodes’ to embed collections search into standard Wordpress pages. However, we don’t think this is the only, or even the best, way to publish a museum collection.  

A common frame of reference…. 

If a member of the public searches your collection, and finds a really interesting object, they may want to share that object with their friends, or with people who share a specialist interest. How do they do that? The normal method of sharing information on the Web is to use the URL of the page as a link or a bookmark. However, the pages returned by a web search will usually have URLs which reflect the search, not the result. Look at this example from the Rutland County Museum: 

http://rutlandcountymuseum.org.uk/wordpress/?modes_query={Object%20classifications}=*{Home%20and%20family%20life,%20Domestic%20equipment}&page_id=350&start_rutland=1

If you unpick this URL, it says that it is returning the first record found in a search for ‘Home and family life, Domestic equipment’ in the Object classifications index. That could be any one of a number of objects from the collection: there are currently 44 of them. At present, it is the object with accession number OAKRM: 1968.53, but this could change as new objects are catalogued and existing records are updated:

So, if you want to refer unambiguously to that glass jar butter churn, you need a different form of URL: one that explicitly identifies that particular object. A Linked Data identifier can do that:

[http://collections.wordsworth.org.uk/Object/WTcoll/id/ GRMDC.C144.9](http://collections.wordsworth.org.uk/Object/WTcoll/id/ GRMDC.C144.9)

This URL acts as a unique, persistent identifier for one particular object in the Wordsworth Trust’s collection: a print of Keswick Lake from Applethwaite by William Westall. If used as a link or a bookmark, it will return a page describing that object and providing a useful image:

… and a general-purpose API (Application Programming Interface)

However, a Linked Data URL can do much more than that. It can also provide a version of the object’s data which can be processed by software. By default, this will be RDF:

But it could instead be XML:

image

Our Linked Data implementation also includes a simple search facility, so you can find for example all records mentioning Applethwaite:

http://collections.wordsworth.org.uk/Object/WTcoll/search/?q=applethwaite

The ability to find specific records, and to return them in a machine-processible format, makes our Linked Data solution into an API. What’s more, it is an API based on open standards, rather than being a one-off with its own set of access rules.

Stepping up to the RES challenge

When making our Linked Data framework work with RES, one issue we needed to address was that of licensing. RES is (reasonably) flexible about which licence you associate with your metadata, but it does insist that you include a licence statement with each packet of data that is returned. Without this information, the RES indexer refuses to look at your data. Other aggregation projects, such as Europeana, have found the same need to adopt an Open Data licence, and to state this licence explicitly.  

The other development required by RES was to produce a machine-processible ‘site map’, so that the RES indexer could work its way through every object in the collection. We did this using VoID.

Adding licence information and a site map were both features which will make our Linked Data more useful for projects other than RES, so we were delighted to carry out these developments.

We look forward to seeing our Modes data being deployed in RES-based educational applications in the near future!