Big Data and Nollywood

Below I detail how I used Big Data and Natural Language Processing techniques to query the Global Graph for Nigerian Movies and Actors.

Using Virtuoso SPARQL editor, I sent the following queries to the LinkedMDB SPARQL Endpoint using Federated SPARQL.

Nigerian Movies on LinkedMDB

The following query lists all the Nigerian movies in the RDF store of LinkedMDB or in SPARQL terms, show me movies whose movie:country is country:NG which may have a dc:title or a dc:date:

## SPARQL Query
 PREFIX owl:             <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>

 select ?s ?Title ?Date

 where {
 SERVICE     <http://data.linkedmdb.org/sparql> {
 ?s movie:country   country:NG .
 optional { ?s dc:title ?Title. }
 optional { ?s dc:date ?Date. }
  }

 } 

The answer looks like this when you run the query using URIburner Endpoint.

Wow! they have only eight Nigerian movies in their dataset.

Who acted in the movie "2 Rats"?

Let us find out who acted in the movie "2 Rats" by using the following query. In SPARQL terms show me the movie:actor of .../film/16131 and also tell me his/her ?Name.

To do that we run the following query:

## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>

 select ?Name

 where {
 SERVICE     <http://data.linkedmdb.org/sparql> {
 <http://data.linkedmdb.org/resource/film/16131>  movie:actor ?actor .
 ?actor movie:actor_name ?Name .

  }

 } 

The answer.

-------------
| Name |
=======
--------

Wow!

LinkedMDB does not have that information.

Let me see what else they have on the movie using the following query or in SPARQL terms show me all you have on the subject of .../film/16131:

## SPARQL Query
 PREFIX owl:     <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>
 PREFIX dbowl: <http://dbpedia.org/ontology/>

 select *

 where {
 SERVICE     <http://data.linkedmdb.org/sparql> {
 <http://data.linkedmdb.org/resource/film/16131> ?PropertyName ?Value


  }

 } 

You can view the answer here.

This just shows the release date, country, lanaguage, title of the movie. This is not much.

Let us check from the very nucleus of the Linked Open Data web, DBpedia, if they have some more data than LinkedMDB.

"2 Rats" at DBpedia

Let us see - using the following query - if we can catch "2 Rats" in DBpedia by using the following query or in SPARQL terms show us the ?moviename and ?actor of any dbpedia:Film that has 2 Rats in its name and dbprop:country is nigeria:

## SPARQL Query
 PREFIX owl:     <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs:    <http://www.w3.org/2000/01/rdf- schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>

 select ?s as ?Movie ?moviename as ?Name ?actor as ?Actor

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbprop:country ?country.

 ?s dbpedia:starring ?dbname .
 ?s rdfs:label ?moviename.
 ?dbname dbprop:name ?actor.
 filter regex(?moviename,"^2 Rats") .
 filter regex(?country,"^nigeria","i") .


  }

 } 

The answer.

Now we can see that DBpedia has the actors name.

What else do they have on the movie

Let us check for the movie description, director, distributor with the following query.

In SPARQL terms show me the ?director and ?distributor of any dbpedia:Film that has 2 Rats in its name and dbprop:country is nigeria:

## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>

 select distinct ?s as ?Movie ?director ?distributor

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbprop:country ?country.

 ?s dbpedia:starring ?dbname .
 ?s rdfs:label ?moviename.
 ?s rdfs:comment ?description .
 ?s dbprop:director ?director.
 ?s dbprop:distributor ?distributor.
 ?dbname dbprop:name ?actor.
 filter regex(?moviename,"^2 Rats") .
 filter regex(?country,"^nigeria","i") .


  }

 } 

The result shows that we can now see that DBpedia has much more information on this movie.

We can get more information like the movie description with the following query:

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>

 select distinct ?description as ?Summary

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbprop:country ?country.

 ?s dbpedia:starring ?dbname .
 ?s rdfs:label ?moviename.
 ?s rdfs:comment ?description .
 ?s dbprop:director ?director.
 ?s dbprop:distributor ?distributor.
 ?dbname dbprop:name ?actor.
 filter regex(?moviename,"^2 Rats") .
 filter regex(?country,"^nigeria","i") .


  }

 } 

and it showed:

| "2 Rats is a 2003 Nigeria film. Nollywood's highest paid actors, Osita Iheme (A-boy) and Chinedu Ikedieze (Bobo) are two young boys whose father has been murdered by their uncle. In a selfish move, Amaechi Muonagor wants them to work as house boys in their father's own house. A-boy and Bobo have other plans. The film features performances by Aki na Pawpaw and can be dubbed as Nollywood's Home Alone."@en |

Chinedu Ikedieze's filmography

Which other movie did Chinedu Ikedieze star in?

Let us ask DBpedia the following query or in SPARQL terms show us any dbpedia:Film dbpedia:starring dbr:Chinedu_Ikedieze, giving us the ?director and ?distributor if available:

## SPARQL Query
 PREFIX owl:     <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>
 PREFIX dbr: <http://dbpedia.org/resource/>

 select distinct ?s as ?Movie ?director ?distributor

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbpedia:starring dbr:Chinedu_Ikedieze .
 optional {?s dbprop:director ?director. }
 optional {?s dbprop:distributor ?distributor. }

  }

 } 

...and the result showed only three films.

Chinedu Ikedieze and Osita Iheme

Which films in DBpedia have starred both Chinedu and Osita together?

Let us ask with the following query:

## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>
 PREFIX dbr: <http://dbpedia.org/resource/>

 select distinct ?s as ?Movie ?director ?distributor

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbpedia:starring dbr:Chinedu_Ikedieze .
 ?s dbpedia:starring dbr:Osita_Iheme .
 optional {?s dbprop:director ?director. }
 optional {?s dbprop:distributor ?distributor. }

  }

 } 

The answer shows only two movies.

Nollywood in DBpedia

Let us ask DBpedia which Nigerian movies does it have, using the following query or in SPARQL terms show us all the dbpedia:Film you have that dbprop:country is nigeria and give it to me in english language.

 ## SPARQL Query
 PREFIX owl: <http://www.w3.org/2002/07/owl#>
 PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
 PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
 PREFIX dc: <http://purl.org/dc/terms/>
 PREFIX country: <http://data.linkedmdb.org/resource/country/>
 PREFIX dbpedia: <http://dbpedia.org/ontology/>
 PREFIX dbprop: <http://dbpedia.org/property/>
 PREFIX dbr: <http://dbpedia.org/resource/>

 select distinct ?s as ?Movie ?moviename as ?Name ?country as ?Country

 where {
 SERVICE <http://dbpedia.org/sparql> {
 ?s a dbpedia:Film .
 ?s dbprop:country ?country.
 optional {?s rdfs:label ?moviename. }

 filter regex(?country,"^nigeria","i") .
 filter (lang(?moviename) ='en') .
  }

 } 

The answer only 21 movies!

Conclusion

The web of linked data or Semantic Web is a global information space consisting of inter-linked data.

SPARQL enables applications to query this global graph or database for information about resources or entities.

I have shown how easy to query these graphs for information on Nollywood.

We have also seen that not enough information on Nollywood exists in the global graph so there is a need to publish more.

Why?

Applications can consume the results of these kind of queries to create a rich experience for users in interacting with these resources e.g. movie recommendation app, an actor app, etc.

Cover Image Credit: Paul Keller

This article originally appeared on Emeka's Linked Open Data Nigeria blog.

Comments