Home

Awesome

Freebase-SPARQL-server-on-AWS

Tutorial on Freebase SPARQL server setup using AWS instance

Intro

As Freebase was shutdown in 2015, very few resource about Freebase cannot be found on the Internet. But for those who are interested in sementic parsing or KBQA (question answering over knowledge bases), having a Freebase sparql server (something like the one used by Wikidata https://query.wikidata.org/ or DBpedia https://dbpedia.org/sparql) at hand can really relieve your headache. It can be used to explore the KB and also annotate new dataset. I myself have experience following some of the existing tutotials, i.e., sempre, this and this. However, they are either out of date (older Freebase version or older Virtuoso that requries out-of-data openssl) or incomplete. After some trials and errors, I found a simple pipleline that works well and that's what this tutorial is about. For those who are mostly in NLP and do not have lots of experience on Virtuoso and SPARQL like me, this tutotail might be a good start point.

Why AWS and which instance to choose

Freebase is huge! You need a machine that has around 300GB RAM. If you don't to want use the server constantly, AWS might be the cheapest solution. Make sure you select an instance that has enough RAM and storage. I used r5.12xlarge with additional SSD storage and Ubuntu 18.04 AMI.

Before your launch your instance, make sure your modify the Security Group rule so that the SPARQL server can be access with your browser, the inbound rule should be something like below:

<p align="center"><img width="85%" src="inbound_rule.png" /></p>

Instructions

After loading, you will be able to access the SPARQL endpoint via your local brower: