Apache Rya is a tool for storing and querying triples at scale. It is not used much and
Run Zookeeper and HDFS
Zookeeper and HDFS need to be running in order for Rya to work. To start Zookeeper run the following command
$ZOOKEEPER_HOME/bin/zkServer.sh start
To run HDFS execute the following commands
cd $HADOOP_HOME bin/hdfs namenode -format sbin/start-dfs.sh bin/hdfs dfs -mkdir /user bin/hdfs dfs -mkdir /user/$USERNAME bin/hdfs dfs -mkdir input bin/hdfs dfs -put etc/hadoop/*.xml input
If you have run hdfs previously than you might only need to execute sbin/start-dfs.sh.
Install Accumulo
You can get the installation instructions for Accumulo here, but I will go into more detail. Download the version 1.9.3 binary here.
Type the following
cd <install_location> tar xzf accumulo-1.9.3-bin.tar.gz cd accumulo-1.9.3 ./bin/build_native_library.sh ./bin/bootstrap_config.sh
You will then be asked some questions about your desired configuration. I chose 3) 3GB, 2) Native, and 5) Hadoop 3. Additional configurations will need to be set manually. Change the first property in conf/accumulo-site.xml to
<property> <name>instance.volumes</name> <value>hdfs://127.0.0.1:9000/accumulo</value> <description>comma separated list URIs for volumes. example: hdfs://localhost:9000/accumulo</description> </property>
You might need to set the value that is appropriate for you. Change the next property similarly.
<property> <name>instance.zookeeper.host</name> <value>127.0.0.1:2181</value> <description>comma separated list of zookeeper servers</description> </property>
Change instance.secret, which is the next property to whatever you want. Also change trace.token.property.password, which is farther down. Add the following line to the last property
$HADOOP_PREFIX/share/hadoop/common/lib/[^.].*.jar
Set HADOOP_PREFIX, JAVA_HOME, and ZOOKEEPER_HOME in conf/accumulo-env.sh. I also set HADOOP_HOME, ZOOKEEPER_HOME, and ACCUMULO_HOME in .bashrc. You should now be able to run the following command
$ACCUMULO_HOME/bin/accumulo init
It will ask you for an instance name a password. Use any instance name and password you like. Next start the Accumulo master, tserver, monitor, and gc.
$ACCUMULO_HOME/bin/accumulo master $ACCUMULO_HOME/bin/accumulo tserver $ACCUMULO_HOME/bin/accumulo monitor $ACCUMULO_HOME/bin/accumulo gc
Install Rya
You can find the quickstart for Rya at . Rya can be downloaded from here. To install Rya execute the following commands
unzip rya-project-3.2.12-incubating-source-release.zip cd rya-project-3.2.12-incubating mvn clean install
If Rya was successfully installed there will be a war file at web/web.rya/target/web.rya.war. Copy the contents of $RYA_HOME/web/web.rya/target to Tomcat’s webapp directory.
cp -r $RYA_HOME/web/web.rya/target/* $TOMCAT_HOME/webapps
You can also obtain openrdf-sesame.war and openrdf-workbench.war and put them in $TOMCAT_HOME/webapps. Next create a file named environment.properties in $RYA_HOME with the following contents
instance.name=<instance_name> instance.zk=localhost:2181 instance.username=root instance.password=<instance_password> rya.tableprefix=rya_ rya.displayqueryplan=true
Replace <instance_name> and <instance_password> with the instance name and instance password you entered when you ran accumulo init. You might also want to change instance.zk. You need to tell Tomcat where it can find this file. In $TOMCAT_HOME/conf/catalina.properties set shared.loader=”$RYA_HOME/environment.properties”. Now start Tomcat.
$TOMCAT_HOME/bin/startup.sh
Load Triples
You can use the following code to load triples into Rya
import java.io.BufferedReader; import java.io.InputStream; import java.io.InputStreamReader; import java.io.OutputStream; import java.net.URL; import java.net.URLConnection; public class LoadDataServletRun { public static void main(String[] args) { try { String inputFile=args[0]; String format=args[1]; final InputStream resourceAsStream = Thread.currentThread().getContextClassLoader() .getResourceAsStream(inputFile); URL url = new URL("http://localhost:8080/web.rya/loadrdf" + "?format=" + format + ""); URLConnection urlConnection = url.openConnection(); urlConnection.setRequestProperty("Content-Type", "text/plain"); urlConnection.setDoOutput(true); final OutputStream os = urlConnection.getOutputStream(); int read; while((read = resourceAsStream.read()) >= 0) { os.write(read); } resourceAsStream.close(); os.flush(); BufferedReader rd = new BufferedReader(new InputStreamReader( urlConnection.getInputStream())); String line; while ((line = rd.readLine()) != null) { System.out.println(line); } rd.close(); os.close(); } catch (Exception e) { e.printStackTrace(); } } }
As an example you could put the following contents in src/main/resources/ntriples.ntrips
<http://mynamespace/ProductType1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://mynamespace/ProductType> . <http://mynamespace/ProductType1> <http://www.w3.org/2000/01/rdf-schema#label> "Thing" . <http://mynamespace/ProductType1> <http://purl.org/dc/elements/1.1/publisher> <http://mynamespace/Publisher1> .
You can then load the triples by running
sbt "runMain LoadDataServletRun ntriples.ntrips N-triples"
Query Triples in Rya
You can use the following Java code to query triples in Rya
import java.io.BufferedReader; import java.io.InputStreamReader; import java.net.URL; import java.net.URLConnection; import java.net.URLEncoder; import java.nio.file.Files; import java.nio.file.Paths; public class QueryDataServletRun { public static void main(String[] args) { try { String queryFile = args[0]; String query = new String(Files.readAllBytes(Paths.get(queryFile))); String queryenc = URLEncoder.encode(query, "UTF-8"); URL url = new URL("http://localhost:8080/web.rya/queryrdf?query.infer=true&query=" + queryenc); URLConnection urlConnection = url.openConnection(); urlConnection.setDoOutput(true); BufferedReader rd = new BufferedReader(new InputStreamReader( urlConnection.getInputStream())); String line; while ((line = rd.readLine()) != null) { System.out.println(line); } rd.close(); } catch (Exception e) { e.printStackTrace(); } } }
As an example you can put the following in a file, query.txt
select * where { <http://mynamespace/ProductType1> ?p ?o. }
and then run the following command
sbt "runMain QueryDataServletRun query.txt"
Hopefully you are now able to store and query triples using Rya. If not please leave a comment with the error you encountered.
Hello and thanks for the instructions. I was looking for something like that in order to find a solution to a problem that I am facing for several weeks now. However I still have the same problems I faced before (when I used the versions suggested by the VagrantFile in the v4.0.0-incubating-SNAPSHOT). I am using Debian GNU/Linux 9.9, HDFS, Zookeeper and Accumulo the versions from this guide, Tomcat v8.5.42, sesame-http-server-4.1.2.war and sesame-http-workbench-4.1.2.war. The Sesma Server and Workbench load properly but Web-Rya gives the following error “HTTP Status 404 – Not Found” and a more detailed one “The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.”. However the Accumulo shell shows that the following tables have been created:
rya_ns
rya_osp
rya_po
rya_prospects
rya_spo
Any suggestion would by appreciated.
Thanks
Theofilos