Hive defines a simple SQL-like query language to querying and managing large datasets called Hive-QL ( HQL ). It’s easy to use if you’re familiar with SQL Language. Hive allows programmers who are familiar with the language to write the custom MapReduce framework to perform more sophisticated analysis.
In this Blog,we will learn how to create a hive client with scala to execute basic hql commands,first create a scala project with scala 2.12 version
now add following properties in your build.sbt file
name := "hive_cli_client" version := "1.0" scalaVersion := "2.12.2" libraryDependencies += "org.apache.hive" % "hive-exec" % "1.2.1" excludeAll ExclusionRule(organization = "org.pentaho") libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.7.3" libraryDependencies += "org.apache.httpcomponents" % "httpclient" % "4.3.4" libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.6.0" libraryDependencies += "org.apache.hive" % "hive-service" % "1.2.1" libraryDependencies += "org.apache.hive" % "hive-cli" % "1.2.1" libraryDependencies += "org.scalatest" % "scalatest_2.12" % "3.0.3"
in my case i am using hive 2.1.1,you can use any,let the dependencies to be resolved,now add a scala class in your project named as hiveclient
package cli import java.io.IOException import scala.util.Try import org.apache.hadoop.hive.cli.CliSessionState import org.apache.hadoop.hive.conf.HiveConf import org.apache.hadoop.hive.ql.Driver import org.apache.hadoop.hive.ql.session.SessionState /** * Hive meta API client for Testing Purpose * * @author Anubhav */ class HiveClient { val hiveConf = new HiveConf(classOf[HiveClient]) /** * Get the hive ql driver to execute ddl or dml * * @return */ private def getDriver: Driver = { val driver = new Driver(hiveConf) SessionState.start(new CliSessionState(hiveConf)) driver } /** * @param hql * @throws org.apache.hadoop.hive.ql.CommandNeedRetryException * @return int */ def executeHQL(hql: String): Int = { val responseOpt = Try(getDriver.run(hql)).toEither val response = responseOpt match { case Right(response) => response case Left(exception) => throw new Exception(s"${ exception.getMessage }") } val responseCode = response.getResponseCode if (responseCode != 0) { val err: String = response.getErrorMessage throw new IOException("Failed to execute hql [" + hql + "], error message is: " + err) } responseCode } }
it has one public method executeHQL that called private method getDriver to get the hiveDriver instance and execute hql with it,this method will give back the response code back
now write the test case to test this hive client
import cli.HiveClient import org.scalatest.FunSuite class HiveClientTest extends FunSuite { val hiveClient = new HiveClient test("testing for the hql query") { assert(hiveClient.executeHQL("DROP TABLE IF EXISTS DEMO") == 0) assert(hiveClient.executeHQL("CREATE TABLE IF NOT EXISTS DEMO(id int)") == 0) assert(hiveClient.executeHQL("INSERT INTO DEMO VALUES(1)") == 0) assert(hiveClient.executeHQL("SELECT * FROM DEMO") == 0) assert(hiveClient.executeHQL("SELECT COUNT(*) FROM DEMO") == 0) } }
now run these test cases
i hope this blog will be helpful happy coding