Create Your Own MetastoreEvent Listeners in Hive With Scala


HIve MetaStore Event Listeners are used to Detect the every single event that takes place whenever an event is executed in hive, in case You want some action to take place for an event you can override MetaStorePreEventListener and provide it your own Implementation

in this article, we will learn how to create our own metastore event listeners in the hive using scala and sbt

so let’s get started first add the following dependencies in your build.sbt file

libraryDependencies += "org.apache.hive" % "hive-exec" % "1.2.1" excludeAll
  ExclusionRule(organization = "org.pentaho")

libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.7.3"

libraryDependencies += "org.apache.httpcomponents" % "httpclient" % "4.3.4"

libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.6.0"

libraryDependencies += "org.apache.hive" % "hive-service" % "1.2.1"

unmanagedJars in Compile += file("/usr/lib/hive/lib/hive-exec-1.2.1.jar")

assemblyMergeStrategy in assembly := {
  case PathList("META-INF", xs @ _*) => MergeStrategy.discard
  case x => MergeStrategy.first
}

now create your first class you can be named it anything I named it as OrcMetastoreListener this class must extend  MetaStorePreEventListener class of hive and take Hadoop conf as the constructor argument

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.hive.metastore.MetaStorePreEventListener
import org.apache.hadoop.hive.metastore.events.PreEventContext.PreEventType._
import org.apache.hadoop.hive.metastore.events._

class OrcMetastoreListener(conf: Configuration) extends MetaStorePreEventListener(conf) {

  override def onEvent(preEventContext: PreEventContext): Unit = {
    preEventContext.getEventType match {
      case CREATE_TABLE =>
        val tableName = preEventContext.asInstanceOf[PreCreateTableEvent].getTable
        tableName.getSd.setInputFormat("org.apache.hadoop.hive.ql.io.orc.OrcInputFormat")
        tableName.getSd.setOutputFormat("org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat")
      case ALTER_TABLE =>
        val newTableName = preEventContext.asInstanceOf[PreAlterTableEvent].getNewTable
        newTableName.getSd.setInputFormat("org.apache.hadoop.hive.ql.io.orc.OrcInputFormat")
        newTableName.getSd.setOutputFormat("org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat")
      case _ => //do nothing

    }

  }
}

the pre-event context contains all the hive meta store event in my case I want that whenever a table is get created in the hive it must use hive input format and output format and same thing for altering command

The best use case for this listener is when somebody wants to query a data source such as spark or any other data source using its own custom input format and even don’t want to alter the schema  of hive table to use his custom input format

now let’s build a jar from the core and use it in the Hive

First, add sbt assembly plugin in your plugins.sbt file

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.5")

now got to your root project and build the jar with command sbt assembly

it will build your jar, collect your jar and put it in your $HIVE_HOME/lib path

inside $HIVE_HOME/conf folder add the following contents in hive-site.xml
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false></value>
</property>
<property>
<name>hive.metastore.pre.event.listeners</name>
<value>metastorelisteners.OrcMetastoreListener</value>
</property>
</configuration>

now create a table in hive and describe it

hive> CREATE TABLE HIVETABLE(ID INT);
OK
Time taken: 2.742 seconds
hive> DESC EXTENDED HIVETABLE
    > ;
OK
id                  	int                 	                    
	 	 
Detailed Table Information Table(tableName:hivetable, dbName:default, owner:hduser,e, inputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat, compressed:false, 
Time taken: 0.611 seconds, Fetched: 3 row(s)


knoldus-advt-sticker


 

Advertisements
This entry was posted in Scala. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s