How to send Github commits and PR logs to ElasticSearch using a custom script

background
Reading Time: 6 minutes

Hello Readers!! In this blog, we will see how we can send GitHub commits and PR logs to Elasticsearch using a custom script. Here we will use a bash script that will send GitHub logs to elasticsearch. It will create an index in elasticsearch and push there the logs.

After sending logs to elasticsearch we can visualize the following github events in kibana:-

  • Track commit details made to the GitHub repository
  • Track events related to PRs  in the GitHub repository in a timestamp
  • Analyze relevant information related to the GitHub repository
workflow

1. GitHub User: Users will be responsible for performing actions in a GitHub repository like commits and pull requests.

2. GitHub Repository: Source Code Management system on which users will perform actions.

3. GitHub Action:  Continuous integration and continuous delivery (CI/CD) platform which will run each time when a GitHub user will commit any change and make a pull request.

4. Bash Script: The custom script is written in bash for shipping GitHub logs to Elasticsearch.

5. ElasticSearch: Stores all of the logs in the index created.

6. Kibana: Web interface for searching and visualizing logs.

Steps for sending logs to Elasticsearch using bash script: 

1. GitHub users will make commits and raise pull requests to the GitHub repository. Here is my GitHub repository which I have created for this blog.

https://github.com/NaincyKumariKnoldus/Github_logs

github repo

2. Create two Github actions in this repository. This GitHub action will get trigger on the events perform by the GitHub user.

github actions

GitHub action workflow file for getting trigger on commit events:

commit_workflow.yml:

# The name of the workflow
name: CI
#environment variables
env:
    GITHUB_REF_NAME: $GITHUB_REF_NAME
    ES_URL: ${{ secrets.ES_URL }}
 
# Controls when the workflow will run
on: [push]
#A job is a set of steps in a workflow
jobs:
    send-push-events:
        name: Push Logs to ES
        #The job will run on the latest version of an Ubuntu Linux runner.
        runs-on: ubuntu-latest
        steps:
           #This is an action that checks out your repository onto the runner, allowing you to run scripts
           - uses: actions/checkout@v2
           #The run keyword tells the job to execute a command on the runner
           - run: ./git_commit.sh

GitHub action workflow file for getting trigger on pull events:

pr_workflow.yml:

name: CI
 
env:
  GITHUB_REF_NAME: $GITHUB_REF_NAME
  ES_URL: ${{ secrets.ES_URL }}
 
on: [pull_request]
jobs:
  send-pull-events:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - run: ./git_pr.sh

3. Create two files inside your GitHub repository for putting bash scripts. Following is the bash script for shipping GitHub logs to Elasticsearch. This script will get executed by the GitHub actions mentioned above.

git_commit.sh will get triggered by GitHub action workflow file commit_workflow.yml:

#!/bin/bash

# get github commits
getCommitResponse=$(
   curl -s \
      -H "Accept: application/vnd.github+json" \
      -H "X-GitHub-Api-Version: 2022-11-28" \
      "https://api.github.com/repos/NaincyKumariKnoldus/Github_logs/commits?sha=$GITHUB_REF_NAME&per_page=100&page=1"
)

# get commit SHA
commitSHA=$(echo "$getCommitResponse" |
   jq '.[].sha' |
   tr -d '"')

# get the loop count based on number of commits
loopCount=$(echo "$commitSHA" |
   wc -w)
echo "loopcount= $loopCount"

# get data from ES
getEsCommitSHA=$(curl -H "Content-Type: application/json" -X GET "$ES_URL/github_commit/_search?pretty" -d '{
                  "size": 10000,                                                                  
                  "query": {
                     "wildcard": {
                           "commit_sha": {
                              "value": "*"
                           }}}}' |
                  jq '.hits.hits[]._source.commit_sha' |
                  tr -d '"')

# store ES commit sha in a temp file
echo $getEsCommitSHA | tr " " "\n" > sha_es.txt

# looping through each commit detail
for ((count = 0; count < $loopCount; count++)); do
   
   # get commitSHA
   commitSHA=$(echo "$getCommitResponse" |
      jq --argjson count "$count" '.[$count].sha' |
      tr -d '"')

   # match result for previous existing commit on ES
   matchRes=$(grep -o $commitSHA sha_es.txt)
   echo $matchRes | tr " " "\n" >> match.txt

   # filtering and pushing unmatched commit sha details to ES
   if [ -z $matchRes ]; then
      echo "Unmatched SHA: $commitSHA"
      echo $commitSHA | tr " " "\n" >> unmatch.txt
      
      # get author name
      authorName=$(echo "$getCommitResponse" |
         jq --argjson count "$count" '.[$count].commit.author.name' |
         tr -d '"')

      # get commit message
      commitMessage=$(echo "$getCommitResponse" |
         jq --argjson count "$count" '.[$count].commit.message' |
         tr -d '"')

      # get commit html url
      commitHtmlUrl=$(echo "$getCommitResponse" |
         jq --argjson count "$count" '.[$count].html_url' |
         tr -d '"')

      # get commit time
      commitTime=$(echo "$getCommitResponse" |
         jq --argjson count "$count" '.[$count].commit.author.date' |
         tr -d '"')

      # send data to es
      curl -X POST "$ES_URL/github_commit/commit" \
         -H "Content-Type: application/json" \
         -d "{ \"commit_sha\" : \"$commitSHA\",
            \"branch_name\" : \"$GITHUB_REF_NAME\",
            \"author_name\" : \"$authorName\",
            \"commit_message\" : \"$commitMessage\",
            \"commit_html_url\" : \"$commitHtmlUrl\",
            \"commit_time\" : \"$commitTime\" }"
   fi
done

# removing temporary file
rm -rf sha_es.txt
rm -rf match.txt
rm -rf unmatch.txt

git_pr.sh will get triggered by GitHub action workflow file pr_workflow.yml:

#!/bin/bash

# get github PR details
getPrResponse=$(curl -s \
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
  "https://api.github.com/repos/NaincyKumariKnoldus/Github_logs/pulls?state=all&per_page=100&page=1")

# get number of PR
totalPR=$(echo "$getPrResponse" |
  jq '.[].number' |
  tr -d '"')

# get the loop count based on number of PRs
loopCount=$(echo "$totalPR" |
  wc -w)
echo "loopcount= $loopCount"

# get data from ES
getEsPR=$(curl -H "Content-Type: application/json" -X GET "$ES_URL/github_pr/_search?pretty" -d '{
                  "size": 10000,                                                                  
                  "query": {
                     "wildcard": {
                           "pr_number": {
                              "value": "*"
                           }}}}' |
                  jq '.hits.hits[]._source.pr_number' |
                  tr -d '"')

# store ES PR number in a temp file
echo $getEsPR | tr " " "\n" > sha_es.txt

# looping through each PR detail
for ((count = 0; count < $loopCount; count++)); do

  # get PR_number
  totalPR=$(echo "$getPrResponse" |
    jq --argjson count "$count" '.[$count].number' |
    tr -d '"')
  
  # looping through each PR detail
  matchRes=$(grep -o $totalPR sha_es.txt)
  echo $matchRes | tr " " "\n" >>match.txt

  # filtering and pushing unmatched PR number details to ES
  if [ -z $matchRes ]; then
    # get PR html url
    PrHtmlUrl=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].html_url' |
      tr -d '"')

    # get PR Body
    PrBody=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].body' |
      tr -d '"')

    # get PR Number
    PrNumber=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].number' |
      tr -d '"')

    # get PR Title
    PrTitle=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].title' |
      tr -d '"')

    # get PR state
    PrState=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].state' |
      tr -d '"')

    # get PR created at
    PrCreatedAt=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].created_at' |
      tr -d '"')

    # get PR closed at
    PrCloseAt=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].closed_at' |
      tr -d '"')

    # get PR merged at
    PrMergedAt=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].merged_at' |
      tr -d '"')

    # get base branch name
    PrBaseBranch=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].base.ref' |
      tr -d '"')

    # get source branch name
    PrSourceBranch=$(echo "$getPrResponse" |
      jq --argjson count "$count" '.[$count].head.ref' |
      tr -d '"')

    # send data to es
    curl -X POST "$ES_URL/github_pr/pull_request" \
      -H "Content-Type: application/json" \
      -d "{ \"pr_number\" : \"$PrNumber\",
            \"pr_url\" : \"$PrHtmlUrl\",
            \"pr_title\" : \"$PrTitle\",
            \"pr_body\" : \"$PrBody\",
            \"pr_base_branch\" : \"$PrBaseBranch\",
            \"pr_source_branch\" : \"$PrSourceBranch\",
            \"pr_state\" : \"$PrState\",
            \"pr_creation_time\" : \"$PrCreatedAt\",
            \"pr_closed_time\" : \"$PrCloseAt\",
            \"pr_merge_at\" : \"$PrMergedAt\"}"
  fi
done

# removing temporary file
rm -rf sha_es.txt
rm -rf match.txt
rm -rf unmatch.txt

4. Now make a push in the GitHub repository. After making a commit, GitHub action on push will run and it will send commit logs to elasticsearch.

commit action

Move to your elasticsearch for getting GitHub commits logs there.

es_data

We are now getting GitHub commits here.

5. Now raise a pull request in your GitHub repository. It will also run GitHub action on pull and this will trigger the bash script which will push pull request logs to elasticsearch.

pull

GitHub action got executed on the pull request:

github action

Now, move to elasticsearch and you will find pull request logs there.

es_pull data

6. We can visualize these logs in kibana also.

GitHub commit logs in kibana:

kibana data

GitHub pull request logs in kibana:

kibana

This is how we can analyze our GitHub logs in elasticsearch and kibana using the custom script.

We are all done now!!

Conclusion:

Thank you for sticking to the end. In this blog, we have learned how we can send GitHub commits and PR logs to Elasticsearch using a custom script. This is really very quick and simple. If you like this blog, please share my blog and show your appreciation by giving thumbs-ups, and don’t forget to give me suggestions on how I can improve my future blogs that can suit your needs.

HAPPY LEARNING! 

Written by 

Naincy Kumari is a DevOps Consultant at Knoldus Inc. She is always ready to learn new technologies and tools. She loves painting and dancing.

Discover more from Knoldus Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading