Hello Readers!! In this blog, we will see how we can send GitHub commits and PR logs to Elasticsearch using a custom script. Here we will use a bash script that will send GitHub logs to elasticsearch. It will create an index in elasticsearch and push there the logs.
After sending logs to elasticsearch we can visualize the following github events in kibana:-
- Track commit details made to the GitHub repository
- Track events related to PRs in the GitHub repository in a timestamp
- Analyze relevant information related to the GitHub repository
1. GitHub User: Users will be responsible for performing actions in a GitHub repository like commits and pull requests.
2. GitHub Repository: Source Code Management system on which users will perform actions.
3. GitHub Action: Continuous integration and continuous delivery (CI/CD) platform which will run each time when a GitHub user will commit any change and make a pull request.
4. Bash Script: The custom script is written in bash for shipping GitHub logs to Elasticsearch.
5. ElasticSearch: Stores all of the logs in the index created.
6. Kibana: Web interface for searching and visualizing logs.
Steps for sending logs to Elasticsearch using bash script:
1. GitHub users will make commits and raise pull requests to the GitHub repository. Here is my GitHub repository which I have created for this blog.
https://github.com/NaincyKumariKnoldus/Github_logs
2. Create two Github actions in this repository. This GitHub action will get trigger on the events perform by the GitHub user.
GitHub action workflow file for getting trigger on commit events:
commit_workflow.yml:
# The name of the workflow
name: CI
#environment variables
env:
GITHUB_REF_NAME: $GITHUB_REF_NAME
ES_URL: ${{ secrets.ES_URL }}
# Controls when the workflow will run
on: [push]
#A job is a set of steps in a workflow
jobs:
send-push-events:
name: Push Logs to ES
#The job will run on the latest version of an Ubuntu Linux runner.
runs-on: ubuntu-latest
steps:
#This is an action that checks out your repository onto the runner, allowing you to run scripts
- uses: actions/checkout@v2
#The run keyword tells the job to execute a command on the runner
- run: ./git_commit.sh
GitHub action workflow file for getting trigger on pull events:
pr_workflow.yml:
name: CI
env:
GITHUB_REF_NAME: $GITHUB_REF_NAME
ES_URL: ${{ secrets.ES_URL }}
on: [pull_request]
jobs:
send-pull-events:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- run: ./git_pr.sh
3. Create two files inside your GitHub repository for putting bash scripts. Following is the bash script for shipping GitHub logs to Elasticsearch. This script will get executed by the GitHub actions mentioned above.
git_commit.sh will get triggered by GitHub action workflow file commit_workflow.yml:
#!/bin/bash
# get github commits
getCommitResponse=$(
curl -s \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"https://api.github.com/repos/NaincyKumariKnoldus/Github_logs/commits?sha=$GITHUB_REF_NAME&per_page=100&page=1"
)
# get commit SHA
commitSHA=$(echo "$getCommitResponse" |
jq '.[].sha' |
tr -d '"')
# get the loop count based on number of commits
loopCount=$(echo "$commitSHA" |
wc -w)
echo "loopcount= $loopCount"
# get data from ES
getEsCommitSHA=$(curl -H "Content-Type: application/json" -X GET "$ES_URL/github_commit/_search?pretty" -d '{
"size": 10000,
"query": {
"wildcard": {
"commit_sha": {
"value": "*"
}}}}' |
jq '.hits.hits[]._source.commit_sha' |
tr -d '"')
# store ES commit sha in a temp file
echo $getEsCommitSHA | tr " " "\n" > sha_es.txt
# looping through each commit detail
for ((count = 0; count < $loopCount; count++)); do
# get commitSHA
commitSHA=$(echo "$getCommitResponse" |
jq --argjson count "$count" '.[$count].sha' |
tr -d '"')
# match result for previous existing commit on ES
matchRes=$(grep -o $commitSHA sha_es.txt)
echo $matchRes | tr " " "\n" >> match.txt
# filtering and pushing unmatched commit sha details to ES
if [ -z $matchRes ]; then
echo "Unmatched SHA: $commitSHA"
echo $commitSHA | tr " " "\n" >> unmatch.txt
# get author name
authorName=$(echo "$getCommitResponse" |
jq --argjson count "$count" '.[$count].commit.author.name' |
tr -d '"')
# get commit message
commitMessage=$(echo "$getCommitResponse" |
jq --argjson count "$count" '.[$count].commit.message' |
tr -d '"')
# get commit html url
commitHtmlUrl=$(echo "$getCommitResponse" |
jq --argjson count "$count" '.[$count].html_url' |
tr -d '"')
# get commit time
commitTime=$(echo "$getCommitResponse" |
jq --argjson count "$count" '.[$count].commit.author.date' |
tr -d '"')
# send data to es
curl -X POST "$ES_URL/github_commit/commit" \
-H "Content-Type: application/json" \
-d "{ \"commit_sha\" : \"$commitSHA\",
\"branch_name\" : \"$GITHUB_REF_NAME\",
\"author_name\" : \"$authorName\",
\"commit_message\" : \"$commitMessage\",
\"commit_html_url\" : \"$commitHtmlUrl\",
\"commit_time\" : \"$commitTime\" }"
fi
done
# removing temporary file
rm -rf sha_es.txt
rm -rf match.txt
rm -rf unmatch.txt
git_pr.sh will get triggered by GitHub action workflow file pr_workflow.yml:
#!/bin/bash
# get github PR details
getPrResponse=$(curl -s \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"https://api.github.com/repos/NaincyKumariKnoldus/Github_logs/pulls?state=all&per_page=100&page=1")
# get number of PR
totalPR=$(echo "$getPrResponse" |
jq '.[].number' |
tr -d '"')
# get the loop count based on number of PRs
loopCount=$(echo "$totalPR" |
wc -w)
echo "loopcount= $loopCount"
# get data from ES
getEsPR=$(curl -H "Content-Type: application/json" -X GET "$ES_URL/github_pr/_search?pretty" -d '{
"size": 10000,
"query": {
"wildcard": {
"pr_number": {
"value": "*"
}}}}' |
jq '.hits.hits[]._source.pr_number' |
tr -d '"')
# store ES PR number in a temp file
echo $getEsPR | tr " " "\n" > sha_es.txt
# looping through each PR detail
for ((count = 0; count < $loopCount; count++)); do
# get PR_number
totalPR=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].number' |
tr -d '"')
# looping through each PR detail
matchRes=$(grep -o $totalPR sha_es.txt)
echo $matchRes | tr " " "\n" >>match.txt
# filtering and pushing unmatched PR number details to ES
if [ -z $matchRes ]; then
# get PR html url
PrHtmlUrl=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].html_url' |
tr -d '"')
# get PR Body
PrBody=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].body' |
tr -d '"')
# get PR Number
PrNumber=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].number' |
tr -d '"')
# get PR Title
PrTitle=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].title' |
tr -d '"')
# get PR state
PrState=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].state' |
tr -d '"')
# get PR created at
PrCreatedAt=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].created_at' |
tr -d '"')
# get PR closed at
PrCloseAt=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].closed_at' |
tr -d '"')
# get PR merged at
PrMergedAt=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].merged_at' |
tr -d '"')
# get base branch name
PrBaseBranch=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].base.ref' |
tr -d '"')
# get source branch name
PrSourceBranch=$(echo "$getPrResponse" |
jq --argjson count "$count" '.[$count].head.ref' |
tr -d '"')
# send data to es
curl -X POST "$ES_URL/github_pr/pull_request" \
-H "Content-Type: application/json" \
-d "{ \"pr_number\" : \"$PrNumber\",
\"pr_url\" : \"$PrHtmlUrl\",
\"pr_title\" : \"$PrTitle\",
\"pr_body\" : \"$PrBody\",
\"pr_base_branch\" : \"$PrBaseBranch\",
\"pr_source_branch\" : \"$PrSourceBranch\",
\"pr_state\" : \"$PrState\",
\"pr_creation_time\" : \"$PrCreatedAt\",
\"pr_closed_time\" : \"$PrCloseAt\",
\"pr_merge_at\" : \"$PrMergedAt\"}"
fi
done
# removing temporary file
rm -rf sha_es.txt
rm -rf match.txt
rm -rf unmatch.txt
4. Now make a push in the GitHub repository. After making a commit, GitHub action on push will run and it will send commit logs to elasticsearch.
Move to your elasticsearch for getting GitHub commits logs there.
We are now getting GitHub commits here.
5. Now raise a pull request in your GitHub repository. It will also run GitHub action on pull and this will trigger the bash script which will push pull request logs to elasticsearch.
GitHub action got executed on the pull request:
Now, move to elasticsearch and you will find pull request logs there.
6. We can visualize these logs in kibana also.
GitHub commit logs in kibana:
GitHub pull request logs in kibana:
This is how we can analyze our GitHub logs in elasticsearch and kibana using the custom script.
We are all done now!!
Conclusion:
Thank you for sticking to the end. In this blog, we have learned how we can send GitHub commits and PR logs to Elasticsearch using a custom script. This is really very quick and simple. If you like this blog, please share my blog and show your appreciation by giving thumbs-ups, and don’t forget to give me suggestions on how I can improve my future blogs that can suit your needs.
HAPPY LEARNING!