Audit DC/OS Service via Prometheus/Filebeat

Knoldus Blog Audio
Reading Time: 3 minutes

In our previous blog post, How to audit DC/OS Services?, we learned how to locally audit service/app in DC/OS via dcos-adminrouter.service. This blog is in continuation of the previous one and here we will explore how we can audit DC/OS Service via Prometheus /Filebeat.

Quick Recap

We have seen that DC/OS doesn’t provides any web interface is to track the change in service/app configs , so we parsed the logs from the dcos-adminrouter.service running on master nodes and extracted the info based on PUT requests.

DC/OS Audit
Audit DC/OS Service via Prometheus/Filebeat

Let’s track again… !!!

In this blog we will be using our all time favourite tools to gather metrics/logs and make them available to be represented in better format.

Now as we have seen that the logs might be scattered among the masters so in track all the changes we are required to audit each master.

Audit via Prometheus

The idea here is to get the audit result as metrics and export them so that they can be later used to audit or a grafana dashboard can also be created as per requirement.
Steps (Repeat on all masters):-

  • first, you need node_exporter up and running.
  • create the first script as `preflight.sh` with the following lines.
#!/bin/bash

DIR=/var/log/dcos/service_audit
mkdir -p $DIR
cp dcos_service_audit_prometheus.sh $DIR/
chmod u+x $DIR/dcos_service_audit_prometheus.sh
sed -i '/^ExecStart/ s/$/ --collector.textfile.directory=\/var\/log\/dcos\/service_audit\//' /etc/systemd/system/node_exporter.service
systemctl restart node_exporter
echo "* * * * * root /var/log/dcos/service_audit/dcos_service_audit_prometheus.sh" > /etc/cron.d/dcos_service_audit
  • now create second script as dcos_service_audit_prometheus.sh with following content.
#!/bin/bash

SVC_AUDIT=/var/log/dcos/service_audit
mkdir -p $SVC_AUDIT
> $SVC_AUDIT/audit_file.prom.tmp
journalctl  -u "dcos-adminrouter.service" -r  --since "1 min ago"  | grep type=audit | grep PUT > $SVC_AUDIT/audit_tmp_file.log
while read log
do
    CLUSTER=$(echo $log | cut -d"," -f5 | cut -d"\"" -f2 | cut -d"." -f1)
    USER=$(echo $log | awk -F'uid=' '{print $2}' | cut -d" " -f1 | cut -d"," -f1)
    SVC=$(echo $log | awk -F'apps/' '{print $2}' | cut -d" " -f1 | cut -d"?" -f1)   
    MASTER=$(echo $log | cut -d" " -f4 | cut -d"." -f1)
    echo "service_change_audit_metric{cluster=\"$CLUSTER\",service_changed=\"$SVC\",user=\"$USER\"} 1" >> $SVC_AUDIT/audit_file.prom.tmp
    echo "service_change_observed_at_master{cluster=\"$CLUSTER\"} $MASTER" >> $SVC_AUDIT/audit_file.prom.tmp
done < $SVC_AUDIT/audit_tmp_file.log
mv $SVC_AUDIT/audit_file.prom.tmp $SVC_AUDIT/audit_file.prom
  • now run preflight.sh with sudo privileges and it will do the following things for you:
    • create the standard directory structure for the dcos auditing.
    • copy auditing script there.
    • update the service file of node_exporter to enable reading audit result as Prometheus metrics.
    • create a cronjob to run the audit script on regular basis.
  • your metrics should be visible now, make sure to test via curl curl localhost:9100/metrics

Audit via Filebeat

The idea here is to get the audit result as logs which will be shipped further to ELK stack.
Steps (Repeat on all masters):-

  • this time you need filebeat running on you master node.
  • create first script as preflight.sh with following lines.
#!/bin/bash

DIR=/var/log/dcos/service_audit
mkdir -p $DIR
cp dcos_service_audit_filebeat.sh $DIR/
chmod u+x $DIR/dcos_service_audit_filebeat.sh
systemctl restart filebeat
echo "* * * * * root /var/log/dcos/service_audit/dcos_service_audit_filebeat.sh" > /etc/cron.d/dcos_service_audit
  • now create second script as dcos_service_audit_filebeat.sh with following content.
#!/bin/bash

SVC_AUDIT=/var/log/dcos/service_audit
mkdir -p $SVC_AUDIT
> $SVC_AUDIT/audit_log.tmp
journalctl  -u "dcos-adminrouter.service" -r  --since "1 min ago"  | grep type=audit | grep PUT > $SVC_AUDIT/audit_tmp_file.log
while read log
do
    CLUSTER=$(echo $log | cut -d"," -f5 | cut -d"\"" -f2 | cut -d"." -f1)
    USER=$(echo $log | awk -F'uid=' '{print $2}' | cut -d" " -f1 | cut -d"," -f1)
    SVC=$(echo $log | awk -F'apps/' '{print $2}' | cut -d" " -f1 | cut -d"?" -f1)   
    MASTER=$(echo $log | cut -d" " -f4 | cut -d"." -f1)
    echo "cluster: $CLUSTER, service_changed: $SVC, user: $USER" >> $SVC_AUDIT/audit_log.tmp
done < $SVC_AUDIT/audit_tmp_file.log
cat $SVC_AUDIT/audit_log.tmp >> $SVC_AUDIT/dcos_service_audit.log
  • update your filebeat.yml by adding following lines under prospectors or inputs for filebeat (depending n your version).
fields: {type: dcos-service-audit}
input_type: log
paths: [/var/log/dcos/service_audit/dcos_service_audit.log]
  • now run preflight.sh with sudo privilages and it will do the following things for you:
    • create standard directory structure for the dcos auditing.
    • copy auditing script there.
    • restart the filebeat service to run with update config file.
    • create a cronjob to run the audit script on regular basis.
  • now check your logs, they should be available in kibana (if used elk).

References

Knoldus