More (Advanced) Querying CIF Data With Splunk

My last post on querying Collective Intelligence Framework (CIF) data with Splunk showed you how to enable CIF as part of Splunk workflows for ad-hoc lookups, but what if you want to take it a step further and actually cross reference events against CIF? The easiest method I’ve found has been to perform periodic CIF queries saving the results to a comma delimited (.csv) file then using external file lookups in Splunk to cross reference events. It’s not very difficult to setup and you’ll quickly see the value. All you’ll need outside of Splunk is access to a CIF server to create the .csv files.

First we need to export the query results from CIF:

cif -q infrastructure -s medium -c 85 -p csv -O infrastructure.csv

This command queries CIF for any “infrastructure” results with a severity rating of medium or higher and a confidence rating of 85 or higher. Infrastructure results are IP based threats related to botnet command and control (CnC) and infection sources. You’ll then have an infrastructure.csv file that looks something like this:

We need to remove the “#” and following space from the first row fields so it looks like this:

If you don’t remove the “#<space>” from the first row, Splunk will not parse the field names. (In other words, it won’t work.)

Fire up Splunk and head to Manager > Lookups > Lookup table files and create a new lookup file. Specify which app you would like the lookup table associated with then browse to the file and specify what you want the file to be named when it gets copied up to the Splunk server. The Search app is probably the best default option. Once saved, you can change the permissions to make the lookup file available in other apps.

At this point you can already run some very handy queries, for example:

sourcetype=bro_conn [|inputlookup infrastructure.csv | rename address as dest_ip | fields + dest_ip]

This query searches our Bro IDS connection log but uses the “inputlookup” command against our infrastructure.csv file. The “rename address as dest_ip” tells Splunk to rename the address field as dest_ip and the “fields + dest_ip” specifies which fields to lookup.

If you want to just gauge what the number of connections from your network to IPs in CIF look like, add the stats command:

sourcetype=bro_conn [|inputlookup infrastructure.csv | rename address as dest_ip | fields + dest_ip] | stats count by dest_ip

This will show you all the dest_ip addresses in bro_conn that match CIF and provide a count of the number of those events for each IP. You can repeat this process for CIF domains and urls quite easily to perform the same types of queries against DNS or http logs (like Bro’s dns.log and http.log). We can, however, take it a step further with automated lookups.

If you revisit Splunk Manager > Lookups, let’s add a new Lookup definition. Specify the app, give it a name, make sure the “Type” is set to “File-based” and then specify infrastructure.csv in the “Lookup” file drop down. Save it and adjust the permissions accordingly. Then go back to Lookups but this time let’s add a new “Automatic lookup.”

Once again we choose the app, give it a name and choose the lookup table. We then specify which sourcetype we want to check against CIF. The “Lookup input fields” tell Splunk to check the address field in the infrastructure.csv file against the dest_ip for our bro_conn sourcetype. We then have to specify what fields we want to output from the lookup if there is a match. Here you can define every field in the infrastructure.csv file or just the one’s that you want to be able to reference in the events.

All we are doing is telling Splunk when you find “address” field in the .csv file display it in Splunk as cif_address. (I recommend you append a prefix to avoid potential field name conflicts). The fields defined in the above screenshot should give you plenty to get started. Save it and you’re ready to query.

Doing a simple search against the bro_conn sourectype won’t provide any CIF results on it’s own. In order to get the automatic lookup to kick in, we have to tell Splunk that we want one of our CIF output fields as a result.

sourcetype=bro_conn cif_impact=*

Running that query with the automatic lookup properly configured will now get you the CIF matches for any dest_ip addresses plus you’ll now see the output fields we defined in the field picker.

So now you can run queries like this:

sourcetype=bro_conn cif_impact=* | table src_ip dest_ip cif_confidence cif_severity cif_impact cif_portlist cif_cc

To get results like this:

If you want to turn it into a dashboard, you might end up with something like this:

You can automate the CIF query export to .csv, editing the results to remove the “#<space>” from the first row and copy the file to the Splunk lookup folder pretty easily. Once you’ve created the lookup file in Splunk, you can update it at any time without having to let Splunk know. The path to the file on your Splunk server would be something like /opt/splunk/etc/apps/<app name>/lookups/infrastructure.csv.

I plan to get some CIF dashboards built into the Security Onion for Splunk app over the next few months. (If you can’t tell I’ve already been working on them.) They’ll only benefit users who have access to a CIF server to perform the export queries for the file-based lookups, but it’s not that difficult to stand up your own CIF server and being able to correlate your network activity with current community based intelligence is more than worth the effort.

Leave a Reply