You may have heard that Python is extremely prevalent in security engineering/automation and that you must learn it to work effectively as a SecEng. Although there are probably some exceptions to the rule, I would generally agree with this statement.

Overall, it is an extremely effective tool to get things done in this field. I don't claim to be a Python expert, but I know a little bit more than enough to be dangerous.

So what does typical, practical Python even look like in security engineering? Where do people use it?

In my experience, writing Python as a SecEng is almost always for:

  • Sending logs to a SIEM
    • You need to collect data from an app/service and send it to your SIEM for alerting or reporting purposes.
    • For this, using Splunk as the example SIEM, you would use your favorite HTTP library (e.g, requests) to pull the data from the app/service and then send it to Splunk using an HTTP Event Collector endpoint.
  • Security automation workflows
    • You need to respond to some triggered event (a webhook) originating from an app/service.
    • For this, you might create a route within a Flask app that can receive requests from said app/service and then do something with that data. Like... send a Slack message, send a lock command to someone's machine, or order 20 pizzas.

Keep in mind these are just examples of which there are countless, nuanced ways to accomplish them (e.g., maybe you would use a SOAR tool, AWS Lambda, AWS API Gateway, etc. for certain parts of the process).

Python Example in SecEng

Anecdotally, the prototypical example of using Python in Security Engineering is the Sending logs to a SIEM example; you need to collect data from an app/service and send it to your SIEM for alerting or reporting purposes.

The script below pulls the last 20 minutes of updated alerts from Atlassian OpsGenie and sends it to Splunk.

import requests
import json
import os
import pendulum

# Get API keys
opsgenie_api_key = os.environ.get("YOUR_OPSGENIE_API_KEY")
splunk_hec_token = os.environ.get("YOUR_HEC_TOKEN")
splunk_hec_url = os.environ.get("YOUR_SPLUNK_HEC_URL")

# Get the current time in UTC
now = pendulum.now('UTC')

# Get the Unix timestamp
unix_timestamp = now.int_timestamp

# Calculate the time 20 min ago
time_20_min_ago = unix_timestamp - 1200

# Fetch alerts from OpsGenie updated in the past 20 min
opsgenie_url = f'https://api.opsgenie.com/v1/logs?limit=1000&order=desc&query=updatedAt>{time_20_min_ago}'
headers = {'Authorization': f'GenieKey {opsgenie_api_key}'}
response = requests.get(opsgenie_url, headers=headers)
logs = response.json()

# Send logs to Splunk
headers = {
    'Authorization': f'Splunk {splunk_hec_token}',
    'Content-Type': 'application/json'
}

for log in logs.get('data', []):
    # Parse the updatedAt time using Pendulum
    updated_at_time = pendulum.parse(log['updatedAt'])

    event = {
        "event": log,
        "sourcetype": "_json",
        "time": updated_at_time.int_timestamp,
        "host": "opsgenie",
        "source": "opsgenie_log"
    }
    response = requests.post(splunk_hec_url, headers=headers, data=json.dumps(event))

Usually, a script like this would be run every 20 minutes using a task scheduler. For example, in AWS you can run it every 20 minutes as a Lambda function from an Amazon EventBridge Scheduler.

To boost this script a bit, you could add some error-handling by adding try/except blocks where you might decide to send error logs to a different Splunk index.

This also doesn't account for pagination (i.e., you need to make multiple requests to get all the logs that exist in that timeframe/query), but if you have more than 1000 updated alerts in OpsGenie every 20 minutes, that sounds like a problem in itself.

Shout out to the Pendulum library which makes dealing with datetimes less depressing: https://pendulum.eustace.io/

💡
Pro-tip: Don't put hardcoded API keys/tokens in your code.

Python in Security Engineering Interviews

How does this differ from what you might come across in a coding interview for a security engineering position? Ideally, it shouldn't, and I wouldn't deviate from it myself if I was interviewing someone for a SecEng position.

However, it is totally possible that your interviewer is going to make you go through some completely impractical leetcode exercise. I'd wager this is more prevalent at large software companies.

If I were interviewing for a SecEng position today, I would brush up on the following before starting any interview rounds:

  • Grind some easy Python leetcode exercises on YouTube or some other platform. Personally, I think these are really dumb but there is a good chance you will run into them.
  • Using the regex module (re) to write a function to parse through an array of mock log data and filtering out certain events, status codes, etc.
  • Using an HTTP library (e.g, requests) to take data from an app/service and send it to a SIEM.
  • Using the subprocess module to run random shell commands.
  • Opening and writing files using open() .

I don't know Python... how can I get started?

Someone recently asked me this, so I am just going to regurgitate word-for-word what I told them:

That's awesome that you want to learn how to code. For me it has been such a great thing in my life so I would recommend it to anyone.

Not sure how much you've researched already but the basic advice that you will see is to either try out Python or JavaScript. I will speak from experience and recommend that you check out Python first and go through this course https://automatetheboringstuff.com/ which was super helpful when I was getting started. 
I think it is free these days.

After knowing the super basic stuff, the main thing that made me actually start making faster progress is finding a project or something that I wanted to do. And then just so much trial and error until it clicks. It's kind of like learning a musical instrument. I sucked at coding for like 3 years until it started clicking with me.