Maksim Kabakou - Fotolia
Having been set up, a security operations team can quickly be overwhelmed investigating large numbers of false positive alerts due to a lack of readily available information and the need for manual processes.
To help with this, teams should be divided into level one, level two and level three analysts. Level one should identify and triage alerts before they are passed to level two analysts, who identify false positives and confirm incidents. More complex issues are passed to the level three experts, who may also act as incident responders.
Typically, a ticketing or workflow management system would be used to automate the operational processes of passing incidents between the analysts and to keep other stakeholders, or resolver groups (for example, the IT department), informed. This also helps to ensure all the relevant information is passed to the analysts, and that issues are formally closed and documented.
If automating the process, the main prerequisite is that it is efficient and effective. There is no point in automating a broken process.
Automation can either be done at the task or process level. What you choose to do will depend on the scale of the security operations team, and how they interact with the IT team, wider organisation, and any external service providers. It is also important to think about the system as a whole and what automation will be the most effective.
Typically, the main objectives are to minimise mundane tasks, automatically triage events, and provide all the information required by the analysts to investigate an alert, or respond to an incident. However, any dependencies should also be taken into account. For example, automation of log collection can be very effective, but interpretation of the logs is dependent on identifying the endpoints, or hosts, to which they relate.
In a large dynamic environment, relying on a spreadsheet list of IP addresses to identity a host may not be sufficient, and the use of automated asset detection tools can avoid a lot of work tracking down unknown assets during an investigation.
Automation of log collection and security monitoring devices is probably the most obvious aspect to automate, as detection and investigation both rely on having such information. As a minimum, the operational team will rely on the collection of logs from the system (web proxy logs, host logs and so on), but there may also be security monitoring devices on both the host and network.
Read more about Security Think Tank articles about security automation
Although much of this can be done by simply configuring syslog, or other agents built into the assets, to send logs to a central server, it is surprising how often teams are left to search multiple log sources. Other regular activities can also be automated.
Also, many tools – some free or open source – are available to automate parts of incident triage. For example, these can automatically upload hashes of all the startup executables, or all the running processes on a host, to virus total. This can identify any known malware, while malware analysis tools can be used to identify and analyse unknown malware.
If using a security information and event management (Siem) or log analysis tool, these can be configured to detect potential incidents by defining a set of event combinations, or use cases that indicate a potential intrusion. These may be a complex combination of events from several sources, or something as simple as detecting a Windows Management Instrumentation (WMI) connection coming from somewhere other than a management server. Such use cases are based on current threat intelligence and the operational domain, and need to be maintained, but they can significantly increase the detection rate and speed of detection.
These approaches to automation should help to reduce the load on the team, and provide the information necessary for analysts to start investigating the more advanced incidents.