How attackers evade your EDR/XDR system — and what you can do about it | CSO Online

type

status

date

summary

Attackers make a living evading your EDR/XDR systems. Here’s how they get around or go undetected by your defenses at three key points along the way.

How attackers evade your EDR/XDR system — and what you can do about it

Attackers make a living evading your EDR/XDR systems. Here’s how they get around or go undetected by your defenses at three key points along the way.

https://www.csoonline.com/article/3476179/how-your-xdr-is-evaded.html?utm_date=20240725185324&utm_campaign=CSO%20US%20First%20Look&utm_content=Slot%20One%20Title%3A%20How%20attackers%20evade%20your%20EDR%2FXDR%20system%20%E2%80%94%20and%20what%20you%20can%20do%20about%20it&utm_term=CSO%20US%20Editorial%20Newsletters&utm_medium=email&utm_source=Adestra&huid=54d80eb2-6e53-425b-a38c-de5bda17a48a

How attackers evade your EDR/XDR system — and what you can do about it

A recent global survey noted that CISOs and their organizations may be too reliant on endpoint detection and response (EDR) and extended detection and response (XDR) systems, as attackers are increasingly evaded them.

That’s due in part to the fact that evading EDR/XDR systems has been and will continue to be a fundamental requirement for most modern adversaries. “Evasion” has been used generically to describe instances where a defensive response has not been observed. While technically accurate, this lack of specificity hinders cybersecurity professionals from accurately targeting remediation efforts. For example, the fix for faulty detection logic differs greatly from instances where telemetry is missing in an XDR platform.

To better understand how attacks evade EDR/XDR systems and, more importantly, what to do after it happens, we need to understand the three areas where it occurs: observation, detection, and response and prevention.

How attackers evade XDR during observation

At its core, an XDR system consumes events from various sources, such as an endpoint’s operating system or a cloud provider. These events, commonly referred to as telemetry, form the foundation of detections. An XDR can only detect what the system(s) can provide and collect. The first type of evasion occurs when the XDR does not receive the events it needs to detect malicious behavior.

There are a few common causes for this type of evasion. First, and what I believe to be the “purest” technical evasion, is that the adversary’s actions created no relevant telemetry. Relevant because every action taken on a system generates some amount of telemetry, but those events might not be useful for creating good detections. Think of this as a missing event source in the system rather than a deficiency in the XDR.

Next are instances where the telemetry is produced by the system but not consumed by the XDR. There are thousands of event sources to which an XDR can subscribe, and vendors’ jobs are to decide which ones are required to meet their detection requirements. For example, if an XDR vendor is particularly interested in detecting behaviors related to Active Directory, they would prioritize collecting events from AD over something like network traffic. Not collecting certain types of events, whether by choice or ignorance, creates an exploitable gap in the XDR’s coverage of certain techniques.

Last, an adversary may actively interfere with an XDR agent such that events are not sent to the centralized server in charge of collection and correlation. This interference comes in many forms, including stopping or uninstalling the agent, blocking communication with the server (e.g., via host-based firewall modifications), or tampering with sensors (e.g., disabling AMSI).

Generally speaking, these represent failures of the XDR developer. If an attack is missed because the agent does not collect the relevant telemetry, the vendor is the only entity that can remediate the failure as it involves adding new telemetry sources or extending/enriching existing ones. In the event of agent interference, the vendor should implement anti-tampering measures to prevent what they can and detect interference that can’t reasonably be prevented. These issues are the most difficult to address for every security team because there’s nothing they can realistically do other than appeal to their XDR vendor.

How attackers evade XDR during detection

When most people talk about evading XDR, they’re almost always referring to subverting the detection logic in the XDR. Detections themselves are simply ways of evaluating an event, or group of events, to determine if some condition that may be indicative of malicious behavior is present. These detection queries or rules can be precise, meaning they target a specific attribute usually unique to a piece of malware or offensive tool (such as the command line arguments for Mimikatz), or robust, meaning that they target behaviors shared by more than one malware sample or tool.

Both types of detections have their faults. Precise detections are prone to evasion because they are often overly specific, meaning that any modification to the target sample would result in a false negative. An example of this would be an attacker patching Mimikatz’s argument strings to turn “sekurlsa::logonpasswords” into “nothings::happening_here,” breaking the brittle detection logic targeting the attacker-controlled string.

Robust detections, despite appearing less susceptible to evasion at face value, are notorious for their false positives, which lead to exclusions in the rule that become exploitable by attackers. An example of this that I’ve seen in practice is excluding the Chrome update process, “GoogleUpdate.exe,” from credential dumping detections because its normal operation involves opening a privileged handle to the local security authority subsystem service (LSASS) process. This exclusion allows an attacker to either masquerade as the update helper or inject into it to extract credentials without detection, despite all the behavioral patterns being present in the events collected by the XDR.

These evasions exploit logical issues in the detections built into the EDR, whether they are supplied by the vendor or written by internal detection engineers. Imperfect understanding of the technique or procedure, detection chokepoints, compromises made to make the detection viable for production use, and suboptimal telemetry in the XDR leading to weak detection logic are commonplace in the world of endpoint protection.

The fix for these issues is to close these logical gaps, but unfortunately, that’s not always possible in real environments. Sometimes we have to accept a certain level of brittleness to generate detections for emerging threats quickly, but these precise detections should be supplemented with robust counterparts to catch the false negatives that inevitably fall through the cracks.

Robust detections almost always require some level of exclusions to avoid inundating security teams with alerts, but those exclusions should be limited as much as possible and continuously evaluated to determine if they need to remain in production. Realistically, there is a never-ending tuning and tinkering phase of detection engineering that begins as soon as a detection enters production, where the goal is solely to make the detection as resilient as possible while also remaining within the bounds of what your team can tolerate, both in terms of false positives and false negatives.

How attackers evade XDR during response and prevention

The last type of evasion centers around a fault in the investigative process that should occur when a true positive alert occurs. The process of responding to an alert varies from organization to organization, but all generally include triage, investigation, and response stages. The complexity of this process opens many different failure points.

Working through the pipeline of an average SOC, the first opportunity for evasion to occur is in the triage stage, where a level 1 analyst receives the alert and mistakenly tags it as a false positive. This causes the behavior to go unnoticed despite the XDR doing its job. This failure can stem from alert fatigue, leading to incorrect suppression just to fight the alert queue down, or from a general lack of understanding of what the purpose of the detection is and what the information means. Fixing this failure point typically involves queue and fatigue management through the reduction of false positives (which has its issues, as detailed in the previous section), and through better documentation and education about the detections analysts encounter.

Next, the investigative stage, which occurs after a true positive alert is raised, involves secondary information collection to more concretely determine if an alert is worthy of promotion to a full-on incident. This process is typically manual, requiring a skilled analyst to interrogate the system(s) in question and extract supporting information, such as information about artifacts left on the filesystem.

There are many failure points here relating to both the skill of the investigator and the adversary. What happens if the analyst needs to check a file on disk, but the adversary has removed it pre-emptively? What if memory forensics are needed, but the adversary has rebooted the system? What if the adversary employed a technique with which the investigator is unfamiliar, causing them to miss tracks left behind by the attacker? Resolving these failure points requires strong supporting documentation, such as what should be collected in the event of a suspected true positive alert and what that information means.

Finally, the response stage, which happens after the alert has been confirmed to be a true positive and an incident has been declared, involves the eviction of the threat actor. After determining the scope of the incident (how many systems, users, etc. are involved), security teams have many options to clear the attacker out, ranging from simply rebooting the host to clear out memory-resident malware to drastic measures like burning down their entire environment. Ultimately, success is binary here — either the adversary was fully evicted or not.

The biggest mistake I’ve encountered in this stage while in a red team is when the defense team improperly scoped the incident, leading to incomplete eviction and allowing us to persist in the environment for nearly 18 months (we were eventually kicked out only when the server on which we persisted was decommissioned by their IT team as part of a tech lifecycle upgrade process). Improving the response process to reduce an adversary’s chances of evading eviction comes down to having solid processes that have been rehearsed, the ability to identify the whole scope of the compromise, and the ability to validate the complete eradication of the adversary.

Documentation

Describing XDR evasion with sufficient granularity allows us to better identify which component of our detection pipeline failed and, more importantly, what we can do to fix it. Most evasions can be grouped into either observation (whether the XDR saw the malicious behavior), detection (whether the XDR positively identified the behavior as malicious), or response (whether the behavior led to an adequate response by the security team). During your next encounter with evasion, push for more descriptive language to be used and see what improvements to your remediation process can be made.