Fighting the (f)IR(e) : Bucket Brigade, Not More Matches
Hey, you in enterprise IT… having a pretty crappy week this week, huh?!
I get it, I’ve been there – multiple times – before. If it’s not because of this one thing, it’s because of this other thing.
Why don’t we just turn on our heels, throw our hands up and walk out, never to return?
Probably because we like the hunt… the discovery, I sure do. Other times I feel we are all just masochists.
But the problem du jour is just another outcome from, as the pundits and others will spend time and other column inches explaining, something that was entirely preventable.
However, the point they also tend to miss is that it’s also entirely predictable.
I took a tack a few months ago covering what went wrong with the Iowa caucus application development and deployment from the point of view of a C-level tech exec, a developer and security person. In short, a top-down, bottom-up, side-to-side analysis of why it was inevitable, how it can be mitigated, and how we get ourselves into these situations where some failure or another happened and bad stuff happens.
Today’s discussion also revolved around software development, but unlike the Iowa application, we’ll touch a bit on incident handling and response, security operations, and some really weird public sector and policy stuff that puts us in the spot that SolarWinds and their customers find themselves in as of this past Sunday and probably well beyond the expiration time for the relevance of this post.
First, I’m going to basically say, I will not punch down or dunk on, pontificate on the merits of one tool or service or another, because we all know that today’s commentator is probably tomorrow’s target. Instead, I am going to take a holistic view on maybe where some common mistakes were made, which you may or may not be aware of existing in your own organizations, why they may be fixed or mitigated, and if you find yourself in one of these spots in the future, maybe this is an article to reference to get everybody on the same page.
So a quick intro for some of those who may not have found this post through knowing me, but I’m a techie whence I got my hands on my first Commodore 64 and Timex Sinclair as a kid, to where my last role had me as the Chief Technology Officer and Digital Services Director for a Federal agency law enforcement agency. Half of my career was in or supporting the public sector, while the other half was the private sector—so equal opportunity observer.
I’ve developed my own software and internet facing services since 1993, and had led teams of over 100 developers and worked to also secure various aspects of our critical infrastructure. I’ve helped direct Federal policy and even got a few words into a major cybersecurity law that would have gone missing if I wasn’t eagle eyed enough to notice a massive exclusion. I’ve had my fair share of time engineering and architecting enterprise systems, and working with startups—so my point of view comes from hopefully a well-intentioned and well-informed place.
Some of this article will be based on current knowledge, but also some educated inferences from prior experience and exposure, but also knowing that from that experience—while the names or systems may be different, the same methods used to resolve or address the problems are common or easily transferrable to most cases.
Now on with the show.
Early Sunday, December 13, 2020 the software company SolarWinds disclosed that they distributed exploited software to customers, of which numbered to nearly 18,000 and comprised about 85% of the Fortune 500 and a good portion of US Federal agencies in one form or another. They attributed this same flaw to the intrusion reported by FireEye barely a week prior that ended up exposing a library of their offensive tools used during “red team” activities as part of their business operations.
Jointly, while initially not directly disclosed, it was inferred that APT29 (“Cozy Bear”), a Russian cyber-operations group was responsible for the exploitation and intrusions. When I was supporting cybersecurity operations for the Defense Department at the Defense Cybercrime Center (DC3) as part of Carnegie Mellon’s CERT/CC, that same group carried the moniker of “Grizzly Steppe” and “The Dukes”, so they have been around a while in some form or another. The benefit here is that their TTPs, or “tactics, techniques and procedures”, are very well documented as well as their expected targeting.
This same group as the Democratic National Committee (DNC) “hack” from 2016 also attributed to them, but their targeting is less financial cybercrime and more intelligence focused, like their handler agencies of Russia’s FSB (internal) and the SVR (external), the latter currently having this latest incident attributed to.
Why is this particularly important? Because as the targets are usually for intelligence gathering when handled by the SVR versus offensive disruption, of which other more “active” or “energetic” groups tend to be seen as attacks, versus what you could call here as “cyber-spy craft” (please don’t hate me for the use of “cyber” as the prefix). Not that the FSB or the GRU won’t perform similar duties, which they do, but the focus is always primarily intel gathering and other espionage.
How does this play out to what we’ve seen reported by targets already this week, especially in the order of who has disclosed that indicators of compromise (IOCs) were found in their infrastructure associated with a successful breach? The first two, the US Department of Commerce’s National Telecommunications and Information Administration and the US Treasury Department have significant international policy and enforcement interest of Russia. Oddly, one subject we will discuss in this article, regarding software supply chain security, is spearheaded out of a part of the NTIA, which seems to be not only a gut punch but also a raspberry to the face of those efforts since such a flaw was leveraged to infiltrate the agency component. The Treasury Department, an agency I used to work for out of their Government Security Operations Center (GSOC), has been in the news regarding financial sanctions against Russian interests and individuals, especially by FinCEN, or the Financial Crimes Enforcement Network, which tracks suspicious movements of money around the world.
Later on, Tuesday, DHS and the DoD also disclosed similar discovery of known IOCs from a breach, and the National Institutes of Health, of which a portion of our nation’s research and response to the COVID-19 pandemic is run and managed out of. The latter is an operating division (OpDiv) for the US Department of Health and Human Services, another agency I worked for, so that there’s a commonality of technical infrastructure is of knowledge to me.
So, why can this be seen as important and why this is also dangerous? I remember reading a seminal statement on how technical homogeneity is one of the largest possible vulnerabilities to an organization that is not by fault of the technology, but driven by market forces and policy. This comes as a paper co-authored by Dan Geer while working at the firm @stake in 2003 titled “CyberInsecurity: The Cost of Monopoly” – which was critical of Microsoft’s near monopoly in software infrastructure for organizations, that it posed a national security threat. This condition often existed not because Microsoft dominated the market due to technical superiority, capability or even cost, but because of marketing and the “hook” of complexity that made “un-adopting” those products and services difficult because of how tightly they became intertwined with business operations, so the install base grew and grew and made it difficult for alternate technologies and services to be adopted.
Today, once could see some parallels to cloud providers, and I’ve written and spoken on the subject about having an exit strategy once you go in, informed by this nearly 20-year-old article. The loosely coupled technology that is available to address scaling, efficiency, and additional features should prevent such lock in, but that same diversity does also create other problems when developing services and applications on these platforms. Tracking dependencies, source libraries, and the provenance of that code to trusted repositories, let alone what happens when you “add the lime and the coconut” together provide even more challenges to developers and systems teams.
As I was writing this, we understood more details regarding the initial vector of compromise to SolarWinds infrastructure a novel exploitation of multi-factor authentication peculiarities in Outlook Web Access (OWA) identity token generation and management. So, we can see beyond the chain of parts, the “factory” where these tools are built also need protection, and are chained dependently on other service providers and suppliers.
And what of this “factory”, what does it all mean in relation to this discussion, the spy craft and targeting, and methods used to gain footholds in certain organizations of interest, and their role in exfiltration of data and other information. Noted before, we suspect the SVR as the agency directing APT29 primarily and associated teams with exercising these operations. For those who read FireEye’s analysis of the SolarWinds vulnerability and exploit methods, this was highly sophisticated, novel and non-trivial.
These types of infiltrations are long-tail, sometime lasting years and ensnaring several targets at different times for different reasons to reach a goal. They are quiet and stealthy and avoid detection even by the best analysts without a mistake by an offensive operator, an intended “burn” by the intruder, or just dumb luck by somebody noticing something out of the ordinary. It sounds like a well-crafted spy thriller for computer nerds, and well, it sort of is just more in the virtual domain than the physical.
Exploiting a physical factory has many parallels to a software-based one, as you have raw materials, an assembly and finishing line, packaging, distribution, and sales and marketing. Of course, the best way to ensure at least some of what you want to be in the finished product is to compromise the raw materials, in this case any imported libraries or other methods from outside your organization, especially ones you have no direct control or influence over.
Yes, like a real factory, you can do some quality control analysis on your suppliers, require every piece to be examined and analyzed before inclusion in an assembly line or finished product, but sometimes the volume or complexity of the operation makes the use of that model difficult to implement or overly resource intensive. This is where some risk modeling comes in, and understanding not only your organizations tolerance for introducing faulty or damaged products to consumers. This is, sometimes, why defense and critical infrastructure are a bit more persnickety on the tolerances and quality of goods and services sold into them, either components or finished goods. They are also slowly adopting that same mentality used for physical goods to software assurance. Oddly, the ones with the best track record for this is NASA, as the amount of testing and quality assurance built into mission systems is built into the culture of the agency, and has carried forward for decades, even as it’s transformed and modernized.
If you control the means of production, including raw materials, in this case, all the code you utilize is created in house by your staff, you can maintain tighter controls, impose standards and integrate quality control and use of raw materials. Oddly, this was one of the “vertical integrations” that tripped the government to act on anti-trust activities in the early part of the 20th century, made famous by the Rockefeller, Frick and Carnegie among others to keep things “in the family”.
This isn’t exactly workable nowadays for software, as the rapidity of technological advancement, and just modern methods of software development and delivery have moved to an “as a service” model of lowering costs and increasing adaptability to changes by running on and consuming a common framework or infrastructure. You can think of this as the public utility model where we don’t care about who provides the electricity or water to your home or business, as long as it’s reliable, is affordable, and an acceptable quality for everyday consumption. It also relies on a ton of trust.
As vertical integration in software development is not only resource intensive, expensive, and somewhat impractical, find other methods to compensate for using externally supplied raw materials in your “factory”. Sometimes, this can be policy, by perhaps, using certified code components, or in some cases stripping out extraneous methods or even imports from packages. This has the benefit of possibly also making sure the assembly line and packing move faster and smoother, without a chance for some of those materials gumming up the works. This might be like making a package of fruit cocktail and only using fruit that is of a certain size, color, and is peeled and de-seeded.
Say, presented with a new or better supplier, if you only changed to maybe a different variant of apple, say from McIntosh to Fuji, while it had the same characteristics of the original apple in that fruit cocktail, you’d have to observe the affects it had on the end product for a while to make sure the substitution was good and accepted by the factory machinery and eventually the consumer. This may be like changing to another JavaScript library that had a few more pieces of useful functionality or did an operation a bit more efficiently or more precise the than former one, or maybe it forked or wasn’t being maintained regularly. There’s lots of reasons, however, predictability and regularity helps with assurance.
During your software engineering, you’ve gone through architecting and planning what you’re delivering, expected capabilities, and how and where it’s going to run. Modern software development is never “one and done”, as we move more to Agile delivery, it’s iterative, and even when in release, additional features, fixes and enhancements will always be added. Especially so if you are delivering this as a product or service to customers. AWS and Azure wasn’t built in a day, and it’s definitely not the same service when they were launched, as goes most development efforts. As much as you are iterating, sometimes contentment births complacency, and with that, going back and checking on the factory is also growing and changing along with your development needs.
Of course, the factory is housed somewhere, in this case for software development, most likely it’s on and in infrastructure you’re responsible for. Just like you’d probably not leave the front door or loading dock doors open or unlocked, you wouldn’t do the same to your build and deploy environment. As details come out about some missteps SolarWinds did in securing their environment, was using a rather simple password to their repository. This is near the equivalent of having a key on a string outside a locked door where the security and access control is merely for looks and not for actual protection. Then again, they say locks are just there to keep honest people honest—but insider threat discussions are for another post.
If you’ve adequately secured your own supply chain, your software manufacturing to the point of authorship, compiling and even some testing – next up is the packaging that is undertaken before it’s shipped out to customers. Here too, while we have mechanisms for assurance in place, such as code signing, hash checks, and even some levels of testing. The code signing is often more akin to a foil security seal on an application, attesting that the software that left your factory is the same that the customer expected to receive.
(via Twitter : @vinodsparrow)
However, as noted earlier, if anything prior to this step was compromised—raw materials, factory machinery, etc.—you’ve just sealed up a package that has potentially faulty, compromised, or worse contents and gave your customers a false sense of security by exploiting their trust that you did your due diligence. Here, SolarWinds shipped and continued to ship compromised signed code, and installing it, the attackers essentially socially engineered that supplier to consumer trust.
So, while it was a technical exploit once delivered, its primary success exploited that trust. That’s not as an uncommon occurrence as you’d expect—as several applications, services and code have had similar techniques leveraged by attackers, it was the nature of the end product and it’s use that made this issue rather nefarious for customers and those that are the ultimate target of the attacker.
Switching back into what’s been compromised before we leave the subject of trust between the vendor and customer, the analysis by FireEye of the compromise introduced into the SolarWinds Orion platform was a clever use of covert channels over a trusted mechanism. It’s often hard to bury that kind of channel into standalone malware let alone compromising a commercial application in order to remain undetected for such a long time. In this case the Orion Improvement Program, which under normal conditions would be used to reports debugging and performance data to assist with product enhancements and improvements was used as a backdoor for command and control (C2) operations as well as exfiltration routes.
The novel and amazing leveraging of such a channel shows that the attacker knew that in most cases this is usually turned on, goes generally unmonitored, and the route itself could see a trickle or burst of data that would be seen as possibly normal operations. In short, brilliant use of a trojaned system for intelligence gathering. If anybody is to take a way a lesson here, it’s that unless your contractually bound or are during debugging your environment with such tools, always turn off these vendor data channels.
Another clever attribute to the trojaned modules within SolarWinds was code used to detect forensics and analysis tools that may be used to analyze, alert and possibly remove malware such as this from a system. It had a clever hashing algorithm to detect running processes and compare them against known tools and then hide from that detection. While somewhat standard anti-forensics techniques, this depth of self-protection helps show that this tool and operation was a major technical effort by those who were using this platform.
Now, I will not dig into more technical depth on the exploit, as there are better and more in-depth writeups on the subject coming out regularly, but the other thing to be aware of regarding this operation was how widespread the install base was and how that was leveraged, and what kind of tool SolarWinds Orion was that makes this quite the “bad day for everybody”.
I’m taking off my developer hat for the time being and putting on my operations one, the kind a systems administrator may wear when looking at their data center and wondering how their small team is going to manage hundreds of servers and applications. Of course, there’s plenty of tools out there that can be leveraged to instrument and gather data from these farms of machines and code, but to maximize your resources, especially at a mature, enterprise level, you will lean on an NMS, or network management system, or sometimes, even a managed provider now that cloud and “as a service” tools have become more prevalent, and this is where Orion comes in.
For those not familiar with SolarWinds it is an NMS, which allows system administrators and networks operations center staff get an idea as to the health and well-being of their computing environment. To get the full picture it needs to have access and be installed (in most cases) on every system that is of interest to watch, which usually means everywhere. In order to get good, useful and complete data, which is often, by design, protected by the operating system of host computers, it needs administrative or privileged access, especially if you take the next step and allow it to provide remediation services when something goes wrong.
Stop me if you’ve seen this horror story before, but I’m sure you can sense why SolarWinds provided such an appetizing target to exploit.
If you had a chance to gain access to a system, that by design, is to have privileged, wide-ranging access to an organization’s infrastructure, that deals in sending telemetry and other data as normal operations, wouldn’t you think it would be a perfect target to exploit?
Right now, as it appears, most incident response teams and system operators are dealing with not only identifying systems Orion had nearly unfettered access to, but also expiring credentials, and searching for IOCs to help narrow the scope of systems they will have to forensically investigate. The last major incident I ran had 33 systems that involved, and 10 that eventually needed deeper forensic analysis. That took months, but we eventually were able to come the conclusion about what happened, who did it and what, if at all, was exfiltrated. Magnify this to the pool of 18,000 customers disclosed by SolarWinds, and the size of their own network and the level of tool access, and you can grasp why this is potentially magnitudes worse.
As I’ve stated in a few talks I have given, IT personnel don’t scale and the systems they manage, security people even less so—namely due to the scarcity of the talent, but also to the specialized skills required to handle certain tasks. Responding to this event just in the public sector will require nearly all IT people to scrub in to help triage and most likely investigate how bad it has compromised them.
One of the major footholds for any attacker, regardless of outcomes, is not only to get in, but often also have the ability to execute lateral movement. In chess terms, often those movements are bound by the relationship of the systems (pieces) to other systems and where they exist in the environment (board). You have to get creative with pawns, rooks and knights because of some of their limited movements, but in context to the scope of the type of access possibly granted by the SolarWinds compromise, essentially turns all accessed systems into queens.
Returning to how SolarWinds initially got compromised, it appears to come in two particularly interesting “flavors”. One points to a vector via a very novel attack on multi-factor peculiarities in OWA, noted above, which is technical in nature and requires a high level of expertise to pull off. The other is a weak password which would have been preventable via a good policy. That reduces your attack vectors by 50% if followed by people responsible for maintaining that. The former sort of spits in the face of everything experts recommend organizations because of secure access to systems and data. So how could have this been prevented?
Reporting at this point has been noted that the OWA issue was not initially developed or exploited at SolarWinds, but rather at a think tank. Volexity had encountered the same actors who leveraged the OWA and Microsoft Exchange Control Panel vulnerability used against SolarWinds in late 2019 and early 2020 at this think tank. That same attacker had been “ejected” from that network after having had a persistent foothold in it for several years. This was just another pivot for them, via a new tool and weakness, to maintain the foothold and keep it persistent. Reasons for persistence, in such an environment as a think tank, but also by a categorized group of attackers, whom have known types of targets, helps lightly connect why they may be in certain locations and not others.
Sometimes intelligence and other data found at one subject target changes future targeting, and if they are related, some prior intelligence can be leveraged to exploit access at new targets. Noted above, trust between organizations is both a technical and social engineering effort to exploit. Sometimes, like a vendor relationship with a customer, think tank employees are officials and academics who interact with government agencies, and have relationships that can be vectors to leverage to gain new footholds.
While I’m not saying this is potentially how and why APT29 and Volexity’s “Dark Halo” actors used the same OWA MFA flaw to get into SolarWinds environment, but poor secrets management in both environments made this more trivial than environments where this was managed more aggressively. Add to the know shared fact that they used an easily guessable password to “secure” internal code repositories, and the chained access exploitation provided a perfect avenue to trojan SolarWinds’ code base. From outward appearances once technical details are clarified on which exact flaw that was documented to be exploited, there’s plenty of those that sync to known vulnerabilities in involved systems, including CVE-2020-0688 which may be a reasonable culprit. The Zero Day Initiative has a surprisingly good write up on this that lends to the theory I propose, or at least it’s similar enough to have me thinking about it.
If it is the latter and not a unique and new or novel flaw in OWA, then this speaks to patch hygiene, which is almost always trotted out in some of these cases. However, this same “chasing the patch” is also what allowed the SolarWinds infrastructure in organizations to be compromised and continued to be compromised because of a long tail on the software supply chain. Two posts on Twitter cover a curious analysis of some differences of “cleanliness” based on their patching cycles, but also how long the trojaned code was made available to customers by SolarWinds. At the time of my initial authorship of this article, several individuals confirmed that compromised code existed in those same update channels, so it’s unsure of where we stand with how bad this infection vector really is now, but supposedly clean code via a new hotfix is available.
As an ops person, this is good news to a point, because the typical knee-jerk reaction to an incident my management and even some incident handlers is to blind, disconnect or turn off potentially affected systems. Provided you’ve either imaged, scanned, or captured possibly affected systems and infrastructure, returning such a critical monitoring system back to a reasonable operating state provides more help than harm in ensuring resiliency of your operations. You may have to come to some sort of truce with your incident handling team to let them know that at least knowing what systems they affected, but operational and business demands may require more immediate service restoration of continuity of operations.
We now know roughly where the problem arose, how to detect and respond, and some initial return to operations. However, we’ve yet to touch exactly on if and why your organization may be a target and how to possibly avoid the hype/scare train that some vendors may try to capitalize on given the “ambulance chasing” that often occurs as a sales cycle after these types of incidents.
From initial disclosure, it’s curious to note that FireEye released a notice that their own supply chain was breached by the compromise of their “red team” tools that are used on assessment engagements for their professional services team. As a former employee of Mandiant, which is a subsidiary of FireEye, many, many years ago, firms like theirs have specialized tools to help model and use similar tactics that real world attackers may use, but under more controlled circumstances such as a penetration testing engagement. Some may use novel flaws, if the job calls for them, but often they are tools designed to help automate the attack, as “time is money” and the quicker and more comprehensive an assessment is, the better picture of the customer’s environment can be made.
It was shared that their own compromise took the form of being exploited through SolarWinds itself and not the OWA flaw, so this was a clear vendor supply chain attack, but shows the impact that can be had simply by that vector of compromise. While the theft of tools is equivalent to say, to that of breaking into the gun safe, FireEye decided to lessen the usefulness of these tools by releasing them and their code to a public GitHub so protection and detection can be developed by the community as a massive display of transparency. Not that you won’t see these tools ever reused the same way ShadowBrokers leveraged the NSA tools, but as infrastructure becomes inoculated by patching, detection and prevention tools, it will reduce the impact.
However, this is only a useful tactic for when such information and data is stolen that can be valuated and written down. Government systems, especially those which have pre-deliberative policy statements and work, intelligence data, and other confidential information can’t often be afforded radical transparency without throwing off geo-political balances. Therefore for nation states, they are not only valuable targets, but the nature of non-disclosure of details of such thefts plays into the way governments treat their relationship with the public and global community. Unless compelled by Congress or laws and regulations already on the books, the magnate and subject of the espionage efforts will never be fully known unless willfully or accidentally disclosed.
Since SolarWinds noted that upwards of 18,000 customers may be affected by the trojaning of their service and software, the question also arises by customers who are not the government, “should it should worry me?” The answer itself is multi-faceted. Much like the Doctor Who meme regarding severity based on context of the question, this falls along similar lines, as “it depends.” If you are of interest to the attacker, in this case, a think tank, defense contractor, an organization that holds significant caches of interesting data, or has policy interests that overlap with Russia’s—definitely a “yes”. If you are maybe, a bit smaller, niche but have no overlap, targeting you may not be of interest and rarely follows the motives and TTPs of those threat actors. If this was a Chinese APT, the targeting would be around intellectual property, so manufacturers, universities, R&D facilities and other organizations of similar alignment would be privy to being targets.
When we see the chain of disclosed targets so far, it’s been policy orientated. Even those Fortune 500 companies listed as customers may disclose they saw intrusions, but unlike the government, they have potentially more legal hurdles to review and overcome as they may have fiduciary and filing requirements that would require following certain processes before shared publicly, if at all. Much like SolarWinds releasing an 8K detailing the event, as required by their status as a publicly traded company regulated by the Securities and Exchange Commission, similar ones would need to be formulated by other companies in a similar situation.
This also brings up some other events that were related to this incident that prove curious to those who aren’t technical, but see the processes required for information sharing as glaringly suspicious. It’s been reported that there was some stock sale movement by executives and others before the disclosure of the breach, which is considered insider trading and is illegal. I saw this prior to the Equifax breach, and a few others, so as it appears common, it is still illegal and questionable. If it comes out that executives were aware of this and sold prior to disclosure, they may get sued, and in extreme cases, may be arrested and sent to jail. We will see how that shakes out.
But, these details as you stitch them together, helps tell the story and build a timeline of who knew what when and what they did to react and address the issues. Twitter user @chrismerkel created the chart below regarding the hotfixes published for Orion, and they indicate possible internal awareness of issues, and analysis of the code containing the markers of the trojaned code.
Notable is last hotfix before public disclosure was October 29, 2020, with the prior confirmed malicious code existing in the version 2020.2 June 24 release. As the numbering scheme changed, we may assume it that sometime after that June date, developers noticed issues and audited their code and remove known trojaned code. If leadership was notified of this and the severity, the coincidence of the trades seems more tied to public disclosure rather than realizing, privately that there was an undisclosed issue with their product.
What is also interesting to note, is that the release on March 26 was confirmed malicious, which puts the potential intrusion and modification after February 5, the last documented clean version. This, in the most liberal of calculation gives a window of over 200 days where code was distributed to customers to use that had this trojaned code in it and available. It is also a relatively long window in development time, but based on the processed organizations used to download, test, and update systems, the attacker had a potentially smaller widow to take advantage of their newfound avenue into their targets. Whereas the think tank mentioned above had a foothold by these actors for several years, if the intruders weren’t already in these organizations prior to this, they would have to act quickly once confirmed their vectors were active. But, as public disclosure lagged over private awareness, this still provided over 300 days for those intruders could still operate stealthily, nearly a year of a foothold.
If you are an incident handler today and are assigned with the response to this in your organization, 300 days of logs and other data is a lot of information to search through and make sense of, and when the fact an NMS accesses everything everywhere, that scope isn’t just a few machines but potentially hundreds or thousands. For those who aren’t in this role or industry, this is why we who are, feel that this was one of those “black swan” events that while described as an edge case became very real, very quickly. If you have some of us in your organization that ask for some time off after the work done for the response activities, be nice and grant it.
Since you have gotten here so far, done some cleaning, restored service, and completed your initial response, it’s now time to do some threat hunting. Now you could have done this or some of it during your initial response, given the IOCs provided by various individuals and vendors, but as noted, this is a threat actor that didn’t just conduct this as their sole operation. As this intrusion has a fancy name tied to the exploit, called SUNBURST (Solorigate), there’s plenty of other incidents now also tied to this treat actor that are also well documented.
This, though, helps point out some holes in the security industry and community, as this information is scattered and at different levels between individuals, vendors, researchers, organizations, service providers and a host of others. It’s possibly incomplete, but also suffers from a lack of common nomenclature and formatting. MITRE’s ATT&CK framework is always a good start, but when we say APT29, CozyBear, UNC2452, and Dark Halo, it’s only self-serving to those who do the “first post” doctrine that was oft derided when practiced on Slashdot—it only annoys.
Personally, in creating this article, and satisfying my edification when trying to grasp the detail and the bigger picture, I appreciate the work of folks like Ean Meyer to post roundups and sources myself, reporters and others can reference.
But now we dive into past behavior and indicators used by these variously names actors and see if we picked them up farther back than this incident. This can be eye opening for some who’ve never looked retroactively and broadly at their environment, but it may help tell a larger story that can be useful in justifying resources for a security team, or even better processes for acquisitions or changed policies for some organizational activities, like strong identity and access management capabilities. If any good can come from a potential bad, this is where you will see it. Not quite a silver lining, but enough to spin some bad news into something actionable. Plus, if you don’t do it surrounding this, memories fade and it’ll take another event to happen before attention is focused on it again.
So, take indicators from your favorite provider regarding these attackers and see what turns up, you may find this wasn’t a onetime occurrence and may also fill in the blanks from other incidents you responded to and may have missed or not connected something. I used to create chained reporting at DC3 when I was helping with an analysis cell there, and while it was intended for unclassified use at that level, providing context and look-back I felt was useful for those who may have only seen the last report but then could reference others to see the bigger picture. Having worked out of the Treasury GSOC, I know what in 2016 when DHS released IOCs related to Russian activity regarding the election; they could put it in their system and see related incidents over time and determine if it was addressed earlier or related to something new that needed handling. It’s a good reason Treasury was the first cabinet agency to report a relation to the SolarWinds incident. I was proud to work with that team, it’d be nice to see the same level of competency and skill elsewhere supported by executive leadership. I know other agencies have a bunch of great folks working in their SOCs as well, I just wanted to raise up Treasury being so quick with their release.
Finally, we need to consider this as the management level. If you are a consumer or implementer of complex IT systems, responsible for protection of sensitive data, and may service an industry sector or public sector to step back and consider where you are. Don’t feel bad if you chose anybody mentioned here as one of your vendors. This could literally happened to any and all vendors you choose, nobody is perfect, but the idea is to respond and mitigate things quickly and efficiently (and correctly) when something goes awry. Do you think folks drop AWS and Google if there’s an outage—probably not, unless it becomes more of a habit than a few incidents. At the very least, you engineer and architect for resiliency. So things such as “multi-cloud architectures” and tools have made enabling this a lot easier. Is it fool-proof? No, but it makes handling issues as they arise a lot more graceful especially if you understand what you’re doing and how to properly use those tools and services.
Speaking to resiliency, did your disaster and continuity planning for your organization include exercises with a scenario where you NMS is compromised or knocked out of service? If not, definitely should be added as a scenario, even if you weren’t affected by this one. Do you have a mature enough information security program in your organization, or relationships with vendors and service providers, that can scale to support a wide-scale incident. Did it also include your legal, public affairs and investor relations teams? Probably now a good time to do that update and get those folks involved as well. This isn’t just a learning experience for those directly affected by the events of this week, but should a signal to others to tighten up their own work.
I’m reminded of arriving at Disney a few days after the Sony hack, and instead of some serious schadenfreude being expressed by the security team at Disney, there was a lot of introspection of “it could have been us, and are we prepared if it ever ended up being us?” That is the proper response. Same would go for any other industry or business, just consider your risk appetite, your threat profile, and consider things such as your supply chain and operating model, and maybe we can improve things in case this happens again, which it will inevitably will.
I’ll be updating this as I get new data and corrections, and more details get released and either contradict or confirm some assumptions and other hypothesis here. I’m human, I make mistakes, but I’m also dealing with an active event with a lot of disparate sources of information. If you have something to add, contribute, comment, or critique, please message me or comment.
UPDATE 12/16/20 21:20
Found an update list/chart from SolarWinds regarding their own assessment of their compromised releases. https://www.solarwinds.com/securityadvisory/faq
UPDATE 12/16/20 22:18
Killswitch deployed for c2 domains.
Threat hunting for SolarWinds IOCs
FireEye Countermeasures
DevOps DevSecOps Incident Handling Incident Response Policy SBOM Security Software Supply Chain