One of the most fascinating breach analyses I've ever read.
Reading between the lines, I sense the client didn't 100% trust Mr. Bogdanov in the beginning, and certainly knew there was exfiltration of some kind. Perhaps they had done a quick check of the same stats they guided the author toward. "Check for extra bits" seems like a great place to start if you don't know exactly what you're looking for.
Their front-end architecture seemed quite locked down and security-conscious: just a kernel + Go binary running as init, plain ol' NFS for config files, firewalls everywhere, bastion hosts for internal networks, etc. So already the client must have suspected the attack was of significant sophistication. Who was better equipped to do this than their brilliant annual security consultant?
Which is completely understandable to me, as this hack is already of such unbelievable sophistication that resembles a Neil Stephenson plot. Since the author did not actually commit the crime, and in fact is a brilliant security researcher, everything worked out.
> So already the client must have suspected the attack was of significant sophistication. Who was better equipped to do this than their brilliant annual security consultant?
If you suspected your security consultant, what would be the point of slipping them tiny hints about what you've found? If they're the source of the intrusion, they already know. If they're not the source of the intrusion, why fear them when you've already been compromised? Also, if you suspected the consultant, why hire them to do the security review?
I suspect the real reason is probably simpler: they have strong personal or financial incentives to "not have known" about the intrusion before the researcher discovered it.
I agree there's nothing to rule out your theory. Likely we will never know. But then why authorize sharing the story?
Specifically I don't think the owner thought it was likely, just a concern he couldn't shake. Probably he relaxed as soon as the consultant didn't make excuses, and tackled the job—extracting the binary from an unlinked inode is definitely not showing reluctance. Pure speculation, of course.
I don't know if this is realistic in any way, but I've seen lots of Murder, She Wrote episodes where the criminal only gets caught because they become involved in the investigation some way and accidentally reveal knowledge that only the attacker could possibly know. This strategy necessitates hiding secret information so it can be revealed later by the attacker.
> This is hardly reducing the attack surface compared to a good distro with the usual userspace.
Run `tcpdump -n 'tcp and port 80'` on your frontend host and you'll still see PHP exploit attempts from 15 years ago. Not every ghost who knocks is an APT. A singleton Go binary running on a Linux kernel with no local storage is objectively a smaller attack surface than a service running in a container with /bin/sh, running on a vhost with a full OS, running on a physical host with thousands of sleeping VMs—the state of many, many websites and APIs today.
Superb work. The "who" of attribution is more likely related to the actual PII they were after than any signature you'll get in the code. Seems like a lot of effort and risk of their malware being discovered for PII instead of being an injection point into those users machines. I rarely hear security people talk about why a system was targeted, and once you have that, you can know what to look for, inject canaries to test etc.
Conspiracy theory: the fact the POC insisted on the writer checking out the traffic suggests they knew about (or were suspicious of) the fact that PII was being leaked.
Probably, but is that a conspiracy theory so much as an insurance policy? Being able to competently complete that sort of nightmare investigation is probably why the investigator was re-hired annually.
A packet capture of the config files would show something was up to anyone suspicious, but knowing what to do about it is a completely different story.
The 'conspiracy' part of my conspiracy theory is not that they hired a security consultant, but that they explicitly guided him to the exact hardware[1] with the correct metric to detect it[2] asking him to test for a surprisingly accurate hypothetical[3], even going so far as to temporarily deny the suggestion of the person they're paying to do this work[4]. This is weirdly specific assuming they had no knowledge of the compromise.
Of course, I have no non-circumstantial evidence and this could all be a coincidence, which is why my comment is prefixed with "conspiracy theory".
1: "However, he asked me to first look at their cluster of reverse gateways / load balancers"
2: Would have likely been less likely to find the issue with active analysis given the self destruct feature
3: "Specifically he wanted to know if I could develop a methodology for testing if an attacker has gained access to the gateways and is trying to access PII"
4: "I couldn't SSH into the host (no SSH), so I figured we will have to add some kind of instrumentation to the GO app. Klaus still insisted I start by looking at the traffic before (red) and after the GW (green)"
Yes. That would be the person with the org handling the organization's relationship with the contractor, setting up their access, answering questions, guiding, propagating results, etc.
I'm trying to find the lesson in here about how to prevent this kind of incident in the first place. The nearest I can find is: don't build any production binaries on your personal machine.
Reproducible builds can go a long way, along with a diverse set of build servers which are automatically compared. Whether you use your personal machine or a CI system there's still the risk of it being compromised (though your personal machine is probably at a little more risk of that since personal machines tend to have a lot more software running on them than CI systems or production machines).
I'm paranoid, and I'd have considered the efforts described here to be pretty secure. I'll say the only counter to this grade of threat is constant monitoring, by a varied crew of attentive, inventive, and interested people. Even then, there's probably going to be a lot of luck needed.
(Assuming that the system on itself is designed with security in mind.)
The reason is manifold but include:
- attacks against developer systems are often not or less considered in security planing
- many of the technique you can use to harden a server conflict with development workflows
- there are a lot of tools you likely run on dev systems which add a large (supply chain) attack surface (you can avoid this by allways running everything in a container, including you language server/core of your ides auto completion features).
Some examples:
- docker groub member having pseudo root access
- dev user has sudo rights so key logger can gain root access
- build scripts of more or less any build tool (e.g. npm, maven plugins, etc.)
- locking down code execution on writable hard drives not feasible (or bypassed by python,node,java,bash).
- various selinux options messing up dev or debug tools
- various kernel hardening flags preventing certain debugging tools/approaches
- preventing LD_PRELOAD braking applications and/or test suites
I think a big difference between build machines and dev machines, at least in principle, is that you can lock down the network access of the build machine, whereas developers are going to want to access arbitrary sites on the internet.
A build machine may need to download software dependencies, but ideally those would come from an internal mirror/cache of packages, which should be not just more secure but also quicker and more resilient to network failures.
Interestingly, this is water on mills we are currently thinking about. We're in the process of scaling up security and compliance procedures, so we have a lot of things on the table, like segregation of duties, privileged access workstations, build and approval processes.
Interestingly, the way with the least overall headaches is to fully de-privilege all systems humans have access to during regular, non-emergency situations. One of those principles would be that software compiled on a workstation automatically disqualifies from deployment, and no human should even be able to deploy something into a repository the infra can deploy from.
Maybe I should even push container-based builds further and put up a possible project to just destroy and rebuild CI workers every 24 hours. But that will make a lot of build engineers sad.
Do note that "least headaches" does not mean "easy".
This is why I always insist on branches being protected at the VCS server level so that no code can sneak in without other's approval - the idea is that even if your machine is compromised, the worst it can do is commit malicious code to a branch and open a PR where it'll get caught during code review, as opposed to sneakily (force?) pushing itself to master.
If you use cloud services that offer automated builds you can push the trust onto the provider by building things in a standard (docker/ami) image with scripts in the same repository as the code, cloned directly to the build environment.
If you roll your own build environment then automate the build process for it and recreate it from scratch fairly often. Reinstall the OS from a trusted image, only install the build tools, generate new ssh keys that only belong to the build environment each time, and if the build is automated enough just delete the ssh keys after it's running. Rebuild it again if you need access for some reason. Don't run anything but the builds on the build machines to reduce the attack surface, and make it as self contained as possible, e.g. pull from git, build, sign, upload to a repository. The repository should only have write access from the build server.
Verify signatures before installing/running binaries.
> If you use cloud services that offer automated builds you can push the trust onto the provider by building things in a standard (docker/ami) image with scripts in the same repository as the code, cloned directly to the build environment.
And I guess, for those super-critical builds, don't rely on anything but the distro repos or upstream downloads for tooling?
Because if you deploy your own build tools from your own infra, you are at risk to taint the chain of trust with binaries from your own tainted infra again. I'm aware of the trusting trust issue, but compromising the signed gcc copy in debians repositories would be much harder than some copy of a proprietary compiler in my own (possibly compromised) binary repository.
That can not be the right lesson, because there's no inherent reason "personal machine" is any less safe than "building cluster" or whatever you have around. Yes, on practice it often is less secure to a degree, so it's not a useless rule, but it's not a solution either.
If it's solved some way, it's by reproducible builds and automatic binary verification. People are doing a lot of work on the first, but I think we'll need both.
> there's no inherent reason "personal machine" is any less safe than "building cluster" or whatever you have around
Sure there is! I browse internet a lot on my dev machines, and this exposes me to bugs in browsers and document viewers. And if I do get compromised, my desktop is so complex and runs so many services the compromise is unlikely to be detected. So all attacker needs is one zero day, once.
Compare this to a CI with infra-as-a-code, like Github Actions. If the build process gets compromised, it only matters until the next re-build. Even if you get a supply chain attack once (for example), if this is discovered all your footholds disappear! And even if you got the developers' keys, it is not easy to persist -- you have to make commits and those can be noticed and undone.
(Of course if your "building cluster" is a bunch of traditional machines which are never reformatted and which many developers have root access to, then they are not that much more secure. But you don't have to do it that way.)
Use a PaaS like Heroku or Google App Engine, with builds deployed from CI. All the infrastructure-level attack surface is defended by professionals who at least have a fighting chance.
I feel reasonably competent at defending my code from attackers. The stuff that runs underneath it, no way.
Build everything on a secured CI/CD system, keep things patched, monitor traffic egress especially with PII, manual review of code changes, especially for sensitive things
This is truly the stuff of nightmares, and I'm definitely going to review our CI/CD infrastructure with this in mind. I'm eagerly awaiting learning what the initial attack vector was.
If people didn't allow macros in Excel, stayed in read-only mode in Word and only opened sandboxed PDFs (convert to images in sandbox, OCR result, stitch back together), we would see a sharp decline in successful breaches. But that would be boring.
This is the kind of content I come to HN for! I don't get to do a lot of low level stuff these days, and my forensics skills are almost non-existent, so it's really nice to see the process laid out. Heck, just learning of binwalk and scapy (which I'd heard of, but never looked into) was nice.
Consider the possibility that its fiction. Would you be upset? I wouldn't, perhaps a bit disappointed not to learn more. This certainly fits into "worthy of itself".
Please change the posting title to match the article title and disambiguate between APT (Advanced Persistent Threats, the article subject) and Apt (the package manager).
Thanks, I don’t work in security but I use APT a lot.
I thought it was a unfunny joke? Like ... APT provide some of those packages?
Ok. That make more sense.
The author did a good job at making that readable. Is it often like that?
You're right... what an annoying namespace collision. On the other hand, stylizing software as Initial Caps is much more acceptable than stylizing non-software acronyms that way, so it would still be less misleading to change the capitalization.
Poster here.
Do you think I need to edit the title?
This title was funny to me, but probably just because I am a security guy and I know what is an APT.
Who is such a hot target and can take such an independent attitude, even to allowing this to be published? If this had been a bank, they'd have had to report to regulators and likely we'd have heard none of these details for years if ever. Same for most anything else big enough to be a target i can think of offhand.
Idk. while banks have to report on this they are (as far as I know) still free to publicize details.
We normally don't hear about this things not because they can't speak about it but because they don't want to speak about it (bad press).
My guess is that it's a company which takes security relatively serious, but isn't necessary very big.
> hot target [..] else big enough to be a target
I don't thing you need to be that big to be a valid target for a attack of this kind, neither do I think this attack is on a level where "only the most experienced/best hackers" could have pulled it of.
I mean we don't know how the dev laptop was infected but given that it took them 3 month to reinfect it I would say it most likely wasn't a state actor or similar.
I think you're right that it's medical. The author calls out PII was the target. Sure, there's PII in Defense/Fintech/Government, but it's probably not the target in those sectors and PII doesn't have the same spotlight on it as in the Medical world (e.g. HIPPA & GDPR).
Not just vaccines, but basically all your data, including billing and disease history. Perfect for both scamming and extortion.
Keep in mind that you actually want your medical provider to have that data, so they can treat you with respect to your medical history, without killing you in the process.
One thing I didn't get is this magical PII thing. How does the author look at a random network packet -- nay, just packet headers -- and assign a PII:true/false label? I think many corporations would sacrifice the right hand of a sysadmin if that was the way to get this tech.
The article just says:
> I wrote a small python program to scan the port 80 traffic capture and create a mapping from each four-tuple TLS connection to a boolean - True for connection with PII and False for all others.
Is it just matching against a list of source IPs? And perhaps the source port, to determine whether it comes from e.g. a network drive (NFS in this case)? Not sure what he uses the full four-tuple for, if this is the answer in the first place. It's very hand-wavy for what is an integral part of finding the intrusion and kind of a holy grail in other situations as well.
Amazon and Microsoft also have their own offerings, but can be quite expensive for network packets (and pretty slow).
Most projects / teams will use some basic regular expressions to capture basics like SSN, credit card numbers or phone numbers. They’re typically just strings of a specific length. More difficult if you’re doing addresses, names, etc.
That's great for you but... how's that relevant to the article? The author never speaks of using this sort of thing.
I saw these regex matchers in school but don't understand them. They go off all day long because one in a dozen numbers match a valid credit card number, even in the lab environment the default setup was clearly unusable. But perhaps more my point: who'd ever upload the stolen data plaintext anyhow? Unencrypted connections have not been the default for stolen data since... the 80s? If your developers are allowed to do rsync/scp/ftps/sftp/https-post/https-git-smart-protocol then so can I, and if they can't do any of the above then they can't do their work. Adding a mitm proxy is, aside from a SPOF waiting to happen, also very easily circumvented. You'd have to reject anything that looks high in entropy (so much for git clone and sending PDFs) and adding a few null bytes to avoid that trigger is also peanuts.
These appliances are snakeoil as far as I've seen. But then I very rarely see our customers use this sort of stuff, and when I do it's usually trivial to circumvent (as I invariably have to to do my work).
Now the repository you linked doesn't use regexes, it uses "a cutting edge pre-trained deep learning model, used to efficiently identify sensitive data". Cool. But I don't see any stats from real world traffic, and I also don't see anyone adding custom python code onto their mitm box to match this against gigabits of traffic. Is this a product that is relevant here, or more of a tech demo that works on example files and could theoretically be adapted? Either way, since it's irrelevant to what the author did, I'm not even sure if this is just spam.
My guess was that traffic containing PII was flagged in some way such that it was visible in the pre-GW traffic the researcher had access to. That was the point of linking up the pre-gateway and post-gateway packets. I'm not sure how common such setups are.
What's even more incredible to me is that the researcher somehow recreated exactly the same / correct traffic pattern on their local testing setup, so that they were able to compare the traffic with the production environment to detect that there was a problem. How would you do this?
I'm not even sure what the "time" variable is on the graphs. Response time? (It also seems weird that there's any PII on port 80, but that's an unrelated issue.)
> What's even more incredible to me is that the researcher somehow recreated exactly the same / correct traffic pattern on their local testing setup, so that they were able to compare the traffic with the production environment to detect that there was a problem.
Yeah, that's another thing that has me confused, but I figured one thing at a time...
Thanks for the response, that pre-set PII flag does sound plausible, though it's odd that they'd never mention it and mention a 'four-tuple' instead (sounds like they're trying to use terms not everyone knows? Idk, maybe it's more well-known than it seems to me).
Yes, that was the part where I got lost. It seems he skipped some details about that so it's not clear from the article how that was done. I can't imagine capturing the encrypted data got him that.
This observation os way too casual imo:
"We noticed a 3 month gap about 5 month ago, and it corresponded with the guy moving the kernel build from a Linux laptop to a new Windows laptop with a VirtualBox VM in it for compiling the kernel. It looks as if it took the attackers three months to gain access back into the box and into the VM build."
If the attackers have access to brute force OS engineers / sysadmins work pc's then that should probably be the headline. The rest is just about not being found
Maybe if you are a business oriented person. But reading through the analysis, I felt like the researcher seriously enjoyed the hunt and the "not being found" part.
> On March 21, 2021, CNA determined that it sustained a
sophisticated cybersecurity attack. The attack caused a network
disruption and impacted certain CNA systems, including corporate
email. Upon learning of the incident, we immediately engaged a
team of third-party forensic experts to investigate and
determine the full scope of this incident, which is ongoing.
One of the most fascinating breach analyses I've ever read.
Reading between the lines, I sense the client didn't 100% trust Mr. Bogdanov in the beginning, and certainly knew there was exfiltration of some kind. Perhaps they had done a quick check of the same stats they guided the author toward. "Check for extra bits" seems like a great place to start if you don't know exactly what you're looking for.
Their front-end architecture seemed quite locked down and security-conscious: just a kernel + Go binary running as init, plain ol' NFS for config files, firewalls everywhere, bastion hosts for internal networks, etc. So already the client must have suspected the attack was of significant sophistication. Who was better equipped to do this than their brilliant annual security consultant?
Which is completely understandable to me, as this hack is already of such unbelievable sophistication that resembles a Neil Stephenson plot. Since the author did not actually commit the crime, and in fact is a brilliant security researcher, everything worked out.
> So already the client must have suspected the attack was of significant sophistication. Who was better equipped to do this than their brilliant annual security consultant?
If you suspected your security consultant, what would be the point of slipping them tiny hints about what you've found? If they're the source of the intrusion, they already know. If they're not the source of the intrusion, why fear them when you've already been compromised? Also, if you suspected the consultant, why hire them to do the security review?
I suspect the real reason is probably simpler: they have strong personal or financial incentives to "not have known" about the intrusion before the researcher discovered it.
I agree there's nothing to rule out your theory. Likely we will never know. But then why authorize sharing the story?
Specifically I don't think the owner thought it was likely, just a concern he couldn't shake. Probably he relaxed as soon as the consultant didn't make excuses, and tackled the job—extracting the binary from an unlinked inode is definitely not showing reluctance. Pure speculation, of course.
1 reply →
I don't know if this is realistic in any way, but I've seen lots of Murder, She Wrote episodes where the criminal only gets caught because they become involved in the investigation some way and accidentally reveal knowledge that only the attacker could possibly know. This strategy necessitates hiding secret information so it can be revealed later by the attacker.
> just a kernel + Go binary running as init
This is hardly reducing the attack surface compared to a good distro with the usual userspace.
It's been decades since attackers relied on a shell, or unix tools in general, or on being to write to disk and so on: it's risky and ineffective.
Many attack tools run arbitrary code inside the same process that has been breached and extract data from its memory.
They don't try to snoop around or write to disk and so on. Rather, move to another host.
The only good mitigation is to split your own application in multiple processes based on the type of risk and sandbox each of them accordingly.
> This is hardly reducing the attack surface compared to a good distro with the usual userspace.
Run `tcpdump -n 'tcp and port 80'` on your frontend host and you'll still see PHP exploit attempts from 15 years ago. Not every ghost who knocks is an APT. A singleton Go binary running on a Linux kernel with no local storage is objectively a smaller attack surface than a service running in a container with /bin/sh, running on a vhost with a full OS, running on a physical host with thousands of sleeping VMs—the state of many, many websites and APIs today.
5 replies →
Superb work. The "who" of attribution is more likely related to the actual PII they were after than any signature you'll get in the code. Seems like a lot of effort and risk of their malware being discovered for PII instead of being an injection point into those users machines. I rarely hear security people talk about why a system was targeted, and once you have that, you can know what to look for, inject canaries to test etc.
From Twitter chatter, this appears to be Chinese APT malware, something related to PlugX
>Chinese APT
Wow, surprising!
> Chinese APT malware,
Why is it necessary to point out the foreign origin? Doesn't that just encourage our innate xenophobia?
8 replies →
I think the old “Mossad is gonna Mossad” thing is still true. Good security practices are mandatory, and will keep you safe 99% of the time.
But when you have what appear to be state level actors using 0 day exploits... you will not stop them.
Thanks for making me look up "Mossad is gonna Mossad" -> Schneier -> Mickens' essay titled "This World of Ours".
https://www.usenix.org/system/files/1401_08-12_mickens.pdf
Thank you for this. Helps put my career choices into perspective. (I just quit security work to be a stay at home dad.)
Thanks, this is such good writing. Reminds me a little of Douglas Adams.
2 replies →
That was very entertaining, thank you.
No 0-day here, more of a supply chain attack, but your point stands. This actor was determined
More like Chinese state sponsored hackers
Conspiracy theory: the fact the POC insisted on the writer checking out the traffic suggests they knew about (or were suspicious of) the fact that PII was being leaked.
Probably, but is that a conspiracy theory so much as an insurance policy? Being able to competently complete that sort of nightmare investigation is probably why the investigator was re-hired annually.
A packet capture of the config files would show something was up to anyone suspicious, but knowing what to do about it is a completely different story.
The 'conspiracy' part of my conspiracy theory is not that they hired a security consultant, but that they explicitly guided him to the exact hardware[1] with the correct metric to detect it[2] asking him to test for a surprisingly accurate hypothetical[3], even going so far as to temporarily deny the suggestion of the person they're paying to do this work[4]. This is weirdly specific assuming they had no knowledge of the compromise.
Of course, I have no non-circumstantial evidence and this could all be a coincidence, which is why my comment is prefixed with "conspiracy theory".
1: "However, he asked me to first look at their cluster of reverse gateways / load balancers"
2: Would have likely been less likely to find the issue with active analysis given the self destruct feature
3: "Specifically he wanted to know if I could develop a methodology for testing if an attacker has gained access to the gateways and is trying to access PII"
4: "I couldn't SSH into the host (no SSH), so I figured we will have to add some kind of instrumentation to the GO app. Klaus still insisted I start by looking at the traffic before (red) and after the GW (green)"
5 replies →
from Igor:
>I think he had some suspicions, but he is denying that vehemently ;)
https://twitter.com/IgorBog61650384/status/13753134251323146...
Is POC being used as "point of contact" ? I've not come across the acronym before.
https://en.wikipedia.org/wiki/POC
Yes. That would be the person with the org handling the organization's relationship with the contractor, setting up their access, answering questions, guiding, propagating results, etc.
I was thinking the same.
I'm trying to find the lesson in here about how to prevent this kind of incident in the first place. The nearest I can find is: don't build any production binaries on your personal machine.
Reproducible builds can go a long way, along with a diverse set of build servers which are automatically compared. Whether you use your personal machine or a CI system there's still the risk of it being compromised (though your personal machine is probably at a little more risk of that since personal machines tend to have a lot more software running on them than CI systems or production machines).
I'm paranoid, and I'd have considered the efforts described here to be pretty secure. I'll say the only counter to this grade of threat is constant monitoring, by a varied crew of attentive, inventive, and interested people. Even then, there's probably going to be a lot of luck needed.
Traffic analysis and monitoring will detect detect signs of intrusion almost in real time but also exfiltration. The network never lies.
3 replies →
One sensible mitigation to this grade of threat; avoid running Windows, even as a VM host as the dev did. It's a dumpster fire.
5 replies →
I would go further and say:
"Developer systems are often the weakest link."
(Assuming that the system on itself is designed with security in mind.)
The reason is manifold but include:
- attacks against developer systems are often not or less considered in security planing
- many of the technique you can use to harden a server conflict with development workflows
- there are a lot of tools you likely run on dev systems which add a large (supply chain) attack surface (you can avoid this by allways running everything in a container, including you language server/core of your ides auto completion features).
Some examples:
- docker groub member having pseudo root access
- dev user has sudo rights so key logger can gain root access
- build scripts of more or less any build tool (e.g. npm, maven plugins, etc.)
- locking down code execution on writable hard drives not feasible (or bypassed by python,node,java,bash).
- various selinux options messing up dev or debug tools
- various kernel hardening flags preventing certain debugging tools/approaches
- preventing LD_PRELOAD braking applications and/or test suites
...
I think a big difference between build machines and dev machines, at least in principle, is that you can lock down the network access of the build machine, whereas developers are going to want to access arbitrary sites on the internet.
A build machine may need to download software dependencies, but ideally those would come from an internal mirror/cache of packages, which should be not just more secure but also quicker and more resilient to network failures.
Interestingly, this is water on mills we are currently thinking about. We're in the process of scaling up security and compliance procedures, so we have a lot of things on the table, like segregation of duties, privileged access workstations, build and approval processes.
Interestingly, the way with the least overall headaches is to fully de-privilege all systems humans have access to during regular, non-emergency situations. One of those principles would be that software compiled on a workstation automatically disqualifies from deployment, and no human should even be able to deploy something into a repository the infra can deploy from.
Maybe I should even push container-based builds further and put up a possible project to just destroy and rebuild CI workers every 24 hours. But that will make a lot of build engineers sad.
Do note that "least headaches" does not mean "easy".
This is why I always insist on branches being protected at the VCS server level so that no code can sneak in without other's approval - the idea is that even if your machine is compromised, the worst it can do is commit malicious code to a branch and open a PR where it'll get caught during code review, as opposed to sneakily (force?) pushing itself to master.
In this case no CI was involved so that wouldn't have helped.
(The CI was not compromised but a dev laptop which was used to manually build+deploy the kernel, without any CI involved).
Through generally I agree with you.
If you use cloud services that offer automated builds you can push the trust onto the provider by building things in a standard (docker/ami) image with scripts in the same repository as the code, cloned directly to the build environment.
If you roll your own build environment then automate the build process for it and recreate it from scratch fairly often. Reinstall the OS from a trusted image, only install the build tools, generate new ssh keys that only belong to the build environment each time, and if the build is automated enough just delete the ssh keys after it's running. Rebuild it again if you need access for some reason. Don't run anything but the builds on the build machines to reduce the attack surface, and make it as self contained as possible, e.g. pull from git, build, sign, upload to a repository. The repository should only have write access from the build server. Verify signatures before installing/running binaries.
> If you use cloud services that offer automated builds you can push the trust onto the provider by building things in a standard (docker/ami) image with scripts in the same repository as the code, cloned directly to the build environment.
And I guess, for those super-critical builds, don't rely on anything but the distro repos or upstream downloads for tooling?
Because if you deploy your own build tools from your own infra, you are at risk to taint the chain of trust with binaries from your own tainted infra again. I'm aware of the trusting trust issue, but compromising the signed gcc copy in debians repositories would be much harder than some copy of a proprietary compiler in my own (possibly compromised) binary repository.
1 reply →
Did you mean read access from the build server? I’m confused.
2 replies →
Hum... On what machine do you build them?
That can not be the right lesson, because there's no inherent reason "personal machine" is any less safe than "building cluster" or whatever you have around. Yes, on practice it often is less secure to a degree, so it's not a useless rule, but it's not a solution either.
If it's solved some way, it's by reproducible builds and automatic binary verification. People are doing a lot of work on the first, but I think we'll need both.
> there's no inherent reason "personal machine" is any less safe than "building cluster" or whatever you have around
Sure there is! I browse internet a lot on my dev machines, and this exposes me to bugs in browsers and document viewers. And if I do get compromised, my desktop is so complex and runs so many services the compromise is unlikely to be detected. So all attacker needs is one zero day, once.
Compare this to a CI with infra-as-a-code, like Github Actions. If the build process gets compromised, it only matters until the next re-build. Even if you get a supply chain attack once (for example), if this is discovered all your footholds disappear! And even if you got the developers' keys, it is not easy to persist -- you have to make commits and those can be noticed and undone.
(Of course if your "building cluster" is a bunch of traditional machines which are never reformatted and which many developers have root access to, then they are not that much more secure. But you don't have to do it that way.)
4 replies →
Use a PaaS like Heroku or Google App Engine, with builds deployed from CI. All the infrastructure-level attack surface is defended by professionals who at least have a fighting chance.
I feel reasonably competent at defending my code from attackers. The stuff that runs underneath it, no way.
The lesson is in the essay from James Mickens, above.
But isn't that just defeatist? Can't we continue to ratchet up our defenses?
2 replies →
in solarwind hack their cicd systems were hacked, don't see a lesson here
Build everything on a secured CI/CD system, keep things patched, monitor traffic egress especially with PII, manual review of code changes, especially for sensitive things
This is truly the stuff of nightmares, and I'm definitely going to review our CI/CD infrastructure with this in mind. I'm eagerly awaiting learning what the initial attack vector was.
9 times out of 10, through the front door. Some shit in a .doc, .html or .pdf. The Google-China hack started with targetted pdfs
If people didn't allow macros in Excel, stayed in read-only mode in Word and only opened sandboxed PDFs (convert to images in sandbox, OCR result, stitch back together), we would see a sharp decline in successful breaches. But that would be boring.
2 replies →
How such an attack is even possible? A bug in the LibreOffice, browser, or Evince?
8 replies →
This is the kind of content I come to HN for! I don't get to do a lot of low level stuff these days, and my forensics skills are almost non-existent, so it's really nice to see the process laid out. Heck, just learning of binwalk and scapy (which I'd heard of, but never looked into) was nice.
Consider the possibility that its fiction. Would you be upset? I wouldn't, perhaps a bit disappointed not to learn more. This certainly fits into "worthy of itself".
Please change the posting title to match the article title and disambiguate between APT (Advanced Persistent Threats, the article subject) and Apt (the package manager).
Thanks, I don’t work in security but I use APT a lot. I thought it was a unfunny joke? Like ... APT provide some of those packages? Ok. That make more sense.
The author did a good job at making that readable. Is it often like that?
Most security analysis are easily readable (the ones I read were) but there may be outliers
Especially where the article doesn't define it or even use the term "APT" except in the title.
FWIW, the package manager is also spelled APT.
You're right... what an annoying namespace collision. On the other hand, stylizing software as Initial Caps is much more acceptable than stylizing non-software acronyms that way, so it would still be less misleading to change the capitalization.
2 replies →
And I thought it was the common English word "apt". It's a trivial ambiguity, not click-bait.
I thought we couldn't edit titles?
Poster here. Do you think I need to edit the title? This title was funny to me, but probably just because I am a security guy and I know what is an APT.
2 replies →
Yes, the poster can for a limited time, 2 hours I think.
1 reply →
Call me naïve, but who is such a hot target to warrant so much effort to exfiltrate PII? Defense? FinTech? Government?
Who is such a hot target and can take such an independent attitude, even to allowing this to be published? If this had been a bank, they'd have had to report to regulators and likely we'd have heard none of these details for years if ever. Same for most anything else big enough to be a target i can think of offhand.
Idk. while banks have to report on this they are (as far as I know) still free to publicize details.
We normally don't hear about this things not because they can't speak about it but because they don't want to speak about it (bad press).
My guess is that it's a company which takes security relatively serious, but isn't necessary very big.
> hot target [..] else big enough to be a target
I don't thing you need to be that big to be a valid target for a attack of this kind, neither do I think this attack is on a level where "only the most experienced/best hackers" could have pulled it of.
I mean we don't know how the dev laptop was infected but given that it took them 3 month to reinfect it I would say it most likely wasn't a state actor or similar.
2 replies →
Hotel or b2b travel agencies also have PII that can be very useful to intelligence agencies.
Based on how outlandish the GW setup is, this is definitely a bank.
It could conceivably belong to a defense organization, but if it did, they wouldn't be able to write up a blog about their findings.
sounds like a non-conventional bank with many details allowed to be posted, perhaps something crypto ?
I'd add medical to that list. Vaccine test results are hot stuff.
I think you're right that it's medical. The author calls out PII was the target. Sure, there's PII in Defense/Fintech/Government, but it's probably not the target in those sectors and PII doesn't have the same spotlight on it as in the Medical world (e.g. HIPPA & GDPR).
4 replies →
Not just vaccines, but basically all your data, including billing and disease history. Perfect for both scamming and extortion.
Keep in mind that you actually want your medical provider to have that data, so they can treat you with respect to your medical history, without killing you in the process.
2 replies →
If they can get to you, they can get to your clients, who have clients they're now better able to get to, etc...
HVAC company working in a building where a subcontractor of a major financial firm has an office, for a random example...
One thing I didn't get is this magical PII thing. How does the author look at a random network packet -- nay, just packet headers -- and assign a PII:true/false label? I think many corporations would sacrifice the right hand of a sysadmin if that was the way to get this tech.
The article just says:
> I wrote a small python program to scan the port 80 traffic capture and create a mapping from each four-tuple TLS connection to a boolean - True for connection with PII and False for all others.
Is it just matching against a list of source IPs? And perhaps the source port, to determine whether it comes from e.g. a network drive (NFS in this case)? Not sure what he uses the full four-tuple for, if this is the answer in the first place. It's very hand-wavy for what is an integral part of finding the intrusion and kind of a holy grail in other situations as well.
We just open sourced one of our libraries used to detect PII:
https://github.com/capitalone/DataProfiler
Amazon and Microsoft also have their own offerings, but can be quite expensive for network packets (and pretty slow).
Most projects / teams will use some basic regular expressions to capture basics like SSN, credit card numbers or phone numbers. They’re typically just strings of a specific length. More difficult if you’re doing addresses, names, etc.
That's great for you but... how's that relevant to the article? The author never speaks of using this sort of thing.
I saw these regex matchers in school but don't understand them. They go off all day long because one in a dozen numbers match a valid credit card number, even in the lab environment the default setup was clearly unusable. But perhaps more my point: who'd ever upload the stolen data plaintext anyhow? Unencrypted connections have not been the default for stolen data since... the 80s? If your developers are allowed to do rsync/scp/ftps/sftp/https-post/https-git-smart-protocol then so can I, and if they can't do any of the above then they can't do their work. Adding a mitm proxy is, aside from a SPOF waiting to happen, also very easily circumvented. You'd have to reject anything that looks high in entropy (so much for git clone and sending PDFs) and adding a few null bytes to avoid that trigger is also peanuts.
These appliances are snakeoil as far as I've seen. But then I very rarely see our customers use this sort of stuff, and when I do it's usually trivial to circumvent (as I invariably have to to do my work).
Now the repository you linked doesn't use regexes, it uses "a cutting edge pre-trained deep learning model, used to efficiently identify sensitive data". Cool. But I don't see any stats from real world traffic, and I also don't see anyone adding custom python code onto their mitm box to match this against gigabits of traffic. Is this a product that is relevant here, or more of a tech demo that works on example files and could theoretically be adapted? Either way, since it's irrelevant to what the author did, I'm not even sure if this is just spam.
2 replies →
> YOLO rename label
Love the most recent commit
My guess was that traffic containing PII was flagged in some way such that it was visible in the pre-GW traffic the researcher had access to. That was the point of linking up the pre-gateway and post-gateway packets. I'm not sure how common such setups are.
What's even more incredible to me is that the researcher somehow recreated exactly the same / correct traffic pattern on their local testing setup, so that they were able to compare the traffic with the production environment to detect that there was a problem. How would you do this?
I'm not even sure what the "time" variable is on the graphs. Response time? (It also seems weird that there's any PII on port 80, but that's an unrelated issue.)
> What's even more incredible to me is that the researcher somehow recreated exactly the same / correct traffic pattern on their local testing setup, so that they were able to compare the traffic with the production environment to detect that there was a problem.
Yeah, that's another thing that has me confused, but I figured one thing at a time...
Thanks for the response, that pre-set PII flag does sound plausible, though it's odd that they'd never mention it and mention a 'four-tuple' instead (sounds like they're trying to use terms not everyone knows? Idk, maybe it's more well-known than it seems to me).
1 reply →
Yes, that was the part where I got lost. It seems he skipped some details about that so it's not clear from the article how that was done. I can't imagine capturing the encrypted data got him that.
Wow! What an amazing write-up and find!
It’s also amazing that they noticed the subtle difference in the NFS packet capture.
I can’t wait for the rest to be published.
Bookmarked
This observation os way too casual imo: "We noticed a 3 month gap about 5 month ago, and it corresponded with the guy moving the kernel build from a Linux laptop to a new Windows laptop with a VirtualBox VM in it for compiling the kernel. It looks as if it took the attackers three months to gain access back into the box and into the VM build."
If the attackers have access to brute force OS engineers / sysadmins work pc's then that should probably be the headline. The rest is just about not being found
Maybe if you are a business oriented person. But reading through the analysis, I felt like the researcher seriously enjoyed the hunt and the "not being found" part.
All I learned is (a reminder that) I actually have no idea how computers work. :-)
Started slow to get me hooked, then bam... slapped you in the face with a wild ride.
Reading this, I know of places that have no hope against someone half as decent as this APT. The internet is a scary place
"I think they let some intern fresh out of college write that one." - I think it was intentional; They probably had a tool to generate that code.
Wow. They don't pay you enough.
CNA?
> On March 21, 2021, CNA determined that it sustained a sophisticated cybersecurity attack. The attack caused a network disruption and impacted certain CNA systems, including corporate email. Upon learning of the incident, we immediately engaged a team of third-party forensic experts to investigate and determine the full scope of this incident, which is ongoing.
+ [CNA suffers sophisticated cybersecurity attack](https://www.cna.com/)
No, CNA was hit by ransomware.