Comment by jsiepkes
18 hours ago
> Each Lambda invocation executed a simple TruffleHog scan command with concurrency set to 1000. This setup allowed me to complete the scan of 5,600,000 repositories in just over 24 hours.
Gitlab must have been thrilled about a bot cloning 5.6 million repo's in 24 hours. That doesn't really sound responsible to me.
That's 64 clones per second. That's quite a lot but it seems like something that a forge operating at the scale of GitHub can handle, especially if they were --depth=1 (which might have missed some secrets if someone was lazy about clearing their git history).
Provided someone told GitLab Support. This was likely fine. GitLab can handle this much load. The platform as a whole has increased and improved over the years as new customers are added.
Think about this… every CI/CD Job runs a clone. That’s a lot..
Gitlab*
If they don’t like, they will apply rate limiting? Assuming they were well behaved (user agent, IPs).
Assuming bog standard lambda they'd have to rate limit a whole Aws region lambda range which would risk affecting legit usage. Bit of an arse way to behave against a service
I also thought the sleep(0.03) was cute. Some well deserved rest for the server to avoid hammering it.