Tell HN: OpenBB scraping GitHub emails for marketing spam

4 years ago

A recent post[0] went around about OpenBB. I starred their GitHub[1] repository because I was interested and now I've just received an email[2] thanking me for signing up for their newsletter.

Look, I hate to call the project out but I'm tired of people scraping GitHub for emails and blasting out marketing spam. And I'm confident they scraped my email from GitHub because that's the only place 'github@mydomain.com' is used.

[0]: https://news.ycombinator.com/item?id=30854451

[1]: https://github.com/OpenBB-finance/OpenBBTerminal

[2]:

  Hi,

  We, as OpenBB, would like to personally thank you for signing up to our newsletter.

  We are on a mission to make investment research effective, powerful and accessible to everyone.

  The team would love to hear what you think of our OpenBB Terminal and if there is anything we can do to improve it.

  Feel free to e-mail us at hello@openbb.co or reach us through our Discord to get involved with the community.

Received the same email.

Here is the GitHub link[0] to report abuse. I send this short text[2]. Generally I'm ok to be contacted by email and I think I'm quite tolerant regarding this as long as I don't receive a spam saying that I subscribed to a newsletter while I have not.

[0]: https://github.com/OpenBB-finance/OpenBBTerminal) is spamming GitHub users that have starred their repository. I never subscribed to their newsletter.

  • I would 1) report it to Github, 2) flag it as spam in your web mail client, 3) report it as spam to whatever mailing service the company may be using (Sendgrid, Mailchimp, whatever).

  • I starred their repo. If they send me anything I will report them to whoever they’re using.

I just don't understand how companies can think this kind of privacy invading marketing can be effective. In which situation do they believe they are adding affinity for their product by scraping your email?

  • It's not "companies" as a whole. It's a small department who comes up with these ideas and uses misleading, short-term metrics such as open rate or "cooking the books" by misattributing future revenue to their efforts ("this user who we spammed 3 years ago suddenly signed up - see, my marketing strategy is great, now give me a raise!") to justify their salaries. Their incentives are rarely correlated to those of the company, so every short-term trick that they can use to justify their salary is good even if it will be damaging to the company in the long term (by that time they'll already be on their next project with a successful achievement on their resume).

    • > It's not "companies" as a whole. It's a small department who comes up with these ideas

      Ah yes, the "one bad nerd-apple" hypothesis.

  • The cost for the negative responses are close to zero while the benefit for positive replies is high. Usual spam/growth hack accounting.

  • It's your public email address on GitHub (which you may choose to hide, if you prefer). It's also public what you starred on GitHub. This email is obnoxious and annoying, and it's possibly even against the law to sign people up without them explicitly consenting, but it's not "privacy invading".

Surely it's some kind of trademark infringement or something to call your Bloomberg competitor "OpenBB". That seems obviously designed to mislead people into thinking it's something to do with Bloomberg.

If it were just some meme project fine, but they are apparently taking investment [1], and investors generally expect to make money. Making money from misleading people into thinking you're backed by the market leader...uhh...seems risky.

1: https://openbb.co/company#investors

  • I spent half a decade in finance and know my way around a Bloomberg terminal, and I don't think I've ever heard “BB” used as shorthand for Bloomberg. When I see it, I think BlackBerry.

    • Sure, but you see "financial terminal" you're already thinking Bloomberg. Then you see "OpenBB" and by that point what else could "BB" stand for?

      I mean that as a legitimate question, what are they even claiming "BB" stands for? I actually can't find it.

  • They are also using stock data from Yahoo Finance and IEX Cloud as their data sources. It is misleading to use IEX because you are only getting 2-3% of volume being that IEX is a very small stock exchange. Using Yahoo Finance may be more accurate but the exchanges will not take kindly to using scraped data like that. There are very strict compliance and audit requirements for dealing with any of the exchanges, and the data costs a ton. I have seen a lot of projects pull together a bunch of random "free" data sources without regard for their licensing or terms. I can't understand how this is even comparable to Bloomberg when there is no live data feed and the data source is questionable at best. I hope their investors aren't blindly believing their marketing.

    Being a free and open project I'm not sure how they would ever use a paid data feed though, the market data industry is just not compatible with that model unfortunately.

    • Under which authority could Yahoo finance legally restrict access to the stock data if on the technical side the website and information is public and doesn't require authentication?

      4 replies →

It's definitely not the first time I've seen emails scraped from stars or committing to a public repo on GitHub.

I agree with others in this thread: if you didn't explicitly sign up for the mailing list, mark it as spam. They should realize the damage they're inflicting on their mail sending reputation by sending unsolicited communications pretty quickly.

  • > They should realize the damage they're inflicting ...

    Of course they should, but this is by no means a new marketing strategy ...

That sucks and I've always expected unscrupulous, shameless groups to do this at some point.

I use the users.noreply.github.com alias when developing on Github

https://docs.github.com/en/account-and-profile/setting-up-an...

And on Gitlab, the users.noreply.gitlab.com email

https://docs.gitlab.com/ee/user/profile/index.html#use-an-au...

  • I treat GitHub as a social coding network and I want individuals to be able to email me. I’ve received genuine questions, compliments and friendship by providing an email address on my GitHub profile. And if that’s not the product manager’s dream over at GitHub, I’m not sure what is!

    So I think the solution to the problem is not to blackhole my GitHub profile, but to put systems in place that prevent the spamming techniques reported by the OP.

> We, as OpenBB, would like to personally thank you...

What dishonest fake-friendly BS. An automated email isn't personal.

Hi all,

I'm the Didier Lopes from the e-mail.

Sorry for this. This was not intentional The goal was to let all you know about the rebrand of the project, ONLY.

We would have removed every single one of you from the newsletter.

See this thread: https://github.com/OpenBB-finance/OpenBBTerminal/issues/1625

  • > This was not intentional The goal was to let all you know about the rebrand of the project, ONLY.

    That is not the defense you think it is, given that is also a violation of anti-spam and privacy laws in many jurisdictions, and of GitHub's ToS. One unsolicited e-mail is still unsolicited e-mail. What you intended to do was also wrong.

    • I understand this now.

      And believe me, if we had known how people would have reacted - even with the one single e-mail about us changing completely the name of org and repo - we would have not done it.

      It's midnight here and am still apologising to everyone for this. It's frustrating, but it's the only thing we can do now.

      2 replies →

  • I think people are probably curious how you got a list of emails of people who have starred your repo(s) on GitHub? Personally I don't think "please add my email to a database" as being included in starring a repo. Maybe there's something in the TOS or a community understanding I'm not aware of.

    I'm not trying to pile on or push negativity to you and the project; if this is what you intended to do (and I don't have too much reason to doubt) then it's a different flavor of spam than recruiters/blogspam contacting us or signing us up for stuff without our consent.

  • "Letting you all know about the rebrand of the project" is still unwanted marketing that no project should engage in. Whether it's going to be continuous or not doesn't change that.

  • The irony is you're using a public platform to apologise for this, whereby people chose to come - it's not been foisted upon them. I'd imagine it gained you quite some interest in the project at first. You could have used the same means with some extra content to make the click worth it. This just looks sketchy and you may have alienated a number of potential users. Yes it's only an email but the demographic here is a little different to the norm.

Well at least you starred the project. I get recruitment spam and unwanted surveys from “researchers” on my GitHub emails all the time, having starred nothing related. The worst offender is Turing.com; they spammed emails of all my GitHub accounts repeatedly before I blackholed them, and their unsubscribe button seemed to be a “here’s a real human, please spam more” button. Shame on them and all their investors. https://www.crunchbase.com/organization/turing/company_finan...

Thanks for posting this. I just received their spam email an hour ago. If enough people report it as spam, it should have deleterious effects on their mail sender.

Yes, these kinds of marketing tactics are disgusting and despicable, but the broader issue is why does Github facilitate them by making user emails discoverable? There are tools like github-email [1], which allow you to:

> Retrieve a GitHub user's email even if it's not public. Pulls info from Github user, NPM, activity commits, owned repo commit activity.

Why does Github have so many ways to exfiltrate a user's email address?

[1]https://github.com/paulirish/github-email

  • Given that having an email address attached to commits is a fairly standard component of git, is there a way to prevent this?

    Short of using burner/fake email addresses of course

    • The best way, at least when committing to GH, is to use your GH email itself like so

        8601934+judge2020@users.noreply.github.com
      
      

      You can find this in email settings https://github.com/settings/emails under “Keep my email addresses private”

      You can’t receive email at this address, so, hopefully, anyone that needs to contact you can find your real email elsewhere.

      1 reply →

    • The only (remotely) effective strategy I've seen is to use burners that only show up in commits and then report all companies using them to GitHub itself in the hopes that eventually those companies will somehow be reprimanded.

      Of course, the problems with this strategy are that the manual work scales poorly, there are two or more points of failure, and the best outcome is only marginally positive, doing nothing to deter future abuse.

      3 replies →

  • Because the email is part of a Git commit (author and committer information) and your Github repo has public Git commits. "man git-commit" and search for "email" for details.

    • Yes, I'm aware of how commits work, my point is that this kind of practice goes hand in hand with making it easy for spammers to harvest user emails.

      8 replies →

  • GitHub behaves similarly to git, which uses emails as a weak form of identity. They’re not meant to be private in git’s model, and having them be public is not a serious problem absent bad actors (who should be banned for violating GitHub’s TOS).

Update: I was honestly hoping to forget about this and move on. And now I receive another email! And while they closed with an apology for the unsolicited spam and promise to unsubscribe me from whatever I didn't subscribe to, they couldn't help but shill a bit more:

  I wanted to give you an update about our name change and share with you our exciting journey, which you can find here: https://openbb.co/blog/gme-didnt-take-me-to-the-moon-but-gamestonk-terminal-did

  TL;DR. We are on a mission to make investment research effective, powerful and accessible for everyone.

  If you are interested in learning more about us, you can subscribe https://openbb.co/newsletter.

This is adding insult to injury. /rant

I was just contemplating this age old battle of light vs dark when it comes to shady marketing tactics vs every-day consumers. My phone number was somehow picked up by some spammer sending me 2-3 spam SMS's per day with the same shady tactic trying ("You have an unsent package", "claim your free gift", etc.) to send me to the below domains [1] w/ a tracking code attached.

How do we, as the people building the platforms these perpetrators ride on, stop them? I reported this one to Cloudflare's abuse form because they're all on Cloudflare's nameservers, and almost guaranteed to be the same owner, but, it took me 5 minutes to fill out the form and they only accept one URL at a time. It's just too time consuming to fight back as an individual consumer.

Every one of us has thought at some point, "there has got to be a better way".... right? So?

[1]

http://needthecbd.com

http://wantafreetv.com

http://careforgreatskin.com

http://valuedcust.com

  • > How do we, as the people building the platforms these perpetrators ride on, stop them?

    We don't, because there is more money to be made with the way things are setup as-is. And since builders are generally not the money-decision-makers, the platforms keep being built (technically) bad.

    Take phone-number-based-scams, since there is no trust in identity, but there is a money-made-per-usage, everyone except the end-user wins to keep everything as-is.

    Scammers get money if a scam succeeds, network providers get money regardless, transit agreements stay up since traffic being passed at volume keeps the lines open, maintained and at size. End-users don't disconnect since they'll need it for legitimate use, so they keep being profitable as well.

    If the money were to stop being made at any point in the chain it would suddenly all be over. But since legitimate and scam usage is mixed that will never happen.

    Replace phones and SS7 and telcos with any other transportation and information system for comparison. Email spam keeps 'working' because there is no real way to identify the sender in such a way that the identity can be barred. Postal spam has the same anonymous sender problem.

    Heck, the best way that does work is having to physically hand flyers to people since you can simply not take the flyer, and since you (as a person) can't be handing out flyers without physically being there you can also be identified and barred.

  • Follow the chain.

    1. Confirm domains are known for phishing and spam.

    2. Figure out who registered said domains.

    3. Add those people to known blacklists so they can't register anymore domains ever again. Likewise block all domains owned already by them.

    4. Get domain registrars and email servers to block said domains too.

    5. Rinse and repeat every time it happens.

    6. Find similar accountability chains as above and make sure to close the loop on them each time. "Sorry we can't give out emails and personal details. Fuck you, stop enabling illegal activity." And fight for legislation and tech solutions to enable the above.

    If you can't move to a better spot after identifying bad patterns, then it's just a giant game of useless wack-a-mole.

    • > 3. Add those people to known blacklists so they can't register anymore domains ever again. Likewise block all domains owned already by them.

      You generally can't know who operates a given domain automatically. whois is almost always redacted now.

      > 4. Get domain registrars and email servers to block said domains too.

      Good luck with that. They make money from spammers, and don't have any incentive to stop

      I tried Namecheap twice and provided them spams with valid DKIM signatures for domains registered to them (generally on TLDs on sale for 1$, which must be sold at a loss, right?). They refused to do something about it.

  • > How do we, as the people building the platforms these perpetrators ride on, stop them?

    Every now and then, I get mad while I'm bored and have a bit of free time, and I'll write a script to make requests with randomised tracking codes. I've got about 30 available VPN end points easily available, and I'll cycle through them all sending requests with random ID in whatever the format looks like. It _probably_ makes no difference, but _maybe_ it'll make their data less useful (and if nothing else, I get a bit of satisfaction from doing it.)

    • I stew on this idea all the time and conclude that only scale (lots of users?) would make this effective. But then I ponder the remedy itself being wielded against legitimate parties and I get slightly sad and move on to something else to worry about.

      1 reply →

The exact same thing happened to me -- and this is very, very frustrating and makes me want to have nothing to do with OpenBB. Easy enough to block the email but just how gross to do.

When I read a spam e-mail, I add that company to my list of companies not to make business with, ever.