Comment by Retr0id

7 hours ago

I don't have the links handy but I believe there are some comments from staff on social media that give more details.

Edit: https://hackaday.social/@tindie/116427447318102919

https://hackaday.social/@tindie/116436988752373293

The maker people I know have been migrating away from Tindie because it has felt like a sinking ship for a long time.

I really like the idea of Tindie so I hope they can succeed. I don’t understand what sequence of events led to this being such a large problem that they can’t even keep their site online. The post says something vague about the engineering team is hoping the migration work is close to finished, but it’s been years since I remember any engineering team knocking out the entire site for days without being able to restore it during a failed migration. Are they outsourcing dev work to the type of agency that bills by the hour and perpetually churns low hourly cost work to make their money in volume fixing their own code?

  • > The maker people I know have been migrating away from Tindie

    To what? The only alternative I know of is Lectronz.

    • Shopify, etsy, crowdsupply, a custom website. All have their problems, i’m not endorsing. I sell on tindie. Well, i don’t sell much there, but i list on tindie. Most of my sales come thru my own store site.

      2 replies →

  • It can be as simple as a terraform apply wiping out huge swaths of the backend infra, getting that back, depending on how disciplined you are, can take in the order of days/weeks.

You have to wonder why it's so hard to put that on their 503 error page. I suspect something's much more broken than they're letting on.

  • This would indicate wherever they were hosting their site on no longer exists. 503's even on pages that should mostly be static suggest the backend no longer exists, or whatever ingress they're using in front of it disappeared. As far as I can tell every single page on their site is 503'ing.

    Example of a response I see:

    < x-cache: Error from cloudfront < via: 1.1 bdf85d6d4811ab08c57841855a848f8a.cloudfront.net (CloudFront) < x-amz-cf-pop: LAX54-P11 < x-amz-cf-id: nTQ-y1Ut3F-04jUCDM09ordCtj0CMkVmmtZTe__BtzEr1sMJu7rKaw== < age: 76773

They are putting out a lot of stuff that to me is very obvious to read between the lines what led to this because I've been brought in to clean messes like this before:

>The goal of the current maintenance is to fix a lot of long-standing issues with the site. The underlying infrastructure was getting very fragile as technical debt accumulated over time. A team is working very hard right now to make sure that once the site is back up, it's on much better footing and will be solid and reliable for the long term. Despite the unfortunate amount of time this is taking, it will be a major benefit to the site in the long run.

They are saying it was "spring cleaning" or a migration that took out the site for days. "infrastructure getting very fragile" reeks of bad or nonexistent ops practices, probably very little or unreliable IAC (if any, I've seen shops get by for 10+ years by just clicking things in console, til unfortunately it gets to this point).

This though, rubs me the wrong way:

> We want to offer a much better quality of service going foward. We understand that the lack of communication has been frustrating, and I have been closely watching social media and reporting the community's feelings up the chain, so your voices are being heard. The plan was not to have a long outage like this, but due to factors beyond the dev team's control, things have taken much longer than anticipated. Please be patient with us - I will keep updating here and on our other social media.

"Factors beyond the dev teams control." Sorry, no. If you have an ops team, you don't get to toss blame over the wall like that, and if you don't, you have no one to blame but yourselves. I feel bad for whoever the unofficial official ops dude is right now. These kind of infrastructure "tech debt" woopsies come from years of people just not giving a crap to doing things properly, it's never seen as important until it suddenly is. Hope they learn a lesson and hire an infrastructure guy properly. There's long been a persistent delusion in the pure dev world that they should be able to be completely agnostic to the hardware lying underneath their beautiful code - ideally yes, in practice almost never, unless you come from a place that has the significant resources to make something nice like that, or are willing to pay out the azz for managed cloud services or licenses.

  • It is entirely possible, especially in small companies in my experience, that “factors beyond the dev teams control” means “technical founder with severe myopia and decision fatigue who prevents “complexity”” as they see it, which for them means everything you discuss here as being necessary.

  • I didn't take "the dev team" to exclude ops. Ops folks are usually devs, too.

    • Often, but there are a lot of shops that make them entirely separate silo'd teams, and the symptoms are usually what I am describing here.

      Most ops guys can do dev, the inverse is absolutely not true IME.

    • How big of an operation is Tindie? Founder plus one other dev/ops/everything else guy?