Comment by datavirtue
3 months ago
Seriously wondering what you guys experienced with Azure. Never had an issue and prefer it over AWS.
3 months ago
Seriously wondering what you guys experienced with Azure. Never had an issue and prefer it over AWS.
Same here, I prefer Azure to AWS and I’ve spent multiple years with each
I suppose it depends on what you do with it and what you need.
Most of it is not an individual experience or 'event', just bad design with bad results. I'll try to describe some global ones:
One of the most bizarre things is the crazy bad resource hierarchy. There are multiple overlapping and incompatible ones. Resources, networks, storage, IAM, billing and org, none of it in a single universal hierarchy. It seems to mirror the idiosyncrasies of legacy enterprise organisations with their fiefdoms, instead of a cloud.
The next useless thing is how you just cannot use what you need when you need it in whatever way you want it. Almost all services are hyper-segmented requiring various premium tiers instead of them being universally available.
I get it, it's a great way to bundle things people don't want and extract as much money out of them, but that only really works if people have no alternative. And those two form the bad architecture/bad technology trifecta with this third one: a lot of services, maybe most of them, seem like some sort of 2005 model where a resource is backed by nothing more than some random managed VM in the backend, with all the problems (failure modes, inconsistent behaviour etc) that come with that model.
Perhaps the reason for those things is simple: Microsoft wanted a way to extract more money from their customers and lock them in even more. Moving workloads to Azure meant something different for them than it did for the rest of the world: you used to have a lage army of common windows sysadmin jobs where there was a lot of local control and local management loops, but when you move that to a common template in someone else's datacenter (Azure, essentially) you can ditch most of those loops and people. Granted, they created those local controls/loops themselves to get a school-to-work microsoft client pipeline (same as say, Cisco or oracle), but I doubt there is any new markets to cater to in that way. Since people tend to be the most expensive and most risky part of a business, being able to take more of them out of the loop, making more of them optional or making them more expendable/remote is a (short-term) positive thing in the spreadsheets of most MBAs, which is who most large companies cater to after all. This did of course backfire and we now have the same quantity of jobs but instead of sysadmin you get 'azure engineer' which is more of a crossover between operational helpdesk and technical application manager. But everyone wins: during your exodus you can sell it as modernisation, when you remove that on-prem burden you can shift your CAPEX and OPEX around, your quarter looks better when you can reduce headcount, and once your bonus is in, you can put some job postings out for the people you are now missing.
Technology-wise, the only thing that really changed was the ways in which people could cut corners. Some corners are pre-cut, while others are uncuttable. API-wise, it's a crapshoot, a turd attempted to be polished by a webui that hides the maelstrom of questionable residue below the surface.
Re: "There are multiple overlapping and incompatible ones. Resources, networks, storage, IAM, billing and org, none of it in a single universal hierarchy." - hierarchy is based on subscription / resource group. Billing is usually done with tags (you can add a tag like "CostCenter": "Online Marketing CostCenter1234")
Re: "hyper-segmented requiring various premium tiers instead of them being universally available" - premium tier usually means your service runs on its own Azure VMs; while the other tiers your service shares a VM with other customers. The first choice is more expensive obviously and I prefer to pay for that service only if I need it.
BTW - Azure supports bare metal Linux and Windows. So if these pesky Azure services get in your way you can always go back to your on-prem version, where instead of running your workload on your own VMs you run it on Azure VMs.
Preface: don't worry, this is not a rant aimed at you, I just enjoy off-the-cuff writing sometimes ;-)
For your first Re:
That would have been great, but that is just more inconsistency. Some resources exist in resource groups, but some don't and you cannot nest them. IAM has the same problem, you always have to create elements on two sides since Entra is not really an Azure resource, it's parallel to your tenant. Policies for Azure don't exist in Entra, but in MGs and Subscriptions and RGs they do. Those don't affect Entra of course, so now you have two different non-interacting policy trees, except you can reference Entra principals. But not if you want to target STS instead. But you can't always target STS, because that would mean you wouldn't have to buy a premium version of IAM (be it P1 or P2 or PAM). Technically RGs would have never needed to exist if they had their tagging and policy system available from day one.
For your second Re:
Instead of having 1 class of groups or containers, there are many non-interoperable versions. You know who doesn't do that? Everyone else. Same for say, IAM. Principals are principals. Tokens are tokens. Want to authorise something? One universal policy language that can target principals, tokens or a combination. Want to use metadata? That's available too, including tags. Applies on all resources the same way as well. Sure, you'll still not find perfect consistency (looking at you, S3, with a 3rd extra policy option), but there is no artificial distinction or segmentation. There is no 'conditional access' product since we would just call that a policy. There is no 'PAM' product since again, that's just a policy. There is no 'premium' because all features are always available, to everyone. And you know the best part? It's not a parallel tenant construction, it's just part of the same namespace of all other resources. Even Google's weird identity setup treats it all as the same organisational namespace.
It's not like Microsoft is unaware of all of this, they are digging Azure-flavoured graves (for legacy setups) faster than Google can expand their own graveyard, and some features that were really late to the party (like MGs, RBAC, PIM, tagging scope with policies as well) are not surprising to see. But repairing a large fractured product like Azure is iffy at best. Time will tell.
For the BTW: yeah, everyone can in the end run virtual machines, but a cloud just to run some VMs is a real good way to burn money. The value proposition of a cloud is elasticity and consistent API-driven resources (which includes IAM, policy language and tagging). A web UI that starts and stops a hidden VM is essentially just a VPS and plesk straight out of 2005.
From the way persistence is implemented on Azure, you can pretty much tell it's all just personal templated VMs underneath, which is exactly what I don't want. I don't want a "storage account" that configures a bunch of inflexible subresources. Say I want to store some blobs, I'd want to make a bucket for that and on that bucket I'll do my tagging, policies and parameters (availability, durability etc). And then I want to do it again, but with slightly different parameters. And then I want to do it 100 times again with various parameters. So now I need 100+ storage accounts too? Who thought it would be a good idea to add a storage account as an intermediary? Probably nobody. But the technology wasn't ready, so instead of going witha good idea, they went with "this will fit on the spreadsheet of the sales department" and released it. Below the surface somewhere hidden from the public API, this reserves some SAN for you, as if we're playing datacenter-for-hire in 2005...
You might wonder: why does it matter? It matters when you do a lot of changes every day, not just deployments or cookie cutter rollouts, but many different applications, services and changes to existing resources. Entire environments are created and destroyed with 100's of resources many times per day per team, and we can't sit around waiting because Azure wants to stop and cleanup an instance that they run under the hood, and we definitely don't want to pay (6 to 7 figures) for such a construction. We want to make use of fast public services that provision and scale in seconds and have APIs that will actually do the job instead of time out and return internal errors. If a cloud isn't really a cloud, but behaves like a datacenter with windows PCs in it, it doesn't do enough for us.
I'll admit, after migrating the last users off of Azure, the only remaining ones are not doing anything cloud-native anyway, it's all just staff-type SaaS (think: Intune, M365 and some Dynamics), so the amount of new Azure knowledge and experience for me over the past 6 months is a lot less than it used to be. The period around 2017 was when most stuff in Azure became a bit more usable with RBAC and AZ Policies, but that was like 6 years too late and to this day is a split world with Entra, yet completely dependant on Entra. Even external identities cannot use STS directly and will have to use static SP credentials. A cursory look at the current docs shows it's still a (premium in secure uses) case. I get it, that's how Microsoft can make more money, but it is technically a bunch of nonsense as other clouds have shown.
These reads like you learned one cloud platform and expected all others to be the same.
Well, regardless of how it reads, that is not the case.