It's hard for me to know whether to feel bad for ES in this case. Did they bring it on themselves? Is Amazon too big and a bully?
From my perspective, Amazon has made most of its profit price gouging consumers on bandwidth after vendor locking them into their ecosystem, where they bootstrap new services by wrapping open source software with some provisioning scripts, management dashboards and cookie-cutter API / console templates. Indeed, most of this is templated -- AFAIU, for example, each AWS service autogenerates its Boto bindings and parts of its console frontend via code generators. Amazon has really mastered the factory process of churning out new services, and when they find a popular one, they can invest more resources into developing it than the original team ever could.
And therein lies the rub. If Amazon is improving the software in a way that the original team couldn't, it's hard to say that the community isn't benefiting. I think what strikes me the wrong way is that Amazon is not doing it for any altruistic reason. In fact, Amazon contributes very little to open source in general, considering how much they take from it. Compare them to Facebook (React, etc) or Google (tons of dev tools) or Microsoft (VSC, TypeScript). What does Amazon have? Firecracker, kind of? And now a fork of ES because that's the only way they could continue making money off it without violating the license a small startup put in place to stop them?
Well, good for Amazon, I suppose, but I find myself instinctively disliking them for this. I'm not sure what the solution is. Hopefully technologies like Kubernetes and Terraform will encourage big customers to become at least cloud-agnostic, if not cloud-independent. At the very least it would be great if Amazon / Google / Microsoft stopped gouging bandwidth at such absurd margins. Or not. Maybe it will be their downfall as startups differentiate along those lines. That would be ironic, coming from the originators of "your margin is my opportunity."
Personally I'm doing my part by not building anything with vendor lock-in. It's great to be able to deploy to any cloud, if you value either robustness or flexibility.
One thing which surprised me: Elastic has a market capitalization of ~$11B.
I think that changes some of the more floaty ethical concerns. This is not a David vs Goliath situation. This is Goliath vs Super-Goliath.
At this point, I'm much less interested in the drama of which mega-corp is screwing over the other. I'm more interested in: how does it affect me? When the titans are done trampling over the rest of us, which side benefits me the most?
Its too early to tell, but it seems like it'll be Amazon. The product is more open. They have a demonstrated history of great support. Yeah, they gouge us on networking and everything else, but at least they're the devil we know, and buying into the OpenSearch ecosystem has a greater probability of being the more open solution into the next decade.
Uhmmm I’m pretty sure David vs Goliath is talking about scale between competition. Saying that $11B is Goliath just because you’re sitting at $1M doesn’t mean they’re not in a crazy mismatched fight against a $2T company. In the same way you could be in a David vs Goliath situation yourself if you with $1M in wealth tried to sue someone with $25K of wealth. Everything is relative. Doesn’t mean it’s not a crazy unfair mismatch that doesn’t deserve sympathy and regret.
This argument is part of Amazon's PR campaign to tell devs to not feel sorry for Elastic because it's now a big company and they make money in the market. So, if you built a successful OSS and start to make money then it's ethical to clone any OSS and pushes projects out of the market because now it is "Goliath vs Super-Goliath".
I mean, Elastic was successful because of license arbitrage; to complain about said arbitrage when Amazon does it is ... well, it's hard to feel a lot of sympathy.
Quick point,.., as yours is valid... these days $11Bn market cap doe not represent cash in bank for development and R&D. It just reflects what the market think its worth. R&D , of which there has been a lot, and it continues, is hugely expensive.
> they can invest more resources into developing it than the original team ever could.
I know this is a popular narrative, but as someone who works on AWS, I think you would be shocked by how small the individual dev teams are that build and maintain the services that everyone uses.
I'm not going to downplay the network effects involved. Of course AWS has a tremendous advantage in being able to standardize the customer billing, IAM, and EC2 Usage.
And there are economies of scale.
But individual AWS service teams are:
* incredibly lean and focused
* still have to make a profit on their own terms based on the infrastructure they build and the fees they charge customers
* laser customer obsessed to solve people's (developer's) direct needs.
I understand the community's concern about AWS investment and approach to OSS. But I can assure you (though you have no reason to believe me) that the goal is never to embrace, extend, then extinguish. It's all in the service of going where the customers are, and solving problems that they tell us they have. The profits are a byproduct. The "working backwards" process is no joke. We spend a lot of time figuring out what is the right thing for customers to build, start building it, and THEN we think about how do we make money from it.
> and THEN we think about how do we make money from it
Do you really need to think? Looking at the on-demand pricing in US East, a m5.4xlarge.elasticsearch instance costs $1.133 an hour, while a m5.4xlarge instance costs $0.768 an hour. That's 47.53% of extra money. And like you said, it only requires a small team to build and maintain the service.
It is no coincidence that all cloud providers are trying to ramp up their hosted services for open source software, even GCP, who historically only focused on their own proprietary stack. There's a lot of money to be made.
they have as many people dedicated to the capability as needed. at least the capabilities have owners, which from my experience is not trivial for an organization to achieve.
curious if others have noticed this as well (capabilities without clear owners)? what does this depend on? time? company size? both?
Could you please shed some light on how many people would be behind a product like AWS Lambda or AWS CloudWatch?
As an outsider, I would guess huge swaths of developers with a massive hierarchy. Buildings full of folks working on AWS services. I have no idea and extremely curious.
Interesting. I wonder if the market can only support managed services when it’s provided directly by the cloud provider like AWS. I assume Elastic has to add margins to cover for their existence, which make them less competitive with AWS.
This sounds a little unfair, even if I agree with the argument that they’re free to fork OSS software and do whatever the fuck they want.
It sounds trivial to "wrap open source software", but surprisingly it is big value-add to thousands of companies. We can't just look at successful companies like Netflix to downplay the challenges of operating a service. Not every company knows how to operate complex systems under manageable cost. How many companies can really manage a Kafka cluster, let alone scaling it, for instance? Indeed, even companies that people deem powerful may screw up, if they don't get their culture or process right. Take Uber for example, for god damn five years, they still couldn't offer a service like EC2, let alone supporting persistent volumes. They still couldn't make their database provisioning on demand via an API. Their MySQL-based NoSQL solution was still based on FriendFeed's architecture and the APIs were hard to use. Yet they spent millions building a k8s replacement, building a GPU database, switching from mysql to postgres and back to mysql, etc and etc. So, yes, cloud companies like AWS buildd mere control planes to wrap open-source software, yet such seemingly mundane offering does bring values to many customers.
A key reason for Netflix to have an easy-to-operate infrastructure is that Netflix prioritizes productivity and scalability. They specifically did three things:
1. No fixed deadline, with a few exceptions of course, for
platform-related projects.
2. Promotion/salary negotiation was not tied directly to release of external features.
3. A single engineer could be responsible for more than one service for the entire company, with 24x7 oncall.
With Netflix establishing such incentives, engineers naturally focus on getting infrastructure right, to the point that oncall 24x7 is a non-issue.
So, yeah, culture matters, big time.
Edit: another incentive was that a service was measured by its adoption. The more people praised it, the more successful the service would be. Requiring meetings to get buy-in for a new service was considered a sign of potential failure. As a result, every single team focused on making the value proposition of their services obvious. Path of least resistance was a given instead of a debated topic.
This is the most important point, IMO. Amazon's value add is not the software itself, it's the operation of the software. That includes a LOT of stuff, not just making sure it's running. It's security modeling and patching, compliance, DDOS protection, etc. Amazon's product is an army of ops engineers working 24/7 to keep your stuff secure and online.
With that in mind, their behavior here makes a lot more sense, and comparing it with companies who have dramatically different products, like Facebook and Google, takes a lot of effort to understand the differences and what impact they have.
90% of the companies do not know how to manage software. They got weird dogmas, no KPIs, no ability to measure performance or debug problems. This is why they got external consultants and cloud vendors. What is really funny how they think internally about these issues. If Netflix and Amazon was publishing efficiency numbers and we were able to compare with the bottom 95% of tech users people would be shocked. The difference between the numbers I am aware of (number of computers / engineer for example) is 100x.
As a longtime ElasticSearch cluster admin/developer and Elastic Cloud customer, I don't feel bad for Elastic in the slightest and I'm psyched about this fork.
The way they operate their cloud service leaves a lot to be desired and encourages maximum spend if you end up wanting to use it for anything demanding in production.
This is really a shame to hear. There was once a an Elastic SaaS company from Norway called found.io that were pretty sharp and customer-centric. They were acquired be Elastic pretty early on[1]. I believe Elastic Cloud was built from this. I guess found.io's culture of delivering a good product didn't survive?
This! The company I work for spending over $750,000 with Elastic Cloud every year and the quality of service we get from them leave a lot of be desired. I don't feel bad for Elasticsearch corp, they have done this to them selfs.
> In fact, Amazon contributes very little to open source in general, considering how much they take from it
I don’t think this is a fact. Amazon seems to contribute pretty significantly according to the pages [0,1] they put out that describes their contributions. Not to mention their membership in OSS foundations like Linux Foundation. [2]
You have the caveat about in relation to benefit they gain, but that’s pretty hard to measure. And I think isn’t really a good measure.
I’d like to learn more about why you make such an absolute claim and maybe you have some better measure.
I remember back in the 90s when big orgs (Microsoft, IBM) didn’t contribute to open source and can’t even think of any big orgs today that don’t contribute to open source. Even Oracle has big open source projects.
The absolute claim, aside from being at home in a rant on HN, comes from a cursory glance at https://github.com/amzn, weighted by contributors and popularity, and compared to companies of similar size. Google, Microsoft and Facebook all build and maintain multiple open source projects that are hugely popular with people who use them outside of the company sandboxes. For example, people benefit from React without Facebook gaining much directly. (Facebook! If Facebook has any redeeming qualities, it's their open source contributions to the frontend ecosystem, although I promise you I could ascribe malicious intent to those as well...) Contrast that with Amazon. On their GitHub page, I see a few obscure projects amongst a bulbous array of AWS SDKs.
To the sibling comment that asked about Firecracker -- I think Firecracker is awesome, and I did mention that in my original complaint. They even created it themselves! Well, a team of amazing engineers in Romania did. I have no personal insight into the matter, but it seems like they operate relatively independently from the AWS profit machine. Good for them too, it's incredible software. But I'm sure if they were to tell the story of how they got buy-in at Amazon to open source it, the same themes would come up -- how does Amazon benefit from this? In the case of Firecracker, the more people test it / harden it / run Doom on it, the more value Amazon can provide on its serverless platform. So again, unlikely to be purely altruistic intentions... but that's not to say there's anything wrong with that. I just find it all a bit distasteful in aggregate.
I have no idea if Amazon does this or not, but maintaining forks of projects is no fun, so it's in a company's best interest to contribute bug fixes and improvements that aren't part of their secret sauce.
Did the BSDs do it wrong, too? Apple uses a lot of FreeBSD software that they turn around and sell for profit. How about PostgreSQL, didn't Amazon fork that as well? My point is, there's nothing wrong with forks nor companies forking if the license allows for it. It's up to the developers to choose an appropriate license to not be forked/ripped off, if they so desire.
I personally am against modern day corporate America, but I can't blame them for this. The software is given away free/libre/gratis to be forked by whomever.
Perhaps to combat this, one should choose a non "Open Source(TM)" license, but a source availbe license. E.g https://mariadb.com/bsl11/ (not my personal favorite, just an example).
Also, I definitely agree with/do the same:
> Personally I'm doing my part by not building anything with vendor lock-in. It's great to be able to deploy to any cloud, if you value either robustness or flexibility.
My favorite "conspiracy theory" is that AWS intentionally creates stupidly verbose and numerous headers in all of their APIs just to up the bandwidth usage a few bytes per request at a time.
People want to pay for services, not software and licenses. They want turn-key solutions that are available via API and GUI, instantly and on-demand. This is the fundamental reason why AWS is so successful and the demand is constantly proven with every new product launch.
Elastic (and other vendors) complaining about this instead of using it for their own success is a problem of their own making. At least a few companies are finally learning.
What ES wants of course is for Amazon to give them a cut of revenue from hosting ES.
We already know Amazon isn't interested in doing that (either at all, or at whatever price ES wanted, we don't know that).
They had no legal requirement to when ES was open source. So ES changed the licensing to no longer be open source.
So, Amazon could... a) decide to give ES a cut after all, b) decide to stop hosting ES, or c) fork the last open source version.
I don't think anyone is surprised they chose c? Presumably ES isn't either? Maybe ES thinks this will be good for them/bad for Amazon anyway, because they are hoping potential customers will abandon the Amazon fork and stay on the original ES fork?
Not sure why they'd be confident in that exactly. Maybe they know what they're doing.
As users/customers, we would rather have a choice of hosted vendors/platforms, and that it remain un-forked (so we can use/write software compatible with either vendor/platform). Competition is good for us as users/customers, that's in fact one of the reasons we choose open source, so no one vendor can set the hosting price all on their own without competition. We want to be able to choose among competitors for hosting, based on price, customer service, performance, uptime, whatever.
But ES didn't want that, they didn't want hosting competition to exist -- at least not without permission and agreed upon cut for them -- because, I guess, hosting was how they planned to make money as a company to fund development as well as profits for investors etc. So they changed their license to no longer allow it. So of the possible outcomes remaining... this one seems as good as any for the user/customer, I guess?
So, when you say "I'm doing my part by not building anything with vendor lock-in" -- I'm not sure which course you are suggesting. In fact, between ElasticSearch and new OpenSearch fork.. it's OpenSearch that is the one without vendor lock-in, right? OpenSearch is Apache licensed, and can be hosted by any vendor and still forked by anyone . It's ElasticSearch that has a license limiting what vendors can host it (without permission of ES), it's the one with vendor lock-in, right? So not building anything with vendor lock-in means... ?
Good interesting points. Now Amazon will be the good guy because they will run open-source version, whereas ElasticSearch is not, if I understand you correctly.
No single capitalist wants competitive markets. They want monopoly, for themselves. It is only when they don't have the monopoly or an easy way to get it that they cry for competitive markets. And that is good of course.
How is it price gouging if the price is on the tin? It isn't like there's a "surprise" as to how much they charge, and it isn't like there aren't a dozen alternatives including DIY.
I'm the first to say that AWS is too expensive, and I vote with my wallet (and the company I work for by proxy). But I'll never claim that there's any gouging involved.
Price gouging is the practice of using outsized leverage in a particular market to charge excessive prices. Like snow shovels doubling in price after a snow storm. Or $10 water bottles after a hurricane.
So for AWS the term is arguably correctly applied.
But I'd be more worried about the market if AWS was artificially undercutting pricing because it would kill the incentive to create competitors or innovation in the space.
"without violating the license a small startup put in place to stop them?"
Elasticsearch was first released over a decade ago. ElasticSearch, now just Elastic, the company was founded over 9 years ago and now is public. Are they still a "small startup"? If so when does a company graduate from that status?
> Amazon is not doing it for any altruistic reason
The beauty of OSS is that motives don't matter. If Amazon contributes and it's not detrimental in someway to the code, then it's a plus for anyone else who wants to use it.
Precisely! While their business itself may need to be broken up, a community governed OSS project isn't bad for OSS when the alternative is a proprietary license that gives a single corp the ability to not contribute back or be exposed to virality.
All this being said, progressive corporate taxes seem more enticing year after year.
When a product that was previously Open Source changes to a non-open license, it's not uncommon for someone to pick up the last Open Source version and fork it, and release that for others to use and collaborate on. That's always going to be a good thing; that means people who care about Open Source licensing continue to have a version to use and collaborate on.
Is it really price gouging for bandwidth? Or is bandwidth just really expensive in general? I honestly don't know. I would assume if it was actually much cheaper one of the cloud's would undercut the other to get customers.
It's absolutely price gouging. I'm not going to rant about this for the 100th time, but at least I'm in good company [0]. Do the math on the cost you pay if you saturate 1gbps for a month vs. the cost you pay for 1gbps IP transit at basically any colocation provider.
Really this is the secret sauce of the cloud. Create new abstraction layers where you can charge for logical separation on a physical basis. First VMs, then containers, then serverless... Would be cool if somebody did it with bandwidth (looking at you, Cloudflare). Why can't I buy an elastically sized pipe? Why do I need to pay for the stuff I put through it instead of reserving a size for the time I'll need it?
I'm kind of surprised that people are this upset about how much AWS charges for bandwidth. They may charge more for bandwidth than a colo would but they're not a colo. A colo you get a network port and -thats it- you provide everything else yourself, with its attendant cost, and you roll that up into your total bill.
If a colo provides you a 1 Gbps connection if you use less, you don't get a refund. And most of the time you don't get 24/7 saturation, or you get charged on some 95th percentile billing, and their networks are almost always oversubscribed anyways.
AWS is trying to disincentivize using it as a dumb pipe. They want you to use it smartly and if you just want to push static data there are much more cost effective ways to do it, such as CDNs, which are more cost effective for both you AND AWS.
Comparing AWS bandwidth costs and Colo and even other clouds like Oracle isn't fair because different things are associated with that cost.
It really is price gouging - bandwidth is actually cheap.
A couple of comparisons:
Oracle Cloud give you 10TB of bandwidth for free, with overage charged at around €7.5/TB.
You can rent a VPS from the likes of Hetzner, and they will throw in 2-20TB of bandwidth for free, with overage charged at something like €1/TB - AWS charge an eye-watering €125 for each TB!
I think the reason the big 3 (AWS, Azure, GCP) still charge such huge amounts is that they profit so greatly from it, and there is more than enough business to go round.
Why would players in an oligopoly undercut each other when their implicit agreement around pricing makes all of them richer? Also, second tier cloud providers like Oracle give deep discounts and still can't compete with AWAZGO so pricing isn't necessarily a main competitive advantage.
Keep in mind bandwidth gets cheaper as AWS gets bigger. If you are some random tiny colo provider, people don't necessarily care to peer with you unless you pay them for the privilege. If you are originating 20% of internet traffic, now people need to peer with you or their customers won't have a great experience.
IP transit costs something like $350-$700/mo for a Gbit. Amazon are certainly getting better rates, so even with equipment costs I doubt they're spending much more than $0.005/GiB. Their pricing starts at $0.15/GiB. (Not to single out AWS, the other big providers are much the same.)
There are also ways out of vendor lock-in. Alternator comes to mind as a way of migrating DynamoDB workloads out to other cloud vendors or your own servers: https://docs.scylladb.com/using-scylla/alternator/
The main thing I look at in this situation is their approaches to security. ES decided authentication is a paid-only feature implemented via closed source proprietary code, and the result has been countless PII leaks; now sure you could say that’s the developers’ fault for making the endpoints internet-accessible, but when the system has been designed from the start to both be insecure and hold PII, you have to place some blame on the provider as well.
Amazon on the other hand developed a free and open source auth plugin and anyone is able to deploy it no problem.
There is absolutely nothing “cloud agnostic” about using Terraform. Every provisioner is specific to a cloud provider. If you are at any scale, moving a k8s cluster is the least of your issues.
It seems reductionist to say Amazon primarily wraps around open source. What about EC2? S3? Glue? DynamoDB? Many of the services that provide the most value are services Amazon has built out.
Om top of these, many of the core services that AWS themselves rely on, like SQS, SNS, Kinesis, Lambda, Cloudfront, ECS, Fargate, Elastic Beanstalk are mostly homemade
This is Amazon's playbook. Make a direct competitor and squeeze the originals out. They did that with jewelry early on and then anyone they couldn't buy out they would under cut until they capitulated like diapers.com The Everything Book by Brad Stone goes over this in detail. Clearly anti-competitive monopolistic actions are taken constantly by Amazon. The only reason they aren't trust busted is because the common line of reasoning is that consumers pay less for goods, but this is being looked at because shouldn't competition be lowering prices. IE if Amazon hadn't killed diapers.com, wouldn't diapers be cheaper overall? And the answer is they should be, but the government hasn't caught up. Once they start getting into the weeds, they'll see example after example of monopoly behavior destroying competitors and ultimately raising prices on consumers.
You described the cloud lock in model very well, especially the bandwidth part. What they charge for outbound is nuts, and the other clouds are not much better. Inbound is free of course to make it easy to send your data in but costly to get it out.
As someone who tried buying services from ES and had to deal with their smug sales people that had a total disdain to those who wanted to give ES money, I am happy I will never need to deal with them in future.
Coincidentally, AWS hasn't open-sourced anything that they use internally. Zilch. Nada. And yet they are using 'open source' developed by another firm (smaller is inconsequential here) to market themselves.
IMO, FOSS licensing is completely broken. Its definitions (of what is free/open) are from a boomer's era that is no longer sustainable. At least I wouldn’t want any of my FOSS projects become “corporate strategy” of any particular proto-nation.
That made me wonder which megacorp did? Google published a lot of papers about their architecture, but no code. They have many open source projects modeled after internal tools (eg. bazel), but there's no search/GFS/mapreduce/monorepo OSS project from them. MS has VSCode, that's great, and they even open sourced .NET. But Azure is a full black box, just like (almost) every Windows component. (Finally calc.exe is OSS!)
What's worse is the second order repercussions. Future open core/open source SaaSes will go straight to something like the Business Source License or the MongoDB license instead of traditional libre licenses. Amazon has done an incredible amount of damage to the open ecosystem.
"having subsumed the opensource version of ES, we are now relicensing, calling it our own, and would be really happy if the opensource community would lie to contribute, because actually we don't totally understand how this product works. Many thanks to all who help us"
The issue with AWS version are many fold, but the main one is that it forces extra usage of expensive EC2 units, for the following reasons:
1. Blue/Green updates -> Start a new cluster with new version, lock current cluster, copy over all the data (can take over a day), at the same time write to both clusters, when finished, lock both clusters while endpoints are swapped over, unlock new cluster, trash old cluster.
During this process Customer pays for both.
- Solution, done properly, fire up new node with new version, swap it in, wait for 'green', take out old... wait for 'green'.. rinse and repeat. Result.. system never goes down, endpoints remain the same, less cost to customer.
2.There are a minimum of the folloowing node types:
- Master (small) x3
- Data (heavy on storage, medium on memory)
- Hot Data (very heavy on memory as shards have to be
held in memory)
- Coordinating (query) nodes, heavy on memory, light on
storage (cos there is no real storage)
- Ingest (same as Query).
- voting only.. tiny
- AI/ML heavy on memory and storage.. cos they do real
work
In the AS world they have:
- Masters
- Data
- Hot Data (cos they are really pricey)
The Data nodes do all teh functionality of the data, ingest and query nodes. Th emain query always gores to a data node, while it passes to other data nodes to get the data out, then aggregates locally to it... so its incorrect usage of the system. A data node should only ever deal with its own data, and pass the results it finds back to a node away from the data.
As your system is squeezed.. you add data nodes... not coordinating/Query or Ingest nodes, which you probably need. But thats more money into the coffers.
3. Their userhasging function is also old (2011 vesrion of bcrypt) , and fails. Any static password produces different results everytime you use it.. at least on the current opendistro. So you are forced into either not using security, or using proper security, which can be cumbersome across the cluster (if rolling your own).
however.. on the cloud version they have base level security working, so thats Ok.. they arent using their own 'open' software.
I could go on.. but its all flaky, and misunderstood at the core developer level. Toput on record, I have spoken twice to teh core development team to describe how updates should happen.. the last time 18 months ago (by 'spoken' I mean a face to face video call) .. so they know hoe it should be done.. but dont do it.
4. I have noticed that there are some functions/methods available on the setup in Amazon Linux, that dont exist on other Linux versions (centOs/Debian) that are security related. AWS Linux is 'lifted' from Redhat.. so another piece of software that they didnt write.. but obviously Redhat are happy with this. Maybe they got the licensing deal sorted .. who knows.
basically gores like this:
"Who would actually install, in production, what essentially is very close to pirated software?"
Roll your own.. its a bit of grunt work up front, then 50% the cost of the cloud version.
So VSCode, which is basically slomo Sublime Text for retards and Typescript which is usable only for prototyping are a good enough compensation for having most of their income from Linux (on Win and Azure) after all the almost criminal FUD they poured on Linux[0]? Come on, get real.
Is Starbucks too big and a bully? Sure, they force Mom and Pop shops to close by out-competing them, and that sucks. But bullying? I believe that's just Capitalism.
Or say you're 6'6 and weigh 230LBS and you join a football team full of people who are 5'9 and 175lbs. Are you a bully just because you're bigger?
ES basically handed them a platter with a goose laying golden eggs and a sign that read "Free Goose" and hoped they wouldn't try to make money off the eggs.
> Is Starbucks too big and a bully? Sure, they force Mom and Pop shops to close by out-competing them, and that sucks. But bullying? I believe that's just Capitalism.
I'd say it's more like a mom-and-pop coffee shop giving free coffee to patrons hoping to make money on cookies, and Starbucks coming in, taking the free coffee, opening a nice stand right next to the shop and selling the coffee they got for free.
I think the claim that Amazon is winning through "vendor lock-in" is pretty silly. Honestly anyone who can't quickly migrate the stuff they're hosting on AWS onto one of the many other cloud platforms is pretty bad at DevOps. If you're using K8S/Docker/etc it should be trivial. But even if you're not, the vast majority of AWS offerings were either built to be API-compatible with other existing tools (e.g. postgres-compat Aurora RDS), are literally identical to other services you can self-host (e.g. ElasticSearch) or others have built services compatible with AWS services (e.g. DigitalOcean's "Spaces" aim to be API identical to AWS S3 – you can literally use Amazon's S3 client libs to interact with various S3-compat services from other clouds).
It's not "lock-in", it's providing a great all-in-one solution. You can host everything you want to host on AWS, which has good stability, good latency, etc. People are locked in because the DX around using AWS for everything your platform needs is just better than other platforms / having different services on different cloud providers (at least for many people).
If you're not using any of the AWS services, that might be true but then you're also leaving a lot of potential on the table.
If you're "cloud-agnostic" and could migrate away from AWS in the blink of an eye then you're paying for an overpriced VM offering and should probably migrate to a cheaper hosting provider immediately.
I buy the main reason being the all-in-one solution; the comprehensiveness is attractive. However, I think you're underplaying the lock-in: migrating clouds is non-trivial - mostly due to stuff that's not running in k8s/docker/etc; stateful apps (Postgres, etc), and or just static data like s3. This takes time, and careful planning and sometimes downtime - and is mostly avoided due to it being hard.
Is there any good that done by Amazon to support OSS? Like ever. They started with cloning MongoDB and now Elastic with actually zero contribution to the community regardless of their insane profits. This is a clear single. Amazon can always clone and redistribute any open source software then lock it in for AWS. If we've started to witness declining in OSS, well at least we know now who started the wave.
Do people assume that a company as large as AWS would automatically have a lot to contribute to the OSS? Maybe most of Amazon teams have not much to open source yet. Contributing to OSS is a bottom-up effort. An engineer needs to be motivated to generalize her project, to peel the code from Amazon's vast internal infrastructure, and to go through an approval process to open source her project. Given that many teams have razor-sharp focus on delivering features, for good or for bad, I was wondering how many engineers are really motivated enough to open source something internal.
I don't think it's a bottom up process. I feel often it's a top down process. Most companies with lots open source activity normally have management that have decided that is something they want to encourage and then it comes down to people making their code open sourcable.
You don't need to assume. Amazon's Open Source team regularly talks about how much they do for open source in my companies Slack. It's their own words.
Maybe a dual-company like Mozilla would at least make it more clear that the big guy that are just taking advantage of the free lunch is doing more harm than good?
Elastic should have a for-profit and a non-profit company taking donations that would actually control the open source code and hiring the core part of team working on it.
I mean, we know how badly Amazon is behaving here, but at least they should have a option where they could realistically invest.
Asking for a company to invest in a competitor that can grow and eat their lunch with the money being invested by them is not realistic. Even because the company investing the money would want to know if that money is actually being invested back in the open source software and not used by a competing company.
If, giving this choice, they didn't invest back in the foundation, it would be much more clear that they are doing it in bad faith.
I like to think, or at least I hope, that OSS isn't going to decline, it's just going to evolve. The existing OSI licenses weren't meant for a world of clouds and SAAS. I believe we'll have to find the next-generation of licenses that can succeed where AGPL tried and failed.
As far as I know DocumentDB is closed source. Btw: I can name zero OSS projects from Amazon and this is not the same for Microsoft (VScode, TS) and FB (React, Jest).
The start maybe done by open source some technology they use that can profit and help other startups. Maybe open source their own version of "React". That will be a good start.
I have mixed feelings about this server side license stuff that mongo db started. Imagine where the internet would be today if the creators of apache and mysql had tried to prevent shared hosting providers in the early days of the web from using their software
The time when Apache or MySQL started out was very different. Imagine where the internet would be if cloud computing itself didn't take off.
Do you remember a time when there were hundreds of hosting providers? Do you remember WebHostingTalk where admins would go to check hosting offers from suppliers around the world?
The monopolies finished that era. So I don't think that software companies trying to adapt now can be seen through the lens of what was 15 years ago.
There are more hosting providers today than in the era you are talking about. AWS has a "monopoly" simply because large companies are using it, and back in the day those companies would have run their own datacenters not used a shared PHP host. For a personal site or startup you have a thousand other options.
Yes, but would we have TimescaleDB, CockroachDB, ElasticSearch, Docker (containers at all) and projects like these if there wasn't any money at the table?
Not saying you're wrong, but it's a multidimensional problem. One could argue AWS is nothing like shared hosting providers(compare scale), and a webserver is essentially "stateless" which means easier to build than say... A database holding sensitive information that doesn't break. I assume this is why postgres HA and horizontal scaling still really isn't a thing, while CockroachDB funded by VC "solved" this problem.
I think it's fair to let companies monetize on the service they built, while allowing people to run it on their own if they can. A problem here though is that the companies incentives mismatch the opensource project they're eunning. CockroachDB enterprise having killer features that the opensource version doesn't have and that noone will be able to PR because the company will reject it.
TimescaleDB went ahead and opensourced all their features, aligning their incentives with the project, but I don't know of anyone else who has done this.
The server side license stuff doesn't prevent shared hosting providers from using software. It just requires hosting providers to open source their infrastructure.
I think the internet would be even better today, if shared hosting providers had been sharing infrastructure technology since 25 years ago.
I think that's missing the forest for the trees. The license is designed to prevent hosting providers from selling the software as a service to their customers. The requirement to open source their entire infrastructure and operations is just a means to do that.
Why? Dockerized versions of countless server-side software are there for free. In a lot of cases the value is that someone maintains it and operates it for you.
I feel Amazon took the feedback from the DocumentDB/MongoDB fiasco to heart and made positive change in their approach.
DocumentDB is a closed source proprietary database created by Amazon to emulate the MongoDB API. Think Google's Dalvik runtime vs Sun/Oracle's JVM.
This time around we have an open source fork of ES with big backers all contributing and very permissive licensing.
In both cases, Amazon gets to implement AWS-specific upgrades to management to depend heavily on EBS replication rather than application-layer replication. Would it be nice to have that secret sauce that makes Aurora/DocumentDB so nice to use compared to self-hosting or RDS? Of course. Do we have to have it to consider using or contributing to the open source software? No.
On the other hand, MongoDB is already sort of obsolete and trending towards death by the time that all ended up happening while ElasticSearch is hot and "new".
Where do you get the impression MongoDB is trending towards death? Seems to be growing by some metrics; the stock price has more than doubled in the last year. Not a fan myself, but still seems a long way from death to me and seem to be doing something right in enterprise market.
It's been said that the best way to fortify your business is to use your clout to make the world inhospitable for adjacent businesses.
As someone who is not currently in the cloud, that idea strikes me as being very pertinent to what's happening here. Increasingly many technologies are becoming cloud-only, or have non-cloud offerings that are decidedly second-class. Elastic offers on-prem support. I doubt Amazon will be doing the same with OpenSearch.
It may be a subtle effect, but it's pushing the world in a direction that makes me uncomfortable. If it's harder for non-cloud-based companies to maintain non-cloud-based offerings, then that will push the industry even more toward being dominated by SaaS products. And these products often leave clients and users locked in, with limited control over their own data, and, by extension, reduced ability to control their own fates. What I worry about is that we may be witnessing a return of Embrace, Extend, Extinguish, only in a new form that's even more dangerous because it's harder to see.
I appreciate the discomfort people have about the SSPL. It is a departure from the original ideas behind FOSS. But, at least as I see it, those open source principles were never an end in and of themselves. They're a means to a greater end: digital autonomy. To the extent that very large companies seem to be learning how to co-opt FOSS in order to re-assert control, FOSS's ability, in its current incarnation, to serve that end may be waning.
Elastic's on prem support amounts to little more than an onsite where they explain to you how the Java Garbage Collector works. EVERY detail about tuning your clusters derives from keeping the Java GC from ruining your day.
There's some index template optimizations but any semi-competent engineer or dba should be able to figure all that out (it's literally all in the documentation about what not to do).
You'll still be able to pay a consultant to come help you -- they don't have to be from Elastic.
In fact, it seems like Amazon just created an industry for third party consultants here.
From the announcement: "You should consider the initial code to be at an alpha stage — it is not complete, not thoroughly tested, and not suitable for production use. We are planning to release a beta in the next few weeks, and expect it to stabilize and be ready for production by early summer (mid-2021)."
Given that Amazon announced the fork in January and they don't expect it to be production-ready until summer, I'm guessing they've underestimated the amount of work required to package and distribute a product as complex as Elasticsearch. Given that, I doubt they will be well-equipped to keep pace with new feature development.
I would question the assumption that this is “not suitable for production use” means “everything is broken and we're way behind” rather than, say, “we are being extremely conservative because our customers will expect support as soon as we say it's production ready and we need to test every upgrade scenario for our large number of existing customers”. The AWS-managed ElasticSearch seems to be pretty popular and I would expect them to be as conservative about new offerings as they are with, say, RDS.
> they've underestimated the amount of work required to package and distribute a product as complex as Elasticsearch
The bulk of the work thus far has been to strip out the non-OSS components ("X-pack") and the many references to it, nothing to do with packaging, distributing, or even maintaining and developing features.
I for one will be happy when those are taken out. So many headaches trying to get bloated Kibana to start as a docker container before realizing that some random x-pack-disable flag needs to be set for it to start without a random error.
I'm not sure I agree with that assessment. Now that the fork is publicly available, others can contribute to get it ready, which wasn't possible until now.
Yes, others can contribute, but significant feature development on large-scale OSS projects tends to be driven by developers paid to work on the project full-time and coordinated by an organized steering committee with clear governance (or company if the product is owned by a single company). I don't see any of that in place for OpenSearch and getting that all started up is not at all a trivial endeavor.
The fork announcement was announced as a response to the Elastic stuff. I don't think they made any predictions about when it'd be ready in that blog post, so I'm not sure why they would've underestimated anything?
I have been perfectly happy with ES cloud services. Is this done by honest intentions from AWS or is it simply based on the fact that ES are making a lot of money of the cloud services?
Well, now nobody can provide a competitor to ES cloud services for newer versions. If you upgrade to v7.11 or above, you're locking in your choice for 'managed hosting for ES/Kibana' to ES cloud services.
> Is this done by honest intentions from AWS or is it simply based on the fact that ES are making a lot of money of the cloud services?
Is wanting to make money considered honest intentions? ES released v7.10 under the Apache license on purpose, so they knew that any form of license change would mean people can still use earlier releases without having to adhere to the SSPL, and anyone could legally fork it or run it on non-ES hardware without having to pay the piper.
Will be interesting to see the resources that AWS will throw at this. You can get a sense of the resource that elastic.co is throwing at elasticsearch at
I scrolled back through the commits. It looks like they've been removing traces of x-pack, Elastic branding, licensing checks, etc. since the beginning of March. So far it looks like one person is doing the bulk of all that work.
If there are new features, I haven't seen any. The real question is do they have a team for new feature work that they are putting together or is this just a fork that is doomed to fall behind as Elastic's huge team continues to develop their code base fixing bugs that will never get fixed on the Amazon fork, adding features that will never get fixed, eventually releasing the 8.0 release that has been in the works for two years, etc.
I don't see any evidence that they have that team so far. They're paying a few people to go through the moves of forking but I don't really see a grand vision beyond that so far.
I really need to add a compare feature to my tool as it would make analysis a lot easier. Having said that, there is no denying there is a huge difference in work being done in both projects over the past 30 days.
Amazon does have 16 open pull requests though, with about 7 having 20 or more file changes, but I didn't dig into them to understand their significance. Maybe it's another feature I'll need to work on.
If you look at the one year window for elasticsearch
its churn and activity has been extremely consistent and I'm not sure if this is an investment Amazon can and/or is willing to make.
However, knowing enterprise, I'm not sure if this will make a huge difference as those making the decisions might not really care and they'll just accept whatever Amazon tells them.
Elastic spends so much time and effort on making sure that their search is performant (and they are not shy about deprecating and removing features that are slow). I think this is where Elastic will continue to shine. It's one thing to add features, it's another to make it so they work well and make sure the integrating product team doesn't shoot themselves in the foot.
What may hurt them though, is the number of customers that currently feel things are currently "good enough". I don't know what their sales engagement looks like, so I'm not sure if this will really hurt them or not.
I think even with their "permissive usage guidelines" of the OpenSearch trademark, their own way of how they've been doing Amazon ElasticSearch would not be allowed... For example, you can't do: `Microsoft OpenSearch`, you have to do "for OpenSearch" or "with OpenSearch compatibility".
From their "permissive" trademark policy [1]
> You may also use the “OpenSearch” word mark to make accurate statements about compatibility and interoperability using relational phrases such as “works with,” “runs on,” “compatible with,” and the like (e.g., “Foocorp Software powered by OpenSearch” or “Foocorp Software for OpenSearch” or “Foocorp Software with OpenSearch compatibility”).
Hey All, if you're interested in getting a good understanding of this vs Elasticsearch, we invited the team to give a Haystack LIVE talk where they outlined the details and goals of the project: https://www.youtube.com/watch?v=J_6U1luNScg
OpenSearch was once was an initiative founded at A9, Amazon subsidiary, to create a personalized, cross-service, search engine: https://archive.is/PCKWq
OpenSearch is from an era when Amazon and Google were covertly competitive. Google didn't get anywhere with Froogle and AppEngine; whilst Alexa and A9 didn't move any mountains.
I'm happy to see a couple of good choices made here:
- Sticking with Apache 2.0
- Asking for a Developer Certificate-of-Origin rather than a copyright assignment
This bodes well for the future of this fork. Amazon also has the resources necessary to keep up consistent and quality maintenance of a project on this scale.
Elastic would definitely like you to view AWS as the Big Bad here, but their response to the Elastic betrayal is very good, and I would like to see more like this in the future.
I think this thread is much about shared source licenses like SSPL vs. "orthodox" open source licenses like GPL.
Based on the link below it seems to me the difference is that SSPL etc. have a clause which prevents me from making money by selling the use of the licensed software over the network for instance.
GPL puts some rather strict rules on users of copyleft software, mainly that you MUST distribute your modifications with the same license.
What I don't quite get is why adding a rule that says "if you make this software usable over the network you must make it usable for free" would be considered categorically less ethical than GPL.
GPL says you must give out your modifications for free.
SSPL says you must also give out the rights to use that software for free as well.
Isn't SSPL more ethical in the sense that it requires you to give out more for free?
Ethics requires a framework. Just because something is free it doesn't become good. For example free heroin samples!
It's a complex problem to even phrase the question of what do we mean by having a healthy software/IT ecosystem. Do simply count the number of users? GDP of the Internet? Number of git repos? Naturally those doesn't even begin to capture the self-balancing dynamics we are after. We want to encourage folks to start new ventures, but also to give back. But by giving back what if they eliminate old ventures? (Eg. Google "giving back" Chrome might make the Firefox venture non-viable.) How can we describe healthy competition? (It'd be good if the browser market wouldn't be cross-financed from ads, but - let's say - every user would tell their ISP to direct some of their subscription fee to one of the browser vendors.) Okay, but what does this have to do with licenses!? Yeah, it's a fairly hard problem.
I feel this is a very scary trend starting. I have not come across a single founder in the last 5-6 years who does not start with AWS credits or is not craving for them.
AWS is a monopoly and they use their cash to buy early customers. Initially it was Amazon's money, but now AWS has enough cash of their own to push whatever they wish to. The same goes for Google and Microsoft.
AWS directly building up the software side of what started out as IaaS (Infrastructure as a Service) is only going to hurt software vendors. We can only expect new software players or ones with low capital to restrict their licenses even more.
Open source licenses are not only for ideological freedom, but very necessary for companies (end users) to integrate and modify products on their own. We will migrate more toward source-available licenses instead since big giants are going to corner the small companies.
I find it fine for software to not be open source. Source-available but closed-source (in terms of freedom) is perfectly fine from a commercial standpoint and should be the gold-standard for mission-critical software in the backend. The problem comes from companies touting their software as open source purely for the marketing aspect in order to bring in customers and get free work done by people who don't get paid.
OpenDistro for ES is the surrounding tools for ES => plugins, index-state-management, basically a suite clone of ES Enterprise offerings (X-pack) because AWS can't ship AWS ES with X-Pack.
Open Distro for ElasticSearch was not a fork rather an Apache 2.0-licensed distribution of Elasticsearch enhanced with enterprise security, alerting, SQL etc... OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2.
I want to love elasticsearch and they keep making it harder. On top of this open source backflip, their sales staff would put Oracle to shame. Pressure tactics, making clients pay 5 figures for few hours of ES consultant time, very expensive training/certification (and cert expires every 2 years) - among few of them.
AWS gave us consultant time for no charge, guess who we picked to run our ES load.
Yeah the original OpenSearch project is a different enough domain that I think confusion will be minimal. We have talked to the maintainer and he is supportive. We have also posted disambiguation in case anyone does get confused. https://opensearch.org/disambiguation.html
In general I'm supportive of an Amazon open source fork here.
But the name re-use is unfortunate.
Amazon's argument seems to be "Don’t worry, we own the trademark for ‘OpenSearch’ cause it came out of Amazon originally, so it’s cool!”
That is really poor stewardship of the Intellectual Property of the trademark of a name that was part of a standard that was meant to be an open multi-vendor standard. Amazon owned the trademark to protect it's use under that standard, not to re-use it for something totally different harming the standard further.
But it's just another indication that the original Opensearch, like the era of believing in open web standards for inter-operability that it was part of, is dead.
Turns out that was developed by Amazon according to Wikipedia. So maybe they’re merging that usage into this offering (since that is a spec for search results)?
In all three cases, Lucene is used as a "low level" (Java) API which provides search capabilities. OS, ES and Solr turn Lucene into a server, with features like horizontal scaling (ES Cluster, Solr Cloud). The major differences are in how well that all works, how easy it is to administer, how much caching and optimization is done on top of Lucene, etc.
I haven't extensively used ES, but I've used Solr a lot (and contributed to), and I can say that it's a mess. The community is not one of the better ones I've seen. Bugs and stability issues are often ignored. Patches sit around gathering dust. There are some gems and very clever people in the community, of course, but it seems like there are too few of them to cope with the large beast that Solr has become. If I were starting a new project in the search space, Solr would not make my shortlist.
ElasticSearch seems to have more mindshare. It can be easier to find resources online to help solve your problems.. though ES moves through versions fast (and they do break backwards compatibility on major version bumps) so sometimes this can still be an issue.
Other benefit is that you don't have to rely on Zookeeper if you're horizontally scaling.
I don't have a ton of experience with Solr but they seem pretty comparable.
The real vendor lock-in for ES (and now AWS OS) is the REST query API; if Solr implemented ES's API, I bet $1 a lot more people would have moved or at least considered moving over
IIRC Solr also has some weird stuff about schemaless indices, whereas ES took the very Mongo-y approach of "yeah, just throw content at the index, don't worry about it" but then separately the approach of "I am angry with your new conflicting field in that document" and throws an exception; so you don't have to worry about the schema right up until you do have to worry about it
Not familiar with Solr but I believe the analogy to linux would be choosing the preferred distro while they all use the same kernel. There are a bunch of long comparison lists if you search for it.
It's nice that they announce it and that there's some sort of future effort promised. From my perspective we might not upgrade the elastic-stack (with current Elastic projects) too far to not bacome accidentally incompatible in case we want to make a switch.
One of the more discouraging aspects of being an OSS developer is that successful companies that use your software never consider contributing to the OSS developers. I suppose that is the nature of business though. Take what you can get.
Are people actually required to use ELK? What are your use cases?
The interface is completely cluttered and it takes loads of resource and it feels like it's waiting to be replaced with lighter and more focused products.
Graylog (though it uses Elasticsearch internally) does a decent job at log handling and creating all the visual items out of logs and Grafana/Loki can do quite good at it as well with a very small memory footprint.
Besides, most of the "business intelligences" aren't actionable but just some visual arts you wouldn't need but to stare at when you're bored.
I wonder if ES had originally been AGPL licensed would that have helped them? If Amazon adapts AGPL code to integrate it with their own infra=structure doesn't that in fact mean that all of Amazons' software-based infra-structure would become AGPL as well, and thus easily reproduced by Google Cloud, MS Cloud, Oracle Cloud etc.? Or even inhouse? In other words wouldn't it mean it would be easy to replicate the Amazon Cloud-business (on a smaller scale)?
Amazon just wouldn't do that. They would either not offer it as a service, or make a clone from the beginning like they did with MongoDB. In general none of the cloud providers are actually willing to comply with the AGPL license.
Am I missing something here? Elastic says this is a free sw, which you can install and use, but if you want someone else to manage the hosting, we are practically the only option.
How did that became an 'ethical position' ? If I am OSS dev, why should I contribute to them vs OpenSearch?
> and we don’t ask for a contributor license agreement (CLA)
Makes me wonder what these are for (copyright transfer) and why they decided it’s not needed. It also makes me wonder if this sort of thing has ever been taken/tested in court or if it’s paranoid friction with little value add.
> Makes me wonder what these are for (copyright transfer) and why they decided it’s not needed. It also makes me wonder if this sort of thing has ever been taken/tested in court or if it’s paranoid friction with little value add.
Some companies/projects might use them purely to avoid possible future legal headaches (I think GNU does this), and I'm not sure to what degree that has actually been tested, but they can also allow re-licensing under a different license which is more clear cut and I think that's more the issue here
Amazon is trying to say that they'll never relicense the code, so they have no need to take ownership over contributions.
Indeed that is likely but I wonder why didn't they at least require a Developer's Cerificate of Origin [0] that kernel.org uses. This is really lightweight (just append one line to git commit message) and supposedly provides a minimum legal base for the change. IANAL.
Eric S. Raymond is against them, but also argues that they are harmful--as opposed to just useless--because if they ever got to court, a jurist would look at the practices of the community to decide whether such a thing is common enough that they should be required. [1]
I know GNU does it (at least for Emacs) under the reason that the FSF can go after any GPL violation only if it is the clear copyright holder, but no such case exists, to my knowledge.
A copyright transfer would easily smell to people "Amazon is going to change the license at some point in the future to duck us over Elasticsearch-style". They're trying to avoid that smell.
anyone know if OpenSearch still uses "/" as a special character? Largest pita when trying to use ES for logging web applications and quite frankly, made it near unusable.
If Amazon fixed that, I would be firmly on their side. Also, any improvement over Kibana would be welcome.
They took over FreeRTOS for good, CBMC for good, with Xen they were a bit unlucky, but it still has much better security than KVM, and now they take over ElasticSearch.
Good Open Source efforts, much better than until a few years ago.
ElasticSearch will need to relatively quickly come out with a feature the OpenSearch doesn’t replicate or people will just use the minimum that both support (see MySQL vs MariaDB).
Just want to mention that "OpenSearch" is/was also an AWS^H^H^HAmazon initiative for websites to expose a search URL to browsers in HTML metadata, similar to exposing an RSS feed URL. They may want to consider renaming it to avoid complete and utter confusion, like searching "OpenSearch" (no the other one) using "OpenSearch" (no the ES fork).
Atlas is a virtual monopoly for Mongo solely due to SSPL, and it has created a ridiculously overpriced ecosystem for hosted and managed services, and tooling around it.
Parking the technical merits to one side, considering the sheer number of devs and early-stage products that are built on Mongo, I'd love for someone to go after them next.
Amazon already have DocumentDB which clones the Mongo API. I don't think its forked though, they just use a barely mongo compatible wrapper around their own db engine.
True, but it's not quite the same as what they've done with OpenSearch/Elastic. Also, from what I've read, despite claims, the compatibility isn't complete, esp with stuff like aggregations.
There are a few use-cases where you'd want the ability to have a managed/hosted vanilla Mongo setup vs an emulated experience.
Perfect case for a megacorp destroying open source plus business models. I start to hate amazon with a passion. Craziest thing is they are not paying taxes in Europe though they dominate the market.
Amazon needs be broken up. It's too big and too mighty.
Isn't this an example of an open source business model. Amazon is supporting development of this Apache 2.0 licensed OSS, which they plan to make money off of...
Indeed. But they are not the company supposed to make money of that project. They start to dominate several markets by cross financing and therefore need to be broken up.
What is the sell for ES over something like the fulltext search built into Postgres, considering that the cost of adding another dependency is not insignificant?
Maybe Amazon treating open source developers like it treats its blue collar workers will open people's eyes about working conditions in 21st century American capitalism.
On a different note, recently I was looking to learn AWS concepts through online courses. After so much of research I finally found this e-book on Gumroad which is written by Daniel Vassallo who has worked in AWS team for 10+ years. I found this e-book very helpful as a beginner.
This book covers most of the topics that you need to learn to get started:
You're starting from the wrong place if you're comparing Elasticsearch with a database. And you're also arriving at the wrong place if you think that any database can be distributed.
"Elasticsearch is a distributed, free and open search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. Elasticsearch is built on Apache Lucene and was first released in 2010..."
I suggest understanding what it is first before comparing it to other databases.
Elastic gives you a lot of the fancy stuff that SQL kinda needs extras and hard work for... but it's just a document store with fancy weighting features.
Elasticsearch is a no-sql database that optimizes for full-text searches, built atop Apache Lucene. If you're doing any kind of full-text search, for example, if you're trying to index a university library and make it searchable, then elasticsearch is for you. If you're not, I'd look elsewhere.
It's hard for me to know whether to feel bad for ES in this case. Did they bring it on themselves? Is Amazon too big and a bully?
From my perspective, Amazon has made most of its profit price gouging consumers on bandwidth after vendor locking them into their ecosystem, where they bootstrap new services by wrapping open source software with some provisioning scripts, management dashboards and cookie-cutter API / console templates. Indeed, most of this is templated -- AFAIU, for example, each AWS service autogenerates its Boto bindings and parts of its console frontend via code generators. Amazon has really mastered the factory process of churning out new services, and when they find a popular one, they can invest more resources into developing it than the original team ever could.
And therein lies the rub. If Amazon is improving the software in a way that the original team couldn't, it's hard to say that the community isn't benefiting. I think what strikes me the wrong way is that Amazon is not doing it for any altruistic reason. In fact, Amazon contributes very little to open source in general, considering how much they take from it. Compare them to Facebook (React, etc) or Google (tons of dev tools) or Microsoft (VSC, TypeScript). What does Amazon have? Firecracker, kind of? And now a fork of ES because that's the only way they could continue making money off it without violating the license a small startup put in place to stop them?
Well, good for Amazon, I suppose, but I find myself instinctively disliking them for this. I'm not sure what the solution is. Hopefully technologies like Kubernetes and Terraform will encourage big customers to become at least cloud-agnostic, if not cloud-independent. At the very least it would be great if Amazon / Google / Microsoft stopped gouging bandwidth at such absurd margins. Or not. Maybe it will be their downfall as startups differentiate along those lines. That would be ironic, coming from the originators of "your margin is my opportunity."
Personally I'm doing my part by not building anything with vendor lock-in. It's great to be able to deploy to any cloud, if you value either robustness or flexibility.
One thing which surprised me: Elastic has a market capitalization of ~$11B.
I think that changes some of the more floaty ethical concerns. This is not a David vs Goliath situation. This is Goliath vs Super-Goliath.
At this point, I'm much less interested in the drama of which mega-corp is screwing over the other. I'm more interested in: how does it affect me? When the titans are done trampling over the rest of us, which side benefits me the most?
Its too early to tell, but it seems like it'll be Amazon. The product is more open. They have a demonstrated history of great support. Yeah, they gouge us on networking and everything else, but at least they're the devil we know, and buying into the OpenSearch ecosystem has a greater probability of being the more open solution into the next decade.
Uhmmm I’m pretty sure David vs Goliath is talking about scale between competition. Saying that $11B is Goliath just because you’re sitting at $1M doesn’t mean they’re not in a crazy mismatched fight against a $2T company. In the same way you could be in a David vs Goliath situation yourself if you with $1M in wealth tried to sue someone with $25K of wealth. Everything is relative. Doesn’t mean it’s not a crazy unfair mismatch that doesn’t deserve sympathy and regret.
41 replies →
This argument is part of Amazon's PR campaign to tell devs to not feel sorry for Elastic because it's now a big company and they make money in the market. So, if you built a successful OSS and start to make money then it's ethical to clone any OSS and pushes projects out of the market because now it is "Goliath vs Super-Goliath".
24 replies →
I mean, Elastic was successful because of license arbitrage; to complain about said arbitrage when Amazon does it is ... well, it's hard to feel a lot of sympathy.
1 reply →
Quick point,.., as yours is valid... these days $11Bn market cap doe not represent cash in bank for development and R&D. It just reflects what the market think its worth. R&D , of which there has been a lot, and it continues, is hugely expensive.
After a decade and a third of QE, 11 bil is not really serious money. These days the unit of power on the capital markets is a trillion.
> they can invest more resources into developing it than the original team ever could.
I know this is a popular narrative, but as someone who works on AWS, I think you would be shocked by how small the individual dev teams are that build and maintain the services that everyone uses.
I'm not going to downplay the network effects involved. Of course AWS has a tremendous advantage in being able to standardize the customer billing, IAM, and EC2 Usage.
And there are economies of scale.
But individual AWS service teams are: * incredibly lean and focused * still have to make a profit on their own terms based on the infrastructure they build and the fees they charge customers * laser customer obsessed to solve people's (developer's) direct needs.
I understand the community's concern about AWS investment and approach to OSS. But I can assure you (though you have no reason to believe me) that the goal is never to embrace, extend, then extinguish. It's all in the service of going where the customers are, and solving problems that they tell us they have. The profits are a byproduct. The "working backwards" process is no joke. We spend a lot of time figuring out what is the right thing for customers to build, start building it, and THEN we think about how do we make money from it.
I was nodding along until this:
> and THEN we think about how do we make money from it
Do you really need to think? Looking at the on-demand pricing in US East, a m5.4xlarge.elasticsearch instance costs $1.133 an hour, while a m5.4xlarge instance costs $0.768 an hour. That's 47.53% of extra money. And like you said, it only requires a small team to build and maintain the service.
It is no coincidence that all cloud providers are trying to ramp up their hosted services for open source software, even GCP, who historically only focused on their own proprietary stack. There's a lot of money to be made.
4 replies →
In the world of OSS, code speaks. A small team may not help the cause of Open Search. This article nails how code influences leadership in an OSS project: http://hypercritical.co/2013/04/12/code-hard-or-go-home
they have as many people dedicated to the capability as needed. at least the capabilities have owners, which from my experience is not trivial for an organization to achieve.
curious if others have noticed this as well (capabilities without clear owners)? what does this depend on? time? company size? both?
Could you please shed some light on how many people would be behind a product like AWS Lambda or AWS CloudWatch?
As an outsider, I would guess huge swaths of developers with a massive hierarchy. Buildings full of folks working on AWS services. I have no idea and extremely curious.
4 replies →
Interesting. I wonder if the market can only support managed services when it’s provided directly by the cloud provider like AWS. I assume Elastic has to add margins to cover for their existence, which make them less competitive with AWS.
This sounds a little unfair, even if I agree with the argument that they’re free to fork OSS software and do whatever the fuck they want.
It sounds trivial to "wrap open source software", but surprisingly it is big value-add to thousands of companies. We can't just look at successful companies like Netflix to downplay the challenges of operating a service. Not every company knows how to operate complex systems under manageable cost. How many companies can really manage a Kafka cluster, let alone scaling it, for instance? Indeed, even companies that people deem powerful may screw up, if they don't get their culture or process right. Take Uber for example, for god damn five years, they still couldn't offer a service like EC2, let alone supporting persistent volumes. They still couldn't make their database provisioning on demand via an API. Their MySQL-based NoSQL solution was still based on FriendFeed's architecture and the APIs were hard to use. Yet they spent millions building a k8s replacement, building a GPU database, switching from mysql to postgres and back to mysql, etc and etc. So, yes, cloud companies like AWS buildd mere control planes to wrap open-source software, yet such seemingly mundane offering does bring values to many customers.
A key reason for Netflix to have an easy-to-operate infrastructure is that Netflix prioritizes productivity and scalability. They specifically did three things:
1. No fixed deadline, with a few exceptions of course, for platform-related projects.
2. Promotion/salary negotiation was not tied directly to release of external features.
3. A single engineer could be responsible for more than one service for the entire company, with 24x7 oncall.
With Netflix establishing such incentives, engineers naturally focus on getting infrastructure right, to the point that oncall 24x7 is a non-issue.
So, yeah, culture matters, big time.
Edit: another incentive was that a service was measured by its adoption. The more people praised it, the more successful the service would be. Requiring meetings to get buy-in for a new service was considered a sign of potential failure. As a result, every single team focused on making the value proposition of their services obvious. Path of least resistance was a given instead of a debated topic.
This is the most important point, IMO. Amazon's value add is not the software itself, it's the operation of the software. That includes a LOT of stuff, not just making sure it's running. It's security modeling and patching, compliance, DDOS protection, etc. Amazon's product is an army of ops engineers working 24/7 to keep your stuff secure and online.
With that in mind, their behavior here makes a lot more sense, and comparing it with companies who have dramatically different products, like Facebook and Google, takes a lot of effort to understand the differences and what impact they have.
5 replies →
90% of the companies do not know how to manage software. They got weird dogmas, no KPIs, no ability to measure performance or debug problems. This is why they got external consultants and cloud vendors. What is really funny how they think internally about these issues. If Netflix and Amazon was publishing efficiency numbers and we were able to compare with the bottom 95% of tech users people would be shocked. The difference between the numbers I am aware of (number of computers / engineer for example) is 100x.
As a longtime ElasticSearch cluster admin/developer and Elastic Cloud customer, I don't feel bad for Elastic in the slightest and I'm psyched about this fork.
The way they operate their cloud service leaves a lot to be desired and encourages maximum spend if you end up wanting to use it for anything demanding in production.
Not to mention the support is pretty slow to respond and rarely respond adequately.
At my current client we started with their cloud and in a few months deployed our own in Kubernetes using the official operator.
It's a lot cheaper and gives us fewer headaches this way.
8 replies →
This is really a shame to hear. There was once a an Elastic SaaS company from Norway called found.io that were pretty sharp and customer-centric. They were acquired be Elastic pretty early on[1]. I believe Elastic Cloud was built from this. I guess found.io's culture of delivering a good product didn't survive?
https://www.elastic.co/about/press/elastic-acquires-elastics...
10 replies →
> The way they operate their cloud service leaves a lot to be desired and encourages maximum spend
Pretty sure this is the definition of SAAS and IAAS
1 reply →
This! The company I work for spending over $750,000 with Elastic Cloud every year and the quality of service we get from them leave a lot of be desired. I don't feel bad for Elasticsearch corp, they have done this to them selfs.
> In fact, Amazon contributes very little to open source in general, considering how much they take from it
I don’t think this is a fact. Amazon seems to contribute pretty significantly according to the pages [0,1] they put out that describes their contributions. Not to mention their membership in OSS foundations like Linux Foundation. [2]
You have the caveat about in relation to benefit they gain, but that’s pretty hard to measure. And I think isn’t really a good measure.
I’d like to learn more about why you make such an absolute claim and maybe you have some better measure.
I remember back in the 90s when big orgs (Microsoft, IBM) didn’t contribute to open source and can’t even think of any big orgs today that don’t contribute to open source. Even Oracle has big open source projects.
[0] https://amzn.github.io/ [1] https://aws.amazon.com/opensource/ [2] https://www.linuxfoundation.org/en/join/members/
The absolute claim, aside from being at home in a rant on HN, comes from a cursory glance at https://github.com/amzn, weighted by contributors and popularity, and compared to companies of similar size. Google, Microsoft and Facebook all build and maintain multiple open source projects that are hugely popular with people who use them outside of the company sandboxes. For example, people benefit from React without Facebook gaining much directly. (Facebook! If Facebook has any redeeming qualities, it's their open source contributions to the frontend ecosystem, although I promise you I could ascribe malicious intent to those as well...) Contrast that with Amazon. On their GitHub page, I see a few obscure projects amongst a bulbous array of AWS SDKs.
To the sibling comment that asked about Firecracker -- I think Firecracker is awesome, and I did mention that in my original complaint. They even created it themselves! Well, a team of amazing engineers in Romania did. I have no personal insight into the matter, but it seems like they operate relatively independently from the AWS profit machine. Good for them too, it's incredible software. But I'm sure if they were to tell the story of how they got buy-in at Amazon to open source it, the same themes would come up -- how does Amazon benefit from this? In the case of Firecracker, the more people test it / harden it / run Doom on it, the more value Amazon can provide on its serverless platform. So again, unlikely to be purely altruistic intentions... but that's not to say there's anything wrong with that. I just find it all a bit distasteful in aggregate.
18 replies →
To me it looks like they mostly contribute to projects that are SDKs to use AWS.
Yeah and also what about projects like Firecracker?
https://github.com/firecracker-microvm/firecracker/
I have no idea if Amazon does this or not, but maintaining forks of projects is no fun, so it's in a company's best interest to contribute bug fixes and improvements that aren't part of their secret sauce.
Did the BSDs do it wrong, too? Apple uses a lot of FreeBSD software that they turn around and sell for profit. How about PostgreSQL, didn't Amazon fork that as well? My point is, there's nothing wrong with forks nor companies forking if the license allows for it. It's up to the developers to choose an appropriate license to not be forked/ripped off, if they so desire.
I personally am against modern day corporate America, but I can't blame them for this. The software is given away free/libre/gratis to be forked by whomever.
Perhaps to combat this, one should choose a non "Open Source(TM)" license, but a source availbe license. E.g https://mariadb.com/bsl11/ (not my personal favorite, just an example).
Also, I definitely agree with/do the same:
> Personally I'm doing my part by not building anything with vendor lock-in. It's great to be able to deploy to any cloud, if you value either robustness or flexibility.
My favorite "conspiracy theory" is that AWS intentionally creates stupidly verbose and numerous headers in all of their APIs just to up the bandwidth usage a few bytes per request at a time.
You don’t pay for the bandwidth of API calls to AWS
It it quite funny, elasticsearch is also kind of wrapping an open source library (Apache Lucene) and selling it as their own product.
Very good point.
Could Elastic's business model survive if lucene adopted the SSPL license that Elastic has, saying it's the "spirit" of open source?
That is an interesting question.
2 replies →
People want to pay for services, not software and licenses. They want turn-key solutions that are available via API and GUI, instantly and on-demand. This is the fundamental reason why AWS is so successful and the demand is constantly proven with every new product launch.
Elastic (and other vendors) complaining about this instead of using it for their own success is a problem of their own making. At least a few companies are finally learning.
What ES wants of course is for Amazon to give them a cut of revenue from hosting ES.
We already know Amazon isn't interested in doing that (either at all, or at whatever price ES wanted, we don't know that).
They had no legal requirement to when ES was open source. So ES changed the licensing to no longer be open source.
So, Amazon could... a) decide to give ES a cut after all, b) decide to stop hosting ES, or c) fork the last open source version.
I don't think anyone is surprised they chose c? Presumably ES isn't either? Maybe ES thinks this will be good for them/bad for Amazon anyway, because they are hoping potential customers will abandon the Amazon fork and stay on the original ES fork?
Not sure why they'd be confident in that exactly. Maybe they know what they're doing.
As users/customers, we would rather have a choice of hosted vendors/platforms, and that it remain un-forked (so we can use/write software compatible with either vendor/platform). Competition is good for us as users/customers, that's in fact one of the reasons we choose open source, so no one vendor can set the hosting price all on their own without competition. We want to be able to choose among competitors for hosting, based on price, customer service, performance, uptime, whatever.
But ES didn't want that, they didn't want hosting competition to exist -- at least not without permission and agreed upon cut for them -- because, I guess, hosting was how they planned to make money as a company to fund development as well as profits for investors etc. So they changed their license to no longer allow it. So of the possible outcomes remaining... this one seems as good as any for the user/customer, I guess?
So, when you say "I'm doing my part by not building anything with vendor lock-in" -- I'm not sure which course you are suggesting. In fact, between ElasticSearch and new OpenSearch fork.. it's OpenSearch that is the one without vendor lock-in, right? OpenSearch is Apache licensed, and can be hosted by any vendor and still forked by anyone . It's ElasticSearch that has a license limiting what vendors can host it (without permission of ES), it's the one with vendor lock-in, right? So not building anything with vendor lock-in means... ?
Good interesting points. Now Amazon will be the good guy because they will run open-source version, whereas ElasticSearch is not, if I understand you correctly.
No single capitalist wants competitive markets. They want monopoly, for themselves. It is only when they don't have the monopoly or an easy way to get it that they cry for competitive markets. And that is good of course.
How is it price gouging if the price is on the tin? It isn't like there's a "surprise" as to how much they charge, and it isn't like there aren't a dozen alternatives including DIY.
I'm the first to say that AWS is too expensive, and I vote with my wallet (and the company I work for by proxy). But I'll never claim that there's any gouging involved.
Price gouging is the practice of using outsized leverage in a particular market to charge excessive prices. Like snow shovels doubling in price after a snow storm. Or $10 water bottles after a hurricane.
So for AWS the term is arguably correctly applied.
But I'd be more worried about the market if AWS was artificially undercutting pricing because it would kill the incentive to create competitors or innovation in the space.
4 replies →
"without violating the license a small startup put in place to stop them?"
Elasticsearch was first released over a decade ago. ElasticSearch, now just Elastic, the company was founded over 9 years ago and now is public. Are they still a "small startup"? If so when does a company graduate from that status?
> Amazon is not doing it for any altruistic reason
The beauty of OSS is that motives don't matter. If Amazon contributes and it's not detrimental in someway to the code, then it's a plus for anyone else who wants to use it.
Precisely! While their business itself may need to be broken up, a community governed OSS project isn't bad for OSS when the alternative is a proprietary license that gives a single corp the ability to not contribute back or be exposed to virality.
All this being said, progressive corporate taxes seem more enticing year after year.
When a product that was previously Open Source changes to a non-open license, it's not uncommon for someone to pick up the last Open Source version and fork it, and release that for others to use and collaborate on. That's always going to be a good thing; that means people who care about Open Source licensing continue to have a version to use and collaborate on.
Is it really price gouging for bandwidth? Or is bandwidth just really expensive in general? I honestly don't know. I would assume if it was actually much cheaper one of the cloud's would undercut the other to get customers.
It's absolutely price gouging. I'm not going to rant about this for the 100th time, but at least I'm in good company [0]. Do the math on the cost you pay if you saturate 1gbps for a month vs. the cost you pay for 1gbps IP transit at basically any colocation provider.
Really this is the secret sauce of the cloud. Create new abstraction layers where you can charge for logical separation on a physical basis. First VMs, then containers, then serverless... Would be cool if somebody did it with bandwidth (looking at you, Cloudflare). Why can't I buy an elastically sized pipe? Why do I need to pay for the stuff I put through it instead of reserving a size for the time I'll need it?
[0] https://twitter.com/eastdakota/status/1371252709836263425
12 replies →
I'm kind of surprised that people are this upset about how much AWS charges for bandwidth. They may charge more for bandwidth than a colo would but they're not a colo. A colo you get a network port and -thats it- you provide everything else yourself, with its attendant cost, and you roll that up into your total bill.
If a colo provides you a 1 Gbps connection if you use less, you don't get a refund. And most of the time you don't get 24/7 saturation, or you get charged on some 95th percentile billing, and their networks are almost always oversubscribed anyways.
AWS is trying to disincentivize using it as a dumb pipe. They want you to use it smartly and if you just want to push static data there are much more cost effective ways to do it, such as CDNs, which are more cost effective for both you AND AWS.
Comparing AWS bandwidth costs and Colo and even other clouds like Oracle isn't fair because different things are associated with that cost.
It really is price gouging - bandwidth is actually cheap.
A couple of comparisons:
Oracle Cloud give you 10TB of bandwidth for free, with overage charged at around €7.5/TB.
You can rent a VPS from the likes of Hetzner, and they will throw in 2-20TB of bandwidth for free, with overage charged at something like €1/TB - AWS charge an eye-watering €125 for each TB!
I think the reason the big 3 (AWS, Azure, GCP) still charge such huge amounts is that they profit so greatly from it, and there is more than enough business to go round.
6 replies →
Why would players in an oligopoly undercut each other when their implicit agreement around pricing makes all of them richer? Also, second tier cloud providers like Oracle give deep discounts and still can't compete with AWAZGO so pricing isn't necessarily a main competitive advantage.
2 replies →
Keep in mind bandwidth gets cheaper as AWS gets bigger. If you are some random tiny colo provider, people don't necessarily care to peer with you unless you pay them for the privilege. If you are originating 20% of internet traffic, now people need to peer with you or their customers won't have a great experience.
If you think AWS is expensive, give AZURE a go and be appalled.
IP transit costs something like $350-$700/mo for a Gbit. Amazon are certainly getting better rates, so even with equipment costs I doubt they're spending much more than $0.005/GiB. Their pricing starts at $0.15/GiB. (Not to single out AWS, the other big providers are much the same.)
5 replies →
There are also ways out of vendor lock-in. Alternator comes to mind as a way of migrating DynamoDB workloads out to other cloud vendors or your own servers: https://docs.scylladb.com/using-scylla/alternator/
The main thing I look at in this situation is their approaches to security. ES decided authentication is a paid-only feature implemented via closed source proprietary code, and the result has been countless PII leaks; now sure you could say that’s the developers’ fault for making the endpoints internet-accessible, but when the system has been designed from the start to both be insecure and hold PII, you have to place some blame on the provider as well.
Amazon on the other hand developed a free and open source auth plugin and anyone is able to deploy it no problem.
There is absolutely nothing “cloud agnostic” about using Terraform. Every provisioner is specific to a cloud provider. If you are at any scale, moving a k8s cluster is the least of your issues.
It seems reductionist to say Amazon primarily wraps around open source. What about EC2? S3? Glue? DynamoDB? Many of the services that provide the most value are services Amazon has built out.
Om top of these, many of the core services that AWS themselves rely on, like SQS, SNS, Kinesis, Lambda, Cloudfront, ECS, Fargate, Elastic Beanstalk are mostly homemade
EC2 uses the KVM hypervisor.
This is Amazon's playbook. Make a direct competitor and squeeze the originals out. They did that with jewelry early on and then anyone they couldn't buy out they would under cut until they capitulated like diapers.com The Everything Book by Brad Stone goes over this in detail. Clearly anti-competitive monopolistic actions are taken constantly by Amazon. The only reason they aren't trust busted is because the common line of reasoning is that consumers pay less for goods, but this is being looked at because shouldn't competition be lowering prices. IE if Amazon hadn't killed diapers.com, wouldn't diapers be cheaper overall? And the answer is they should be, but the government hasn't caught up. Once they start getting into the weeds, they'll see example after example of monopoly behavior destroying competitors and ultimately raising prices on consumers.
Here are the main reasons to make open source software:
To provide something for free, public use
To get the world to help you maintain your software
A reason to not make open source software:
You want the exclusive right to offer that software as part of a paid service
This is not the first instance of a company not understanding the last point there, and it won't be the last.
You described the cloud lock in model very well, especially the bandwidth part. What they charge for outbound is nuts, and the other clouds are not much better. Inbound is free of course to make it easy to send your data in but costly to get it out.
As someone who tried buying services from ES and had to deal with their smug sales people that had a total disdain to those who wanted to give ES money, I am happy I will never need to deal with them in future.
Coincidentally, AWS hasn't open-sourced anything that they use internally. Zilch. Nada. And yet they are using 'open source' developed by another firm (smaller is inconsequential here) to market themselves.
IMO, FOSS licensing is completely broken. Its definitions (of what is free/open) are from a boomer's era that is no longer sustainable. At least I wouldn’t want any of my FOSS projects become “corporate strategy” of any particular proto-nation.
That made me wonder which megacorp did? Google published a lot of papers about their architecture, but no code. They have many open source projects modeled after internal tools (eg. bazel), but there's no search/GFS/mapreduce/monorepo OSS project from them. MS has VSCode, that's great, and they even open sourced .NET. But Azure is a full black box, just like (almost) every Windows component. (Finally calc.exe is OSS!)
> Coincidentally, AWS hasn't open-sourced anything that they use internally. Zilch. Nada.
Firecracker. s2n.
Not much, I agree
gplv3 was supposed to fix this though no?
What's worse is the second order repercussions. Future open core/open source SaaSes will go straight to something like the Business Source License or the MongoDB license instead of traditional libre licenses. Amazon has done an incredible amount of damage to the open ecosystem.
Good. So the VCs will go back to playing like in the old days of shareware software licenses. It'll be good.
Only good one I can think of is Amazon Java, not sure how essential it is though
I am with you. The release could read:
"having subsumed the opensource version of ES, we are now relicensing, calling it our own, and would be really happy if the opensource community would lie to contribute, because actually we don't totally understand how this product works. Many thanks to all who help us"
The issue with AWS version are many fold, but the main one is that it forces extra usage of expensive EC2 units, for the following reasons:
1. Blue/Green updates -> Start a new cluster with new version, lock current cluster, copy over all the data (can take over a day), at the same time write to both clusters, when finished, lock both clusters while endpoints are swapped over, unlock new cluster, trash old cluster. During this process Customer pays for both. - Solution, done properly, fire up new node with new version, swap it in, wait for 'green', take out old... wait for 'green'.. rinse and repeat. Result.. system never goes down, endpoints remain the same, less cost to customer.
2.There are a minimum of the folloowing node types: - Master (small) x3 - Data (heavy on storage, medium on memory) - Hot Data (very heavy on memory as shards have to be held in memory) - Coordinating (query) nodes, heavy on memory, light on storage (cos there is no real storage) - Ingest (same as Query). - voting only.. tiny - AI/ML heavy on memory and storage.. cos they do real work In the AS world they have: - Masters - Data - Hot Data (cos they are really pricey) The Data nodes do all teh functionality of the data, ingest and query nodes. Th emain query always gores to a data node, while it passes to other data nodes to get the data out, then aggregates locally to it... so its incorrect usage of the system. A data node should only ever deal with its own data, and pass the results it finds back to a node away from the data.
As your system is squeezed.. you add data nodes... not coordinating/Query or Ingest nodes, which you probably need. But thats more money into the coffers.
3. Their userhasging function is also old (2011 vesrion of bcrypt) , and fails. Any static password produces different results everytime you use it.. at least on the current opendistro. So you are forced into either not using security, or using proper security, which can be cumbersome across the cluster (if rolling your own). however.. on the cloud version they have base level security working, so thats Ok.. they arent using their own 'open' software.
I could go on.. but its all flaky, and misunderstood at the core developer level. Toput on record, I have spoken twice to teh core development team to describe how updates should happen.. the last time 18 months ago (by 'spoken' I mean a face to face video call) .. so they know hoe it should be done.. but dont do it.
4. I have noticed that there are some functions/methods available on the setup in Amazon Linux, that dont exist on other Linux versions (centOs/Debian) that are security related. AWS Linux is 'lifted' from Redhat.. so another piece of software that they didnt write.. but obviously Redhat are happy with this. Maybe they got the licensing deal sorted .. who knows.
basically gores like this:
"Who would actually install, in production, what essentially is very close to pirated software?"
Roll your own.. its a bit of grunt work up front, then 50% the cost of the cloud version.
So VSCode, which is basically slomo Sublime Text for retards and Typescript which is usable only for prototyping are a good enough compensation for having most of their income from Linux (on Win and Azure) after all the almost criminal FUD they poured on Linux[0]? Come on, get real.
0. https://en.wikipedia.org/wiki/Halloween_documents
No need to feel bad. The CEO sold a lot of his shares prior to the license change.
> Is Amazon too big and a bully?
Is Starbucks too big and a bully? Sure, they force Mom and Pop shops to close by out-competing them, and that sucks. But bullying? I believe that's just Capitalism.
Or say you're 6'6 and weigh 230LBS and you join a football team full of people who are 5'9 and 175lbs. Are you a bully just because you're bigger?
ES basically handed them a platter with a goose laying golden eggs and a sign that read "Free Goose" and hoped they wouldn't try to make money off the eggs.
> Is Starbucks too big and a bully? Sure, they force Mom and Pop shops to close by out-competing them, and that sucks. But bullying? I believe that's just Capitalism.
I'd say it's more like a mom-and-pop coffee shop giving free coffee to patrons hoping to make money on cookies, and Starbucks coming in, taking the free coffee, opening a nice stand right next to the shop and selling the coffee they got for free.
I'm having trouble justifying this ethically.
3 replies →
I think the claim that Amazon is winning through "vendor lock-in" is pretty silly. Honestly anyone who can't quickly migrate the stuff they're hosting on AWS onto one of the many other cloud platforms is pretty bad at DevOps. If you're using K8S/Docker/etc it should be trivial. But even if you're not, the vast majority of AWS offerings were either built to be API-compatible with other existing tools (e.g. postgres-compat Aurora RDS), are literally identical to other services you can self-host (e.g. ElasticSearch) or others have built services compatible with AWS services (e.g. DigitalOcean's "Spaces" aim to be API identical to AWS S3 – you can literally use Amazon's S3 client libs to interact with various S3-compat services from other clouds).
It's not "lock-in", it's providing a great all-in-one solution. You can host everything you want to host on AWS, which has good stability, good latency, etc. People are locked in because the DX around using AWS for everything your platform needs is just better than other platforms / having different services on different cloud providers (at least for many people).
If you're not using any of the AWS services, that might be true but then you're also leaving a lot of potential on the table.
If you're "cloud-agnostic" and could migrate away from AWS in the blink of an eye then you're paying for an overpriced VM offering and should probably migrate to a cheaper hosting provider immediately.
4 replies →
I buy the main reason being the all-in-one solution; the comprehensiveness is attractive. However, I think you're underplaying the lock-in: migrating clouds is non-trivial - mostly due to stuff that's not running in k8s/docker/etc; stateful apps (Postgres, etc), and or just static data like s3. This takes time, and careful planning and sometimes downtime - and is mostly avoided due to it being hard.
1 reply →
Do you know can I migrate my sophisticated security setup easily over? Including users, groups, roles, policies and instance policies.
2 replies →
The lock in is real, but they are also the best cloud option and its not even close in my opinion.
4 replies →
Is there any good that done by Amazon to support OSS? Like ever. They started with cloning MongoDB and now Elastic with actually zero contribution to the community regardless of their insane profits. This is a clear single. Amazon can always clone and redistribute any open source software then lock it in for AWS. If we've started to witness declining in OSS, well at least we know now who started the wave.
Do people assume that a company as large as AWS would automatically have a lot to contribute to the OSS? Maybe most of Amazon teams have not much to open source yet. Contributing to OSS is a bottom-up effort. An engineer needs to be motivated to generalize her project, to peel the code from Amazon's vast internal infrastructure, and to go through an approval process to open source her project. Given that many teams have razor-sharp focus on delivering features, for good or for bad, I was wondering how many engineers are really motivated enough to open source something internal.
I don't think it's a bottom up process. I feel often it's a top down process. Most companies with lots open source activity normally have management that have decided that is something they want to encourage and then it comes down to people making their code open sourcable.
1 reply →
I wish this long process that's exhausting in Amazon applies when it comes to cloning open source projects to get more profits.
You don't need to assume. Amazon's Open Source team regularly talks about how much they do for open source in my companies Slack. It's their own words.
https://aws.amazon.com/opensource/
Yes exactly, you have actually to google it maybe you'll be able to find a useful link that makes Amazon looks good.
6 replies →
Maybe a dual-company like Mozilla would at least make it more clear that the big guy that are just taking advantage of the free lunch is doing more harm than good?
Elastic should have a for-profit and a non-profit company taking donations that would actually control the open source code and hiring the core part of team working on it.
I mean, we know how badly Amazon is behaving here, but at least they should have a option where they could realistically invest.
Asking for a company to invest in a competitor that can grow and eat their lunch with the money being invested by them is not realistic. Even because the company investing the money would want to know if that money is actually being invested back in the open source software and not used by a competing company.
If, giving this choice, they didn't invest back in the foundation, it would be much more clear that they are doing it in bad faith.
I like to think, or at least I hope, that OSS isn't going to decline, it's just going to evolve. The existing OSI licenses weren't meant for a world of clouds and SAAS. I believe we'll have to find the next-generation of licenses that can succeed where AGPL tried and failed.
You're asserting that they have forked OSS and then not provided back the source code for their own improvements.
I guess we can check their github repos to see if that's the case.
As far as I know DocumentDB is closed source. Btw: I can name zero OSS projects from Amazon and this is not the same for Microsoft (VScode, TS) and FB (React, Jest).
6 replies →
For some companies there are some (very partial) statistics, for instance: https://www.openhub.net/orgs/Google
This seems like a much more positive response than the one they took with MongoDB. I agree, Amazon hasn't done much, but maybe this could be a start?
The start maybe done by open source some technology they use that can profit and help other startups. Maybe open source their own version of "React". That will be a good start.
I have mixed feelings about this server side license stuff that mongo db started. Imagine where the internet would be today if the creators of apache and mysql had tried to prevent shared hosting providers in the early days of the web from using their software
The time when Apache or MySQL started out was very different. Imagine where the internet would be if cloud computing itself didn't take off.
Do you remember a time when there were hundreds of hosting providers? Do you remember WebHostingTalk where admins would go to check hosting offers from suppliers around the world?
The monopolies finished that era. So I don't think that software companies trying to adapt now can be seen through the lens of what was 15 years ago.
There are more hosting providers today than in the era you are talking about. AWS has a "monopoly" simply because large companies are using it, and back in the day those companies would have run their own datacenters not used a shared PHP host. For a personal site or startup you have a thousand other options.
5 replies →
Imagine where elastic would be, as the whole success of elastic is based on a apache licensed project (lucene).
In the same place as it would require elastic’s changes to Lucerne to be made open source also?
1 reply →
Yes, but would we have TimescaleDB, CockroachDB, ElasticSearch, Docker (containers at all) and projects like these if there wasn't any money at the table?
Not saying you're wrong, but it's a multidimensional problem. One could argue AWS is nothing like shared hosting providers(compare scale), and a webserver is essentially "stateless" which means easier to build than say... A database holding sensitive information that doesn't break. I assume this is why postgres HA and horizontal scaling still really isn't a thing, while CockroachDB funded by VC "solved" this problem.
I think it's fair to let companies monetize on the service they built, while allowing people to run it on their own if they can. A problem here though is that the companies incentives mismatch the opensource project they're eunning. CockroachDB enterprise having killer features that the opensource version doesn't have and that noone will be able to PR because the company will reject it.
TimescaleDB went ahead and opensourced all their features, aligning their incentives with the project, but I don't know of anyone else who has done this.
The server side license stuff doesn't prevent shared hosting providers from using software. It just requires hosting providers to open source their infrastructure.
I think the internet would be even better today, if shared hosting providers had been sharing infrastructure technology since 25 years ago.
I think that's missing the forest for the trees. The license is designed to prevent hosting providers from selling the software as a service to their customers. The requirement to open source their entire infrastructure and operations is just a means to do that.
2 replies →
I don't think sspl is realistic to be able to comply with. This clause is ridiculously wide:
> Corresponding Source for all programs that you use to make the Program or modified version available as a service
All programs you use
Every single one of them.
There is literally no limit on what sources you would need to publish. You don't need to think very long to realize how impossible that is.
2 replies →
> Imagine where the internet would be today if the creators of apache and mysql had tried to prevent shared hosting
PHP + MySQL was the foundation of the Internet not so long ago , Wordpress is stil the backbone of lots of platform.
With SS License definitely this would not have been possible.
Why? Dockerized versions of countless server-side software are there for free. In a lot of cases the value is that someone maintains it and operates it for you.
I feel Amazon took the feedback from the DocumentDB/MongoDB fiasco to heart and made positive change in their approach.
DocumentDB is a closed source proprietary database created by Amazon to emulate the MongoDB API. Think Google's Dalvik runtime vs Sun/Oracle's JVM.
This time around we have an open source fork of ES with big backers all contributing and very permissive licensing.
In both cases, Amazon gets to implement AWS-specific upgrades to management to depend heavily on EBS replication rather than application-layer replication. Would it be nice to have that secret sauce that makes Aurora/DocumentDB so nice to use compared to self-hosting or RDS? Of course. Do we have to have it to consider using or contributing to the open source software? No.
On the other hand, MongoDB is already sort of obsolete and trending towards death by the time that all ended up happening while ElasticSearch is hot and "new".
Where do you get the impression MongoDB is trending towards death? Seems to be growing by some metrics; the stock price has more than doubled in the last year. Not a fan myself, but still seems a long way from death to me and seem to be doing something right in enterprise market.
6 replies →
MongoDB is not trending towards death and is actually still growing by almost any metric.
https://db-engines.com/en/ranking
I like to check this site every couple months for stats on DB popularity
1 reply →
The new name clashes with the Open Search Foundation.
https://opensearchfoundation.org/
Amazon was using the term "OpenSearch" themselves (via former subsidiary A9) back in 2012. https://github.com/dewitt/opensearch/
Like they care
That's extremely frustrating
It's been said that the best way to fortify your business is to use your clout to make the world inhospitable for adjacent businesses.
As someone who is not currently in the cloud, that idea strikes me as being very pertinent to what's happening here. Increasingly many technologies are becoming cloud-only, or have non-cloud offerings that are decidedly second-class. Elastic offers on-prem support. I doubt Amazon will be doing the same with OpenSearch.
It may be a subtle effect, but it's pushing the world in a direction that makes me uncomfortable. If it's harder for non-cloud-based companies to maintain non-cloud-based offerings, then that will push the industry even more toward being dominated by SaaS products. And these products often leave clients and users locked in, with limited control over their own data, and, by extension, reduced ability to control their own fates. What I worry about is that we may be witnessing a return of Embrace, Extend, Extinguish, only in a new form that's even more dangerous because it's harder to see.
I appreciate the discomfort people have about the SSPL. It is a departure from the original ideas behind FOSS. But, at least as I see it, those open source principles were never an end in and of themselves. They're a means to a greater end: digital autonomy. To the extent that very large companies seem to be learning how to co-opt FOSS in order to re-assert control, FOSS's ability, in its current incarnation, to serve that end may be waning.
Elastic's on prem support amounts to little more than an onsite where they explain to you how the Java Garbage Collector works. EVERY detail about tuning your clusters derives from keeping the Java GC from ruining your day.
There's some index template optimizations but any semi-competent engineer or dba should be able to figure all that out (it's literally all in the documentation about what not to do).
You'll still be able to pay a consultant to come help you -- they don't have to be from Elastic.
In fact, it seems like Amazon just created an industry for third party consultants here.
From the announcement: "You should consider the initial code to be at an alpha stage — it is not complete, not thoroughly tested, and not suitable for production use. We are planning to release a beta in the next few weeks, and expect it to stabilize and be ready for production by early summer (mid-2021)."
Given that Amazon announced the fork in January and they don't expect it to be production-ready until summer, I'm guessing they've underestimated the amount of work required to package and distribute a product as complex as Elasticsearch. Given that, I doubt they will be well-equipped to keep pace with new feature development.
I would question the assumption that this is “not suitable for production use” means “everything is broken and we're way behind” rather than, say, “we are being extremely conservative because our customers will expect support as soon as we say it's production ready and we need to test every upgrade scenario for our large number of existing customers”. The AWS-managed ElasticSearch seems to be pretty popular and I would expect them to be as conservative about new offerings as they are with, say, RDS.
6 months from start to prod is... not bad at all? You must be a wizard programmer if that is your typical turnaround time.
I don't remember AWS saying something like "it will be ready in weeks" in Jan...
Given how poorly of a job Elastic themselves did with keeping the full ecosystem of tools working in lockstep for YEARS, I'm sure Amazon will do fine.
I remember all through Elasticsearch 5 where none of their packaged Kibana dashboards flippin' worked.
> they've underestimated the amount of work required to package and distribute a product as complex as Elasticsearch
The bulk of the work thus far has been to strip out the non-OSS components ("X-pack") and the many references to it, nothing to do with packaging, distributing, or even maintaining and developing features.
https://discuss.opendistrocommunity.dev/t/preparing-opensear...
I for one will be happy when those are taken out. So many headaches trying to get bloated Kibana to start as a docker container before realizing that some random x-pack-disable flag needs to be set for it to start without a random error.
I'm not sure I agree with that assessment. Now that the fork is publicly available, others can contribute to get it ready, which wasn't possible until now.
Yes, others can contribute, but significant feature development on large-scale OSS projects tends to be driven by developers paid to work on the project full-time and coordinated by an organized steering committee with clear governance (or company if the product is owned by a single company). I don't see any of that in place for OpenSearch and getting that all started up is not at all a trivial endeavor.
I would assume they’re doing more than just packaging.
A beginning is the time for taking the most delicate care that the balances are correct
The fork announcement was announced as a response to the Elastic stuff. I don't think they made any predictions about when it'd be ready in that blog post, so I'm not sure why they would've underestimated anything?
If this project were to be governed by Apache Software Foundation, I'd have associated more credibility to this effort.
I have been perfectly happy with ES cloud services. Is this done by honest intentions from AWS or is it simply based on the fact that ES are making a lot of money of the cloud services?
Well, now nobody can provide a competitor to ES cloud services for newer versions. If you upgrade to v7.11 or above, you're locking in your choice for 'managed hosting for ES/Kibana' to ES cloud services.
> Is this done by honest intentions from AWS or is it simply based on the fact that ES are making a lot of money of the cloud services?
Is wanting to make money considered honest intentions? ES released v7.10 under the Apache license on purpose, so they knew that any form of license change would mean people can still use earlier releases without having to adhere to the SSPL, and anyone could legally fork it or run it on non-ES hardware without having to pay the piper.
Will be interesting to see the resources that AWS will throw at this. You can get a sense of the resource that elastic.co is throwing at elasticsearch at
https://public-001.gitsense.com/insights/github/repos?r=gith...
I'm currently indexing the fork, so in about an hour or two, I'll provide the insights for the fork as well.
I scrolled back through the commits. It looks like they've been removing traces of x-pack, Elastic branding, licensing checks, etc. since the beginning of March. So far it looks like one person is doing the bulk of all that work.
If there are new features, I haven't seen any. The real question is do they have a team for new feature work that they are putting together or is this just a fork that is doomed to fall behind as Elastic's huge team continues to develop their code base fixing bugs that will never get fixed on the Amazon fork, adding features that will never get fixed, eventually releasing the 8.0 release that has been in the works for two years, etc.
I don't see any evidence that they have that team so far. They're paying a few people to go through the moves of forking but I don't really see a grand vision beyond that so far.
I really need to add a compare feature to my tool as it would make analysis a lot easier. Having said that, there is no denying there is a huge difference in work being done in both projects over the past 30 days.
Amazon does have 16 open pull requests though, with about 7 having 20 or more file changes, but I didn't dig into them to understand their significance. Maybe it's another feature I'll need to work on.
If you look at the one year window for elasticsearch
https://public-001.gitsense.com/insights/github/repos?q=wind...
its churn and activity has been extremely consistent and I'm not sure if this is an investment Amazon can and/or is willing to make.
However, knowing enterprise, I'm not sure if this will make a huge difference as those making the decisions might not really care and they'll just accept whatever Amazon tells them.
They have done extensive work in the Open Distro modules which I assume they will carry over. See: https://github.com/opendistro-for-elasticsearch/
4 replies →
Elastic spends so much time and effort on making sure that their search is performant (and they are not shy about deprecating and removing features that are slow). I think this is where Elastic will continue to shine. It's one thing to add features, it's another to make it so they work well and make sure the integrating product team doesn't shoot themselves in the foot.
What may hurt them though, is the number of customers that currently feel things are currently "good enough". I don't know what their sales engagement looks like, so I'm not sure if this will really hurt them or not.
2 replies →
I agree, I trust there will be value for some in openSearch not using SSPL and value for others in Elastic's performant/scaleable tendencies.
I can't edit the comment anymore, but you can find the fork at https://public-001.gitsense.com/insights/github/repos?r=gith...
I think even with their "permissive usage guidelines" of the OpenSearch trademark, their own way of how they've been doing Amazon ElasticSearch would not be allowed... For example, you can't do: `Microsoft OpenSearch`, you have to do "for OpenSearch" or "with OpenSearch compatibility".
From their "permissive" trademark policy [1]
> You may also use the “OpenSearch” word mark to make accurate statements about compatibility and interoperability using relational phrases such as “works with,” “runs on,” “compatible with,” and the like (e.g., “Foocorp Software powered by OpenSearch” or “Foocorp Software for OpenSearch” or “Foocorp Software with OpenSearch compatibility”).
1: https://opensearch.org/trademark-usage.html
Hey All, if you're interested in getting a good understanding of this vs Elasticsearch, we invited the team to give a Haystack LIVE talk where they outlined the details and goals of the project: https://www.youtube.com/watch?v=J_6U1luNScg
OpenSearch was once was an initiative founded at A9, Amazon subsidiary, to create a personalized, cross-service, search engine: https://archive.is/PCKWq
OpenSearch is from an era when Amazon and Google were covertly competitive. Google didn't get anywhere with Froogle and AppEngine; whilst Alexa and A9 didn't move any mountains.
Code: https://github.com/dewitt/opensearch
I'm happy to see a couple of good choices made here:
- Sticking with Apache 2.0
- Asking for a Developer Certificate-of-Origin rather than a copyright assignment
This bodes well for the future of this fork. Amazon also has the resources necessary to keep up consistent and quality maintenance of a project on this scale.
Elastic would definitely like you to view AWS as the Big Bad here, but their response to the Elastic betrayal is very good, and I would like to see more like this in the future.
I think this thread is much about shared source licenses like SSPL vs. "orthodox" open source licenses like GPL.
Based on the link below it seems to me the difference is that SSPL etc. have a clause which prevents me from making money by selling the use of the licensed software over the network for instance.
GPL puts some rather strict rules on users of copyleft software, mainly that you MUST distribute your modifications with the same license.
What I don't quite get is why adding a rule that says "if you make this software usable over the network you must make it usable for free" would be considered categorically less ethical than GPL.
GPL says you must give out your modifications for free. SSPL says you must also give out the rights to use that software for free as well.
Isn't SSPL more ethical in the sense that it requires you to give out more for free?
https://techcrunch.com/2018/11/29/the-crusade-against-open-s...
Ethics requires a framework. Just because something is free it doesn't become good. For example free heroin samples!
It's a complex problem to even phrase the question of what do we mean by having a healthy software/IT ecosystem. Do simply count the number of users? GDP of the Internet? Number of git repos? Naturally those doesn't even begin to capture the self-balancing dynamics we are after. We want to encourage folks to start new ventures, but also to give back. But by giving back what if they eliminate old ventures? (Eg. Google "giving back" Chrome might make the Firefox venture non-viable.) How can we describe healthy competition? (It'd be good if the browser market wouldn't be cross-financed from ads, but - let's say - every user would tell their ISP to direct some of their subscription fee to one of the browser vendors.) Okay, but what does this have to do with licenses!? Yeah, it's a fairly hard problem.
Good questions for discussion. There seems to be a commonly held or propagated assumption that GPL good other licenses un-ethical.
I feel this is a very scary trend starting. I have not come across a single founder in the last 5-6 years who does not start with AWS credits or is not craving for them.
AWS is a monopoly and they use their cash to buy early customers. Initially it was Amazon's money, but now AWS has enough cash of their own to push whatever they wish to. The same goes for Google and Microsoft.
AWS directly building up the software side of what started out as IaaS (Infrastructure as a Service) is only going to hurt software vendors. We can only expect new software players or ones with low capital to restrict their licenses even more.
Open source licenses are not only for ideological freedom, but very necessary for companies (end users) to integrate and modify products on their own. We will migrate more toward source-available licenses instead since big giants are going to corner the small companies.
() Edits
I find it fine for software to not be open source. Source-available but closed-source (in terms of freedom) is perfectly fine from a commercial standpoint and should be the gold-standard for mission-critical software in the backend. The problem comes from companies touting their software as open source purely for the marketing aspect in order to bring in customers and get free work done by people who don't get paid.
Okay... and what's the difference here from Open Distro for ElasticSearch? I guess it's just a rebranding, isn't it?
OpenDistro for ES is the surrounding tools for ES => plugins, index-state-management, basically a suite clone of ES Enterprise offerings (X-pack) because AWS can't ship AWS ES with X-Pack.
OpenSearch is the ES (core) itself.
If I'm not mistaken.
Open Distro for ElasticSearch was not a fork rather an Apache 2.0-licensed distribution of Elasticsearch enhanced with enterprise security, alerting, SQL etc... OpenSearch is a community-driven, open source search and analytics suite derived from Apache 2.0 licensed Elasticsearch 7.10.2 & Kibana 7.10.2.
I want to love elasticsearch and they keep making it harder. On top of this open source backflip, their sales staff would put Oracle to shame. Pressure tactics, making clients pay 5 figures for few hours of ES consultant time, very expensive training/certification (and cert expires every 2 years) - among few of them. AWS gave us consultant time for no charge, guess who we picked to run our ES load.
What about the thing that is already named OpenSearch?
I’m the co-author and maintainer of the OpenSearch syndication protocol and I posted in support of reusing the name here: https://groups.google.com/g/opensearch/c/gi-iVJZgfdA
Ah, wow. Good to know! :D
Yeah the original OpenSearch project is a different enough domain that I think confusion will be minimal. We have talked to the maintainer and he is supportive. We have also posted disambiguation in case anyone does get confused. https://opensearch.org/disambiguation.html
In general I'm supportive of an Amazon open source fork here.
But the name re-use is unfortunate.
Amazon's argument seems to be "Don’t worry, we own the trademark for ‘OpenSearch’ cause it came out of Amazon originally, so it’s cool!”
That is really poor stewardship of the Intellectual Property of the trademark of a name that was part of a standard that was meant to be an open multi-vendor standard. Amazon owned the trademark to protect it's use under that standard, not to re-use it for something totally different harming the standard further.
But it's just another indication that the original Opensearch, like the era of believing in open web standards for inter-operability that it was part of, is dead.
1 reply →
There is also https://www.opensearchserver.com/ which is the first result when your search "opensearch docker".
Turns out that was developed by Amazon according to Wikipedia. So maybe they’re merging that usage into this offering (since that is a spec for search results)?
It’s what Chrome’s tab to search utilizes. If you implement the protocol users can tab-to-search your site even with autocomplete.
Naive aside, but why would I want to use ElasticSearch or OpenSearch over Solr? Are ES and Solr not both based on Lucene?
In all three cases, Lucene is used as a "low level" (Java) API which provides search capabilities. OS, ES and Solr turn Lucene into a server, with features like horizontal scaling (ES Cluster, Solr Cloud). The major differences are in how well that all works, how easy it is to administer, how much caching and optimization is done on top of Lucene, etc.
I haven't extensively used ES, but I've used Solr a lot (and contributed to), and I can say that it's a mess. The community is not one of the better ones I've seen. Bugs and stability issues are often ignored. Patches sit around gathering dust. There are some gems and very clever people in the community, of course, but it seems like there are too few of them to cope with the large beast that Solr has become. If I were starting a new project in the search space, Solr would not make my shortlist.
I've used both and cannot think of a reason why I would use Solr again, besides licensing.
ElasticSearch seems to have more mindshare. It can be easier to find resources online to help solve your problems.. though ES moves through versions fast (and they do break backwards compatibility on major version bumps) so sometimes this can still be an issue.
Other benefit is that you don't have to rely on Zookeeper if you're horizontally scaling.
I don't have a ton of experience with Solr but they seem pretty comparable.
The real vendor lock-in for ES (and now AWS OS) is the REST query API; if Solr implemented ES's API, I bet $1 a lot more people would have moved or at least considered moving over
IIRC Solr also has some weird stuff about schemaless indices, whereas ES took the very Mongo-y approach of "yeah, just throw content at the index, don't worry about it" but then separately the approach of "I am angry with your new conflicting field in that document" and throws an exception; so you don't have to worry about the schema right up until you do have to worry about it
Elasticsearch has done a good job of focusing on the log analytics space, and Kibana is a great tool.
I'd probably use ES for any log analytics and Solr for things like website and ecommerce search.
Of course, you could do both with either.
Not familiar with Solr but I believe the analogy to linux would be choosing the preferred distro while they all use the same kernel. There are a bunch of long comparison lists if you search for it.
Great blog post on this subject: https://opensourceconnections.com/blog/2019/02/28/stop-worry...
It's nice that they announce it and that there's some sort of future effort promised. From my perspective we might not upgrade the elastic-stack (with current Elastic projects) too far to not bacome accidentally incompatible in case we want to make a switch.
One of the more discouraging aspects of being an OSS developer is that successful companies that use your software never consider contributing to the OSS developers. I suppose that is the nature of business though. Take what you can get.
Will be interesting if other cloud providers (Google, Azure) offer this or you see other software companies offer support for it.
Will also be an interesting case study if the community shifts to this project and it dwarfs elastic for features.
Are people actually required to use ELK? What are your use cases?
The interface is completely cluttered and it takes loads of resource and it feels like it's waiting to be replaced with lighter and more focused products.
Graylog (though it uses Elasticsearch internally) does a decent job at log handling and creating all the visual items out of logs and Grafana/Loki can do quite good at it as well with a very small memory footprint.
Besides, most of the "business intelligences" aren't actionable but just some visual arts you wouldn't need but to stare at when you're bored.
I recently learned that Graylog changed their license and it's now one of those vanity licenses: https://github.com/Graylog2/graylog2-server/blob/master/LICE...
I wonder if ES had originally been AGPL licensed would that have helped them? If Amazon adapts AGPL code to integrate it with their own infra=structure doesn't that in fact mean that all of Amazons' software-based infra-structure would become AGPL as well, and thus easily reproduced by Google Cloud, MS Cloud, Oracle Cloud etc.? Or even inhouse? In other words wouldn't it mean it would be easy to replicate the Amazon Cloud-business (on a smaller scale)?
Amazon just wouldn't do that. They would either not offer it as a service, or make a clone from the beginning like they did with MongoDB. In general none of the cloud providers are actually willing to comply with the AGPL license.
Offering the unaltered software as a service, or forking it and releasing all of your changes under AGPL does comply with AGPL.
Am I missing something here? Elastic says this is a free sw, which you can install and use, but if you want someone else to manage the hosting, we are practically the only option.
How did that became an 'ethical position' ? If I am OSS dev, why should I contribute to them vs OpenSearch?
> and we don’t ask for a contributor license agreement (CLA)
Makes me wonder what these are for (copyright transfer) and why they decided it’s not needed. It also makes me wonder if this sort of thing has ever been taken/tested in court or if it’s paranoid friction with little value add.
> Makes me wonder what these are for (copyright transfer) and why they decided it’s not needed. It also makes me wonder if this sort of thing has ever been taken/tested in court or if it’s paranoid friction with little value add.
Some companies/projects might use them purely to avoid possible future legal headaches (I think GNU does this), and I'm not sure to what degree that has actually been tested, but they can also allow re-licensing under a different license which is more clear cut and I think that's more the issue here
Amazon is trying to say that they'll never relicense the code, so they have no need to take ownership over contributions.
Indeed that is likely but I wonder why didn't they at least require a Developer's Cerificate of Origin [0] that kernel.org uses. This is really lightweight (just append one line to git commit message) and supposedly provides a minimum legal base for the change. IANAL.
[0]: https://blog.chef.io/introducing-developer-certificate-of-or...
2 replies →
Eric S. Raymond is against them, but also argues that they are harmful--as opposed to just useless--because if they ever got to court, a jurist would look at the practices of the community to decide whether such a thing is common enough that they should be required. [1]
I know GNU does it (at least for Emacs) under the reason that the FSF can go after any GPL violation only if it is the clear copyright holder, but no such case exists, to my knowledge.
[1] http://esr.ibiblio.org/?p=8287
A copyright transfer would easily smell to people "Amazon is going to change the license at some point in the future to duck us over Elasticsearch-style". They're trying to avoid that smell.
anyone know if OpenSearch still uses "/" as a special character? Largest pita when trying to use ES for logging web applications and quite frankly, made it near unusable.
If Amazon fixed that, I would be firmly on their side. Also, any improvement over Kibana would be welcome.
They took over FreeRTOS for good, CBMC for good, with Xen they were a bit unlucky, but it still has much better security than KVM, and now they take over ElasticSearch.
Good Open Source efforts, much better than until a few years ago.
Does that mean ElasticSearch documentation won't be relevant soon?
ElasticSearch will need to relatively quickly come out with a feature the OpenSearch doesn’t replicate or people will just use the minimum that both support (see MySQL vs MariaDB).
I'm glad they changed the name. Although it's probably too much to ask that the AWS service be renamed to OpenSearch Service as well.
> We plan to rename our existing Amazon Elasticsearch Service to Amazon OpenSearch Service.
Oh, I missed that :)
I suspect this is mostly AWS trying to stop using "ElasticSearch" in the title of something, probably for trademark reasons.
One of the best part of elasticsearch is excellent documentation, hope this project can replicate that well.
Just want to mention that "OpenSearch" is/was also an AWS^H^H^HAmazon initiative for websites to expose a search URL to browsers in HTML metadata, similar to exposing an RSS feed URL. They may want to consider renaming it to avoid complete and utter confusion, like searching "OpenSearch" (no the other one) using "OpenSearch" (no the ES fork).
Imagine if they went after Mongo next?
Atlas is a virtual monopoly for Mongo solely due to SSPL, and it has created a ridiculously overpriced ecosystem for hosted and managed services, and tooling around it.
Parking the technical merits to one side, considering the sheer number of devs and early-stage products that are built on Mongo, I'd love for someone to go after them next.
Amazon already have DocumentDB which clones the Mongo API. I don't think its forked though, they just use a barely mongo compatible wrapper around their own db engine.
It's not nearly as compatible as you might think. Interestingly enough, MongoDB's CTO managed RDS at AWS.
This already exists
https://aws.amazon.com/documentdb/
True, but it's not quite the same as what they've done with OpenSearch/Elastic. Also, from what I've read, despite claims, the compatibility isn't complete, esp with stuff like aggregations.
There are a few use-cases where you'd want the ability to have a managed/hosted vanilla Mongo setup vs an emulated experience.
Kinda blocked on the compatibility front after the 4.0 API though, eh?
2 replies →
Off topic, is there a lightweight alternative to ES? Preferably can be run in small RAM vps.
Have a look at Meilisearch.
I love it for its simplicity. No schema's, no clusters, no transformations, no calculations.
Just documents and search over HTTP.
There is Fuse.js if you in node
Perfect case for a megacorp destroying open source plus business models. I start to hate amazon with a passion. Craziest thing is they are not paying taxes in Europe though they dominate the market. Amazon needs be broken up. It's too big and too mighty.
Isn't this an example of an open source business model. Amazon is supporting development of this Apache 2.0 licensed OSS, which they plan to make money off of...
Indeed. But they are not the company supposed to make money of that project. They start to dominate several markets by cross financing and therefore need to be broken up.
1 reply →
cancel amazon prime?
It's not too difficult to wait an extra day or two for products.
On a tangential note, how is Meilisearch compared to Elasticsearch?
Meilisearch is a new niche engine for instant-search experiences, and is far less flexible and less mature than Elasticsearch
What is the sell for ES over something like the fulltext search built into Postgres, considering that the cost of adding another dependency is not insignificant?
"If you're going to shoot the king, you'd better be god damn sure you kill him." - Jack Barker
Irrespective of how big Elastic is, this is an escalation of Amazon's behaviour of suppressing the tough weeds of competition.
> First they came for the socialists, and I did not speak out—because I was not a socialist.
so what will happen to opendistro now ?
how Beats compability is preserved?
Embrace extend extinguish
TLDR: It’s all Apache 2.0.
i bet this will only really work in AWS. :D
Maybe Amazon treating open source developers like it treats its blue collar workers will open people's eyes about working conditions in 21st century American capitalism.
Interesting!
On a different note, recently I was looking to learn AWS concepts through online courses. After so much of research I finally found this e-book on Gumroad which is written by Daniel Vassallo who has worked in AWS team for 10+ years. I found this e-book very helpful as a beginner.
This book covers most of the topics that you need to learn to get started:
If someone is interested, here is the link :)
https://gumroad.com/a/238777459/MsVlG
Oh this will not end well.
This is wonderful
Algolia has won the search race, move on
Eh, far from it. Their per record / per query pricing is extremely prohibitive for any real production use.
This is just sad, amazon playing the victim.
RIP Elastic
They also could sponsor Elasticsearch alternatives in Rust - Sonic[1] and Toshi[2]. Even more, integration[3] with Vector.
[1] https://github.com/valeriansaliou/sonic
[2] https://github.com/toshi-search/Toshi
[3] https://github.com/timberio/vector/issues/988
So as someone who has heard about Elasticsearch for years and years, and seen all this, right this moment I've decided to see what it really is.
On their home page, "Why use Elastic search?", the reasons are basically:
* It's fast!
* It does a lot of stuff!
* It has some tools to visualize data!
* It's distributed!!
I have to say this is not very appealing to me since it sounds like something any database could do.
You're starting from the wrong place if you're comparing Elasticsearch with a database. And you're also arriving at the wrong place if you think that any database can be distributed.
The Elastic website has a dedicated page to explain: https://www.elastic.co/what-is/elasticsearch
"Elasticsearch is a distributed, free and open search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. Elasticsearch is built on Apache Lucene and was first released in 2010..."
I suggest understanding what it is first before comparing it to other databases.
A lot of databases handle many types of data.
1 reply →
Elastic gives you a lot of the fancy stuff that SQL kinda needs extras and hard work for... but it's just a document store with fancy weighting features.
> just a document store
With a particular profile of efficiency choices and interfaces that might appeal to your project.
Elasticsearch is a no-sql database that optimizes for full-text searches, built atop Apache Lucene. If you're doing any kind of full-text search, for example, if you're trying to index a university library and make it searchable, then elasticsearch is for you. If you're not, I'd look elsewhere.
Devil is in the details.