Because the client requests pagination by lastEventId (a UUID), the server needs to remember every event forever in order to correctly catch up clients.
If instead the client paginated by lastEventTimestamp, then a server that for any reason no longer had a particular event UUID could at least start at the following one.
That’s why the article suggests using a uuid v6 which is time orderable. Or prefixing with an incrementing db id. So indeed, if you intend to delete events, you might want to make sure you have orderable ids of some sort.
I think that HTTP is not the best way to do it, and that JSON is also not the best way to do it. (HTTP may work reasonably when you only want to download existing events and do not intend to continue polling.)
I also think using UUID alone isn't the best way to make the ID number. If events only come from one source, then just using autoincrementing will work (like NNTP does for article numbers within a group); being able to request by time might also work (which is also something that NNTP does).
What happens if you need to catch up? You keep calling in a loop with a new lastEventId?
What is the intention there though. Is this for social media type feeds, or is this meant for synchronising data (at the extreme for DB replication for example!).
What if anything is expected of the producer in terms of how long to store events?
Here’s a nice comparison between CloudEvents and AsyncAPI from 2019. You can combine them. In the end being able to version and wrap events is useful, although amusingly it reminds me of SOAP.
I enjoy anything that drives down NIH, or something that an existing library could possibly support, or something that I could take to my next job (or could possibly hire for)
I believe cloud events are most common in Kafka-adjacent or event-driven architectures but I think they're used in some GCP serverless things, too
Use HTTP server-sent events instead. Those can keep the connection open so you don't have to poll to get real-time updates and they will also let you resume from the last entry you saw previously.
Yeah, but in real life, SSE error events are not robust, so you still have to do manual heartbeat messages and tear down and reestablish the connection when the user changes networks, etc. In the end, long-polling with batched events is not actually all that different from SSE with ping/pong heartbeats, and with long-polling, you get the benefit of normal load balancing and other standard HTTP things
Never had to use ping/pong with SSE. The reconnect is reliable. What you probably had happen was your proxy or server return a 4XX or 5XX and that cancels the retry. Don't do that and you'll be fine.
SSE works with normal load balancing the same as regular request/response. It's only stateful if you make your server stateful.
6 tabs is the limit on SSE. In my opinion Server Sent Events as a concept is therefore not usable in real world scenarios as of this limitation or error-detection around that limitation. Just use Websockets instead.
It could use a section on high level justification / inspiration.
For example, what inspired this over a typical paginated API that lets you sort old to new with an afterId parameter?
Because the client requests pagination by lastEventId (a UUID), the server needs to remember every event forever in order to correctly catch up clients.
If instead the client paginated by lastEventTimestamp, then a server that for any reason no longer had a particular event UUID could at least start at the following one.
That’s why the article suggests using a uuid v6 which is time orderable. Or prefixing with an incrementing db id. So indeed, if you intend to delete events, you might want to make sure you have orderable ids of some sort.
Previously discussed (April 2022; 95 comments): https://news.ycombinator.com/item?id=30904220
I think that HTTP is not the best way to do it, and that JSON is also not the best way to do it. (HTTP may work reasonably when you only want to download existing events and do not intend to continue polling.)
I also think using UUID alone isn't the best way to make the ID number. If events only come from one source, then just using autoincrementing will work (like NNTP does for article numbers within a group); being able to request by time might also work (which is also something that NNTP does).
What happens if you need to catch up? You keep calling in a loop with a new lastEventId?
What is the intention there though. Is this for social media type feeds, or is this meant for synchronising data (at the extreme for DB replication for example!).
What if anything is expected of the producer in terms of how long to store events?
Sounds like it. But the compaction section has more details - basically you can discard events that are overwritten by later ones
Never heard of "CloudEvents" before. How do people feel about those?
Here’s a nice comparison between CloudEvents and AsyncAPI from 2019. You can combine them. In the end being able to version and wrap events is useful, although amusingly it reminds me of SOAP.
https://www.asyncapi.com/blog/asyncapi-cloud-events
I enjoy anything that drives down NIH, or something that an existing library could possibly support, or something that I could take to my next job (or could possibly hire for)
I believe cloud events are most common in Kafka-adjacent or event-driven architectures but I think they're used in some GCP serverless things, too
They're also documented for aws https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-... and argo-events https://github.com/argoproj/argo-events/blob/master/docs/con...
I guess you mean "not invented here..."
Yep, any opportunity to not reinvent stuff is a big win.
But I'm wary of layers upon layers. I'm thinking of how this could be combined with MQTT... it doesn't seem totally redundant.
Good
Did someone just reinvent a GET API with cursor-based pagination?
Sure looks like it. I'm not getting what's new or interesting here.
Cursors are actually better because you can put any kind of sort order in there. This "lastEventId" seems to be strictly chronological.
This is an astonishingly bad idea. Don't do this.
Use HTTP server-sent events instead. Those can keep the connection open so you don't have to poll to get real-time updates and they will also let you resume from the last entry you saw previously.
https://developer.mozilla.org/en-US/docs/Web/API/Server-sent...
Yeah, but in real life, SSE error events are not robust, so you still have to do manual heartbeat messages and tear down and reestablish the connection when the user changes networks, etc. In the end, long-polling with batched events is not actually all that different from SSE with ping/pong heartbeats, and with long-polling, you get the benefit of normal load balancing and other standard HTTP things
Never had to use ping/pong with SSE. The reconnect is reliable. What you probably had happen was your proxy or server return a 4XX or 5XX and that cancels the retry. Don't do that and you'll be fine.
SSE works with normal load balancing the same as regular request/response. It's only stateful if you make your server stateful.
Correct. In the end, mechanically, nothing beats long polling. Everything ends up converging at which point you may as well just long poll.
But SSE is a standard HTTP thing. Why would you not be able to do "normal load balancing"?
I would also rather not have a handful of long-polling loops pollute the network tab.
5 replies →
Or use Braid-HTTP, which gives you both options.
(Details in the previous thread on HTTP Feeds: https://news.ycombinator.com/item?id=30908492 )
Isn't SSE limited to like 12 tabs or something? I remember vividly reading about a huge limitation on that hard limit.
6 tabs is the limit on SSE. In my opinion Server Sent Events as a concept is therefore not usable in real world scenarios as of this limitation or error-detection around that limitation. Just use Websockets instead.
4 replies →