Comment by catapart
2 years ago
This is awesome! Love to see more local-first, security-focused apps! Especially ones that are free.
Asking because I'm currently working on a local-first app that I am gaming out how to sync to non-local networks: The docs state that the app syncs to local P2P networks. Does that mean it doesn't sync across the internet? If so, is that because of the cost of maintaining/renting a TURN server? Or is there a technical limitation? If it does sync across the open internet, are there restrictions to that?
Sorry for the grilling! It's just an area of interest at the right time. In trying to come up with a WebRTC solution, I think I've settled on defeat. The local network syncing works like a charm, but I haven't had any luck in trying to get around a TURN server by using an API endpoint to provide routing data. It would be nice if there were other prebuilts than COTURN (something deployable to a deno/node server would be ideal), but with what we currently have, I think I've settled on just pushing the encrypted data through my api server, letting users encrypt and decrypt with well-known hashing, based on some key they provide in each client (exporting client and importing client; they never exchange keys).
Anyway, my pet projects aside, this thing looks really full-featured! Great work here. I use Notion, daily, and this could easily replace it for me. In fairness, I don't take full advantage of Notion, but I doubt most people do either. Seems like you've covered enough of the bases to provide a real product for all but some edge cases, so that's pretty awesome! I'll echo others in lamenting a web app; that's what will keep me from switching from Notion, because its just not portable enough for the work I do. But, otherwise, I'm happy with it so far. Gonna try to plan some stuff in it and see if there are any creaky wheels!
> Does that mean it doesn't sync across the internet?
They currently host their own backup node, so it does sync across the internet, but you only get 1GB of storage (or 10GB if you were in the alpha) on their backup node. Once that's exceeded, additional files are only synced P2P.
They're planning on providing extra storage for a cost and soon anyone will be able to self-host their own backup node.
P2P sync over the internet on our roadmap. We plan to release it later this year. Our protocol already supports it, but we need to implement peer discovery.
Right; should have been more specific, thanks!
I meant to ask if the app syncs P2P over the internet, rather than just through their provided storage.
My solution is to have short-lived data so that they can't reach more than a few MBs of memory on my server, but if there's a straightforward way to do P2P across the globe, I'd very much prefer that option. Asking people to believe that I can't look at their (encrypted) data is fine, but preventing me from having their data at all is ideal.
Given what you pointed out, I definitely figure they're not doing WWW P2P syncing, but you'd be surprised the technical stuff you can figure out by just asking someone who spent way too long building it. They're usually happy to tell you all kinds of interesting details (especially when it comes to 'interesting' bugs/workarounds/hacks)!
> In trying to come up with a WebRTC solution, I think I've settled on defeat. The local network syncing works like a charm, but I haven't had any luck in trying to get around a TURN server by using an API endpoint to provide routing data. It would be nice if there were other prebuilts than COTURN (something deployable to a deno/node server would be ideal),
The bad news is that if you want something that works in all instances, you need a relay of _some_ sort, because p2p isn't possible/feasible in all cases. Bittorrent (for instance) works around that limitation by simply having many-to-many peering and relying on large numbers, to be reliable. But that doesn't work for 1:1.
The good news is that maintaining a relay (or TURN, with WebRTC), need not be expensive. Yes, you need a server, and perhaps some IP-based rate limiting, but that can handle a LOT of connections and small data.
I created https://github.com/betamos/rdv for this purpose, an extremely light-weight alternative to WebRTC, but for TCP only (BYO identity, auth and encryption). The p2p success rate is, anecdotally, very high. However, you cannot use it from a web browser.
Feel free to reach out (see profile), happy to chat about p2p whether or not you use this project.
Ah, nice! That looks like a method I had tried, for sure. My first setup was to use .NET core to build an API that registered users with session variables and broadcast that session to new users. It required a "host" because they initiated the session and then all other users were given the host's address to negotiate their P2P connection.
That all worked fine but it required a token api, and then having to choose between "polling" and "websockets"; not a particularly appealing choice for me. So TURN servers seemed to be like the ticket, but I just couldn't find someone to say "this is what data a TURN server receives; this is how it determines its response; this is what the response data looks like". Without that, I was having a very hard time building my own TURN server. The only option seemed to be "use COTURN", which is not something I'm interested in. Not because I think it doesn't work, but because I haven't ripped it apart to know how to fix it if something goes wrong. As this is a side-project, I have the luxury of forcing myself to deeply learn each part of the data flow, so it doesn't make much sense to offload something I don't totally understand to someone's library (conversely, I've written an IndexedDB handler, so I use the Dexie library because if it completely falls apart in novel ways, I can still troubleshoot it). So I figured I would wait (it's been a couple of years now), and if no one got around to dumbing it down, I would start inspecting the data flow, myself, and try to replicate whatever COTURN is doing.
So if you can shed any light on how to make a TURN server, I am ALL ears! I do think WebRTC is the way to go so, while I appreciate a suggestion for an alternative (and one that you personally use is great, but one that you personally build AND use is pretty much the gold standards for reccs, so I really do appreciate it!), I don't think it's helpful for my intentions. My project is also web-based (first), so while I'm not averse to a wasm library implementation of TCP connections (mad stuff recently with the curl support!), it's a bridge too far, to start with. In the mean time - like you said - I'm not concerned about the cost of a TURN server. I know that most apps can get by on free-tier cloud hosting limits, when you're talking about those little chirps of data. So I'm happy to build and pay for what it takes, I just honestly don't know how to do it.
> [...] I don't think it's helpful for my intentions. My project is also web-based (first)
Yes, then it's a no-go. No excuses needed.
WebRTC is the only way to reasonably do p2p on the web. This is because of the lack of "raw" sockets and their options (just regular TCP/UDP with SO_REUSEPORT).
> So if you can shed any light on how to make a TURN server, I am ALL ears!
You'd know better in a few minutes of reading the docs, sorry. I will say, pay attention to the requirements - for instance, listening on arbitrary ports, getting the "real" port from the client, even stateful connections etc, aren't always possible on cloud services (the more "managed" the marketing, the more restricted).
> Not because I think it doesn't work, but because I haven't ripped it apart to know how to fix it if something goes wrong. As this is a side-project, I have the luxury of forcing myself to deeply learn each part of the data flow, so it doesn't make much sense to offload something I don't totally understand to someone's library
I empathize deeply with this. In fact, that's the main reason why I built rdv. WebRTC in particular is, imo, a nasty piece of over-engineering. It basically throws together a bunch of separate "protocols" into one. It has way, way too many knobs, and many things are stateful. If you do go down the route of reimplementing even a subset of WebRTC I recommend getting an insurance for your keyboard, one that covers physical violence, (but if you do, I would be very curious about your learnings).
Finally, my words of wisdom would be to avoid p2p if you have the luxury of having a small amount of messages. Just running a simple "relay" of encrypted messages is reliable and cheap. Users are still protected by e2ee.
Looking forward to seeing if it syncs across ZeroTier networks. They emulate a LAN including multicast so it might work out of the box.