Public post federation

I can understand why you (and others) tend to feel like this is a bug. However, I disagree. In my opinion, that’s neither bug nor feature, it is both a design choice and a design limitation. We made the decision to federate contents on demand, and we decided against having a single data storage for obvious reasons. One of the limitations by this approach is, indeed, that some contents are not globally available. One can argue about whether it should be available or not without any precondition, but I have a different point of view.

Our approach and our goal is to deliver posts to everyone who signalized they are interested in someone’s posts. People can identify and categorize other people by having a look at their tags and their profile description (which can be public these days). If they are interested, they can subscribe to a person’s posts and they will receive them from that point on. In addition to that, each pod also displays tagged contents from strangers if it has received them. That’s a nice thing, but, in my opinion1, not the main feature.

Regardless of my opinion, I spent a lot of time thinking about how we could, in theory, resolve this “issue”. Let’s ignore everything @supertux88 already mentioned, which is the never-ending database grow and the issue with fetching contents in certain cases. Let’s ignore possible issues on that level and just talk about broad approaches.

One suggested solution is somehow implementing a system that allows “following a tag on a different pod”. Frankly, I have no idea how that would look like and how that might work. If I understand the suggestion correctly, it would enable people to say “I want to see all posts with #kittens from joindiaspora”. If I get that right, that would be a horribly confusing approach, which probably makes diaspora even less usable than it is right now.

Some time ago, Jason did some interesting and good work on sketching and implementing a relay-like thingy, which is the second possible solution to this issue. Having a central entity that would, on opt-in, relay public contents to all pods that decided to subscribe. In theory, that sounds like a great idea and I was already planning on implementing a high-traffic relay in Rust2, but I quickly realized that this is not feasible either.

Currently, we exclusively relay public posts to other pods, not their interactions. This by itself is a huge issue and the main reason I have not enabled publishing or subscribing to the relay on Geraspora. Not only is a lot of context lost for people discovering new posts, it can also very quickly become very annoying for post authors, since they might receive questions multiple times or have people randomly jumping into an existing discussion simply because they were not aware of the fact that something was going on already.

The obvious reply to this is a simple “well, just relay interactions as well”, but this is not possible without running into at least one major issue. How do we determine the set of subscribers we deliver an interaction to? Do we simply throw everything to all subscribers? That would turn the relay into a DoSing machine that kills all small pods by both traffic volume and database size - not an option. The idea behind the relay was to offer pods to subscribe to a set of tags and have them only receive that subset of posts. We could also use that to limit the subscribers for interactions, right? Well, no. The relay does not know what post a comment actually belongs to3 and we would have to store all posts we ever relayed in order to be able to check a comment’s parent hashtags. This would turn the job of designing and running the relay into a task that is comparable with running big data jobs.

As you can see, even without talking about database sizes, I see no solution to this4.


1: I am aware that the truthfulness of this opinion is debatable, but that’s how I see diaspora.
2: Mainly as an experiment since I am really into writing applications that send messages with high throughputs.
3: Yes, the comment includes the parent GUID, but unless there is an object to look up via that GUID, it’s useless.
4: JFTR, If anyone has a solution for the relay issues, I am more than happy with tackling the task of designing, implementing and maintaining a relay that could work on a network-wide scale.

1 Like

This is true but let’s be honest: we’re currently very far away from Facebook and the “I share pictures with my IRL friends” usage in diaspora*. People here doesn’t know each other, they do not share IRL moments, they share opinions. We help thousands of person thinking and debating about everything, including being informed, and how to make the world a better place. This is the heart of diaspora* and this is awesome. So not being able to follow a topic (= a tag) instead of a user is a huge lack.

I wrote a wiki page about this suggestion. (Looks like it has been moved under my username, please tell me if you can access it).
To summarize, from a user point of view, nothing change: a user clicks on the following button on the tag page as he already does. If at least one user follows a tag then the pod is considered as following that tag. Each pod then keeps a table with the list of the pods following a tag, and when a user writes a public posts which contains a tag, the pod checks if there are pods following that tag. If so, it sends the post to them. It’s that simple, it only uses the diaspora* federation protocol so interactions are not broken, and it should be enough to avoid DB overflow.

I totally agree with you here, the relay is, in my opinion, not the good approach, and that is why I didn’t activate it on diaspora-fr.org and framasphere.org.

3 posts were split to a new topic: Unable to interact on [please fill in pod]

I really tried to stay away from here, but since there is some false information here, here we go.

The relay does not know what post a comment actually belongs to3 and we would have to store all posts we ever relayed in order to be able to check a comment’s parent hashtags. This would turn the job of designing and running the relay into a task that is comparable with running big data jobs.

This has already been implemented into the relays since May 18th 2016 (https://github.com/jaywink/social-relay/releases/tag/1.1.0). When a relay receives a post, it will save two pieces of information: 1) what the GUID is and 2) which nodes it got sent to. When the relay then later receives a comment/like/retraction with this target guid, it will look up where the original was sent to and send only to those nodes. So there is absolutely no need to store the post or the hashtags, just the information where to send the reactions. This is not at all expensive compared to the whole object. Since diaspora hasn’t sent reactions yet, I haven’t implemented the planned syncing of information between relays. This is required if a comment goes to another relay, for example.

Regarding performance, my relay has been relaying at best over 4K payloads a day with approximately ~50 deliveries each (because half of the subscribers subscribe to tags only). There has been no issue dealing with performance so far. If the network was to grow (which it hasn’t in the last two years), at some point obviously performance could become an issue. That would be a positive problem I would say. I think the best way to tackle this would be to spread the load over more relays - currently there are only two.

Diaspora could easily start sending reactions to the relays, just like Hubzilla and Socialhome are already doing (Friendica I don’t know the status). This would fix the broken threads issue caused by (in part but not entirely) the current relay delivery. It has been 7 years since the project started, 5 years since the community took over, 4 years since the relay idea, 2 years since the relay system went live - and in all that time there has not been another solution for allowing users to reach a wider amount of public content around the network than the relay system. We can wait forever for a perfect solution or we can take what is available and go with that until a perfect solution comes up. The relays are already there, it works, it’s tested, and the only reason there is a broken threads problem is that support for the relay system is not finalized in the project creating most of the content in the network.

I blame myself really, when I left the project I didn’t feel like pushing the relay system because I knew the only place the idea had support was outside the actual project team. But I’ve read many comments about users who come from Mastodon for example and wonder why “The Federation” feels more complete, how there is a better availability of content on even small instances. I think we shouldn’t lose that competitive edge. On the contrary, we should build on it and actively work towards solutions that guarantee users can reach the content they are interested in.

About the database problem. Subscribing to all the content is optional and tags is default in diaspora settings. I don’t really think anyone in the project team should be worried if some podmin (over 40 currently) want to fill their database with all the possible content they can get their hands on. If it comes too much, they only need to switch the subscription to tags, which will keep the interesting content flowing. If the network grows, this would be necessary anyway, since there would be too much content for any single node to store. But the point of the relay system was never that everybody has to sync all the content to their node. The point was that there is the possibility to get more content. This is especially important for new nodes which are pretty much empty. I can’t imagine how many new pods have been lost because their podmins have been struggling to find anyone to interact with and then have shrugged and turned their pod off.

Long story short. I really don’t see there any other technical issues stopping implementing the remaining relay delivery in diaspora. Doing that would make things better, not worse. If there are, I would gladly hear and discuss them (assuming the discussion stays civil). I’ve asked multiple times for opinions on how to improve the relay system, but there has not been much response yet unfortunately. Since I am going to be federating myself with the ActivityPub network too (Mastodon and Hubzilla basically mostly), I have some plans to widen the relay to that side too where there is a lot more content.

Btw, regarding statistics. In the last 30 days, the two relays (relay iliketoast net & relay diasp org [link restriction here]) have processed almost 100K payloads, most of which are probably posts because AFAIK only Hubzilla and Socialhome send reactions/retractions.

1 Like

Thanks for your insights and ideas, much appreciated.

I kinda disagree; it would make things much less predictable and delivery issues almost impossible to trace. Having more content just for the sake of having more content is totally pointless if you cannot be sure you can adequately interact with posts, or if you cannot be sure you receive interactions.

As you rightly pointed out, a solution to some of the issues could be to simply broadcast both interactions and retractions. I dislike that. In fact, I dislike that so much that I will actively block on any effort that implements something like that.

Multiple concerns here. The first and obvious is database size. If we send all interactions in addition to all posts, small pods would not only have to store posts, but also all interactions, and that gets messy if we consider the amount of these things (especially if we consider re-adding liking on comments, which we will do eventually).

A common case which will make trouble: Alice posts something, Alice’s pod does not publish to a relay and is also not subscribed. Alice shares with Bob. Bob is publishing to the relay and is also subscribing to the relay. Bob reshares a post he has received via the relay, so Alice’s pod now shows the post as well. Interactions happen. Alice will never see any interaction on the post. Similar situations can be constructed by time-related issues as well: what happens if some interactions were received before the post? They are lost forever.

One idea of the relay was to increase the reach of tagged posts, and you support subscribing to individual tags. Thats cool, and it would be even cooler to use a list of subscribed tags (i.e. tags that users on a pod have subscribed to) to decide what tags a pod should subscribe to. That actually would solve the “small pod needs to store all posts ever made” issue, but it makes the entire reaction situation even worse. Unless the relay stores all posts ever written, the relay cannot really keep track of which subscriber received which posts, and so it cannot decide which interactions needed to be send to which subscribers. So you’d end up broadcasting interactions again, which is meh.

However, there is actually a nice solution (disclaimer: “nice” as in “I had discussions with some people on some evenings while playing games, there might be some conceptual issues left”, second disclaimer: that actually was mainly @supertux88’ idea, but he is too lazy to type): Having the ability for pods to subscribe to posts directly.

Short and incomplete flow outline:

  1. Pod A has users who subscribed to #diaspora, #cat, and #photography.
  2. Pod A subscribes to these tags on a relay.
  3. Pod A receives posts tagged with the aforementioned tags via the relay.
  4. Pod A looks at the posts, and figures out the originating pod.
  5. Pod A talks to these pods and says “Hey friend, I just got [GUID], mind sending me all existing interactions as well as subscribe me to it so I get all the future interactions please?”

If we have something like that, it would be win-win-win. We can increase the range of posts, we can provide interesting content for small pods and we can be sure all interactions are received by everyone who is interested in. Bonus points for not spamming everyone with uninteresting stuff.

Also, as a little side effect, we also would fix a bunch of other issues with “unreliable” interactions. For example: If you receive a post via a reshare that would not have arrived at your pod otherwise, this wouldn’t be an issue since the pod would simply subscribe to this post.

Still, someone has to think this through and implement it.

1 Like

@denschub @supertux88 that’s an interesting approach, thanks for having developed it. I’m personally still sure this is the biggest issue in diaspora* actually so I will gladly support anyone working on it, including with money :wink:

If we send all interactions in addition to all posts, small pods would not only have to store posts, but also all interactions, and that gets messy if we consider the amount of these things (especially if we consider re-adding liking on comments, which we will do eventually).

This is no different to what you are proposing at the end. It doesn’t matter if the interaction comes via the relay or directly from the source - it is the same size anyway. I think you skipped my previous answer btw regarding interaction delivery. Your proposal and the relay currently are equal in database size. The only difference is delivery.

A common case which will make trouble: Alice posts something, Alice’s pod does not publish to a relay and is also not subscribed. Alice shares with Bob. Bob is publishing to the relay and is also subscribing to the relay. Bob reshares a post he has received via the relay, so Alice’s pod now shows the post as well. Interactions happen. Alice will never see any interaction on the post.

This is why nodes should subscribe automatically to reshared posts. Doesn’t that already happen?

Similar situations can be constructed by time-related issues as well: what happens if some interactions were received before the post? They are lost forever.

That isn’t possible with the relay. It’s not possible an interaction would be delivered before the post since the relay would not know where to deliver it. Please read my previous reply again.

Thats cool, and it would be even cooler to use a list of subscribed tags (i.e. tags that users on a pod have subscribed to) to decide what tags a pod should subscribe to

I’m pretty sure this was in the original diaspora implementation already. I made it and have a pretty sure idea it was already done. Selecting tags wouldn’t be very useful otherwise.

Unless the relay stores all posts ever written, the relay cannot really keep track of which subscriber received which posts, and so it cannot decide which interactions needed to be send to which subscribers. So you’d end up broadcasting interactions again, which is meh.

Again, you skipped reading my reply and don’t understand how the relay distributes interactions. Please read the reply.

  1. Pod A talks to these pods and says “Hey friend, I just got [GUID], mind sending me all existing interactions as well as subscribe me to it so I get all the future interactions please?”

I assumed diaspora already had this for reshares, but if not, sure this would make a lot of sense. For reshares. I really don’t think it’s a good idea for “normal delivery” ie what the relay does. If you auto-subscribe each receiver to each post they receive, you will just increase the load for each sender in a way that the network will have an even harder time scaling up. Right now a sender will do a delivery to each of their contacts AND one to the relay. What you are proposing is that the sender do a delivery to their contacts AND to each node in the network. In current numbers, for popular tags, this would increase small pods doing 100 deliveries more PER reaction.

In addition to making content available, the point of the relay network was to make the content available without causing too much extra load to nodes. Your proposal will do just that. A model where everybody delivers to everybody just will not scale.

I realize I’m just wasting time here since my reply will just be skipped, but just trying to correct false information again.

No, diaspora doesn’t do that yet, but that’s what I want to add. And it will do that for all posts, not only for reshares, because reshare is not the only way a post can be received. A post can also be fetched because of various other reasons. And as @flaburgan wrote here: “If we solve that problem without central points of failure, that’s better.” So we don’t need to rely on the relay for interactions/retractions, we can do this directly in diaspora and it will solve many other problems not solvable with the relay.

The owner-pod of the post can keep track which other servers have the post and send interactions and retractions to them. And if a small pod has one post (sent from there) with a popular tag and many reactions, he is able to handle many reactions for that one post. But a small pod will not have hundreds of posts with hundreds of interactions. But a small pod is not able to keep track of which other posts are interested in which tags, that’s what the relay currently does.

Diaspora was designed to work decentralized (everybody deliver to everybody), if that doesn’t work, we failed and we can stop developing diaspora. But for now it seems to work pretty well, so let’s continue with that. The relay is currently only a workaround to solve the “follow tags” problem. But since you can only follow tags for posts, posts are the only thing that needs to be sent via the relay.

This works fine for social interactions. But if you have for example 10K servers:

  • Social interactions will still work fine because the amount of deliveries is still the amount of social interactions
  • Delivering public content and their interactions around the network will not work, because you can’t honestly think a single server will do 10K deliveries and still do all the other things it needs to do.

The relay system on the other hand can be optimized and scaled to do only one thing well: deliver and deliver fast.

There are reasons for separating services and not baking everything into one application, whether you accept those or not.

And as @flaburgan wrote here: “If we solve that problem without central points of failure, that’s better.”

The relay system would not be a central point of failure if it was developed further as originally planned, but I don’t think there is any point if diaspora is not in the train. The plan was that nodes would deliver to a random relay. This would mean that any relay going down in the relay network would not stop deliveries from happening.

I will only address the claim of better scaling now, and agree on disagreeing on everything else.

The worker distributing public entities generates the payload exactly once, then leverages typhoeus to actually do the requests. typhoeus internally is calling libcurl-multi multithreaded, with a concurrency we define in the config. If you want, you can say that this thing does one job: deliver and deliver fast. Delivering requests is not a bottleneck anymore and it hasn’t been in quite a while. In reality, database requests are the true bottleneck, so federating things in and out is a totally insignificant process in the chain right now.

If we have a look at the current runtimes, we have to agree that incoming federation is totally irrelevant in terms of processing times, but outgoing federation takes some time if we have a large reach (like, for example, when the dhq account posts something public).

So let’s ignore everything else and focus on delivering payloads, since that is a concern of yours. I don’t like the 10k pods example much, because if we talk about scaling the network, one should think ahead. Let’s get an order of magnitude bigger and deal with 100k pods, just to make my point clear.

It’s easy for me to say “you are wrong”, since I worked as a leading engineer of real high-traffic messaging systems in the past, but that might be a bit simple and surely will not get the point across.

Let me introduce by stating something: You totally ignore the fact that federated systems are actually more scaleable than central one, and you casually forget the exponential growth of messages if you have a central pubsub node.

To prove my point, I wrote a simple Rust application that loads a list of domains and executes http requests similar to the way typhoeus is doing it. I tried to code without fancy Rustness and comment the code a bit so people can understand what I am doing. This benchmark is not entirely perfect, as…

  • I am using GET instead of POST. Although it does not make a difference for HTTP, delivering large payloads can cause some milliseconds extra, since sending the payload may take time. However, in my tests, I figured that almost the entire runtime is spent on connecting to the host in the first place, so this is negligible.
  • For having hostnames, I used the top 1 million webpages according to Alexa and extracted random 100k one. This isn’t really a good simulation of a large diaspora network, but the best approximation I can get. I also cannot provide the dataset I am using, since the top domain list is now a commercial property, and making that public would be illegal. You should be able to find an old version archived in the internets, if you want to follow along.
  • Once again, one very improtant note: I assume that everything gets send to everyone. Every post gets delivered to every pod, every like is delivered to every pod, and so on. In reality, this is highly unlikely, but it’s still a fair comparison since I assume the same for both current federation and the relay.

Preparations/environment:

  • My main benchmark system was a virtual machine with 500mb RAM and “1 virtual CPU core” (whatever that means, it’s kinda slow) that I bought for 3 euros per month. Probably one of the lower-end machines diaspora will ever run on.
  • For comparison, I also ran on a Raspberry Pi 2. I should note that run times were almost exactly similar, but that did not really surprise me, since most of the time was spent waiting for the connection to be established, there is nothing CPU intensive involved at all.
  • I measured the average route length and figured that almost 30% of my simulated pods had a route length of more than 10 hops. That’s mainly caused since my random set of domains has a very significant portion of asian domains, and peering towards china is horrible.
  • Response times of these websites are surely not the same as response times for diaspora nodes. Some might be faster, some might be slower. I assume that it averages itself out. To avoid issues with chineese servers not responding to me, I set the connection timeout to five seconds. That’s shorter than you would have in a production systems, but it matches the average response time in my sidekiq logs.
  • I ran the test 10 times, to get rid of any error, and to establish a more real scenario. In a real world, pods would have DNS caches of most pods, so that’s something I wanted to include. I took a simple average to get a final result.

Long story short. Submitting payloads to 100k pods in my test took, on average, 416 seconds; if we only look at the time that is actually spent with distributing payloads. As a base rate, we can say diaspora can handle 865k requests per hour.

So what does this mean? For interactions that get sent to all pods (in reality, this probably would be mainly public posts, but as before, we have some disagreement on that, so let’s stay call it “interaction” as a generic term), we actually do have a limit of what we can handle. With that processing time, pods would be limited to 8.6 interactions per hour before we get backlogged. Actually, that could work out on smaller pods (which they would probably be if there were that much pods), since they only have a few users. Even if we get backlogged, people do not spent 24 hours a day on diaspora, so we should be able to handle that. Incoming messages, as told earlier, are way faster, so we can be somewhat certain this would be the maximum we can ever achieve. Reality is probably a bit different, and my result should not be taken as a test to see if diaspora can or cannot handle such traffic. this is meant as a comparision between our current system and a relay based one.

Let’s call 100k pods with 8 interactions per hour the limit of what native diaspora could handle right now “somewhat”. It won’t be perfect, but eh… we could be fine, maybe.

Let’s compare that with the relay. All pods would deliver all their stuff to the relay, as suggested. Assuming all pods deliver at their max rate, the relay would have to distribute ~800k payloads per hour to 100k-1 pods each. Which means that, in total, you would somehow have to manage almost 80 billion (80 * 10^9, just to avoid any confusion) requests per hour.

We established that the main delay in sending payloads is caused by connecting to the host, so that is nothing a faster and better server can fix. With my Rust script as a delivery machine, you would end up with a whopping 924856 hours of work by just receiving one hour of network traffic. That’s 105 years.

Now, uh, okay, that’s… impossible.

To get rid of the delay caused by opening connections, you could think about a solution where each subscriber has to keep a persistent connection to the relay, or you can keep open the connections yourselves (TCP has a keepalive, as we all know). The thing is, you cannot really have more than 10k connections opened on a single server. So, uh, you would need 10 servers to keep the connections open, and “10 servers” is optimistic, since managing the sockets would use a lot of CPU, so it’s probably more like 20 or so. Not to mention that you would have to distribute the messages between the nodes somehow…

You can work around the C10K problem fairly well with asynchronous I/O. With fairly, I mean there have been people who managed 60k concurrent websocket connections on a very customized machine. However, those are idling, and you’d run into CPU bottlenecks if you send messages over those channels. In our example, there would be a lot of traffic on those channels, so based on previous projects, I think you can handle something like 5k connections per host, probably way lower.

In the end, you would be building something similar as folks at Twitter and Facebook are using to deliver their “realtime” streams (spoiler: they still have up to 15 minutes delay, and you know that if you have a lot of transatlantic friends and their messages and posts take forever to arrive), which means: hundreds of servers distributed worldwide, connected via a private fibre network. Good luck building that. And, eh, you would end up with a federated system to handle the load for a relay you built because you thought a federated system could not handle the load.

Edit 1, 4:30utc:

I’d like to point out, just for the sake of pointing it out, that short-time caching of messages (like, for example, 15 minutes) and then bulk delivering is a solution that may come in mind. In the outlined scenario, we’d deal with an average stream of 25 million (outgoing) messages per second, created by only ~250 messages/second (actually more like 220, but I rounded for nicer numbers in the example above) inbound. Even when storing only the incoming payloads, if we assume ~10KiB per payload (which is accurate for posts, since signatures blow the size up a bit…) and decide to bulk-send every 15 minutes, we’d have to store 2.1GiB, in a constant buffer. Now, that’d actually be small enough to store in memory, but keep in mind the 25 million outgoing messages still will be there. Even if there is enough space in memory, memory bandwidth (yes, I know, usually, we don’t think about such specs) is becoming the bottleneck.

Edit 1.1:

There actually was a nice case study back in 2014 by RabbitMQ (a insanely awesome message broker, that is built to deliver messages to subscribers in high scaleable environments). They managed to deliver one million messages per second. To achieve that, they had to use 32 servers (30 workers, 2 managing) with 8 CPU cores and 30GB RAM each. So, in theory, we would need 750 servers to handle that traffic. Totally the scale of large social networks!

Edit 1.2:

I feel like I should talk again a bit about that insanely high number of messages per hour, which might confuse people if they read the aforementioned blog post and see that “Apple processes about 40 billion iMessages per day”, which would come down to only 460k/second, much lower than our 25 million / second. One might think that I made some math errors here.

When Apple delivers an iMessage, they get input from one node and have to deliver it to another node. 1 incoming, 1 outgoing. We would be only processing 250 messages per second, but since we would have to deliver one message to 100k nodes, so for every incoming message, there will be 100k outgoing messages. And suddenly you have to deal with huge numbers.

[End of Edits]

Summary: No, relays do not scale better than pod-to-pod federation. In fact, the opposite is true.

There is no room for discussion here. This is a simple, provable fact.

2 Likes

The problem in your calculation is that it is based on the idea that a relay has to send content to all servers out there. I think that we can learn from the past and could do better.

I don’t know if you still know the Fidonet. This was a mailbox net that was most famous in den 80ies and 90ies. The whole communication was done via modem or ISDN. It would had been impossible that every mailbox had phoned every other mailbox in the world. So they used some system that could have been called a relay system as well.

You simply had sent your stuff to the uplink. The uplink then looked at the nodelist and not distributed the message directly to the server but to the responsible uplink (possibly over several levels) and that sent the message to the mailbox.

For example: I have the address 2:2437/44.2 and want to send stuff - My stuff was sent to 2:2437/44. The server looks if the stuff belongs to other systems that are directly connected to 2:2437/44. If not, then it is sent to the responsible system for 2:2437. If the stuff is meant for some system under 2:2437, then the message is transferred to the system. In any other case it is sent to the responsible system for “2”,

What can this mean for a powerful relay system? You post to your local relay. This transfers the content to the directly connected systems and to its uplink relay. This uplink relay posts the stuff to his local relays and to its own uplink as well. And this can be repeated many times.

Since every relay only has some few receivers, you don’t have that much TCP connections. And between the relays you could even pack single messages in packets to reduce the overload per message - or the relay servers could use keepalive to let the connection open - since a relay uplink would possibly speak to only 100 servers or so.

Doing so, you could easily increase the whole traffic to a level where not the sending is a problem - but the receiving.

I did indicate that, yes. I assumed that because it was both the initial proposal and the current implementation, given you set scope to all in the config or you publish to a tag like #diaspora with a huge range. Even for “less known” tags, you do not solve the issue, you just make the message flow less predictable, which is even more dangerous.

Thanks for outlining Fidonet. I knew of its existance, but never looked into implementation details. But basically, what you described is just the way IP works, where a client has an address, the information what the “own network” is and a fallback route for everything else.

And yes, one could build a relay system that way, but you would have to design a system that works, that works reliable, that makes error tracking possible, and in the end, you’d still create bottlenecks and you’d probably end up in a “which relay should I connect my pod to?” kind of issue. Diaspora nodes do not have a hierarchical address, and they are not supposed to, so even if you use such a thing, one still would have to distribute a “routing table” and pods need to pick a prefix and … urgh.

If we take the RabbitMQ example as a base, you can solve the issue by probably having 50-or-so nodes in a tree structure as you suggested, but somebody still has to design, implement and maintain that. Also, 50-or-so people have to run relays. And that is only needed to handle the traffic diaspora is able to handle itself anyway. Which raises the conclusion: why bother? As mentioned earlier, you would end up building a federated system to avoid using a federated system.

Don’t get me wrong, I’m not opposed to having a relay as a support, to relay posts (and posts only) based on something like the idea I outlined earlier in this thread. Also, it could be cool to maybe distribute public profiles via the relay to increase the discoverability of profiles. Or maybe we can use a relay thingy mainly as a node for discovering other pods. Who knows. But suggesting to throw all posts and all interactions towards an relay and saying “this scales better, you can always optimize and scale the relay to make it work” isn’t a solution.

We can do better with the system we have right now, there is no need to replace it since we feel like it can scale up quite a bit. I mean, it would be cool to hit a limit where we have so many pods we simply can’t keep up and have to think about something less… meshed, but if I compare the probability of us hitting that cap vs. we running into issues caused by a design mistake in a relay, the situation is pretty clear.

Fair enough. Your example is simplified, but I take it. It does leave out that the server handling the deliveries has to actually do other stuff too, like serve web requests and handle database connections. Most servers will also run their own database, redis and other services + a bunch of other web apps too. A server doing the delivery you highlight in your example would really in the end be just a relay since it would be locked from doing anything else. It’s fun talking about high performance networking when 99% of nodes are run by hobbyists.

Of course the numbers highlighted are ultra-unrealistic, as was 10K, looking that the network has not grown in the last two years at all. Thus, it is probable you are correct that nodes can deliver all that they need.

What I don’t understand then, if you think nodes have all this unused power to use, why not just replace the whole relay with nodes doing the job themselves? Why deliver only interactions directly? You could easily make all nodes poll all other nodes for subscription preferences and deliver posts directly. This would be more efficient in terms of delivery (since now we got double deliveries sometimes), but it would increase the things to do for the nodes themselves. According to you this shouldn’t be a problem so I think that would make much more sense than have a relay at all.

1 Like

I guess we don’t really disagree that much. I wouldn’t want to replace the “server to server” communication with a relay based one.

My described way would only be an enhancement for the current relay system that helps to get content for tags even on small servers and to get to know interesting people in the public posts.

I also like your idea of possibly relaying profiles that way. This would really help people finding each other.

Concerning the routing: That only had to be done via the relays admins. The server admin would choose a relay like a user chooses a Diaspora server. And when a relay is full then the admin could close the registration.

Good thing about my idea would be to add further levels of distribution only whenever we will need it.

That’s already possible. At least you can enable this in the config file.

Let me just quote @jaywink here.


New pods don’t know any other pods, so servers like relays help discovering other pods in the network. With our current implementation it should be “easy” to subscribe to a post (basically “Hey there, could you add me to your list of subscribers and send me all interactions?”), but asking a pod to send all public posts (maybe only those that contain some hashtags) is something completely new. And you would still want to have relays for things like pod discovery.

1 Like

@denschub actually I’m interested about that, would it be possible to run your test once again on a server actually running a diaspora* instance? Just out of curiosity.

I would like to see real live data about the outbound speed when transmitting posts (comments, likes, …)

I did some work for Friendica concerning the speed improvements when delivering content. My system is extremely slow, concerning this. But fast systems are about 500 to 1.000 messages per minute.

I’m feeling like we’re blocking ourselves on a problem quite far away in the future. This discussion is technically very interesting, and good choices have to be made to avoid being stuck later. Nevertheless, the network is not growing at the moment. Let’s move on by improving the federation to allow public post pulling with interaction and tags following, we have time to see a possible performance bottleneck coming.

5 Likes

I agree with this. Sad to see another technical discussion turn unnecessarily sour.

2 Likes

I agree and deleted the three Off-Topic comments here (@denschub already wrote, that it’s OK to do so, and since @jaywink extracted the response to them to it’s own comments, I think it’s OK too).

The server has enough resources to do this in parallel. Sending doesn’t block the server, the server is mostly waiting for the connections. It’s completely in the background and sending doesn’t need any database queries.

Incoming federation would be a much bigger problem, because it uses the same unicorn workers that are also used for the frontend. So when all workers are busy receiving messages, there is nothing left to deliver the frontend.

Nodes don’t know all other nodes, that’s why we currently need the relays for that. If we have a better/cleaner solution for that, we can replace the relays.

As I already wrote: For interactions we should keep track of them anyway, because relays are not the only way a post can find it’s way to another pod. And when we handle interactions/retractions directly within diaspora, we don’t need to do that with relays in addition.

And the tests from @denschub showed, that we don’t get troubles with sending messages anytime soon (we probably have trouble with something else before that), so thanks for the tests.

4 Likes