Before I start, let's setup a sample use case for my question. Our chat application is rapidly growing in users and we have to expand in running instances of our server application. Language implementation is irrelevant but is likely done in Java or C++.
Having one server accepting all clients and sending messages to users is very easily. You can basically have a pool of users and then send the message to the right connected users. Works great. However like I said, the userbase is growing rapidly and we're forced to expand. Then let's say that we make our server aware of other server nodes. So you'd now have node1 and node2. Connected to each other and communicating with each other using their own protocol ontop of TCP/IP.
Node1 and node2 can ask each other if a certain user is connected, if yes, then the message will be forwarded to the other node so the message will end up at the right user if the user can not be found on its own node. Works perfect and almost similar to EMQ has implemented with no single point of failure.
Now let's say that we're growing even bigger to the size of Facebook. Not realistic but let's say we do. Obviously we'd have many experts in our team that do know how to implement and design this architecture and would probably love to explain it to me but that's not the case now.
We decided to setup clusters of nodes. For example, each cluster has like 20 nodes. Each node is aware of each other within its own cluster. Let's say we have three clusters and 'Fred' is connected to cluster1 and 'Fish' is connected to cluster2.
Now Fred sends a message to Fish. How does cluster1 know about cluster2 and it's users?
I came up with a few things. 1) I could create a redis instance and store each user session in there with the cluster and node that the user is connected with and then send the message to the right cluster and node OR 2) ask each cluster if one of its nodes has a user so called Fish OR 3) store the message in some database and poll if there are incoming messages from other clusters.
I like option 1 but I'm not certain if that's the right thing to do. Option 2 sounds a little heavy for load balancers of clusters. Option 3 doesn't quite suite the use case because we wish to have this message deliverd in real time.
I suggest using gateways for user connections. These gateways don't your messaging channels but know how to find them.
You also have node which hold the channels, they only talk to gateways.
Each client can connect to one gateway. Each gateway has a connection to every channel server. This way, one client connection can access the channels across any number of servers. If you have really high internal fan out you can have a reliable UDP protocol to broadcast the message.
This NxM arrangement allows you to scale both client connections and channels.
BTW This is how most financial exchange work. These are design for sub-milli-second latencies and can have very high message rates e.g. 10 Million per second.