-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clustering: Add clustering support with ToCloud9 integration. #504
Conversation
Implement overriding of configuration from the .conf file with environment variables. Environment variables keys are autogenerated based on the keys defined in .conf file. Usage example: $ export CM_DATA_DIR=/usr $ CM_WORLD_SERVER_PORT=8080 ./mangosd
The new env key format: Mangosd_Rate_Health Mangosd_DataDir
I'll edit this comment as I go along, first question I have: |
|
||
player->SaveToDB(); | ||
|
||
// TODO: unfortunately mangos doesn't provide API to track saving progress :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have async db queries, we can probably make some changes to allow for SaveToDB to run a callback on completion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it would be nice, but at the moment there is workaround - walkline/ToCloud9@fd906b5#diff-cf09120953fd248dcecc81b97b5a684f7ecba5ec076f31f6cf776dac99c9b27cR130
This warrants a longer discussion but I will try to give at least a brief response. First I need to state, that I am inherently against clustering in cmangos in principle due to the debugging overhead and complication it provides until most systems are finalized (which they arent by any means). Luckily for everyone involved, I can be overvoted. Now that that is out of the way, we know the proper official clustering structure we are supposed to follow, and sadly, this is not it. The official structure is this:
To achieve this a bigger restructuring of the codebase is needed and is actually something I working towards thread context wise for several years now. And it would be insane to maintain it as an open source project where I am doing 50% of the core work on my own. (hence why the hard pass from me) Me and @cyberium are considering adding a protobuf link between realmd and mangosd for a long time, even possibly gRPC+protobuf, but during the initial step just to send session keys instead of doing it through the db. However that is an incredibly scaled down version of this proposal. And now onto the PR. It contains some arm stuff that should be probably separate. Appending packets on top the existing packet handlers is heresy, would make maintaining several versions difficult. Each communication space needs to have its own opcodeset to keep the blizzlike stuff perfectly pristine. (this is unlike Acore #1 priority reverse engineering project, not commercial core) Lastly, there is a lot of stuff that this PR ignores, like for example the group stuff, which needs to be properly re-engineered to be thread safe to begin with. That being said, your work is commendable to pull this off, and dont let my negativity to discourage you from what you are trying to achieve. |
Just out of curiosity, in this structure, what do each of the components do?
Login server I guess is what realmd currently does, and world server + instance server is what mangosd is doing, but what do Session and Realm servers do? Also: boo gRPC >:( |
Congrats for this awesome work and results!However iam also negative for this PR. Your work may benefit to those who want it now or even to those who want to rewrite it in c++ using more what we have in mind as miro-services architecture. So to be clear i do not expect this form of PR be merged any time soon. It will require a complete rethink that iam supposing you are not ready for it. |
Session server would hold our object (session) and all the other nodes would send packets to be forwarded to client. (my lfg code in wotlk has to emulate this cos basically session is not thread safely accessable from async lfg thread) Realm is our World object, World is our Map object, instance is also our Map objects, but those of type Instanceable (not static world maps like kalimdor, ek, outland, northrend and dk zones) |
Ahhh, ok. So Session server is basically the router / main interface to the internet and then the Realm Server is basically the singleton for GUIDs/consistency etc.? |
I think gRPC is the first option that comes to mind when you think about synchronous communication in distributed systems (not over the internet). |
That doesn't necesseiry means that we should follow it.
Yeah, I agree that this PR is messy, but the goal of this PR is not to be merged, but to provide you an idea of how integration works. Nonetheless, I agree that changes like arm changes don't help with that, sorry 😅
You referring to these changes: walkline@23d8a1a#diff-1c6279373626ee45cc739c22846da733265493a1f45f702d59140bae1f3d3abcR1349 ?
Indeed, some implementations are a little bit naive and don't cover all cases, like groups (also, I didn't touch instance binding/saves, code completely different from TC). But I don't think that big groups re-engineering is required to make it work. Also, please don't forget that the goal is to make this clustering optional (with config or CMake flag option). Making it optional gives you the ability to not care a lot about clustering since >90% of users will continue to use the current setup. |
Well, currently the best case scenario for our infrastructure is a high cache medium core bare metal cpu, like lets say 5950x, which will make it run better than any clustering without rewrites or any cloud provider. (we are single thread bound, not multithreading bound) And here is the thing, I run my own server on vengeance, but the things I want on vengeance are kept in a separate repo because this project is exactly dedicated to making that original vision of the game with its own bugs. For example having LFG clustered like official did reproduces exact levelup bugs blizzard has. I have no problems with someone patching them in their own repo in 5 minutes, however this is a blizzlike project, not a commercial venture as I said. So on this part, even if i would agree with you on the point of this being good for server hosting (which I do not, and i strongly disagree with even the leader of acore on this) that still would not mean I would want it to be in cmangos. (something something purity of principle) (also note to insunaa, grpc can run on http/3 (QUIC), which is even more efficient) |
@cyberium Thanks for feedback.
@killerwife That’s true, the current setup works fine for most users. However, there are several things to consider:
Depending on the person, it can bring some new knowledge and fun (maybe this is only relevant for me 😁). But I understand your concerns regarding maintenance and complexity (maybe with a bit of an overreaction), and since there are no maintainers interested in this (which is totally fine), then I think it’s better to close this PR (but the discussion may continue). Anyway, if someone is interested (or changes their mind) in this, I will be glad to discuss/cooperate. |
Thank you for your understanding and for considering the feedback. While we may have different views on the architecture and direction of the project, I truly appreciate your contribution and the effort you've put into this PR. Although this specific PR may not align with our current plans for the repository, your insights and suggestions are valuable for our future development efforts. We're always open to hearing your thoughts and ideas, and I hope we can continue to benefit from your expertise and contributions moving forward. Thank you again for your work on this, and I look forward to collaborating with you on future endeavors. Best regards |
🍰 Pullrequest
The purpose of this PR is to evaluate possible required changes to the core and discuss possible integration with ToCloud9 project. The changes don't provide full clustering support and would probably need refactoring, possibly with hiding the implementation behind a CMake flag.
The demo with these changes available here - https://www.youtube.com/watch?v=XnehuGA7Jjc
So basically, TC9 integration brings a bunch of microservices (currently 9) that run alongside mangosd and enable clustering support, allowing the distribution of players between several mangosd instances.
How does everything work? In short, microservices (and mangosd using libsidecar) use gRPC + NATS to communicate with each other. Players connect to game-load-balancer(s) instead of directly to mangosd. This game-load-balancer offloads encryption, analyzes packets, and in some cases, uses other services to handle packets or sends raw packets to mangosd via a TCP connection (the same connection that player <-> mangosd uses without clustering).
There is a lot to tell about the architecture, but unfortunately, at the moment, there is only this reading (skip to "ToCloud9 Architecture" section) that provides some architecture details (sorry, I'm a bad writer). But feel free to ask any questions.
How2Test
The installation guide will be available later.
Windows not supported (only with WSL)
You can still build the project and use it on Windows with the clustering mode disabled as you do now. However, once you enable it in config, you will receive the corresponding error message.
But you still can use WSL/Docker/etc to run it.
How to add support for Windows?
3 options available: