-
-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Stream Creation from Ingestors in Distributed Deployments #825
Comments
By the above do you mean, checking that:
|
Is there a way to run the binary in query mode with a local store? I see
Ideally, I'd like to be able to run the distributed setup on my machine to ensure a faster feedback loop. |
You'll need to run a MinIO instance locally and then run Parseable with s3-store https://min.io/docs/minio/linux/operations/install-deploy-manage/deploy-minio-single-node-single-drive.html#download-the-minio-server |
Problems
Query Server Getting SchemaLazy CreationThe query server can lazily create the schema and metadata for a stream if it doesn’t exist on demand, the same way an ingestion server does now.
Regular ScansHave the query server regularly scan the relevant bucket to get the diff of streams created, and create schema & metadata files for itself that aren’t already present. (Discussed on call)
Event MechanismSet up AWS (as S3 seems to be the only object store supported) to send an event via SNS when a new stream is created so it becomes a reactive system. Linux fs can use
(Side note: Does the query server even need to create its own schema & metadata files in object storage? From what I can see the schema files are identical & the metadata files are the same except for additional fields on the ingestion server’s metadata file. So, why not just load directly from that into query servers in-memory map and not create additional files in object storage?) Ingestion Servers Creating SchemaRedirect To Query ServerThe simplest solution to the problem is to have ingestion servers re-direct the request to the query server. From the docs I can see that the architecture expects a single query server which doubles as a leader so why not just use it as a leader? I know we discussed on call about keeping them de-coupled but if it’s designated as the leader of the setup, why not use it as that? At Tensorlake we had similar approach with ingestion servers and coordinators.
S3 Strong Consistency For LockingI came across this answer on SO and it tickled my brain. It uses S3’s recent strong consistency to place a lock file and do a List-after-Put to determine that the writer that placed the lock file has acquired the lock. This guarantees that only a single ingestion server will create a stream. All other ingestion servers will be expected to remove their locks.
Designated Leader/WriterHave a designated leader/writer among the ingestion servers. All writes must be done via leader. The simple end of this is have a config file demarcating one server as writer. Up to the operator to ensure up-time and backup for leader/writer. The complex end of this is a Raft cluster of ingestion servers (overkill)
Registry ServiceRun a separate, dedicated registry service which will be responsible for stream creation. This can be strongly consistent via sync mechanisms.
(Side note: Does it matter if the last writer wins with regards to stream creation from concurrent ingestion servers? That seems like a simple approach and then this concurrency problem goes away) ConclusionI favor the approach of forwarding the requests from the ingestion servers to the query server for the reasons I've outlined above. Query servers can create/fetch schemas lazily and store in internal memory. It's minimal code changes and achieves the outcome. Alternatively, "locking" via S3 would work for multiple concurrent ingestion server writers if connecting to query server will not work. |
@redixhumayun Thank you for this well-articulated and wonderful write-up.
Do let us know if you need a follow up call to discuss further. Thanks! CC: @nitisht |
@nikhilsinhaparseable The proposed approach works and considers minimizing the changes in the current setup. Also, there might be other constraints I am not aware of. But, if there is a way to invest time/effort in a registry service (or metadata service), it will be beneficial in the long run IMHO. A single-node registry service with a fallback node can still handle a large cluster. Also, Given this is APM space, if there is a necessity for a multi-node registry service, it can be designed as eventually consistent (to maximize throughput) as opposed to something strongly consistent like raft. |
Current Behaviour -
In Distributed deployment, stream creation is allowed only from querier node.
At the time of ingestion, ingestors check if stream is created in storage, if yes, syncs it to its memory.
Enhancement Required-
to allow stream creation from ingestor node from Ingestion API so that user does not have to use stream creation endpoint from query node then use ingest endpoint from ingest node for ingestion.
Then, verify if querier node and other ingestor nodes are able to sync from memory and all functionalities work well.
Doc link for reference: https://www.parseable.com/docs/api/log-ingestion
The text was updated successfully, but these errors were encountered: