-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect & visualise spark-stats API usage based on access tokens #206
Collect & visualise spark-stats API usage based on access tokens #206
Comments
Are we currently using any tools for monitoring and collecting logs? Also, what service is responsible for managing API keys and setting rate limits? |
Logs are being collected in Papertrail, which I believe Miro gave you access to. I believe at the moment API keys are only parsed in Cloudflare, @bajtos correct me if I'm wrong. |
I don't have access to Papertrail yet. |
I believe if we want to track both cached and non-cached requests we'd have to use cloudflare workers. In that case we'd have two options:
|
Instead of pushing logs to papertrail, where we then would parse them (I assume), can we make the cloudflare worker submit information about the request into a DB, via an HTTP API that we create? It could also feed this data into InfluxDB, for time series processing |
Yeah I would prefer something like that over the papertrail also. It will be much easier to aggregate and visualize the data. |
Cool, let's either add a route to |
Can we put the new route to |
Writing from Cloudflare worker directly to InfluxDB may be the simplest option, as we don't need to write any new REST API 💪🏻 |
Good point, yes should be in |
It also has the advantage of automatic retention policies |
Do we already have InfluxDB deployed? If so I'd appreciate access to the instance 🙏🏻 |
Invite sent 👌 |
I have deployed a cloudflare worker that should report data to influxdb. I have yet to configure it properly now. I am proposing is adding another CNAME record (i.e. After we make sure that cloudflare worker is working correctly, we can setup another CNAME record (i.e. |
This would be a kind of staging environment, right? We can ourselves make requests against that endpoint, and verify they end up in InfluxDB correctly. As long as the Cloudflare worker doesn't negatively affect requests, I don't see an immediate problem with just using it in prod. Can we just make all requests to api.filspark.com go through your worker? |
Yeah, it's kind of a separate environment but In this case I didn't use cloudflare worker environments. Generally it shouldn't affect request performance as metrics are reported after the request has been executed. I have created a repository which itself is a fork of some other repository I've used as started template. In the end I didn't end up using much of the code inside that repository. I am wondering should we fork my repository as an organization or could I create a new repository and push the code. Given that it's a MIT license I guess it isn't legally binding even if we copy code into new repo? |
@juliangruber I've tried adding new CNAME records but it seems like there's some issue with TSL certificates (I get error 525 when trying to send a request). Are the certificates managed by cloudflare or it's done outside of cloudflare? |
Ok, let's just attach the worker to the production route then (with manual monitoring that nothing goes wrong), for simplicity.
Are you able to create new repos in the Filecoin-station org? I think we can then transfer your repository into it. Yes we could just copy the code over, but forks are convenient because you see where the code came from, and that makes it easier to pull in upstream changes. |
Can we use the existing domain name instead? |
Yes we could do that, I'm only worried if we're going to miss cache. I've switched worker for few minutes to |
Does this mean the API cache will be inactive with workers, or that it will be invalidated for every worker update? |
If the |
I don't understand. We want spark-stats to stay proxied, so that we can use Cloudflare's caching behavior. We want that proxy to also submit usage data to the API, through a Cloudflare worker. What am I missing? |
Got it, I wasn't aware that the worker needs to sit in front of the proxy. This makes sense then. Your graphic helped clarify this 👏 I will take a look at the certificate setup and get back to you |
The problem is that the fly.io deployment requires the hostname to be |
Thinking about this more, are you sure this is the right way to go about things? Ie is it a documented pattern to put the worker in front of the proxy? Or are there alternatives, like proxying from inside the worker? |
FYI, https://stats-api.filspark.com works now |
We could proxy inside the worker, sure. I guess it would also be more appropriate and also would require a lot less DNS resolutions. I'll see how to use basic cloudflare cache setup with workers. |
@juliangruber I have added caching inside cloudflare worker here. With that I guess there's no need for additional CNAME records so I'm going to delete them. I've yet to add tests and alter the docs, but the it has been deployed to cloudflare and it seems to be working. |
(EDITED) ^^^ That's no longer relevant based on what I learned while writing this comment. ^^^ Here is some documentation that may be relevant: Auth with headers:
Using Cloudflare cache from workers:
Quoting from https://developers.cloudflare.com/workers/reference/how-the-cache-works/
If I am understanding this topic correctly, then we should have the following architecture:
In other words:
|
@bajtos I think you might be right about this. To be honest I was confused by this 👇🏻
As I understood it, request is cached if server is proxied behind cloudflare, hence me asking for another CNAME that we could put out server behind. |
I have added token to data source on https://spacemeridian.grafana.net/ that reads spark-stats API. Are the dashboards built there are using that data source also published on https://grafana.filstation.app/? If that's the case, if I'd to edit API token directly on the dashboard I am afraid we would risk leaking API token. |
I don't understand this sentence |
Woah, I made a mistake. 🤦 I wanted to ask if dashboards built on https://spacemeridian.grafana.net/ are published on https://grafana.filstation.app/ and use data sources defined in https://spacemeridian.grafana.net/? |
No, these two Grafana instances are completely independent. If we want to use the same datasource from both instances, then we need to configure it twice. I believe we don't need to expose API usage data publicly yet (if ever), so there is no need to touch https://grafana.filstation.app/. |
All requests to |
We started to hand out API keys to people who signed up with their email for Spark Data offering. Now we would like to understand which user is using Spark API and how frequently.
Docs which the users see: https://filspark.com/api-docs
Notes:
api-key
header &api-key
query string in addition to the standardAuthorization
header inBearer
format.TODO:
Update our internal spark-stats consumers to send an access token:
The text was updated successfully, but these errors were encountered: