-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support channel ID as an input of the "Invite to channel" operation #2695
Comments
Why should the connector not just handle rate limiting as documented in the Slack API? Given that we keep state in the connector, wouldn't it be more useful to resume iteration instead of starting over to performing all the same lookups for user ids and channel ids again (usually resulting in the same rate limit error) |
@furudean that would be a great option but the issue with it is that this will result in a potentially indefinite execution time of the connector task, which is conflicting with Zeebe's concept of job timeout. Job workers must return a result synchronously, or else Zeebe will consider the job stuck and make it available for workers again, which can lead to request duplication. Depending on the job timeout configuration, the behavior of a connector can even be different in similar situations. As I understand, it is not guaranteed that the request to Slack API will succeed after waiting the time period specified in |
Alternatively we could think of supporting exponential retry backoff strategies in the |
Okay, as I was thinking about it, I realized we have a way to support this. I think this would be a great use-case for the On rate-limiting errors, the connector will throw this exception, setting the |
@chillleader Could you explain how using |
@furudean when a connector throws a This will not require any manual error handling from your side (process execution will automatically recover), the only thing to keep in mind is that the default number of retries is 3 and you might want to increase this value in the connector properties, should you still continue to see the rate limiting issues. |
@chillleader That sounds helpful in some cases, but this won't help with the issue when there are too many channels/users in a workspace and we're using the "Invite" method, right? The connector would make the queries to hit the user and channel endpoints until it hits a rate limit, and then the context will be thrown away with the connector job. If it retries at this point, it basically starts over from zero. |
Ah I see what you mean, thanks for explaining! For some reason I assumed this lookup was performed in bulks but I see now that it's not the case 🤦 I agree that it would not be a complete solution for your problem. Let me just gather more opinions on this from the team and get back to you asap. |
Our suggestion is to use the Slack connector in multi-instance mode. This way Slack connector invocations will contain only one user per job, and the context of batch processing will be managed by Zeebe. Please check the examples below. To prevent rate-limiting issues with this setup, it's best to use sequential multi-instance (not parallel) with a reasonable retry backoff. This can already be configured now with the current implementation of the Slack connector. As an improvement, I think it will still be valuable to throw the |
@chillleader I tried out the linked process in our instance, switching out the variables to values relevant in our Slack workspace. I seem to just get the same error. I'm not sure what this multi process instance gave us here. |
In this example, multi-instance allows us to avoid the issue of losing the context of already resolved user IDs and wasting retries due to re-resolving all user IDs every time the job is retried. This is achieved by breaking down the bulk job into a number of instances, each responsible for resolving a single user ID, resolving a channel ID, and then inviting this user into the channel. With multi-instance, when Slack returns a rate-limiting error, Zeebe will handle retries for each user separately (vs. attempting to resolve N users together and failing with a rate-limiting issue again). But the multi-instance itself is only part of the solution, because retries also need to be configured so that the job is not retried immediately (use the retry backoff property for this). In this process, sequential multi-instance is used, so users are invited one by one, and if the connector is rate-limited while inviting the first user, it will not attempt to invite the rest of them until the first issue is cleared out. Regarding the error, I'm not sure what was the cause in this case - unfortunately we don't expose any details in the error message (but that will be solved when we resolve this GitHub issue!), so ideally we'd have to take a look at the connector runtime logs. If it's the rate-limiting error again, you would need to configure the retry parameters of the connector to handle it. I'm also happy to organize a call and support you synchronously with this issue, if you like - the support team would be able to help you with scheduling 🙂 But I wonder if the suggested solution makes sense and if it could work for you? |
Sorry about the delay in getting back to you. Here's the error trace:
I'm getting a rate limit response trying to invite just one user and one channel, so unfortunately the multi-instance does not help for our large slack workspace. It will always rate limit with how the connector hammers the API and fails quickly. Let's continue our conversation in the support ticket as it's a little off topic. The general sentiment of this issue is good and will help with debugging but will unfortunately not solve our problem. |
We have a specific logic where we retrieve 100 channels, we look up if the channel is present, if not we retrieve 100 others channel and repeat. Wouldn't it be better to retrieve 1000 channels at a time, reducing the number of request by 10 in addition to add the option of directly entering the channel id ? |
☝️ I think this is worth trying |
Is your feature request related to a problem? Please describe.
When using the Slack API via the Slack OOTB connector, it is possible to hit the rate limit. The connector currently doesn't distinguish between rate limit errors and other types of errors. It doesn't show a custom error message to the user, showing a generic message instead, while the logs may indicate further details. Also it is impossible to work around the rate limit and implement custom retry behavior for such errors.
Describe the solution you'd like
The "Invite to channel" operation is using the
conversations.list
method of Slack API. Given a large enough workspace, iterating through the channel list may never succeed.It would be beneficial for the "Invite to channel" operation to support the channel ID as an input. Oftentimes the channel ID is already known from the previous steps of the workflow. If we could supply it directly to the connector, this would help avoid the costly conversations lookup.
Additionally, the error message that is returned when we hit the rate limit must be changed to something more representative (currently it just says "Unable to find conversation with name X" for any errors during this step).
Describe alternatives you've considered
Slack connector could support some customized retry mechanism tailored specifically for rate limiting errors, but it is somewhat difficult given the stateless nature of outbound connectors. We may consider further investigations in this direction.
https://jira.camunda.com/browse/SUPPORT-22188
The text was updated successfully, but these errors were encountered: