forked from uber-go/cadence-client
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extracting domain client to a separate file and add code coverage #1
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Improved error messages in getValidatedActivityOptions, also removed redundant call to getActivityOptions * Added tests for the error messages in getValidatedActivityOptions * Rolled back to the previous error messages and transformed the three new tests into a table test
The only test that fails with go 1.20 is this one. The reason is workerOptions.WorkerActivitiesPerSecond == 0, and it is not augmented in newActivityWorker() - this supposed to be done in callers. This breaks task-dispatching: polled tasks are never executed since it is always above the limit(0).
What changed? The file was named as reply_test earlier so I changed it to replay_test. Why? The file name was confusing so I updated it to a more meaningful one. How did you test it? locally Potential risks NA
Docs are currently a bit vague on what time exactly it represents, which brings up questions like "is this the same as the cron schedule start time" or "what time does it represent if my workers are down for an hour". So this description was gathered experimentally. It may be a good idea to track exactly what updates it and when, but I suspect that the "recorded event" is a complete description. Or it probably should be.
…t suite (uber-go#1227) * Fixed the spelling of replay_test file. * Added one scenario for greeting workflow.changed activity signature to something incompatible with arguments * Added all the scenarios for CDNC-4316 * Added all the scenarios for CDNC-4316 * Added supporting documentation for the tests * Added supporting documentation for the workflow
Resolving this "two kinds of finished-timeout / heartbeat-timeout" issue with ErrResultPending will need some new semantics on the server side... but at least there's a workaround until then. The general concept of a "pending activity" does seem worth supporting, particularly because it can respond with an "already completed" error when a second "complete" request is received. It just has unfortunate quirks at the moment.
… test suite (uber-go#1231) * Added the Activity Registration required failure scenario to replayer test suite * Renamed files according to go convention * added copyright info to the files * Skipped the test case that panics at incompatible activity returns. This is an expected behaviour and test is included for documentation purposes.
This allows us to move to `//go:embed` or use hard-coded strings (or any other data-source). Short-term plans are to use go:embed to read files once / not require runtime access to files, only build.
This is necessary to use worker-instance-registered activities, as otherwise they are unrecognized and may not be able to replay successfully (as a changed activity name is a non-backwards-compatible change).
…ber-go#1225) * Merged the timeout loging for the tests in internal_workers_test.go * Fixed the import order and fixed the name of the function
What changed? To start a new workflow correctly, we actually need more fields: - serialized input - header info Potential risks low risk to add NEW receive methods on public struct
…s into the replayersuite. (uber-go#1237) * Added tests for parallel workflow * Added the test for coroutine where removal of branch should have caused failure * Added copyright info
…1236) * Fixed the spelling of replay_test file. * Added the Activity Registration required failure scenario to replayer test suite * Documentation change * Added test cases for branch workflow * Added change in number of branches scenario to replayer * Added copyright info * Update replay_test.go * Fixed merge issues * Added assertion * Updated assretion
…o#1240) At least, I finally ran it more than 5x in a row without failure. yay! We should probably just ban bare atomics. There are occasionally test races between test-completion and zap logging, but those are sorta un-resolvable unless we reliably stop everything in tests. Which would be excellent, but I think we're fairly far from achieving that at the moment. There also seem to still be some flaky runs of these locally-dispatched tests, but I'm not sure what the causes is yet. Still, seems improved.
* Switched to revive, goimports, re-formatted everything Works pretty well! Using latest tools/go.mod versions (except thrift), and using the server's revive.toml for starting out. Revive is a bit noisy due to some `Id` -> `ID` recommendations, but we probably should actually do that for v2.
…fication in current behavior) in the switch case has no effect on replayer. (uber-go#1238) * Added the conflicting activity registration bug * Added copyright info * Resolved git comments and split the test cases into two * Cleaned up the code * Added more comments nit
…ister (uber-go#1242) * Fixed the spelling of replay_test file. * Added the Activity Registration required failure scenario to replayer test suite * Documentation change * Removed extra file * Replaced Activity.RegisterWithOptions with replayers oen acitivty register * Removed activity from imports: nit
* [cadence-client] produce a log when activities time out logging when activities time out will help with debugging * Update internal_task_handlers.go
…#1245) * Added a better log message for nondeterministic errors * Upped log level to Error as this is an error * Created test for the logging update * Fixed imports
These just snuck through, possibly we have a gap in our CI.
…ber-go#1247) The added test hopefully reveals the issue - prior to this fix, that `err := f.Get(...)` -> `errors.Is(err, ...)` check passed, because the error value was not ever encoded into a `*GenericError`. Thankfully this was only a test environment bug, and a user discovered it due to writing tests. --- The non-test behavior can be seen split between these two locations: - encoding error reason / data: https://github.com/uber-go/cadence-client/blob/ad91fc718330cb40ff26627fa1d1a6c78c2ceaca/internal/internal_event_handlers.go#L1159-L1163 - constructing the generic error in handleLocalActivityMarker: https://github.com/uber-go/cadence-client/blob/ad91fc718330cb40ff26627fa1d1a6c78c2ceaca/internal/internal_event_handlers.go#L1132-L1135 ... which also show incorrect / pointless string/byte conversions. I've left comments there just to prevent confusion, but these fields are probably worth changing at some point.
…eadability. (uber-go#1244) * Fixed the spelling of replay_test file. * Added the Activity Registration required failure scenario to replayer test suite * Documentation change * Removed extra file * Extracting the replayer specific utilities into a separate file for readability. * Pulling more helper functions * Separated all the matching cases
It is very expected to have this field, but customers have to walk through the stack trace, which is not always convinient especially with wrappers.
uber-go#1252) * Adding in additional header as a config option to determine a more stable isolation-group
uber-go#1260) * Add hardware monitoring of CPU and RAM metrics on workers * Use cpu.counts over runtime.numcpu() * Add host machine tag injection for metrics * add Sync.Once to ensure metrics is emitted once per worker host
…uber-go#1222) Bumps [golang.org/x/net](https://github.com/golang/net) from 0.0.0-20220121210141-e204ce36a2ba to 0.7.0. - [Release notes](https://github.com/golang/net/releases) - [Commits](https://github.com/golang/net/commits/v0.7.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…o#1264) Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.4.1 to 1.11.1. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](prometheus/client_golang@v1.4.1...v1.11.1) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* replay test sequential * Add replay test case for problematic continue-as-new case * fix * Made minor comment fixes and expected error handling --------- Co-authored-by: taylan isikdemir <taylan@uber.com>
* Release v1.2.7 * Release v1.2.8 * Empty commit to trigger github CI --------- Co-authored-by: Tim Li <ltim@uber.com> Co-authored-by: Tim Li <47233368+timl3136@users.noreply.github.com>
* Revert "Making Workflow and Activity registration optional when they are mocked (uber-go#1256)" This reverts commit 2e73362. * Revert "Addressing difference in workflow interceptors when using the testkit (uber-go#1257)" This reverts commit e31ba38.
* Release v1.2.7 * Release v1.2.8 * Empty commit to trigger github CI * add retract directive for v1.2.8 * Update version and changelog --------- Co-authored-by: Tim Li <ltim@uber.com> Co-authored-by: Tim Li <47233368+timl3136@users.noreply.github.com> Co-authored-by: Jakob Haahr Taankvist <jht@uber.com>
…1312) Clarify behavior on return of error Document the whole execution sequence Docs added to both internal package and exposed type alias due to: golang/go#44905 Signed-off-by: Alexander Shopov <ashopov@uber.com> Co-authored-by: David Porter <contact@davidporter.id.au>
Seems like this has been left un-pinned and un-documented for a long time. Might as well fix that, so we can address the new async APIs in e.g. uber-go#1327 in a standardized way.
…#1329) There's more work to be done, but it's a major step towards forcing stability. Oddly, missing the safe.directory git setting leads to Go failing to get version info, leading to errors building, but for some reason it doesn't show the error that git's reporting (missing safe directory). Just stuff like this with no other output: ``` error obtaining VCS status: Use -buildvcs=false to disable VCS stamping. ``` Once I fixed that, I learned that `go test -exec nonexistent ./... >/dev/null` errors, but the "FAIL: command nonexistent not found" error is.... reported on stdout, so it's hidden. And so is the "FAIL the/package/name" message. But if you have a failing build, the failure *is* printed to stderr: ``` ❯ go test -exec true ./... >/dev/null # go.uber.org/cadence/evictiontest evictiontest/workflow_cache_eviction_test.go:58:4: missing ',' in composite literal ``` I have no idea why `-exec nonexistent` isn't on stderr, but I left a comment in the makefile anyway. Quite a large amount of weird stuff condensed into a seemingly simple goal.
Followups from uber-go#1327: 1. simplifying the memo/search-attr encoding tests 2. making sure the dataconverter tests can detect correct vs incorrect encoder use 1 is pretty simple: memos and search attrs are ALWAYS JSON because the server must interpret them, so the exact bytes should not change in the future. A hardcoded string is easy to verify, and strengthens the guarantee that "this should not change even if other encoding changes". 2 is more interesting: as originally written, the test would still pass if the custom dataconverter was *not* saved and used. The type-assertion to check the internal field somewhat prevents that, but it's unnecessary and a disjointed check. So it has been rewritten: now the serialized request bytes must clearly come from the test encoder, not the default encoder, and they must be detectably different.
…o#1333) * Migrate CI from AWS queues to Google Kubernetes Engine queues - Add necessary k8s plugin config to the pipeline for use with GKE + Buildkite's agent-stack-k8s helm stack - Remove docker agent tag filter - Install bk agent in Dockerfile for use in the code coverage step - Add comment on source of buildkite-agent installation commands - Add comment on yq explode usage to the top of pipeline.yml
Mockery is currently failing on go 1.22 on Linux due to golang/go#64812 tl;dr, changes in go 1.22 break the popular `x/tools/go/packages.Load` tool, which Mockery uses. Upgrading our tools' `x/tools` (to latest) resolves it, and appears to work fine for go 1.19 and above on both Linux and OSX, and has zero effect on our users.
…er-go#1341) Apparently these also failed on the original PR, but due to missing a `\` in uber-go#1303 it wasn't failing in CI because the env var was gone by the time `exit` ran. I should really take that as a hint to finally rewrite that chunk of the makefile, so it just does a normal `go test ./...` :\ but not today. Thankfully this seems to be the only set of tests that snuck past, and they were just incorrect. I'm not sure why I apparently didn't run them or something while writing them in uber-go#1331 but it seems pretty clear I didn't since the old pre-merge SHA fails too. Bleh. --- To correct the tests' original flaws: 1. memos *are* supposed to be encoded by custom dataconverters, and they are. we don't search them so they're not JSON like search attrs are, they're just blobs we expose in list APIs (and in workflow metadata). 2. `Equal(string, []byte)` just doesn't work, and the default dataconverter adds a trailing newline to separate args. 3. the `[]byte`-heavy copypasta meant the Input-checking tests were odd, so I changed them, and fixed the args to the dataconverter. Internally we splat the args in here, so it's not `[]interface{..}`-packed: https://github.com/uber-go/cadence-client/blob/6decfc78571a9d91d943815ae3a445a3bc115fa8/internal/internal_worker.go#L573-L579
…r-go#1342) Introduce the following methods to the WorkflowRegistry and ActivityRegistry interfaces: GetRegisteredWorkflows GetWorkflowAlias GetWorkflowFn GetRegisteredActivities GetActivityAlias GetActivityFn The logic already exists in the internal implementation of the registry but not exposed to the public API. Also implement these methods for WorkflowReplayer and WorkflowShadower. Update unit tests so that they now call the top level methods. Why? To expose on Uber internal debug page How did you test it? Unit tests Tested on staging environment Potential risks Worst case: these methods return unexpected result
…ber-go#1343) Update TestActivityEnvironment with new APIs so that it implements the ActivityRegistry interface Update TestWorkflowEnvironment with new APIs so that it implements the Registry interface
* Partial fix for Continue as new case * Partial fix for Continue as new case
It has nothing to do with workflow_client - they don't depend on each other. Also extract retryWhileTransientError as the same code-block used everywhere in internal package. // gonna replace it later
Made it against my fork by mistake. Re-creating against the original |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Extracting domain client to a separate file
Also extract retryWhileTransientError as the same code-block used
everywhere in internal package
Adding code-coverage + more structure
unit tests
No risks