Extracting domain client to a separate file and add code coverage #1

dkrotx · 2024-06-21T15:50:41Z

Extracting domain client to a separate file
Also extract retryWhileTransientError as the same code-block used
everywhere in internal package

Adding code-coverage + more structure

unit tests

No risks

* Improved error messages in getValidatedActivityOptions, also removed redundant call to getActivityOptions * Added tests for the error messages in getValidatedActivityOptions * Rolled back to the previous error messages and transformed the three new tests into a table test

The only test that fails with go 1.20 is this one. The reason is workerOptions.WorkerActivitiesPerSecond == 0, and it is not augmented in newActivityWorker() - this supposed to be done in callers. This breaks task-dispatching: polled tasks are never executed since it is always above the limit(0).

What changed? The file was named as reply_test earlier so I changed it to replay_test. Why? The file name was confusing so I updated it to a more meaningful one. How did you test it? locally Potential risks NA

Docs are currently a bit vague on what time exactly it represents, which brings up questions like "is this the same as the cron schedule start time" or "what time does it represent if my workers are down for an hour". So this description was gathered experimentally. It may be a good idea to track exactly what updates it and when, but I suspect that the "recorded event" is a complete description. Or it probably should be.

…t suite (uber-go#1227) * Fixed the spelling of replay_test file. * Added one scenario for greeting workflow.changed activity signature to something incompatible with arguments * Added all the scenarios for CDNC-4316 * Added all the scenarios for CDNC-4316 * Added supporting documentation for the tests * Added supporting documentation for the workflow

Resolving this "two kinds of finished-timeout / heartbeat-timeout" issue with ErrResultPending will need some new semantics on the server side... but at least there's a workaround until then. The general concept of a "pending activity" does seem worth supporting, particularly because it can respond with an "already completed" error when a second "complete" request is received. It just has unfortunate quirks at the moment.

… test suite (uber-go#1231) * Added the Activity Registration required failure scenario to replayer test suite * Renamed files according to go convention * added copyright info to the files * Skipped the test case that panics at incompatible activity returns. This is an expected behaviour and test is included for documentation purposes.

This allows us to move to `//go:embed` or use hard-coded strings (or any other data-source). Short-term plans are to use go:embed to read files once / not require runtime access to files, only build.

This is necessary to use worker-instance-registered activities, as otherwise they are unrecognized and may not be able to replay successfully (as a changed activity name is a non-backwards-compatible change).

…ber-go#1225) * Merged the timeout loging for the tests in internal_workers_test.go * Fixed the import order and fixed the name of the function

What changed? To start a new workflow correctly, we actually need more fields: - serialized input - header info Potential risks low risk to add NEW receive methods on public struct

…s into the replayersuite. (uber-go#1237) * Added tests for parallel workflow * Added the test for coroutine where removal of branch should have caused failure * Added copyright info

…1236) * Fixed the spelling of replay_test file. * Added the Activity Registration required failure scenario to replayer test suite * Documentation change * Added test cases for branch workflow * Added change in number of branches scenario to replayer * Added copyright info * Update replay_test.go * Fixed merge issues * Added assertion * Updated assretion

…o#1240) At least, I finally ran it more than 5x in a row without failure. yay! We should probably just ban bare atomics. There are occasionally test races between test-completion and zap logging, but those are sorta un-resolvable unless we reliably stop everything in tests. Which would be excellent, but I think we're fairly far from achieving that at the moment. There also seem to still be some flaky runs of these locally-dispatched tests, but I'm not sure what the causes is yet. Still, seems improved.

* Switched to revive, goimports, re-formatted everything Works pretty well! Using latest tools/go.mod versions (except thrift), and using the server's revive.toml for starting out. Revive is a bit noisy due to some `Id` -> `ID` recommendations, but we probably should actually do that for v2.

…fication in current behavior) in the switch case has no effect on replayer. (uber-go#1238) * Added the conflicting activity registration bug * Added copyright info * Resolved git comments and split the test cases into two * Cleaned up the code * Added more comments nit

…ister (uber-go#1242) * Fixed the spelling of replay_test file. * Added the Activity Registration required failure scenario to replayer test suite * Documentation change * Removed extra file * Replaced Activity.RegisterWithOptions with replayers oen acitivty register * Removed activity from imports: nit

* [cadence-client] produce a log when activities time out logging when activities time out will help with debugging * Update internal_task_handlers.go

…#1245) * Added a better log message for nondeterministic errors * Upped log level to Error as this is an error * Created test for the logging update * Fixed imports

These just snuck through, possibly we have a gap in our CI.

…ber-go#1247) The added test hopefully reveals the issue - prior to this fix, that `err := f.Get(...)` -> `errors.Is(err, ...)` check passed, because the error value was not ever encoded into a `*GenericError`. Thankfully this was only a test environment bug, and a user discovered it due to writing tests. --- The non-test behavior can be seen split between these two locations: - encoding error reason / data: https://github.com/uber-go/cadence-client/blob/ad91fc718330cb40ff26627fa1d1a6c78c2ceaca/internal/internal_event_handlers.go#L1159-L1163 - constructing the generic error in handleLocalActivityMarker: https://github.com/uber-go/cadence-client/blob/ad91fc718330cb40ff26627fa1d1a6c78c2ceaca/internal/internal_event_handlers.go#L1132-L1135 ... which also show incorrect / pointless string/byte conversions. I've left comments there just to prevent confusion, but these fields are probably worth changing at some point.

…eadability. (uber-go#1244) * Fixed the spelling of replay_test file. * Added the Activity Registration required failure scenario to replayer test suite * Documentation change * Removed extra file * Extracting the replayer specific utilities into a separate file for readability. * Pulling more helper functions * Separated all the matching cases

It is very expected to have this field, but customers have to walk through the stack trace, which is not always convinient especially with wrappers.

uber-go#1252) * Adding in additional header as a config option to determine a more stable isolation-group

uber-go#1260) * Add hardware monitoring of CPU and RAM metrics on workers * Use cpu.counts over runtime.numcpu() * Add host machine tag injection for metrics * add Sync.Once to ensure metrics is emitted once per worker host

…uber-go#1222) Bumps [golang.org/x/net](https://github.com/golang/net) from 0.0.0-20220121210141-e204ce36a2ba to 0.7.0. - [Release notes](https://github.com/golang/net/releases) - [Commits](https://github.com/golang/net/commits/v0.7.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…o#1264) Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.4.1 to 1.11.1. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](prometheus/client_golang@v1.4.1...v1.11.1) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* replay test sequential * Add replay test case for problematic continue-as-new case * fix * Made minor comment fixes and expected error handling --------- Co-authored-by: taylan isikdemir <taylan@uber.com>

* Release v1.2.7 * Release v1.2.8 * Empty commit to trigger github CI --------- Co-authored-by: Tim Li <ltim@uber.com> Co-authored-by: Tim Li <47233368+timl3136@users.noreply.github.com>

* Revert "Making Workflow and Activity registration optional when they are mocked (uber-go#1256)" This reverts commit 2e73362. * Revert "Addressing difference in workflow interceptors when using the testkit (uber-go#1257)" This reverts commit e31ba38.

* Release v1.2.7 * Release v1.2.8 * Empty commit to trigger github CI * add retract directive for v1.2.8 * Update version and changelog --------- Co-authored-by: Tim Li <ltim@uber.com> Co-authored-by: Tim Li <47233368+timl3136@users.noreply.github.com> Co-authored-by: Jakob Haahr Taankvist <jht@uber.com>

…art APIs (uber-go#1321)

…1312) Clarify behavior on return of error Document the whole execution sequence Docs added to both internal package and exposed type alias due to: golang/go#44905 Signed-off-by: Alexander Shopov <ashopov@uber.com> Co-authored-by: David Porter <contact@davidporter.id.au>

Seems like this has been left un-pinned and un-documented for a long time. Might as well fix that, so we can address the new async APIs in e.g. uber-go#1327 in a standardized way.

…#1329) There's more work to be done, but it's a major step towards forcing stability. Oddly, missing the safe.directory git setting leads to Go failing to get version info, leading to errors building, but for some reason it doesn't show the error that git's reporting (missing safe directory). Just stuff like this with no other output: ``` error obtaining VCS status: Use -buildvcs=false to disable VCS stamping. ``` Once I fixed that, I learned that `go test -exec nonexistent ./... >/dev/null` errors, but the "FAIL: command nonexistent not found" error is.... reported on stdout, so it's hidden. And so is the "FAIL the/package/name" message. But if you have a failing build, the failure *is* printed to stderr: ``` ❯ go test -exec true ./... >/dev/null # go.uber.org/cadence/evictiontest evictiontest/workflow_cache_eviction_test.go:58:4: missing ',' in composite literal ``` I have no idea why `-exec nonexistent` isn't on stderr, but I left a comment in the makefile anyway. Quite a large amount of weird stuff condensed into a seemingly simple goal.

Followups from uber-go#1327: 1. simplifying the memo/search-attr encoding tests 2. making sure the dataconverter tests can detect correct vs incorrect encoder use 1 is pretty simple: memos and search attrs are ALWAYS JSON because the server must interpret them, so the exact bytes should not change in the future. A hardcoded string is easy to verify, and strengthens the guarantee that "this should not change even if other encoding changes". 2 is more interesting: as originally written, the test would still pass if the custom dataconverter was *not* saved and used. The type-assertion to check the internal field somewhat prevents that, but it's unnecessary and a disjointed check. So it has been rewritten: now the serialized request bytes must clearly come from the test encoder, not the default encoder, and they must be detectably different.

…o#1333) * Migrate CI from AWS queues to Google Kubernetes Engine queues - Add necessary k8s plugin config to the pipeline for use with GKE + Buildkite's agent-stack-k8s helm stack - Remove docker agent tag filter - Install bk agent in Dockerfile for use in the code coverage step - Add comment on source of buildkite-agent installation commands - Add comment on yq explode usage to the top of pipeline.yml

Mockery is currently failing on go 1.22 on Linux due to golang/go#64812 tl;dr, changes in go 1.22 break the popular `x/tools/go/packages.Load` tool, which Mockery uses. Upgrading our tools' `x/tools` (to latest) resolves it, and appears to work fine for go 1.19 and above on both Linux and OSX, and has zero effect on our users.

…er-go#1341) Apparently these also failed on the original PR, but due to missing a `\` in uber-go#1303 it wasn't failing in CI because the env var was gone by the time `exit` ran. I should really take that as a hint to finally rewrite that chunk of the makefile, so it just does a normal `go test ./...` :\ but not today. Thankfully this seems to be the only set of tests that snuck past, and they were just incorrect. I'm not sure why I apparently didn't run them or something while writing them in uber-go#1331 but it seems pretty clear I didn't since the old pre-merge SHA fails too. Bleh. --- To correct the tests' original flaws: 1. memos *are* supposed to be encoded by custom dataconverters, and they are. we don't search them so they're not JSON like search attrs are, they're just blobs we expose in list APIs (and in workflow metadata). 2. `Equal(string, []byte)` just doesn't work, and the default dataconverter adds a trailing newline to separate args. 3. the `[]byte`-heavy copypasta meant the Input-checking tests were odd, so I changed them, and fixed the args to the dataconverter. Internally we splat the args in here, so it's not `[]interface{..}`-packed: https://github.com/uber-go/cadence-client/blob/6decfc78571a9d91d943815ae3a445a3bc115fa8/internal/internal_worker.go#L573-L579

…r-go#1342) Introduce the following methods to the WorkflowRegistry and ActivityRegistry interfaces: GetRegisteredWorkflows GetWorkflowAlias GetWorkflowFn GetRegisteredActivities GetActivityAlias GetActivityFn The logic already exists in the internal implementation of the registry but not exposed to the public API. Also implement these methods for WorkflowReplayer and WorkflowShadower. Update unit tests so that they now call the top level methods. Why? To expose on Uber internal debug page How did you test it? Unit tests Tested on staging environment Potential risks Worst case: these methods return unexpected result

…ber-go#1343) Update TestActivityEnvironment with new APIs so that it implements the ActivityRegistry interface Update TestWorkflowEnvironment with new APIs so that it implements the Registry interface

* Partial fix for Continue as new case * Partial fix for Continue as new case

It has nothing to do with workflow_client - they don't depend on each other. Also extract retryWhileTransientError as the same code-block used everywhere in internal package. // gonna replace it later

dkrotx · 2024-06-21T15:51:44Z

Made it against my fork by mistake. Re-creating against the original

jakobht and others added 30 commits March 9, 2023 11:21

Fixed the spelling of replay_test file. (uber-go#1226)

23fec5c

What changed? The file was named as reply_test earlier so I changed it to replay_test. Why? The file name was confusing so I updated it to a more meaningful one. How did you test it? locally Potential risks NA

Shift replayer to prefer io.Reader rather than filenames (uber-go#1234)

6ad9a67

This allows us to move to `//go:embed` or use hard-coded strings (or any other data-source). Short-term plans are to use go:embed to read files once / not require runtime access to files, only build.

Expose activity registry on the workflow replayer (uber-go#1232)

5454503

This is necessary to use worker-instance-registered activities, as otherwise they are unrecognized and may not be able to replay successfully (as a changed activity name is a non-backwards-compatible change).

Merged the timeout logic for the tests in internal_workers_test.go (u…

342c383

…ber-go#1225) * Merged the timeout loging for the tests in internal_workers_test.go * Fixed the import order and fixed the name of the function

[error] surface more fields in ContinueAsNew error (uber-go#1235)

ebb27dd

What changed? To start a new workflow correctly, we actually need more fields: - serialized input - header info Potential risks low risk to add NEW receive methods on public struct

Add and emulate the issues found in the workflows involving coroutine…

f53f72d

…s into the replayersuite. (uber-go#1237) * Added tests for parallel workflow * Added the test for coroutine where removal of branch should have caused failure * Added copyright info

[activity/logging] produce a log when activities time out (uber-go#1243)

18c143a

* [cadence-client] produce a log when activities time out logging when activities time out will help with debugging * Update internal_task_handlers.go

Better logging when getting some nondeterministic behaviours (uber-go…

ad91fc7

…#1245) * Added a better log message for nondeterministic errors * Upped log level to Error as this is an error * Created test for the logging update * Fixed imports

make fmt fix (uber-go#1246)

e9ed409

These just snuck through, possibly we have a gap in our CI.

Adding WorkflowType to "Workflow panic" log-message (uber-go#1259)

407a703

It is very expected to have this field, but customers have to walk through the stack trace, which is not always convinient especially with wrappers.

Adding in additional header to determine a more stable isolation-group (

58f9746

uber-go#1252) * Adding in additional header as a config option to determine a more stable isolation-group

Bump version strings for 1.0 release (uber-go#1261)

a072981

Bumped Go version (uber-go#1262)

225289a

We now log when the automatic heartbeating fails (uber-go#1263)

32f02a1

ketsiambaku and others added 28 commits February 21, 2024 13:07

Expose method to get default worker options (uber-go#1311)

6decfc7

Support two-legged OAuth flow (uber-go#1304)

10ebf6b

Tests for Continue as New and one additional scenario (uber-go#1313)

9cc8453

* replay test sequential * Add replay test case for problematic continue-as-new case * fix * Made minor comment fixes and expected error handling --------- Co-authored-by: taylan isikdemir <taylan@uber.com>

Replace JWT library for signing tokens (uber-go#1308)

4985100

Release v1.2.8 (uber-go#1314)

bc2753a

* Release v1.2.7 * Release v1.2.8 * Empty commit to trigger github CI --------- Co-authored-by: Tim Li <ltim@uber.com> Co-authored-by: Tim Li <47233368+timl3136@users.noreply.github.com>

Added a unit test for workflow replayer utils (uber-go#1316)

0d1f929

Enable codecov and generate metadata file as artifact (uber-go#1320)

00ff688

Update idl and add wrapper implementaton for async start/signalwithst…

ba7fa67

…art APIs (uber-go#1321)

Fix coverage metadata commit info (uber-go#1323)

81a37e4

Enforce 85% new line coverage (uber-go#1325)

d17db37

Pin mockery and regenerate everything (uber-go#1328)

a55fc13

Seems like this has been left un-pinned and un-documented for a long time. Might as well fix that, so we can address the new async APIs in e.g. uber-go#1327 in a standardized way.

Update client wrappers with new async APIs (uber-go#1327)

08b284a

Added an option to exclude the list of workflows by Type (uber-go#1335)

73b12d5

Update compability adapter to support new enum value (uber-go#1337)

f5ec359

Implement the registered workflows and activities APIs in testsuite (u…

9a7e9b7

…ber-go#1343) Update TestActivityEnvironment with new APIs so that it implements the ActivityRegistry interface Update TestWorkflowEnvironment with new APIs so that it implements the Registry interface

Added test for TerminateWorkflow in the internal package (uber-go#1345)

92b8307

Test for GetWorkflowHistory (uber-go#1346)

afa1a4c

Partial fix for Continue as new case (uber-go#1347)

d3b79f7

* Partial fix for Continue as new case * Partial fix for Continue as new case

Extracting domain client to a separate file

38480e6

It has nothing to do with workflow_client - they don't depend on each other. Also extract retryWhileTransientError as the same code-block used everywhere in internal package. // gonna replace it later

dkrotx closed this Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting domain client to a separate file and add code coverage #1

Extracting domain client to a separate file and add code coverage #1

dkrotx commented Jun 21, 2024

dkrotx commented Jun 21, 2024

Extracting domain client to a separate file and add code coverage #1

Extracting domain client to a separate file and add code coverage #1

Conversation

dkrotx commented Jun 21, 2024

dkrotx commented Jun 21, 2024