feat(x/bank): Replace regex parsing of denom validation to generated parsing #19511

mattverse · 2024-02-21T09:14:32Z

Description

Replaces Regex Usage when parsing denom for validation. Instead uses generated code for the validation using ragel( https://en.wikipedia.org/wiki/Ragel).

As linked in original issue, using regex for denom validation has a huge performance overhead, thus replacing it with direct matching methods.

Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

included the correct type prefix in the PR title
confirmed ! in the type prefix if API or client breaking change
targeted the correct branch (see PR Targeting)
provided a link to the relevant issue or specification
reviewed "Files changed" and left comments if necessary
included the necessary unit and integration tests
added a changelog entry to CHANGELOG.md
updated the relevant documentation or specification, including comments for documenting Go code
confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

confirmed the correct type prefix in the PR title
confirmed all author checklist items have been addressed
reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

Summary by CodeRabbit

New Features
- Introduced a ValueCodec for enhanced type handling in collections.
- Added a new comparison method IsGT for types.Coin instances.
- Implemented core/router.Service to improve runtime service handling.
Refactor
- Updated coin denomination validation to use direct matching methods, enhancing validation accuracy.
- Enhanced ParseDecCoin function for more efficient parsing of decimal coins.
Tests
- Modified integration tests to use a custom regular expression for coin denominations, ensuring broader test coverage.

coderabbitai · 2024-02-21T09:15:02Z

Walkthrough

The recent update focuses on enhancing the efficiency and reliability of coin denomination validation and parsing within the Cosmos SDK. By replacing regex parsing with direct matching methods and introducing a new codec and comparison method, the changes aim to optimize runtime performance and improve the codebase's maintainability.

Changes

Files	Change Summary
`types/coin.go` `types/coin_regex.go`	Replaced regex parsing with direct matching methods for denomination validation.
`types/coin_regex.go` `types/.../coin_regex.rl`	Added functionality for parsing and matching coin denominations using regular expressions and Ragel.
`types/dec_coin.go`	Enhanced `ParseDecCoin` function for better parsing accuracy.
`.../bank/keeper/deterministic_test.go`	Updated denomination regex for testing purposes.
`core/router.Service`	Implemented in runtime for service routing.
`collections/maps.go`	Added `ValueCodec` for `math.Uint` type.

Assessment against linked issues

Objective	Addressed	Explanation
Enhance coin denom matching speed and shift work to compile time rather than runtime. (#17221)	✅
Replace regexp parsing with more direct methods that are functionally equivalent. (#17221)	✅
Validate that denom strings adhere to a specified regex pattern when writing metadata. (#12026)	❌	The changes focus on parsing and validation mechanisms, but do not address validation in `InitGenesis` specifically.

Related issues

Issue Only allow valid denoms in bank's denom Metadata #12026 could be linked to this PR because:
- The PR introduces improvements in denomination validation which is a step towards ensuring only valid denominations are allowed, addressing part of the issue's objectives regarding validation of denom strings.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit-tests for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit tests for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit tests.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
The JSON schema for the configuration file is available here.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between 0cf0c28 and 2c040a4.

Files selected for processing (6)

types/coin.go (2 hunks)
types/coin_regex.go (1 hunks)
types/coin_regex.rl (1 hunks)
types/coin_test.go (1 hunks)
types/dec_coin.go (1 hunks)
types/dec_coin_test.go (1 hunks)

Files not summarized due to errors (1)

types/coin_regex.go: Error: Message exceeds token limit

Additional comments: 54

types/coin_regex.rl (2)

3-21: The MatchDenom function is well-implemented using Ragel, providing a more efficient alternative to regex-based parsing for coin denominations. The use of character classes and patterns is clear and concise. However, it's important to ensure comprehensive test coverage for various valid and invalid denomination strings to validate the correctness of this parsing logic.

24-66: The MatchDecCoin function introduces a more complex parsing logic to handle decimal coin amounts and denominations. The use of actions to mark the start and end of amounts and denominations is a good practice, allowing for precise extraction. However, similar to MatchDenom, it's crucial to verify the correctness through extensive testing, especially for edge cases and invalid inputs.

types/dec_coin.go (1)

634-642: The update to ParseDecCoin to use MatchDecCoin for parsing is a significant improvement in terms of performance. By avoiding regex and using specific byte positions for amount and denomination extraction, the function becomes more efficient. Ensure that this new parsing logic is thoroughly tested, particularly for handling edge cases and ensuring that the extracted positions accurately reflect the intended parts of the input string.

types/coin.go (1)

846-852: The replacement of regex-based validation with a Ragel-generated function (MatchDenom) in the ValidateDenom function is a significant change aimed at improving performance by reducing runtime overhead. This approach aligns well with the PR's objectives of enhancing efficiency and maintainability by moving away from regex for denom validation.

Performance: The shift to a compile-time generated parsing method should indeed offer substantial performance improvements over the regex-based approach, especially in scenarios where denom validation is a frequent operation.

Maintainability: While the performance benefits are clear, it's important to consider the maintainability of the Ragel-generated code. Generated code can sometimes be harder to read or modify than hand-written code, depending on how it's structured and documented. It would be beneficial to ensure that there's sufficient documentation on how to regenerate this code if the validation rules need to change in the future.

Overall, this change appears to be a positive step towards optimizing the Cosmos SDK's performance and codebase clarity. However, ensuring the maintainability of the Ragel-generated code is crucial for future modifications and understanding of the validation logic.

types/coin_regex.go (2)

5-448: The MatchDenom function is well-structured and follows the expected pattern for a Ragel-generated state machine. It's designed to validate coin denominations efficiently. However, it's crucial to ensure that the generated code aligns with the intended patterns for denominations. Since this is machine-generated, it's assumed to be correct based on the Ragel specification provided. The function returns a boolean indicating whether the input matches the denomination pattern, which is a straightforward and expected behavior.

451-963: The MatchDecCoin function is designed to parse decimal coin amounts and extract the amount and denomination parts. It's more complex than MatchDenom due to the need to capture specific parts of the input. The function returns four values: amountStart, amountEnd, denomEnd, and isValid, which are used to extract the amount and denomination from the input.

The use of -1 as the initial value for amountStart, amountEnd, and denomEnd is a good practice for indicating that these positions have not been found.

The isValid flag is correctly set based on the state machine's final state, ensuring that only valid decimal coin strings are accepted.

The adjustments made in the EOF actions (e.g., adjusting denomEnd to exclude space if present) are thoughtful and necessary for correct parsing.

Overall, the function appears to be correctly implemented according to the Ragel specification. It's important to have comprehensive tests to ensure that all edge cases are handled correctly, especially given the complexity of parsing decimal coin amounts.

types/dec_coin_test.go (1)

380-382: The added test case for parsing decimal coins with a value of 5.5 atoms is correctly implemented and follows the established testing pattern in the file. It's good to see that the test case explicitly checks for the expected outcome, ensuring that the parsing logic handles decimal values correctly. This test is crucial for validating the changes made to the parsing logic in the Cosmos SDK.

types/coin_test.go (47)

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [1-1]

The package declaration and imports are correctly structured, adhering to Go conventions. It's good practice to group standard library imports separately from third-party libraries, which is followed here.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [11-11]

The global variables testDenom1 and testDenom2 are well-named, indicating their purpose as denominations for test coins. This enhances readability and maintainability of the test suite.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [13-13]

The coinTestSuite struct is well-organized, containing fields for various test coins and a field for empty coins. This setup facilitates the reuse of test data across different test cases, promoting the DRY (Don't Repeat Yourself) principle.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [15-15]

The TestCoinTestSuite function correctly initializes the test suite using the suite.Run function from the testify/suite package. This approach leverages the setup and teardown hooks provided by the testify suite, allowing for more organized and manageable test code.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [17-17]

The SetupSuite method is efficiently setting up test coins for use in the suite's test cases. The use of math.NewInt and sdk.NewCoin ensures that the test data is accurately represented and easily modifiable.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [21-21]

The TestCoin method correctly tests the creation of coins with various denominations and amounts, including negative amounts to ensure that the coin creation logic correctly panics in such cases. This is crucial for maintaining the integrity of coin operations within the SDK.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [23-23]

The TestCoin_String method effectively tests the string representation of a coin, ensuring that it matches the expected format. This is important for user interfaces and logging, where a clear and accurate representation of coins is necessary.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [25-25]

The TestIsEqualCoin method thoroughly tests the equality comparison between coins, covering various scenarios including different denominations and amounts. This ensures the reliability of coin comparisons, which are fundamental to many operations in the SDK.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [27-27]

The TestCoinIsValid method tests the validity of coins with various denominations, including edge cases like very long denominations and special characters. This comprehensive testing is essential for ensuring that only valid coins are accepted by the system.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [29-29]

The TestCoinsDenoms method correctly tests the extraction of denominations from a set of coins, ensuring that the functionality works as expected even in edge cases. This is important for operations that need to work with the denominations of coins directly.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [31-31]

The TestAddCoin method provides a thorough test of the addition operation for coins, including checks for overflows. This is crucial for ensuring the accuracy and safety of coin arithmetic operations within the SDK.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [33-33]

The TestAddCoinAmount method tests the addition of a raw amount to a coin, covering both positive and zero addition cases. This ensures that coin amounts can be accurately and safely modified.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [35-35]

The TestSubCoin method tests the subtraction operation between coins, including cases that should panic due to invalid operations. This ensures the robustness and safety of coin subtraction within the SDK.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [37-37]

The TestSubCoinAmount method tests the subtraction of a raw amount from a coin, including cases that should panic due to resulting in negative amounts. This is important for maintaining the integrity of coin values.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [39-39]

The TestMulIntCoins method tests the multiplication of a set of coins by an integer, including a case that should panic due to overflow. This ensures that coin values can be accurately scaled up while maintaining system integrity.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [41-41]

The TestQuoIntCoins method tests the division of a set of coins by an integer, including cases that should panic due to division by zero. This ensures that coin values can be accurately scaled down in a safe manner.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [43-43]

The TestIsGTCoin method tests the greater-than comparison between coins, including cases that should panic due to invalid comparisons. This ensures the reliability of coin comparisons, which are essential for decision-making logic in the SDK.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [45-45]

The TestIsGTECoin method tests the greater-than-or-equal-to comparison between coins, including cases that should panic due to invalid comparisons. This ensures the accuracy of coin comparisons, supporting various operations and logic within the SDK.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [47-47]

The TestIsLTECoin method tests the less-than-or-equal-to comparison between coins, including cases that should panic due to invalid comparisons. This is important for ensuring the correctness of comparisons, which are fundamental to many SDK operations.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [49-49]

The TestIsLTCoin method tests the less-than comparison between coins, including cases that should panic due to invalid comparisons. This ensures the robustness of coin comparison logic, which is crucial for various SDK functionalities.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [51-51]

The TestCoinIsZero method correctly tests whether a coin's amount is zero, covering both true and false cases. This functionality is essential for operations that need to distinguish between zero and non-zero coin amounts.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [53-53]

The TestCoinIsNil method tests the nil-check functionality for coins, ensuring that both uninitialized coins and coins with a denomination but no amount are considered nil. This is important for handling uninitialized or partially initialized coins correctly.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [55-55]

The TestFilteredZeroCoins method tests the filtering of zero-amount coins from a set, ensuring that only coins with non-zero amounts are retained. This is crucial for operations that should ignore zero-amount coins.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [57-57]

The TestCoins_String method tests the string representation of a set of coins, ensuring that it matches the expected format. This is important for displaying coin sets in user interfaces and logs.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [59-59]

The TestIsZeroCoins method tests whether a set of coins contains only zero-amount coins, covering various scenarios. This functionality is essential for operations that need to treat zero-amount coin sets differently from non-zero ones.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [61-61]

The TestEqualCoins method thoroughly tests the equality comparison between sets of coins, ensuring that the comparison logic is accurate and reliable. This is fundamental to many operations within the SDK that rely on comparing coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [63-63]

The TestAddCoins method tests the addition of two sets of coins, including cases that should result in zero coins being removed from the result. This ensures the accuracy and safety of adding coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [65-65]

The TestCoinsAddCoalescesDuplicateDenominations method tests the addition of coin sets with duplicate denominations, ensuring that duplicates are correctly coalesced. This is crucial for maintaining the integrity of coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [67-67]

The TestSubCoins method tests the subtraction of one set of coins from another, including cases that should panic due to invalid operations. This ensures the robustness and safety of coin set subtraction within the SDK.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [69-69]

The TestSafeSubCoin method tests the safe subtraction of one coin from another, including cases that should result in an error due to negative results. This is important for ensuring that coin subtraction is performed safely and accurately.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [71-71]

The TestCoins_Validate method tests the validation logic for sets of coins, including various invalid scenarios. This comprehensive testing is essential for ensuring that only valid coin sets are accepted by the system.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [73-73]

The TestMinMax method tests the calculation of the minimum and maximum between two sets of coins, ensuring that the functionality works as expected. This is important for operations that need to compare coin sets to determine bounds.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [75-75]

The TestCoinsGT method tests the greater-than comparison between sets of coins, ensuring that the comparison logic is accurate. This is crucial for decision-making logic within the SDK that relies on comparing coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [77-77]

The TestCoinsLT method tests the less-than comparison between sets of coins, ensuring that the comparison logic is accurate. This is important for various SDK operations that depend on comparing coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [79-79]

The TestCoinsLTE method tests the less-than-or-equal-to comparison between sets of coins, ensuring that the comparison logic is accurate. This supports the correctness of operations that rely on comparing coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [81-81]

The TestParseCoins method tests the parsing of coin strings into sets of coins, covering various scenarios including invalid inputs. This ensures that coin strings can be accurately and safely converted into coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [83-83]

The TestSortCoins method tests the sorting functionality for sets of coins, ensuring that coins are correctly sorted and that invalid coins are handled appropriately. This is crucial for maintaining the order and integrity of coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [85-85]

The TestSearch method tests the search functionality within sets of coins, ensuring that coins can be accurately found or determined to be absent. This supports operations that need to query coin sets for specific denominations.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [87-87]

The TestCoinsIsAnyGTE method tests whether any coin in one set is greater than or equal to any coin in another set, covering various scenarios. This is important for operations that need to compare coin sets in a non-strict manner.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [89-89]

The TestCoinsIsAllGT method tests whether all coins in one set are strictly greater than all coins in another set, ensuring the accuracy of this comparison logic. This supports the correctness of operations that rely on strict comparisons between coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [91-91]

The TestCoinsIsAllGTE method tests whether all coins in one set are greater than or equal to all coins in another set, ensuring the accuracy of this comparison logic. This is crucial for operations that rely on non-strict comparisons between coin sets.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [93-93]

The TestNewCoins method tests the creation of new coin sets, including cases that should panic due to invalid inputs. This ensures that coin sets are correctly initialized and validated upon creation.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [95-95]

The TestCoinsIsAnyGT method tests whether any coin in one set is strictly greater than any coin in another set, covering various scenarios. This is important for operations that need to compare coin sets in a specific manner.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [97-97]

The TestCoinsIsAnyNil method tests whether any coin in a set is considered nil, ensuring that sets with uninitialized or partially initialized coins are correctly identified. This is crucial for handling coin sets safely.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [99-99]

The TestMarshalJSONCoins method tests the JSON marshaling and unmarshaling of coin sets, ensuring that coin sets can be accurately represented in JSON format and reconstructed from it. This is important for data serialization and communication.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [101-101]

The TestCoinValidate method tests the validation logic for individual coins, including various invalid scenarios. This comprehensive testing is essential for ensuring that only valid coins are accepted by the system.

106-111: > 📝 NOTE

This review was outside the diff hunks, and no overlapping diff hunk was found. Original lines [103-103]

The TestCoinAminoEncoding method tests the Amino encoding and decoding of individual coins, ensuring that coins can be accurately serialized and deserialized using the Amino codec. This is important for data persistence and communication.

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between 2c040a4 and d493fc6.

Files selected for processing (1)

types/coin_regex.go (1 hunks)

Files not summarized due to errors (1)

types/coin_regex.go: Error: Message exceeds token limit

Additional comments: 2

types/coin_regex.go (2)

8-455: The MatchDenom function is well-structured and follows the generated pattern from Ragel for state machine-based parsing. The function aims to validate coin denominations using a finite state machine, which is a more efficient approach compared to regex parsing. The transition arrays and logic are generated by Ragel, ensuring correctness as long as the .rl file is correctly defined. The function returns a boolean indicating whether the input matches the denom pattern, which aligns with the PR's objective to improve performance and maintainability by replacing regex-based validation.

458-972: The MatchDecCoin function introduces a more complex state machine to parse decimal coin amounts, capturing the start and end positions of the amount and denomination. This function is a significant improvement in terms of performance and precision over regex-based parsing for decimal coins. The use of action codes to mark the start and end of the amount and to adjust the end position of the denomination is a clever use of Ragel's capabilities. However, it's crucial to ensure that the .rl file accurately defines the expected patterns to avoid any parsing inaccuracies. The function returns the start and end positions for the amount and denomination, along with a validity flag, which is a practical approach for downstream processing.

types/coin.go

facundomedica · 2024-02-21T09:49:33Z

I'm not familiar with Ragel, could we get a small benchmark to compare? Also, how is the code generated? Thank you!

mattverse · 2024-02-21T10:58:56Z

@facundomedica of course! I'll get the benchmark tomorrow and reply in the PR

types/coin.go

types/coin_regex.rl

types/coin_regex.go

types/coin.go

coderabbitai

Review Status

Actionable comments generated: 3

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between d493fc6 and fac684e.

Files selected for processing (4)

CHANGELOG.md (1 hunks)
types/coin.go (2 hunks)
types/coin_regex.go (1 hunks)
types/coin_regex.rl (1 hunks)

Files not summarized due to errors (1)

types/coin_regex.go: Error: Message exceeds token limit

Files skipped from review as they are similar to previous changes (2)

types/coin.go
types/coin_regex.rl

Additional comments: 4

CHANGELOG.md (4)
45-45: The entry for replacing regex parsing with direct matching methods for denom validation accurately reflects the PR's main objective. It's clear and directly links to the PR, facilitating easy reference.

45-45: The addition of a ValueCodec for the math.Uint type in collections maps is mentioned. This change seems unrelated to the main focus of the PR but is a valuable addition to the SDK. Ensure that this change has been documented in the appropriate sections of the SDK documentation for developers to understand its usage and benefits.

Verification successful

The search results indicate that the documentation, specifically within the collections/README.md file, has been updated to mention "math.Uint" multiple times. This suggests that the addition of a ValueCodec for the math.Uint type in collections maps, as mentioned in the review comment, is likely documented, albeit not with the explicit use of the term "ValueCodec". Given the context and the findings, it appears that the documentation has been updated in relation to the change, but the specific terminology used in the review comment may not have been directly matched in the documentation.
Scripts Executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for documentation updates related to ValueCodec for math.Uint
rg "ValueCodec for math.Uint" docs/
Length of output: 37

Script:
#!/bin/bash
# Broaden the search for documentation updates related to ValueCodec and math.Uint separately
rg "ValueCodec" -or "math.Uint" docs/
# Additionally, search for any mention of ValueCodec or math.Uint in README files or developer guides that might not be in the docs/ directory
rg "ValueCodec" -or "math.Uint" --glob "*README.md" --glob "*developer_guide*"
Length of output: 400
* 45-45: The introduction of the `IsGT` method for `types.Coin` is correctly documented. This method enhances the SDK's functionality by providing a straightforward way to compare coin values. It's important to ensure that examples of using `IsGT` are included in the SDK documentation to help developers understand its application. * 45-45: Adding the `--qrcode` flag to the `keys show` command is a user-friendly feature that enhances the usability of the SDK. This change should be highlighted in the SDK's user guide or command-line tool documentation to ensure users are aware of how to use this new feature.

types/coin_regex.go

mattverse · 2024-02-22T08:17:24Z

Finished benchmarking! Bench marked using existing benchmarking test https://github.com/cosmos/cosmos-sdk/blob/main/types/bench_test.go#L18 on parsing coins.

Here are the results:

$ benchstat old.txt new.txt

goos: darwin
goarch: arm64
pkg: github.com/cosmos/cosmos-sdk/types
             │   old.txt   │              new.txt               │
             │   sec/op    │   sec/op     vs base               │
ParseCoin-12   4.173µ ± 2%   3.846µ ± 1%  -7.83% (p=0.000 n=10)

             │   old.txt    │               new.txt                │
             │     B/op     │     B/op      vs base                │
ParseCoin-12   2.156Ki ± 0%   1.719Ki ± 0%  -20.29% (p=0.000 n=10)

             │  old.txt   │              new.txt               │
             │ allocs/op  │ allocs/op   vs base                │
ParseCoin-12   71.00 ± 0%   63.00 ± 0%  -11.27% (p=0.000 n=10)

We can see

7% decrease in operation time
average of approx 20% improvement on memory usage
approx 11% improvement on heap allocation !

coderabbitai

Review Status

Actionable comments generated: 2

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between fac684e and 54e29ff.

Files selected for processing (2)

types/coin_regex.go (1 hunks)
types/coin_regex.rl (1 hunks)

Files skipped from review as they are similar to previous changes (1)

types/coin_regex.rl

Additional comments: 2

types/coin_regex.go (2)

2-5: The file header clearly states that this is generated code and should not be edited manually. It also specifies the source file and disables certain linter warnings. This is a good practice for generated code to ensure maintainability and clarity about its generation and purpose.

8-15: The comments explaining the previous regex patterns provide valuable context for understanding the transition from regex to Ragel. Including a brief comment about the choice of Ragel over regex, as suggested in the existing comments, is beneficial for future maintainers.

types/coin_regex.go

ValarDragon · 2024-02-22T14:53:48Z

I think the speed improvement is actually way higher. The IBC hashes are going to be much longer, but this isn't in the coin benchmarks

tac0turtle

What is the difference between using regal over a simple go loop since we know the input. Bez mentioned that it is recommended to move away from regular expression once there is understanding of the format.

@mattverse would love to better understand the reasoning behind using regal over native golang

ValarDragon · 2024-02-22T15:52:36Z

Ragel is native golang, its just code generated. I think its worth benchmarking against the for loop I put in the GH issue as well though

facundomedica · 2024-02-22T16:15:07Z

I just benchmarked only the ValidateDenom, first regex vs loop:

benchstat 20runs_regex.txt 20runs_loop.txt
goos: darwin
goarch: arm64
pkg: github.com/cosmos/cosmos-sdk/types
                │ 20runs_regex.txt │           20runs_loop.txt           │
                │      sec/op      │   sec/op     vs base                │
ValidateDenom-8       1913.5n ± 0%   484.4n ± 1%  -74.69% (p=0.000 n=20)

                │ 20runs_regex.txt │        20runs_loop.txt         │
                │       B/op       │    B/op     vs base            │
ValidateDenom-8         0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=20) ¹
¹ all samples are equal

                │ 20runs_regex.txt │        20runs_loop.txt         │
                │    allocs/op     │ allocs/op   vs base            │
ValidateDenom-8         0.000 ± 0%   0.000 ± 0%  ~ (p=1.000 n=20) ¹
¹ all samples are equal

Now loop vs ragel:

benchstat 20runs_loop.txt 20runs_ragel.txt 
goos: darwin
goarch: arm64
pkg: github.com/cosmos/cosmos-sdk/types
                │ 20runs_loop.txt │          20runs_ragel.txt           │
                │     sec/op      │   sec/op     vs base                │
ValidateDenom-8       484.4n ± 1%   124.8n ± 1%  -74.25% (p=0.000 n=20)

                │ 20runs_loop.txt │       20runs_ragel.txt       │
                │      B/op       │    B/op     vs base          │
ValidateDenom-8          0.0 ± 0%   160.0 ± 0%  ? (p=0.000 n=20)

                │ 20runs_loop.txt │       20runs_ragel.txt       │
                │    allocs/op    │ allocs/op   vs base          │
ValidateDenom-8        0.000 ± 0%   2.000 ± 0%  ? (p=0.000 n=20)

Used these denoms:

var denoms = []string{
	"ATM",
	"AMX",
	"XXX",
	"BTC",
	"ibc/27394FB092D2ECCD56123C74F36E4C1F926001CEADA9CA97EA622B25F41E5EB2",
	"ibc/498A0751C798A0D9A389AA3691123DADA57DAA4FE165D5C75894505B876BA6E4",
	"uatom",
}

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between 54e29ff and fe75550.

Files selected for processing (3)

types/coin_regex.go (1 hunks)
types/coin_regex.rl (1 hunks)
types/dec_coin.go (1 hunks)

Files skipped from review as they are similar to previous changes (3)

types/coin_regex.go
types/coin_regex.rl
types/dec_coin.go

mattverse · 2024-02-22T16:20:42Z

@facundomedica Amazing! Thanks so much 🙇 So is the claim loop shows approx 75% decrease then regel showing another 75% decrease from there? Also curious about other stats!

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between 8bc1443 and e81f4c9.

Files selected for processing (1)

CHANGELOG.md (1 hunks)

Files skipped from review as they are similar to previous changes (1)

CHANGELOG.md

ValarDragon · 2024-03-05T09:09:27Z

types/coin.go

-		return fmt.Errorf("invalid denom: %s", denom)
+	if reDnm == nil || reDecCoin == nil {
+		// Convert the string to a byte slice as required by the Ragel-generated function.
+		denomBytes := []byte(denom)


In the future we should do internal.unsafeconvstringtobytes

That should fix the allocations in this fn

ValarDragon

Lgtm! Nice job! Looking forward to this dropping out of profiles

kocubinski · 2024-03-05T16:50:24Z

types/coin_regex.rl

@@ -0,0 +1,39 @@
+// Code generated by regel using `ragel -Z coin_regex.rl`.


Comment is misleading, this code wasn't generated right? It was written and then coin_regex.go was generated from it?

The comment here goes directly to the generated code( coin_regex.go ). Changed in the most recent commit to be more clear!

tac0turtle · 2024-03-06T12:15:26Z

once linting passes we can merge. Thank you spending time on this

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between e81f4c9 and 7018ae4.

Files selected for processing (3)

types/coin.go (2 hunks)
types/coin_regex.go (1 hunks)
types/coin_regex.rl (1 hunks)

Files skipped from review as they are similar to previous changes (3)

types/coin.go
types/coin_regex.go
types/coin_regex.rl

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between 7018ae4 and c4e1abe.

Files selected for processing (3)

CHANGELOG.md (1 hunks)
tests/integration/bank/keeper/deterministic_test.go (1 hunks)
tests/integration/staking/keeper/deterministic_test.go (1 hunks)

Files skipped from review as they are similar to previous changes (3)

CHANGELOG.md
tests/integration/bank/keeper/deterministic_test.go
tests/integration/staking/keeper/deterministic_test.go

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between c4e1abe and 1a87ce7.

Files selected for processing (1)

types/coin_regex.go (1 hunks)

Files skipped from review as they are similar to previous changes (1)

types/coin_regex.go

coderabbitai

Review Status

Actionable comments generated: 0

Configuration used: .coderabbit.yml

Commits

Files that changed from the base of the PR and between 1a87ce7 and 97b691f.

Files selected for processing (1)

.golangci.yml (1 hunks)

Files skipped from review due to trivial changes (1)

.golangci.yml

…parsing (cosmos#19511)

)

* feat(x/bank): Replace regex parsing of denom validation to generated parsing (cosmos#19511) * fix: fix ragel denom validation introduced in cosmos#19511 (cosmos#19701) * fix: use loop instead of ragel (cosmos#19710) * Fix test * Fix integration test as well * Try linting * Fix lint * Fix lint * add changelog --------- Co-authored-by: Marko <marbar3778@yahoo.com> Co-authored-by: Adam Tucker <adam@osmosis.team> Co-authored-by: Adam Tucker <adamleetucker@outlook.com>

* feat(x/bank): Replace regex parsing of denom validation to generated parsing (cosmos#19511) * fix: fix ragel denom validation introduced in cosmos#19511 (cosmos#19701) * fix: use loop instead of ragel (cosmos#19710) * Fix test * Fix integration test as well * Try linting * Fix lint * Fix lint * add changelog --------- Co-authored-by: Marko <marbar3778@yahoo.com> Co-authored-by: Adam Tucker <adam@osmosis.team> Co-authored-by: Adam Tucker <adamleetucker@outlook.com> (cherry picked from commit ab4bc05)

* feat(x/bank): Replace regex parsing of denom validation to generated parsing (cosmos#19511) * fix: fix ragel denom validation introduced in cosmos#19511 (cosmos#19701) * fix: use loop instead of ragel (cosmos#19710) * Fix test * Fix integration test as well * Try linting * Fix lint * Fix lint * add changelog --------- Co-authored-by: Marko <marbar3778@yahoo.com> Co-authored-by: Adam Tucker <adam@osmosis.team> Co-authored-by: Adam Tucker <adamleetucker@outlook.com> (cherry picked from commit ab4bc05) Co-authored-by: Matt, Park <45252226+mattverse@users.noreply.github.com>

* feat(x/bank): Replace regex parsing of denom validation to generated parsing (cosmos#19511) * fix: fix ragel denom validation introduced in cosmos#19511 (cosmos#19701) * fix: use loop instead of ragel (cosmos#19710) * Fix test * Fix integration test as well * Try linting * Fix lint * Fix lint * add changelog --------- Co-authored-by: Marko <marbar3778@yahoo.com> Co-authored-by: Adam Tucker <adam@osmosis.team> Co-authored-by: Adam Tucker <adamleetucker@outlook.com>

Replace regex

2c040a4

mattverse requested a review from a team as a code owner February 21, 2024 09:14

coderabbitai bot reviewed Feb 21, 2024

View reviewed changes

Add changelog

d493fc6

coderabbitai bot reviewed Feb 21, 2024

View reviewed changes

Add changelog

60b4b47

facundomedica reviewed Feb 21, 2024

View reviewed changes

types/coin.go Outdated Show resolved Hide resolved

tac0turtle assigned kocubinski, alexanderbez and facundomedica Feb 21, 2024

alexanderbez approved these changes Feb 21, 2024

View reviewed changes

types/coin.go Outdated Show resolved Hide resolved

types/coin.go Outdated Show resolved Hide resolved

types/coin_regex.rl Show resolved Hide resolved

ValarDragon reviewed Feb 21, 2024

View reviewed changes

types/coin_regex.go Outdated Show resolved Hide resolved

ValarDragon reviewed Feb 21, 2024

View reviewed changes

types/coin.go Show resolved Hide resolved

mattverse added 2 commits February 22, 2024 15:33

Valars comment

fb0a26f

Apply comments

fac684e

mattverse force-pushed the mattverse/use-ragel branch from 5f24000 to fac684e Compare February 22, 2024 06:43

coderabbitai bot reviewed Feb 22, 2024

View reviewed changes

types/coin_regex.go Show resolved Hide resolved

types/coin_regex.go Outdated Show resolved Hide resolved

types/coin_regex.go Outdated Show resolved Hide resolved

Fix test

54e29ff

coderabbitai bot reviewed Feb 22, 2024

View reviewed changes

types/coin_regex.go Outdated Show resolved Hide resolved

types/coin_regex.go Outdated Show resolved Hide resolved

tac0turtle reviewed Feb 22, 2024

View reviewed changes

Fix tests

fe75550

coderabbitai bot reviewed Feb 22, 2024

View reviewed changes

mattverse added 2 commits March 4, 2024 13:09

Merge branch 'main' into mattverse/use-ragel

19aad48

Merge branch 'main' into mattverse/use-ragel

e81f4c9

coderabbitai bot reviewed Mar 5, 2024

View reviewed changes

ValarDragon reviewed Mar 5, 2024

View reviewed changes

ValarDragon approved these changes Mar 5, 2024

View reviewed changes

kocubinski reviewed Mar 5, 2024

View reviewed changes

Make lint happy, make comment more clear

7018ae4

coderabbitai bot reviewed Mar 7, 2024

View reviewed changes

Merge branch 'main' into mattverse/use-ragel

c4e1abe

coderabbitai bot reviewed Mar 7, 2024

View reviewed changes

Make lint happy

1a87ce7

coderabbitai bot reviewed Mar 7, 2024

View reviewed changes

Exclude from golangci

97b691f

github-actions bot added the Type: CI label Mar 7, 2024

coderabbitai bot reviewed Mar 7, 2024

View reviewed changes

facundomedica added this pull request to the merge queue Mar 7, 2024

Merged via the queue into cosmos:main with commit a1e3a85 Mar 7, 2024
57 of 60 checks passed

mattverse added a commit to osmosis-labs/cosmos-sdk that referenced this pull request Mar 8, 2024

feat(x/bank): Replace regex parsing of denom validation to generated …

e9dd2d8

…parsing (cosmos#19511)

mattverse mentioned this pull request Mar 8, 2024

fix: fix ragel denom validation introduced in #19511 #19701

Merged

12 tasks

github-merge-queue bot pushed a commit that referenced this pull request Mar 8, 2024

fix: fix ragel denom validation introduced in #19511 (#19701)

2bf7e9f

mattverse added a commit to osmosis-labs/cosmos-sdk that referenced this pull request Mar 13, 2024

feat(x/bank): Replace regex parsing of denom validation to generated …

f154204

…parsing (cosmos#19511)

mattverse added a commit to osmosis-labs/cosmos-sdk that referenced this pull request Mar 13, 2024

fix: fix ragel denom validation introduced in cosmos#19511 (cosmos#19701

7fc05b3

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(x/bank): Replace regex parsing of denom validation to generated parsing #19511

feat(x/bank): Replace regex parsing of denom validation to generated parsing #19511

mattverse commented Feb 21, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 21, 2024 •

edited

Loading

Walkthrough

Changes

Assessment against linked issues

Related issues

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (`.coderabbit.yaml`)

CodeRabbit Discord Community

coderabbitai bot left a comment

coderabbitai bot left a comment

facundomedica commented Feb 21, 2024

mattverse commented Feb 21, 2024

coderabbitai bot left a comment

mattverse commented Feb 22, 2024 •

edited

Loading

coderabbitai bot left a comment

ValarDragon commented Feb 22, 2024

tac0turtle left a comment

ValarDragon commented Feb 22, 2024

facundomedica commented Feb 22, 2024

coderabbitai bot left a comment

mattverse commented Feb 22, 2024

coderabbitai bot left a comment

ValarDragon Mar 5, 2024 •

edited

Loading

ValarDragon left a comment

kocubinski Mar 5, 2024

mattverse Mar 7, 2024

tac0turtle commented Mar 6, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

		@@ -0,0 +1,39 @@
		// Code generated by regel using `ragel -Z coin_regex.rl`.

feat(x/bank): Replace regex parsing of denom validation to generated parsing #19511

feat(x/bank): Replace regex parsing of denom validation to generated parsing #19511

Conversation

mattverse commented Feb 21, 2024 • edited by coderabbitai bot Loading

Description

Author Checklist

Reviewers Checklist

Summary by CodeRabbit

coderabbitai bot commented Feb 21, 2024 • edited Loading

Walkthrough

Changes

Assessment against linked issues

Related issues

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configration File (.coderabbit.yaml)

CodeRabbit Discord Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

facundomedica commented Feb 21, 2024

mattverse commented Feb 21, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

mattverse commented Feb 22, 2024 • edited Loading

coderabbitai bot left a comment

Choose a reason for hiding this comment

ValarDragon commented Feb 22, 2024

tac0turtle left a comment

Choose a reason for hiding this comment

ValarDragon commented Feb 22, 2024

facundomedica commented Feb 22, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

mattverse commented Feb 22, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

ValarDragon Mar 5, 2024 • edited Loading

Choose a reason for hiding this comment

ValarDragon left a comment

Choose a reason for hiding this comment

kocubinski Mar 5, 2024

Choose a reason for hiding this comment

mattverse Mar 7, 2024

Choose a reason for hiding this comment

tac0turtle commented Mar 6, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

mattverse commented Feb 21, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 21, 2024 •

edited

Loading

CodeRabbit Configration File (`.coderabbit.yaml`)

mattverse commented Feb 22, 2024 •

edited

Loading

ValarDragon Mar 5, 2024 •

edited

Loading