Correctly convert `SCREAMING_SNAKE_CASE` #1884

Saalvage · 2024-12-20T15:32:22Z

The problem of normalizing casings is a very interesting one.
I've decided to reimplement the function, partly to make it more readable as well.

It's trivial to implement a function that converts consistent input to consistent output, the issue lies with converting nonsensical input (weird casings and case exceptions) to sensible output, especially while acting purely locally.

These are the test cases that would be different for the original implementation:

Test Value	Original Implementation	New Implementation
`SCREAMING`	`SCREAMING`	`Screaming`
`Still_Not___Screaming`	`StillNotScreaming`	`Still_Not___Screaming`
`Still_Not___screaming`	`StillNot_screaming`	`Still_Not___screaming`
`still_not___screaming`	`StillNotScreaming`	`StillNot___screaming`
`_D_`	`D_`	`_D_`
`_d_`	`D`	`_d_`
`MyCool2d_string`	`MyCool2d_string`	`MyCool2dString`
`MyCool2_d_String`	`MyCool2_dString`	`MyCool2D_String`
`MyC_de_cool`	`MyC_de_cool`	`MyC_deCool`
`m_d_d_d`	`MDDD`	`M_dD_d`
`m_d_d_d` (`LowerCamelCase`)	`mDDD`	`mD_dD`
`__m_d_d_d`	`MDDD`	`__mD_dD`

The questions of design are:

How to deal with leading underscores? I've decided to keep them because I consider them not as much part of the name's case itself, but rather a prefix.
How to deal with multiple underscores? I've decided to ignore them, as I wouldn't imagine them simply indicating casing.
Policy on preventing more capitals in a row than before I have decided to strictly enforce this, while the previous implementation did not do so, this is notable for the MDDD test case. I recognize that the previous handling of this case was superior, however, this case is highly synthetic (in general the test cases are not solely intended to represent real world scenarios, but to capture design decisions made).

In conclusion I believe the new implementation is generally more stable, consistent and sensible in realistic edge cases.

Saalvage · 2024-12-20T16:23:54Z

A general issue with the new approach is that abbreviations such as UTF get converted into Utf, which is clearly undesirable.
There should be ways to remedy this, e.g. by analyzing blocks of characters separated by _, however, all of this would massively complicate the code, while screaming snake case not being handled properly appears to be a rather niche problem, as there has been no mention of it in any issues yet, leading me to believe this is better solved at a client level.

Saalvage added 3 commits December 20, 2024 15:18

Correctly convert SCREAMING_SNAKE_CASE

f417073

Additional test case

9a43e76

More test cases and comments

92a5f74

Saalvage marked this pull request as draft December 20, 2024 15:43

Saalvage closed this Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correctly convert `SCREAMING_SNAKE_CASE` #1884

Correctly convert `SCREAMING_SNAKE_CASE` #1884

Saalvage commented Dec 20, 2024

Saalvage commented Dec 20, 2024

Correctly convert SCREAMING_SNAKE_CASE #1884

Correctly convert SCREAMING_SNAKE_CASE #1884

Conversation

Saalvage commented Dec 20, 2024

Saalvage commented Dec 20, 2024

Correctly convert `SCREAMING_SNAKE_CASE` #1884

Correctly convert `SCREAMING_SNAKE_CASE` #1884