Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API doc directives: accept full character range in package names #431

Merged
merged 1 commit into from
Apr 7, 2022

Conversation

poWer4aiX
Copy link
Contributor

According to the Java Language Specification member of packages, e.g. subpackages are identifiers. Identifiers itself is

...an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter.

As a Java letter includes more than what is expressed by the regex \b[a-z][a-z0-9_]*)\., package names which include uppercase or non ASCII characters cannot be processed. I know that there are some naming conventions flying around but something like aBc.DE.fg is still a valid package name in my opinion.

I have changed the regex in packageDotsToSlash() to (\b\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*)\.. Unfortunately this breaks with the changes applied for issue #397, #395, #98, #86 to handle inner classes. In order to support this notation, the regex can be relaxed to (\b\p{javaLowerCase}\p{javaJavaIdentifierPart}*)\..

I have added the variable 'scaladoc.strictPackageIdent' (which defaults to false) by which you can switch between the two regexp. Once set to true, the package name can conform to an identifier now.

@lightbend-cla-validator
Copy link
Collaborator

At least one pull request committer is not linked to a user. See https://help.github.com/en/articles/why-are-my-commits-linked-to-the-wrong-user#commits-are-not-linked-to-any-user

@poWer4aiX
Copy link
Contributor Author

Stupid question, is there anything I have to do in order to pass the check? I don't understand why "the committer (me) should not linked to a user"...

@poWer4aiX poWer4aiX closed this Apr 21, 2020
@poWer4aiX poWer4aiX reopened this Apr 21, 2020
@lightbend-cla-validator
Copy link
Collaborator

Hi @poWer4aiX,

Thank you for your contribution! We really value the time you've taken to put this together.

Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement:

http://www.lightbend.com/contribute/cla

@poWer4aiX
Copy link
Contributor Author

CLA signed and replied by email...

Copy link
Member

@ennru ennru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK to accept non-ASCII chars in package and class names, but that should be applied for both @javadoc and @scaladoc.

@raboof
Copy link
Contributor

raboof commented Apr 23, 2020

Your CLA signature somehow didn't make it into our system yet. That's probably a problem on our side - how did you submit it?

@poWer4aiX
Copy link
Contributor Author

poWer4aiX commented Apr 23, 2020 via email

@poWer4aiX
Copy link
Contributor Author

@raboof, I simply filled out the PDF and attached it by replying to the email I received. Just noticed that this was addressed to notifications@github.com which is obviously the wrong address, I guess.

Resent it some minutes ago to reply+..., hope this is working now. Sorry for any inconveniences.

@lightbend-cla-validator
Copy link
Collaborator

Hi @poWer4aiX,

Thank you for your contribution! We really value the time you've taken to put this together.

Before we proceed with reviewing this pull request, please sign the Lightbend Contributors License Agreement:

http://www.lightbend.com/contribute/cla

@ennru
Copy link
Member

ennru commented Apr 24, 2020

Your CLA still hasn't shown up in our system, I'm sorry.
Can you try and sign it with the Github integration?

@poWer4aiX
Copy link
Contributor Author

Can you please explain how to "sign it with the Github integration" or provide me a link with a description? BTW, I just sent it to "cla@lightbend.com".

@ennru
Copy link
Member

ennru commented Apr 24, 2020

Sorry for all this trouble, on https://www.lightbend.com/contribute/cla you'll find the "Confirm with your GitHub account" link.

@ennru
Copy link
Member

ennru commented Apr 24, 2020

Ok, all good now. I checked the CLA manually.

@poWer4aiX
Copy link
Contributor Author

Thanks for your support.

Well in the latest version I have applied the change for @scaladoc and @javadoc. In addition I followed I added a setting for [javadoc|scaladoc].<package-prefix>.package_name_style, similar to jadadoc...link_style. Defaults to startWithLowercase, can be set to startWithAnycase.

Copy link
Contributor

@raboof raboof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javadoc needs a tweak, otherwise seems OK to me

* Converts package dot notation to a path, separated by '/'
* Allow all valid java characters and java numbers to be used, according to the java lang spec. In order to
* be allow to convert `package.Outer.Inner` in scala to the `packge/Outer$$Inner`, `strictPackageIdent` has
* be set to `true`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is no longer accurate. Good to give an example: startWithLowercase will get it wrong when a package name starts with an uppercase letter or when an inner class starts with a lowercase character, while startWithAnycase will derive the wrong path whenever an inner class is encountered. Does that summarize the trade-off correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's the old style of setting. Will shorten this block and add your description to the @param description.

Suggested change
* be set to `true`.
* Converts package dot notation to a path, separated by '/'
* Allow all valid java characters and java numbers to be used, according to the java lang spec.
*
* @param s package or full qualified class name to be converted.
* @param packageNameStyle Setting `startWithLowercase`` will get it wrong when a package name
* starts with an uppercase letter or when an inner class starts with
* a lowercase character, while `startWithAnycase` will derive the wrong
* path whenever an inner class is encountered.

}

case class ScaladocDirective(ctx: Writer.Context)
extends ApiDocDirective("scaladoc") {

import ScaladocDirective._

val defaultPackageNameStyle = variables.getOrElse("scaladoc.package_name_style", "startWithLowercase")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to keep the default

@poWer4aiX
Copy link
Contributor Author

If you think the change in general is fine, I will also update the documentation.

@poWer4aiX
Copy link
Contributor Author

@raboof in @javadoc the cases with inner classes

  • having a subpackage starting with an uppercase character (org.example.Lib.Outer.Inner)
  • having an outer class starting with a lowercase character (org.example.lib.outer.Inner)
  • having the inner class starting with a lowercase character (org.example.lib.Outer.inner)

are not addressed so far. @javadoc will resolve to an incorrect URL in all cases. To fix this I propose to make use of the opposite from scala and resolve $$ in the FQCN to a ..

I have implemented it and added the corresponding tests as well.

@poWer4aiX poWer4aiX requested a review from raboof April 28, 2020 07:58
Copy link
Contributor

@raboof raboof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of weird to introduce yet another notation with the $$, but OK


private[markdown] def url(link: String, baseUrl: Url, linkStyle: LinkStyle, packageNameStyle: String): Url = {
val url = Url(link).base
val path = dollarDollarToClassDot(ApiDocDirective.packageDotsToSlash(url.getPath, packageNameStyle)) + ".html"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So any names actually containing $$ are now broken - I suppose that should be exotic enough that this is OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I also think that this might be a very theoretical case. If we want to get rid of it, we can also think of using a non java character, e.g. a .. (two dots) instead of $$. If I', right then this is invalid for any full qualified name.

@mdedetrich
Copy link
Contributor

Is there a plan to merge this PR? This is mainly in relation to #515 but in general its been approved but sitting here unmerged for 2 years.

@raboof raboof self-requested a review April 7, 2022 11:36
@raboof raboof self-assigned this Apr 7, 2022
@raboof raboof force-pushed the strictPackageIdent branch from f8a96ed to 6f49d2b Compare April 7, 2022 12:35
@raboof
Copy link
Contributor

raboof commented Apr 7, 2022

Is there a plan to merge this PR? This is mainly in relation to #515 but in general its been approved but sitting here unmerged for 2 years.

I guess it was because it conflicted with master and wasn't a trivial merge/rebase ;). I squashed and rebased now, kept the old commits for future reference in https://github.com/raboof/paradox/pull/new/strictPackageIdent-bak .

According to the [Java Language Specification](https://docs.oracle.com/javase/specs/jls/se8/html/index.html)
member of packages, e.g. subpackages are identifiers. Identifiers itself is

    ...an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter.

As a `Java letter` includes more than what is expressed by the regex `\b[a-z][a-z0-9_]*)\.`, package names which include uppercase or non ASCII characters cannot be processed. I know that there are some _naming conventions_ flying around but something like `aBc.DE.fg` is still a valid package name in my opinion.

I have changed the regex in `packageDotsToSlash()` to `(\b\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*)\.`. Unfortunately this breaks with the changes applied for issue lightbend#397, lightbend#395, lightbend#98, lightbend#86 to handle inner classes. In order to support this notation, the regex can be relaxed to `(\b\p{javaLowerCase}\p{javaJavaIdentifierPart}*)\.`.

I have added the variable 'scaladoc.strictPackageIdent' (which defaults to `false`) by which you can switch between the two regexp. Once set to true, the package name can conform to an identifier now.
@raboof raboof force-pushed the strictPackageIdent branch from 6f49d2b to 76946fb Compare April 7, 2022 12:43
@raboof raboof merged commit 8d43654 into lightbend:master Apr 7, 2022
@mdedetrich
Copy link
Contributor

Perfect, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants