Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add option to scan and register HTML anchors #20

Closed
wants to merge 8 commits into from

Conversation

tvdboom
Copy link
Contributor

@tvdboom tvdboom commented Aug 9, 2022

I noticed autorefs only scans the toc for anchors. This PR enhances the library by also searching for all html tags with the id attribute. The user might want to use autorefs to link to any arbitrary html tag that is not part of the toc.

Use case: When documenting a Python class, every method can be preceded by a tag <a id="method_name"></a> to refer to it. Using markdown headers is not desired in this case because you don't want all the methods listed in the toc.

@pawamoy pawamoy self-assigned this Aug 18, 2022
@oprypin
Copy link
Member

oprypin commented Aug 18, 2022

Could you please share the repository where you're using this approach of manually adding tags?
I'm currently looking for inspiration in the topic of documenting individual members one by one, for example here https://github.com/mkdocs/mkdocs/pull/2934/files#diff-eed24d79e27242a1b1f070ca802528052c5366b8eebf1cd5f7959a39dbf665cf

Also seeing your use case would help justify the fact that we're parsing HTML with regex here, which usually is to be avoided at all costs 😬

On a separate note - are you aware that scan_toc: bool = True is not a setting that's possible for the plugin's user to set? It is made only to be overridden in code: https://github.com/mkdocstrings/mkdocstrings/search?q=scan_toc
so if you want to actually use this option you're adding, you'd have to add it to the plugin's config schema.

And we'd have to at least make this option false by default, as it's kind of a breaking change and also it's likely to greatly impact performance.

@tvdboom
Copy link
Contributor Author

tvdboom commented Aug 19, 2022

The repo I am working on is: https://github.com/tvdboom/ATOM
Documentation is here: https://tvdboom.github.io/ATOM

See here for example how I create the <a name="fit-transform"></a> tag to refer to the method in the toc above (I should have used id, but used name because my html is not that good).

Up until now I made all these pages manually, which was a lot of work and not very efficient. I am currently working on an automated way to make the same documentation. See the development branch. Here the same file, for example.
This page is rendered here and converted into (almost) the same html as currently in master. You can run mkdocs serve yourself if you clone the repo it to test it.

Let me know if you have any questions.

On a separate note - are you aware that scan_toc: bool = True is not a setting that's possible for the plugin's user to set? It is made only to be overridden in code: https://github.com/mkdocstrings/mkdocstrings/search?q=scan_toc
so if you want to actually use this option you're adding, you'd have to add it to the plugin's config schema.
And we'd have to at least make this option false by default, as it's kind of a breaking change and also it's likely to greatly impact performance.

I am aware that scan_toc was only an internal setting but I tried to follow the "style" of the existing code. I do agree that this should be an option that defaults to False since it has indeed an impact on performance (although not as great as you might expect) based on my current tests (I already implemented this on my local download of autorefs). I'll change the code in the PR as soon as I have time for it.

P.S. I am not sure what the ci/quality test should do to pass

@oprypin
Copy link
Member

oprypin commented Aug 19, 2022

Sorry um this is off-topic, but... is there no way at all to make mkdocstrings work for your usecase? You could probably even subclass the Python handler, replace all templates and such...

@tvdboom
Copy link
Contributor Author

tvdboom commented Aug 20, 2022

I tried to make it work with mkdocstrings, but it was bugged. Not sure what went wrong but the formatting was all off. Furthermore, the support for numpydoc style is limited. I also didn't know you could overwrite the templates. If it's easy, it would be nice to see that in the docs somewhere.

But, to be honest, I don't think I can achieve the level of customization I want with mkdocstrings. Have a look at this page for example. Here, I want to list specific methods and properties at every section (some of them inherited from a parent class and some not), add examples right from the docstring, list properties that are not directly inherited by the class, add a toc at every section, etc... I figured that it would be easier to write my own parser.

@oprypin
Copy link
Member

oprypin commented Aug 20, 2022

Sigh. Yeah it'd be some work to wrangle mkdocstrings to do this.
But this is exactly the type of usage that I always envisioned for it. Instead, everyone just adds 1 directive to dump the entire class and leaves it at that.
There's a ton of cool tech in it and a ton of solutions to various problems, including the one from this pull request... So just makes me sad that indeed none of that matters...

@tvdboom
Copy link
Contributor Author

tvdboom commented Aug 20, 2022

If you can achieve all I just mentioned with mkdocstring, then the problem is the documentation. It needs way more examples on how to achieve all these things.

@oprypin
Copy link
Member

oprypin commented Aug 20, 2022

No, I don't think one can achieve all of that. Maybe one day 😔

@tvdboom
Copy link
Contributor Author

tvdboom commented Aug 22, 2022

I added the scan_html_tags as an option in config

@pawamoy
Copy link
Member

pawamoy commented Nov 22, 2022

Hi @tvdboom, thanks for your patience. I understand your use case and I'm not against adding this feature. However registering all HTML IDs of all pages seems like a good way to shoot yourself in the foot. Depending on the set of activated plugins and extensions, there might be lots of those IDs, overlapping with each other, and worse, overlapping with actual API objects. Would it make more sense to only search for actual anchors <a>? Or to make it configurable?

plugins:
- autorefs:
    register_tags: [a, img, p]

@tvdboom
Copy link
Contributor Author

tvdboom commented Nov 23, 2022

Yes, that makes sense. For simplicity I would say to only register actual anchors. I updated the PR

@pawamoy
Copy link
Member

pawamoy commented Mar 7, 2023

@oprypin would you like to review this again? I don't have a strong opinion on this. We'd just need you to document the new option @tvdboom, with an explanation of your use-case for example (simplified to what the option allows you to do), before we merge.

Also, sorry for taking so long to answer 😫

@tvdboom
Copy link
Contributor Author

tvdboom commented Oct 11, 2023

Incredibly late response, but I sort of forgot about this PR. I updated it now. Can someone indicate me where exactly to update the documentation?

@pawamoy
Copy link
Member

pawamoy commented Feb 5, 2024

The PR with the slowest pace on GitHub 😂
Interest for such a feature expressed here: mkdocs/mkdocs#3566 (reply in thread). I might take some time in the coming days/week to finish this PR, as it would unlock some interesting use-cases (creating cross-ref'able anchors without headings).

@pawamoy pawamoy changed the base branch from master to main February 15, 2024 16:51
@pawamoy pawamoy changed the title added scan html tags feat: Add option to scan and register HTML anchors Feb 16, 2024
@pawamoy pawamoy closed this in #39 Feb 27, 2024
pawamoy added a commit that referenced this pull request Feb 27, 2024
Replaces-PR-#20: #20
Related-to-issue-#25: #25
Related-to-issue-#35: #35
Co-authored-by: Oleh Prypin <oleh@pryp.in>
Co-authored-by: tvdboom <m.524687@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants