Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use repl language tag for sample #1107

Merged
merged 26 commits into from
May 19, 2024
Merged

Use repl language tag for sample #1107

merged 26 commits into from
May 19, 2024

Conversation

abhro
Copy link
Contributor

@abhro abhro commented Apr 22, 2024

No description provided.

W = @node selectrows(Xs, @node shuffle(rs))
julia> Xs = source(X)
julia> rs = @node rows(Xs)
julia> W = @node selectrows(Xs, @node shuffle(rs))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, the pattern has been to only include julia> before input when corresponding output is to follow. Are you finding this confusing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, quite a bit. There are two reasons for this, one is that prompted and prompt-less code are mixed in, and it's not always (immediately) clear to tell what's code, what's code before the output, and what's output. This is especially highlighted when object's reprs are shown, which are made to look like Julia constructors/code, and so it gets even harder to tell what's output and what's code. The other part is syntax highlighting, which can be a huge boon when reading online sources quickly for some reference.

Copy link
Member

@ablaom ablaom Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to have this feedback, thank you.

Unfortunately, this convention is pretty widespread, at least in the MLJ ecosystem. I wonder if others are also finding this confusing. @OkonSamuel @EssamWisam What's your view of this? Do we need to include julia> prompts in every single line when there is sometimes REPL output to be included in a docstring? Often we just only add julia> before lines with output immediately following.

Copy link
Collaborator

@EssamWisam EssamWisam Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is to minimize the use of >julia as it jams up the code (and has to be dropped before copy-pasting into .jl files) which as you said is consistent with the ongoing convention for MLJ. That said, I do find @abhro's point to be valid; it would be not immediately obvious to a beginner that the printed object is not an actual function being called and for that I think we could expose to readers the convention that what immediately comes after a line with >julia is the output of the code in that line.

In any case, in the future, we can explore the possibility of having the outputs be printed in a differently styled cell as in Imbalance tutorials and DataScienceTutorials which would allow dropping all >julia and would be optimal for maximal facilitation of code reuse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the way the @example macro does it also provides a good separation between output and code.

# transforming:
Xsmall = transform(mach);
selectrows(Xsmall, 1:4) |> pretty
julia> # transforming:
Copy link
Member

@ablaom ablaom Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really weird: a comment at the julia> prompt, here and later. Let's just skip all the prompts that don't precede a block of output (see previous comment). Here and below in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm not particularly happy with this change either, I just did it for consistency. Maybe there's a better way to write this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I meant the prompted lines that only hold comments. I still hold that for any code sample that mixes code and output, the code should be prompted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better way to do this would be to split the code sample into 3, one for setup, one for transforming, and one for predicting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can re-organize so that the code comments become ordinary markdown text between seperate fenced blocks.

docs/src/transformers.md Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Apr 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.89%. Comparing base (6f46257) to head (8e45385).
Report is 5 commits behind head on dev.

❗ Current head 8e45385 differs from pull request most recent head ae28151. Consider uploading reports for the commit ae28151 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##              dev    #1107   +/-   ##
=======================================
  Coverage   57.89%   57.89%           
=======================================
  Files           2        2           
  Lines          38       38           
=======================================
  Hits           22       22           
  Misses         16       16           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Co-authored-by: Anthony Blaom, PhD <anthony.blaom@gmail.com>
docs/Project.toml Outdated Show resolved Hide resolved
@ablaom
Copy link
Member

ablaom commented May 6, 2024

@abhro My question regarding ParallelMeans notwithstanding, are you ready for a final review?

@abhro
Copy link
Contributor Author

abhro commented May 7, 2024

Sure!

Copy link

codecov bot commented May 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 57.89%. Comparing base (e341344) to head (650ebbd).
Report is 13 commits behind head on dev.

Additional details and impacted files
@@           Coverage Diff           @@
##              dev    #1107   +/-   ##
=======================================
  Coverage   57.89%   57.89%           
=======================================
  Files           2        2           
  Lines          38       38           
=======================================
  Hits           22       22           
  Misses         16       16           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@ablaom ablaom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work @abhro 🙏🏾

@ablaom
Copy link
Member

ablaom commented May 12, 2024

@EssamWisam Can you please go over the proposed changes to the cheatsheet.

@abhro Can you please address the ParallelKMeans comment?

@abhro
Copy link
Contributor Author

abhro commented May 13, 2024

Hello, I think I addressed it in the original comment thread? But to reiterate, this is to support the change in docs/src/transformers.md, where I changed the predicting transformers code samples to an @example block (lines 196–209), and it loads the KMeans model from the ParallelKMeans package.

Copy link
Collaborator

@EssamWisam EssamWisam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the cheatsheet. All is okay. However, I wonder why only some of the backticks were replaced with ```julia ``` and not others. For the purposes of viewing the cheatsheet in the website, it's better to use ```julia ``` whenever the line is long enough.

Otherwise, the cheatsheet, viewed from the website, would look like this:
image

where many lines are not syntax highlighted (e.g., look below Learning Curves).

Other great changes to facilitate viewing the cheatsheet from the browser:

  • No overly long sections (chunk tuning into three ## headlines: ## Tuning model wrapper, ## Ranges for tuning (range=...), ## Tuning strategies)
  • No overly long lines of code. For instance, the following is too long to view on the website. E.g., comments in:
image

@abhro let me know if you would be willing to revisit the cheat sheet accordingly. If not, I can consider that and add a commit.

For how the sheet views on the website after doing the change in the first bullet above:

MLJ-Sheet.pdf

Thank you for the effort.

@ablaom
Copy link
Member

ablaom commented May 13, 2024

... this is to support the change in docs/src/transformers.md, where I changed the predicting transformers code samples to an @example block (lines 196–209), and it loads the KMeans model from the ParallelKMeans package

Thanks for the explanation.

ParallelKMeans is not a regularly maintained package. Can we please change pkg=ParallelKMeans to pkg=NearestNeighborModels? That pkg is already in the Project and both provide KMeans.

@abhro
Copy link
Contributor Author

abhro commented May 15, 2024

Hello, @EssamWisam, thank you for the feedback! I'm working on updating the cheatsheet to make it a little more consistent. Can I ask how you generated the pdf? It would help a lot in checking the changes I'm making. Thanks again!

@abhro
Copy link
Contributor Author

abhro commented May 15, 2024

@ablaom I can't find the KMeans model in NearestNeighborsModel. I could use the one in Clustering/MLJClusteringInterface. Does that work?

@ablaom
Copy link
Member

ablaom commented May 15, 2024

@ablaom I can't find the KMeans model in NearestNeighborsModel. I could use the one in Clustering/MLJClusteringInterface. Does that work?

Oops 😳 Yes, use the MLJClusteringInterface version.

@EssamWisam
Copy link
Collaborator

Hello, @EssamWisam, thank you for the feedback! I'm working on updating the cheatsheet to make it a little more consistent. Can I ask how you generated the pdf? It would help a lot in checking the changes I'm making. Thanks again!

Well, the MLJ website is currently under development and (as of now only) unless you have some web development frameworks expertise, it's not straightforward to generate it.

That said, avoiding overly ong long lines, using ```julia ``` more often and avoiding overlong long sections are pretty much all the conditions needed for the cheatsheet to render nicely.

@abhro
Copy link
Contributor Author

abhro commented May 15, 2024

Alrighty! I think I've made those changes. If there's something I missed, or in some ways it could be made better, please don't hesitate to let me know! Thanks!

@EssamWisam
Copy link
Collaborator

Alrighty! I think I've made those changes. If there's something I missed, or in some ways it could be made better, please don't hesitate to let me know! Thanks!

Thank you for the valuable contribution, @abhro. I made some minor comments.

@ablaom ablaom merged commit 2745563 into JuliaAI:dev May 19, 2024
5 checks passed
@abhro abhro deleted the patch-1 branch May 19, 2024 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants