Switch to gengo #1305

spenserblack · 2024-04-08T16:56:13Z

See #1152 for previous conversation. With the drastic API changes to gengo it started to feel like more work to modify the existing PR instead of starting over.

To Do

Gengo needs to support the remaining languages (or at least have reasons to not support them) (Onefetch languages checklist spenserblack/gengo#34)
Change pretty much everything that previously matched on Language and tokei::LanguageType enums.
Drop the "blob size" stat (Switch to gengo #1305 (comment))

Resolves #26
Closes #1152

To get this working with one language and then expand from there.

This doesn't do much but make the parts that used to use tokei panic.

This reverts commit d601ee7.

Languages that aren't yet supported are commented out.

spenserblack · 2024-04-08T19:09:27Z

@o2sh Before this moves any further we need to determine if we want to move forward with analyzing a git ref (HEAD) or a folder (I think preferably we don't support both). I've started listing reasons for each here: #1303

src/info/blob_size.rs

o2sh · 2024-04-09T09:55:24Z

src/info/langs/mod.rs

-    languages.get_statistics(&[&dir], &ignored, &tokei_config);
-    languages
+) -> Result<gengo::Analysis, Box<dyn Error>> {
+    // TODO Determine best way to ignore files (and if that should continue to be handled by onefetch)


It would be nice to have language type and glob filtering for the V1

or is this going to be handled via .gitattributes?

That depends on the "file source." Right now the Git file source is highly opinionated and would use .gitattributes for ignoring files, overrides, etc.

I plan on the Directory file source being much less opinionated (can't get much more generic than just reading a folder 🙂). The functionality isn't there yet, but I was considering adding ways to configure excluding/including files. Right now .gitignore and .ignore files would ignore files via the ignore crate's default behavior. I was hoping to discuss the usage sometime, but it would probably either be like this:

let file_source = Directory::with_config("./", Config { ignored_files: Some(vec![]) });

or like this:

let file_source = Directory::new("./").extend_ignored_files(&[]);

Very clear, and it also aplies to the language type filter ?

By default Gengo marks Programming, Markup, and Query as detectable, and Prose and Data as not detectable. detectable is basically a boolean that answers "should this factor in to the stats?" Also, documentation, generated files, and vendored files are not detectable by default. Assuming we use the Git file source with .gitattributes, this is example usage:

# Markdown is Prose, but we still want it in the stats *.md gengo-detectable # JavaScript is Programming, but we want to exclude it *.js -gengo-detectable # The contents of dist/ are not generated, so they will be detectable dist/* -gengo-generated # OR # Even though dist/* is generated, it should still be included in the stats dist/* gengo-detectable

By default any file that is not detectable is excluded from the Summary.

Again, the Git file source is highly opinionated and tries to behave a lot like github-linguist. The usage with the Directory file source is still undecided, and right now it doesn't implement any overrides. Or we could write our own file source (#1303 (reply in thread)) if we want very specific behavior that wouldn't be suitable for the gengo crate.

But, however we implement it, IMO the best way would be to inform gengo what files are and aren't detectable, and then make use of either the Analysis (detailed results) or Summary (simplified results).

tl;dr Yes, gengo would be the one handling included/excluded types, and we'd just pass configuration to it somehow.

o2sh · 2024-04-09T10:02:35Z

src/info/langs/mod.rs

-    languages
+) -> Result<gengo::Analysis, Box<dyn Error>> {
+    // TODO Determine best way to ignore files (and if that should continue to be handled by onefetch)
+    let file_source = Git::new(dir, "HEAD")?;


Are the pending changes taken into count for the language analysis?

#1303 (comment)

o2sh#1305 (comment)

This can be handled by gengo.

Remap languages

spenserblack · 2024-10-22T23:06:56Z

Alright, after a long time procrastinating, I implemented the last of onefetch's languages (besides combining Bash, Zsh, and Sh into Shell).

One thing to note is that the MSRV has been bumped.

spenserblack · 2024-10-23T21:38:02Z

Bumping gix resulting in compiler errors (new usage). Just a moment.

spenserblack · 2024-10-24T16:33:54Z

After thinking about this a bit, I think an issue with tokei and other line-counting tools is that they specialize in counting lines, and, while language ID is a priority, not the top priority. Which leads to e.g. tokei having poor out-of-the-box support for Verilog and V (their *.v extensions were already taken). So line-counting tools are not great for properly identifying a language.

So I'd like to add line counts back at some point, possibly by using gengo as a dependency of some crate and doing something vaguely like

match Language {
    JupyterNotebook => my_super_special_jupyter_notebook_counter()
    _ => generic_counter(Language) // would use community-provided data similar to tokei's languages.json
}

And on that topic, I'd recommend looking into scc. It's a Go project similar to tokei, but also includes some really fun stats like "Estimated Cost to Develop".

Switch tokei with gengo in dependencies

ce680d5

vercel bot deployed to Preview April 8, 2024 16:56 View deployment

DELETEME Comment out most languages

d601ee7

To get this working with one language and then expand from there.

vercel bot deployed to Preview April 8, 2024 17:40 View deployment

spenserblack added 3 commits April 8, 2024 17:59

DELETEME Get this to compile for one language

05c823a

This doesn't do much but make the parts that used to use tokei panic.

Revert "DELETEME Comment out most languages"

6e9e960

This reverts commit d601ee7.

Comment-out/rename languages

541fb89

Languages that aren't yet supported are commented out.

vercel bot deployed to Preview April 8, 2024 18:25 View deployment

spenserblack added 2 commits April 8, 2024 19:01

Analyze stats

bb3e92a

Make info labels more accurate

e84787b

vercel bot deployed to Preview April 8, 2024 19:07 View deployment

o2sh reviewed Apr 9, 2024

View reviewed changes

src/info/blob_size.rs Outdated Show resolved Hide resolved

o2sh reviewed Apr 9, 2024

View reviewed changes

This was referenced Apr 9, 2024

Bare repository support #1299

Open

Added the V programming language #1252

Open

feat: add Twig language support with ASCII art and color scheme #1257

Merged

Make gengo's gix features explicit

135004f

vercel bot deployed to Preview April 11, 2024 13:05 View deployment

spenserblack mentioned this pull request Apr 21, 2024

Automatically detect if no programming language is present to show overview in data-only repository #1311

Open

spenserblack added 2 commits April 30, 2024 12:56

Remove LoC/Size stat

ebfe340

o2sh#1305 (comment)

Bump gengo

9a1a8cd

vercel bot deployed to Preview April 30, 2024 12:59 View deployment

Remove filtering by language type

7c7c29a

This can be handled by gengo.

vercel bot deployed to Preview April 30, 2024 13:07 View deployment

Remove unused helper functions

2c2dc5c

vercel bot deployed to Preview April 30, 2024 13:10 View deployment

This was referenced Apr 30, 2024

Add astro framework support #1317

Open

lang: Adding Oz programming language #1280

Merged

vercel bot deployed to Preview June 5, 2024 14:13 View deployment

spenserblack mentioned this pull request Oct 10, 2024

Add ability to ignore git submodules #31

Closed

Handle merge conflicts

940dce2

vercel bot had a problem deploying to Preview October 22, 2024 21:00 Failure

Bump devcontainer OS version

df3d730

vercel bot had a problem deploying to Preview October 22, 2024 21:07 Failure

Update dependencies

72f0c5d

vercel bot had a problem deploying to Preview October 22, 2024 21:25 Failure

TEMP

3afec59

vercel bot deployed to Preview October 22, 2024 21:37 View deployment

Add back missing chip

e65882f

vercel bot deployed to Preview October 22, 2024 21:42 View deployment

squash! TEMP

6c31e9f

Remap languages

vercel bot deployed to Preview October 22, 2024 21:46 View deployment

Rename test arg

4ec196d

vercel bot deployed to Preview October 22, 2024 23:02 View deployment

spenserblack marked this pull request as ready for review October 22, 2024 23:04

vercel bot deployed to Preview October 23, 2024 21:31 View deployment

spenserblack marked this pull request as draft October 23, 2024 21:37

vercel bot deployed to Preview October 23, 2024 21:53 View deployment

vercel bot deployed to Preview October 23, 2024 22:11 View deployment

Bump gix and gengo

47136f7

spenserblack force-pushed the chore/26/switch-to-gengo-2 branch from afa0c1a to 47136f7 Compare October 23, 2024 22:11

vercel bot deployed to Preview October 23, 2024 22:11 View deployment

spenserblack marked this pull request as ready for review October 23, 2024 22:12

spenserblack mentioned this pull request Oct 24, 2024

Language Request: COBOL #1443

Open

1 task

spenserblack mentioned this pull request Nov 4, 2024

Bump the gix group with 2 updates #1446

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to gengo #1305

Switch to gengo #1305

spenserblack commented Apr 8, 2024 •

edited

Loading

spenserblack commented Apr 8, 2024

o2sh Apr 9, 2024 •

edited

Loading

o2sh Apr 9, 2024

spenserblack Apr 9, 2024 •

edited

Loading

o2sh Apr 9, 2024

spenserblack Apr 9, 2024 •

edited

Loading

o2sh Apr 9, 2024 •

edited

Loading

spenserblack Apr 9, 2024

spenserblack commented Oct 22, 2024 •

edited

Loading

spenserblack commented Oct 23, 2024

spenserblack commented Oct 24, 2024 •

edited

Loading

Switch to gengo #1305

Are you sure you want to change the base?

Switch to gengo #1305

Conversation

spenserblack commented Apr 8, 2024 • edited Loading

To Do

spenserblack commented Apr 8, 2024

o2sh Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

o2sh Apr 9, 2024

Choose a reason for hiding this comment

spenserblack Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

o2sh Apr 9, 2024

Choose a reason for hiding this comment

spenserblack Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

o2sh Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

spenserblack Apr 9, 2024

Choose a reason for hiding this comment

spenserblack commented Oct 22, 2024 • edited Loading

spenserblack commented Oct 23, 2024

spenserblack commented Oct 24, 2024 • edited Loading

spenserblack commented Apr 8, 2024 •

edited

Loading

o2sh Apr 9, 2024 •

edited

Loading

spenserblack Apr 9, 2024 •

edited

Loading

spenserblack Apr 9, 2024 •

edited

Loading

o2sh Apr 9, 2024 •

edited

Loading

spenserblack commented Oct 22, 2024 •

edited

Loading

spenserblack commented Oct 24, 2024 •

edited

Loading