-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider taking shebang line into account when identifying files #189
Comments
Or, for an even more sophisticated solution, maybe something like https://github.com/github/linguist could be used. |
I just moved this over to use the same list that http://github.com/boyter/scc/ uses actually. However it is totally based on file extensions. There used to be some logic in there to guess the file but that was only in the case of duplicate extensions. It was very slow and inaccurate hence its removal. This looks like a reasonable compromise. |
Have updated based on As for dealing with shebang... that might be better as a pure searchcode implementation as it would needlessly slow down scc. @sschuberth I don't suppose you know of some sort of list of these? If I can get them all in one go it would save some time. |
No, and I don't believe there can be such an official / complete list, because you can use the path to any arbitrary interpreter after
And only do the above as a fallback if the language hasn't been identified yet by other means, like the file extension. The above could probably be implemented as a FileTypeDetector to power probeContentType. |
Alternatively, maybe you can find a way to use the GNU |
I had a feeling that was the case, but was hoping it not to be. There are some pretty neat ideas in file. I shamelessly steal ideas from the GNU tools so I might have a look in there as well. Thanks for the pointers. |
Currently, searchcode does not seem to take shebang lines into account then identifying the language. I.e. a file starting with
#! /usr/bin/env python3
should be identified as Python even if the.py
file extension is missing.The text was updated successfully, but these errors were encountered: