Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1703: Introducing MapExtensionToUnknown MappingTarget #1889

Merged
merged 25 commits into from
Oct 25, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
a5e8c6a
1703: Add support for tmux
cbolgiano Oct 3, 2021
da77407
1703: Implemented the concept of MapToUnknownUnlessExactFileNameMatch
cbolgiano Oct 6, 2021
0266bbe
Merge branch 'master' into add-tmux
cbolgiano Oct 6, 2021
b6256ce
Merge branch 'add-tmux' of github.com:cbolgiano/bat into add-tmux
cbolgiano Oct 6, 2021
5f2d2a2
1703: Ran cargo fmt
cbolgiano Oct 6, 2021
86ed2f9
1703: Removed tmux files from this branch.
cbolgiano Oct 6, 2021
74d7a34
Reintroduce racket submodule
cbolgiano Oct 6, 2021
d59de01
Revert "Reintroduce racket submodule"
cbolgiano Oct 6, 2021
aeecad0
Reintroduce racket submodule
cbolgiano Oct 6, 2021
134a1e3
Re-added new line at end of .gitmodules
cbolgiano Oct 6, 2021
bdcb492
Merge branch 'master' of github.com:cbolgiano/bat into issue-1703
cbolgiano Oct 6, 2021
ff1022b
1703: Reworking generic handling of .conf files
cbolgiano Oct 8, 2021
9dd580b
1703: Fixing issues with regression tests by keeping the logic of get…
cbolgiano Oct 8, 2021
3d57f3f
1703: Regression tests are now happy with the changes made for generi…
cbolgiano Oct 9, 2021
a7ccb01
1703: Updated changelog
cbolgiano Oct 11, 2021
d2fb7e1
1703: Documented MapExtensionToUnknown
cbolgiano Oct 11, 2021
77409aa
1703: Clean up
cbolgiano Oct 11, 2021
017b157
1703: Refactored to only call get_extension_syntax_by_file_name once.
cbolgiano Oct 13, 2021
2cc1206
1703: Fixed issues found in regression and integration tests.
cbolgiano Oct 14, 2021
1b1c2db
1703: Addressing linting issues
cbolgiano Oct 14, 2021
c1a1d17
1703: Refactored based on PR suggestions
cbolgiano Oct 16, 2021
8ec79bd
1703: Added integration test, updated changelog, and updated comment
cbolgiano Oct 19, 2021
d28b9ea
Merge branch 'master' into issue-1703
cbolgiano Oct 19, 2021
8e2577c
Add more docs and make the new test more elaborate
Enselic Oct 20, 2021
04d5599
specifc -> specific
Enselic Oct 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

- Python syntax highlighting no longer suffers from abysmal performance in specific scenarios. See #1688 (@keith-hall)
- First line not shown in diff context. See #1891 (@divagant-martian)
- Do not ignore syntaxes that handle file names with a `*.conf` extension. See #1703 (@cbolgiano)

## Performance

Expand Down Expand Up @@ -39,6 +40,8 @@
- Deprecate `HighlightingAssets::syntaxes()` and `HighlightingAssets::syntax_for_file_name()`. Use `HighlightingAssets::get_syntaxes()` and `HighlightingAssets::get_syntax_for_path()` instead. They return a `Result` which is needed for upcoming lazy-loading work to improve startup performance. They also return which `SyntaxSet` the returned `SyntaxReference` belongs to. See #1747, #1755, #1776, #1862 (@Enselic)
- Remove `HighlightingAssets::from_files` and `HighlightingAssets::save_to_cache`. Instead of calling the former and then the latter you now make a single call to `bat::assets::build`. See #1802 (@Enselic)
- Replace the `error::Error(error::ErrorKind, _)` struct and enum with an `error::Error` enum. `Error(ErrorKind::UnknownSyntax, _)` becomes `Error::UnknownSyntax`, etc. Also remove the `error::ResultExt` trait. These changes stem from replacing `error-chain` with `thiserror`. See #1820 (@Enselic)
- Add new `MappingTarget` enum variant `MapExtensionToUnknown`. Refer to its docummentation for more information. Clients are adviced to treat `MapExtensionToUnknown` the same as `MapToUnknown` in exhaustive matches. See #1703 (@cbolgiano)




Expand Down
96 changes: 78 additions & 18 deletions src/assets.rs
Original file line number Diff line number Diff line change
Expand Up @@ -153,9 +153,12 @@ impl HighlightingAssets {
}

/// Detect the syntax based on, in order:
/// 1. Syntax mappings (e.g. `/etc/profile` -> `Bourne Again Shell (bash)`)
/// 1. Syntax mappings with [MappingTarget::MapTo] and [MappingTarget::MapToUnknown]
/// (e.g. `/etc/profile` -> `Bourne Again Shell (bash)`)
/// 2. The file name (e.g. `Dockerfile`)
/// 3. The file name extension (e.g. `.rs`)
/// 3. Syntax mappings with [MappingTarget::MapExtensionToUnknown]
/// (e.g. `*.conf`)
/// 4. The file name extension (e.g. `.rs`)
///
/// When detecting syntax based on syntax mappings, the full path is taken
/// into account. When detecting syntax based on file name, no regard is
Expand All @@ -165,9 +168,9 @@ impl HighlightingAssets {
///
/// Returns [Error::UndetectedSyntax] if it was not possible detect syntax
/// based on path/file name/extension (or if the path was mapped to
/// [MappingTarget::MapToUnknown]). In this case it is appropriate to fall
/// back to other methods to detect syntax. Such as using the contents of
/// the first line of the file.
/// [MappingTarget::MapToUnknown] or [MappingTarget::MapExtensionToUnknown]).
/// In this case it is appropriate to fall back to other methods to detect
/// syntax. Such as using the contents of the first line of the file.
///
/// Returns [Error::UnknownSyntax] if a syntax mapping exist, but the mapped
/// syntax does not exist.
Expand All @@ -177,20 +180,31 @@ impl HighlightingAssets {
mapping: &SyntaxMapping,
) -> Result<SyntaxReferenceInSet> {
let path = path.as_ref();
match mapping.get_syntax_for(path) {
cbolgiano marked this conversation as resolved.
Show resolved Hide resolved
Some(MappingTarget::MapToUnknown) => {
Err(Error::UndetectedSyntax(path.to_string_lossy().into()))
}

Some(MappingTarget::MapTo(syntax_name)) => self
let syntax_match = mapping.get_syntax_for(path);

if let Some(MappingTarget::MapToUnknown) = syntax_match {
return Err(Error::UndetectedSyntax(path.to_string_lossy().into()));
}

if let Some(MappingTarget::MapTo(syntax_name)) = syntax_match {
return self
.find_syntax_by_name(syntax_name)?
.ok_or_else(|| Error::UnknownSyntax(syntax_name.to_owned())),
.ok_or_else(|| Error::UnknownSyntax(syntax_name.to_owned()));
}

None => {
let file_name = path.file_name().unwrap_or_default();
self.get_extension_syntax(file_name)?
.ok_or_else(|| Error::UndetectedSyntax(path.to_string_lossy().into()))
let file_name = path.file_name().unwrap_or_default();

match (self.get_syntax_for_file_name(file_name)?, syntax_match) {
(Some(syntax), _) => Ok(syntax),

(_, Some(MappingTarget::MapExtensionToUnknown)) => {
Err(Error::UndetectedSyntax(path.to_string_lossy().into()))
cbolgiano marked this conversation as resolved.
Show resolved Hide resolved
}

_ => self
.get_syntax_for_file_extension(file_name)?
.ok_or_else(|| Error::UndetectedSyntax(path.to_string_lossy().into())),
}
}

Expand Down Expand Up @@ -263,14 +277,24 @@ impl HighlightingAssets {
.map(|syntax| SyntaxReferenceInSet { syntax, syntax_set }))
}

fn get_extension_syntax(&self, file_name: &OsStr) -> Result<Option<SyntaxReferenceInSet>> {
fn get_syntax_for_file_name(&self, file_name: &OsStr) -> Result<Option<SyntaxReferenceInSet>> {
let mut syntax = self.find_syntax_by_extension(Some(file_name))?;
if syntax.is_none() {
syntax = self.find_syntax_by_extension(Path::new(file_name).extension())?;
syntax = try_with_stripped_suffix(file_name, |stripped_file_name| {
self.get_syntax_for_file_name(stripped_file_name) // Note: recursion
})?;
}
Ok(syntax)
}

fn get_syntax_for_file_extension(
&self,
file_name: &OsStr,
) -> Result<Option<SyntaxReferenceInSet>> {
let mut syntax = self.find_syntax_by_extension(Path::new(file_name).extension())?;
if syntax.is_none() {
syntax = try_with_stripped_suffix(file_name, |stripped_file_name| {
self.get_extension_syntax(stripped_file_name) // Note: recursion
self.get_syntax_for_file_extension(stripped_file_name) // Note: recursion
})?;
}
Ok(syntax)
Expand Down Expand Up @@ -530,6 +554,42 @@ mod tests {
assert_eq!(test.syntax_for_file("test.h"), "C");
}

#[test]
fn syntax_detection_with_extension_mapping_to_unknown() {
let mut test = SyntaxDetectionTest::new();

// Normally, a CMakeLists.txt file shall use the CMake syntax, even if it is
// a bash script in disguise
assert_eq!(
test.syntax_for_file_with_content("CMakeLists.txt", "#!/bin/bash"),
"CMake"
);

// Other .txt files shall use the Plain Text syntax
assert_eq!(
test.syntax_for_file_with_content("some-other.txt", "#!/bin/bash"),
"Plain Text"
);

// If we setup MapExtensionToUnknown on *.txt, the match on the full
// file name of "CMakeLists.txt" shall have higher prio, and CMake shall
// still be used for it
test.syntax_mapping
.insert("*.txt", MappingTarget::MapExtensionToUnknown)
.ok();
assert_eq!(
test.syntax_for_file_with_content("CMakeLists.txt", "#!/bin/bash"),
"CMake"
);

// However, for *other* files with a .txt extension, first-line fallback
// shall now be used
assert_eq!(
test.syntax_for_file_with_content("some-other.txt", "#!/bin/bash"),
"Bourne Again Shell (bash)"
);
}

#[test]
fn syntax_detection_is_case_sensitive() {
let mut test = SyntaxDetectionTest::new();
Expand Down
1 change: 1 addition & 0 deletions src/bin/bat/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ fn get_syntax_mapping_to_paths<'a>(
for mapping in mappings {
match mapping {
(_, MappingTarget::MapToUnknown) => {}
(_, MappingTarget::MapExtensionToUnknown) => {}
(matcher, MappingTarget::MapTo(s)) => {
let globs = map.entry(*s).or_insert_with(Vec::new);
globs.push(matcher.glob().glob().into());
Expand Down
20 changes: 14 additions & 6 deletions src/syntax_mapping.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,21 @@ use globset::{Candidate, GlobBuilder, GlobMatcher};

#[derive(Debug, Clone, Copy, PartialEq)]
pub enum MappingTarget<'a> {
/// For mapping a path to a specific syntax.
MapTo(&'a str),

/// For mapping a path (typically an extension-less file name) to an unknown
/// syntax. This typically means later using the contents of the first line
/// of the file to determine what syntax to use.
MapToUnknown,

/// For mapping a file extension (e.g. `*.conf`) to an unknown syntax. This
/// typically means later using the contents of the first line of the file
/// to determine what syntax to use. However, if a syntax handles a file
/// name that happens to have the given file extension (e.g. `resolv.conf`),
/// then that association will have higher precedence, and the mapping will
/// be ignored.
MapExtensionToUnknown,
cbolgiano marked this conversation as resolved.
Show resolved Hide resolved
cbolgiano marked this conversation as resolved.
Show resolved Hide resolved
}

#[derive(Debug, Clone, Default)]
Expand Down Expand Up @@ -54,12 +67,7 @@ impl<'a> SyntaxMapping<'a> {
// Nginx and Apache syntax files both want to style all ".conf" files
// see #1131 and #1137
mapping
.insert("*.conf", MappingTarget::MapToUnknown)
.unwrap();

// Re-insert a mapping for resolv.conf, see #1510
mapping
.insert("resolv.conf", MappingTarget::MapTo("resolv"))
.insert("*.conf", MappingTarget::MapExtensionToUnknown)
.unwrap();

for glob in &[
Expand Down