The text that is printed is generated when building assets, by analyzing LICENSE
and NOTICE files that comes with syntaxes and themes.
We take this opportunity to also add a NOTICE file as defined by Apache License 2.0.
We started to use lazycell because syntect already used it. But syntect has
changed to use once_cell. So we should also do that to prepare for using the
upcoming version of syntect.
I had to use a `lazy_static` due to that the clap API that only accepts a
reference to a version string. And, in our code, only a 'static reference to a
version string.
Code could probably be refactored to accept a "normal" reference, but that would
be a major undertaking.
By forwarding the task to find the `Plain Text` syntax to `assets`. Not only does
the code become simpler; we also get rid of a call to `self.get_syntax_set()`
which is beneficial to the long term goal of replacing `syntaxes.bin` with
`minimal_syntaxes.bin`.
Note that the use of `.expect()` is not a regression in error handling. It was
previously hidden in `.find_syntax_plain_text()`.
This information is useful when you want to build several SyntaxSets, but
without having to duplicate SyntaxDefinitions. For example:
"Rust" has no dependencies. But "Markdown" depends on "Rust". With the data
structures this code adds, we know that "Rust" is a dependent syntax for
"Markdown", and can construct a SyntaxSet that takes that into account.
Note that code has a temporary environment flag to ignore any information about
dependents when constructing SyntaxSets. Code that makes use of the new data
structure will be added later.
This significantly speeds up the startup time of bat, since only a single
linked SyntaxDefinition is loaded for each file. The size increase of the
binary is just ~400 kB.
In order for startup time to be improved, the --language arg must be used, and
it must match one of the following names:
"Plain Text", "ActionScript", "AppleScript", "Batch File", "NAnt Build File",
"C#", "C", "CSS", "D", "Diff", "Erlang", "Go", "Haskell", "JSON", "Java
Properties", "BibTeX", "LaTeX Log", "TeX", "Lisp", "Lua", "MATLAB", "Pascal",
"R", "Regular Expression", "Rust", "SQL", "Scala", "Tcl", "XML", "YAML", "Apache
Conf", "ARM Assembly", "Assembly (x86_64)", "CMakeCache", "Comma Separated
Values", "Cabal", "CoffeeScript", "CpuInfo", "Dart Analysis Output", "Dart",
"Dockerfile", "DotENV", "F#", "Friendly Interactive Shell (fish)", "Fortran
(Fixed Form)", "Fortran (Modern)", "Fortran Namelist", "fstab", "GLSL",
"GraphQL", "Groff/troff", "group", "hosts", "INI", "Jinja2", "jsonnet",
"Kotlin", "Less", "LLVM", "Lean", "MemInfo", "Nim", "Ninja", "Nix", "passwd",
"PowerShell", "Protocol Buffer (TEXT)", "Puppet", "Rego", "resolv", "Robot
Framework", "SML", "Strace", "Stylus", "Solidity", "Vyper", "Swift",
"SystemVerilog", "TOML", "Terraform", "TypeScript", "TypeScriptReact",
"Verilog", "VimL", "Zig", "gnuplot", "log", "requirements.txt", "Highlight
non-printables", "Private Key", "varlink"
Later commits will improve startup time for more code paths.
* fix some typos and misspellings
* CHANGELOG.md: Add Performance section (preliminary)
* Add a CHANGELOG.md entry for this PR
This will be needed to later support zero-copy deserialization of independent
syntax sets, but is interesting and useful on its own.
Instead of deferring serialization and deserialization to syntect, we implement it
ourselves in the same way, but make compression optional.
We can't use #[from] on Error::Msg(String) because String does not implement Error.
(Which it shouldn't; see e.g. https://internals.rust-lang.org/t/impl-error-for-string/8881.)
So we implement From manually for Error::Msg, since our current code was written
in that way for error-chain.
Move code to build assets to its own file. That results in better modularity and flexibility.
It also allows us to simplify HighlightingAssets a lot, since it will now always
be initialized with a SerializedSyntaxSet.
To improve startup performance, we will later load smaller `SyntaxSet`s instead
of one giant one. However, the current API assumes only one `SyntaxSet` is ever used,
and that that implicitly is the `SyntaxSet` from which returned `SyntaxReference`s
comes.
This change changes the API to reflect that `SyntaxSet` and `SyntaxReference`
are tightly coupled, and enables the use of several `SyntaxSet`.
Instead of 100 ms - 50 ms, startup takes 10 ms - 5 ms.
HighlightingAssets::get_syntax_set() is never called when e.g. piping the bat
output to a file (see Config::loop_through), so by loading the SyntaxSet only
when needed, we radically improve startup time when it is not needed.
They are just a way to get access to data embedded in the binary, so they don't
conceptually belong inside HighlightingAssets.
This has the nice side effect of getting HighlightingAssets::from_cache() and
::from_binary(), that are highly related, next to each other.
Or rather, introduce new versions of these methods and deprecate the old ones.
This is preparation to enable robust and user-friendly support for lazy-loading.
With lazy-loading, we don't know if the SyntaxSet is valid until after we try to
use it, so wherever we try to use it, we need to return a Result. See discussion
about panics in #1747.
Using BufReader makes sense for large files, but assets are never large enough
to require buffering. It is significantly faster to load the file contents in
one go, so let's do that instead.
Closes#1753
It already now reduces code duplication slightly, but will become even more
useful in the future when we add more complicated logic such as lazy-loading.
Since we only modify `pub(crate)` items, the stable bat-as-a-library API is not
affected.
This takes us one step closer to making SyntaxSet lazy-loaded, which in turn
takes us one step closer to solving #951.
This fixes a bug on Windows where `Command::new` would also run
executables from the current working directory, possibly resulting in
accidental runs of programs called `less`.
Otherwise Rust 1.53.0 gets confused during `cargo doc` because it thinks
we want an actual URL:
warning: this URL is not a hyperlink
--> src/pretty_printer.rs:331:40
|
331 | /// The title for the input (e.g. "http://example.com/example.txt")
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use an automatic link instead: `<http://example.com/example.txt>`
|
= note: `#[warn(rustdoc::bare_urls)]` on by default
= note: bare URLs are not automatically turned into clickable links
It was perhaps also a bit confusing to give an URL as an example in the
first place, because according to our own API example
`examples/inputs.rs` it is meant to be more a free-text thing.
Less 581.2 is here, and it has a ".2" in the version string, which can't
be parsed as a usize.
Update the check to find a non-digit character rather than a space. This
ignores the minor version, but parses the major version correctly.
Do not ignore `BAT_CONFIG_PATH` if it doesn't exist. Both when
generating a new config file with `--generate-config-file` and
when attempting to read the config.
Also, provide a better error message in case the file can not
be created.
closes#1550
closes#1510
The change in `create_highlighted_versions.py` fixes a "unknown theme
"'1337'" warning. The single quotes were wrong. `bat` was always falling
back to the default theme, so let's use that for now.
Fixed by implementing the proposal by sharkdp:
* Allow PAGER=bat, but ignore the setting in bat and simply default to
less. Unless of course, BAT_PAGER or --pager is used to overwrite the
value of PAGER.
* Disallow the usage of bat within BAT_PAGER and --pager.
This will fix#614 by making it clear what is wrong by showing the
following error message:
Failed to load one or more themes from
'/Users/me/.config/bat/themes' (reason: 'Invalid syntax theme
settings')
We also need to add a check if theme_dir.exists(), otherwise an absent
dir will seem like an error:
Failed to load one or more themes from
'/Users/me/.config/bat/themes' (reason: 'IO error for
operation on /Users/me/.config/bat/themes: No such file or
directory (os error 2)')
(This is the same check we already have for syntax_dir.)
To trigger/verify the changed code, run
bat --list-languages # or -L
This is the last clippy warning in the code that you get if you run
cargo clippy --all-targets --all-features -- --allow clippy::style
so by fixing it it becomes easier to spot when a new warning is
introduced (that does not belong to the clippy category clippy::style).
And by making it easy to spot new warnings, we increase chance of such
regressions not ending up in the code base.
This macro is intended to be package-internal and is not to be
considered part of the public lib API.
Use it in three places to reduce code duplication. However, main reason
for this refactoring is to allow us to fix#1063 without duplicating the
code yet another time.
The macro can also be used for the "Binary content from {} will not be
printed to the terminal" message if that message starts to use eprintln!
instead (if ever).
To trigger/verify the changed code, the following commands can be used:
cargo run -- --theme=ansi-light tests/examples/single-line.txt
cargo run -- --theme=does-not-exist tests/examples/single-line.txt
cargo run -- --style=grid,rule tests/examples/single-line.txt
This combines ansi-light and ansi-dark into a single theme that works
with both light and dark backgrounds. Instead of specifying white/black,
the ansi theme uses the terminal's default foreground/background color
by setting alpha=01, i.e. #00000001. This is in addition to the alpha=00
encoding where red contains an ANSI color palette number.
Now, `--theme ansi-light` and `--theme ansi-dark` will print a
deprecation notice and use ansi instead (unless the user has a custom
theme named ansi-light or ansi-dark, which would take precedence).