- this PR should close#14514
# Description
Makes updates to `$env.ENV_CONVERSIONS` take effect immediately.
# User-Facing Changes
No breaking change, `$env.ENV_CONVERSIONS` can be set and its effect
used in the same file.
# Tests + Formatting
- 🟢 toolkit fmt
- 🟢 toolkit clippy
- 🟢 toolkit test
- 🟢 toolkit test stdlib
# After Submitting
N/A
# Description
Because `and` and `or` are short-circuiting operations in Nushell, they
must be compiled to a sequence that avoids evaluating the RHS if the LHS
is already sufficient to determine the output - i.e., `false` for `and`
and `true` for `or`. I initially implemented this with `branch-if`
instructions, simply returning the RHS if it needed to be evaluated, and
returning the short-circuited boolean value if it did not.
Example for `$a and $b`:
```
0: load-variable %0, var 999 "$a"
1: branch-if %0, 3
2: jump 5
3: load-variable %0, var 1000 "$b" # label(0), from(1:)
4: jump 6
5: load-literal %0, bool(false) # label(1), from(2:)
6: span %0 # label(2), from(4:)
7: return %0
```
Unfortunately, this broke polars, because using `and`/`or` on custom
values is perfectly valid and they're allowed to define that behavior
differently, and the polars plugin uses this for boolean masks. But
without using the `binary-op` instruction, that custom behavior is never
invoked. Additionally, `branch-if` requires a boolean, and custom values
are not booleans. This changes the IR to the following, using the
`match` instruction to check for the specific short-circuit value
instead, and still invoking `binary-op` otherwise:
```
0: load-variable %0, var 125 "$a"
1: match (false), %0, 4
2: load-variable %1, var 124 "$b"
3: binary-op %0, Boolean(And), %1
4: span %0 # label(0), from(1:)
5: return %0
```
I've also renamed `Pattern::Value` to `Pattern::Expression` and added a
proper `Pattern::Value` variant that actually contains a `Value`
instead. I'm still hoping to remove `Pattern::Expression` eventually,
because it's kind of a hack - we don't actually evaluate the expression,
we just match it against a few cases specifically for pattern matching,
and it's one of the cases where AST leaks into IR and I want to remove
all of those cases, because AST should not leak into IR.
Fixes#14518
# User-Facing Changes
- `and` and `or` now support custom values again.
- the IR is actually a little bit cleaner, though it may be a bit
slower; `match` is more complex.
# Tests + Formatting
The existing tests pass, but I didn't add anything new. Unfortunately I
don't think there's anything built-in to trigger this, but maybe some
testcases could be added to polars to test it.
# Description
Before this PR, `help commands` uses the name from a command's
declaration rather than the name in the scope. This is problematic when
trying to view the help page for the `main` command of a module. For
example, `std bench`:
```nushell
use std/bench
help bench
# => Error: nu::parser::not_found
# =>
# => × Not found.
# => ╭─[entry #10:1:6]
# => 1 │ help bench
# => · ──┬──
# => · ╰── did not find anything under this name
# => ╰────
```
This can also cause confusion when importing specific commands from
modules. Furthermore, if there are multiple commands with the same name
from different modules, the help text for _both_ will appear when
querying their help text (this is especially problematic for `main`
commands, see #14033):
```nushell
use std/iter
help iter find
# => Error: nu::parser::not_found
# =>
# => × Not found.
# => ╭─[entry #3:1:6]
# => 1│ help iter find
# => · ────┬────
# => · ╰── did not find anything under this name
# => ╰────
help find
# => Searches terms in the input.
# =>
# => Search terms: filter, regex, search, condition
# =>
# => Usage:
# => > find {flags} ...(rest)
# [...]
# => Returns the first element of the list that matches the
# => closure predicate, `null` otherwise
# [...]
# (full text omitted for brevity)
```
This PR changes `help commands` to use the name as it is in scope, so
prefixing any command in scope with `help` will show the correct help
text.
```nushell
use std/bench
help bench
# [help text for std bench]
use std/iter
help iter find
# [help text for std iter find]
use std
help std bench
# [help text for std bench]
help std iter find
# [help text for std iter find]
```
Additionally, the IR code generation for commands called with the
`--help` text has been updated to reflect this change.
This does have one side effect: when a module has a `main` command
defined, running `help <name>` (which checks `help aliases`, then `help
commands`, then `help modules`) will show the help text for the `main`
command rather than the module. The help text for the module is still
accessible with `help modules <name>`.
Fixes#10499, #10311, #11609, #13470, #14033, and #14402.
Partially fixes#10707.
Does **not** fix#11447.
# User-Facing Changes
* Help text for commands can be obtained by running `help <command
name>`, where the command name is the same thing you would type in order
to execute the command. Previously, it was the name of the function as
written in the source file.
* For example, for the following module `spam` with command `meow`:
```nushell
module spam {
# help text
export def meow [] {}
}
```
* Before this PR:
* Regardless of how `meow` is `use`d, the help text is viewable by
running `help meow`.
* After this PR:
* When imported with `use spam`: The `meow` command is executed by
running `spam meow` and the `help` text is viewable by running `help
spam meow`.
* When imported with `use spam foo`: The `meow` command is executed by
running `meow` and the `help` text is viewable by running `meow`.
* When a module has a `main` command defined, `help <module name>` will
return help for the main command, rather than the module. To access the
help for the module, use `help modules <module name>`.
# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
# After Submitting
N/A
# Description
The "append" operator currently serves as both the append operator and
the concatenation operator. This dual role creates ambiguity when
operating on nested lists.
```nu
[1 2] ++ 3 # appends a value to a list [1 2 3]
[1 2] ++ [3 4] # concatenates two lists [1 2 3 4]
[[1 2] [3 4]] ++ [5 6]
# does this give [[1 2] [3 4] [5 6]]
# or [[1 2] [3 4] 5 6]
```
Another problem is that `++=` can change the type of a variable:
```nu
mut str = 'hello '
$str ++= ['world']
($str | describe) == list<string>
```
Note that appending is only relevant for lists, but concatenation is
relevant for lists, strings, and binary values. Additionally, appending
can be expressed in terms of concatenation (see example below). So, this
PR changes the `++` operator to only perform concatenation.
# User-Facing Changes
Using the `++` operator with a list and a non-list value will now be a
compile time or runtime error.
```nu
mut list = []
$list ++= 1 # error
```
Instead, concatenate a list with one element:
```nu
$list ++= [1]
```
Or use `append`:
```nu
$list = $list | append 1
```
# After Submitting
Update book and docs.
---------
Co-authored-by: Douglas <32344964+NotTheDr01ds@users.noreply.github.com>
# Description
Fixes: #14110Fixes: #14087
I think it's ok to not generating instruction to `def` and `export def`
call. Because they just return `PipelineData::Empty` without doing
anything.
If nushell generates instructions for `def` and `export def`, nushell
will try to capture variables for these block. It's not the time to do
this.
# User-Facing Changes
```
nu -c "
def bar [] {
let x = 1
($x | foo)
}
def foo [] {
foo
}
"
```
Will no longer raise error.
# Tests + Formatting
Added 4 tests
# Description
Fixes#13991. This was done by more clearly separating the case when a
pipeline is drained vs when it is being written (to a file).
I also added an `OutDest::Print` case which might not be strictly
necessary, but is a helpful addition.
# User-Facing Changes
Bug fix.
# Tests + Formatting
Added a test.
# After Submitting
There are still a few redirection bugs that I found, but they require
larger code changes, so I'll leave them until after the release.
# Description
In the PR #13832 I used some newtypes for the old IDs. `SpanId` and
`RegId` already used newtypes, to streamline the code, I made them into
the same style as the other marker-based IDs.
Since `RegId` should be a bit smaller (it uses a `u32` instead of
`usize`) according to @devyn, I made the `Id` type generic with `usize`
as the default inner value.
The question still stands how `Display` should be implemented if even.
# User-Facing Changes
Users of the internal values of `RegId` or `SpanId` have breaking
changes but who outside nushell itself even uses these?
# After Submitting
The IDs will be streamlined and all type-safe.
# Description
Partialy addresses #13868. `try` does not catch non-zero exit code
errors from the last command in a pipeline if the result is assigned to
a variable using `let` (or `mut`).
This was fixed by adding a new `OutDest::Value` case. This is used when
the pipeline is in a "value" position. I.e., it will be collected into a
value. This ended up replacing most of the usages of `OutDest::Capture`.
So, this PR also renames `OutDest::Capture` to `OutDest::PipeSeparate`
to better fit the few remaining use cases for it.
# User-Facing Changes
Bug fix.
# Tests + Formatting
Added two tests.
# Description
Fixes a bug in the IR for `try` to match that of the regular evaluator
(continuing from #13515):
```nushell
# without IR:
try { ^false } catch { 'caught' } # == 'caught'
# with IR:
try { ^false } catch { 'caught' } # error, non-zero exit code
```
In this PR, both now evaluate to `caught`. For the implementation, I had
to add another instruction, and feel free to suggest better
alternatives. In the future, it might be possible to get rid of this
extra instruction.
# User-Facing Changes
Bug fix, `try { ^false } catch { 'caught' }` now works in IR.
# Description
This PR makes it so that non-zero exit codes and termination by signal
are treated as a normal `ShellError`. Currently, these are silent
errors. That is, if an external command fails, then it's code block is
aborted, but the parent block can sometimes continue execution. E.g.,
see #8569 and this example:
```nushell
[1 2] | each { ^false }
```
Before this would give:
```
╭───┬──╮
│ 0 │ │
│ 1 │ │
╰───┴──╯
```
Now, this shows an error:
```
Error: nu:🐚:eval_block_with_input
× Eval block failed with pipeline input
╭─[entry #1:1:2]
1 │ [1 2] | each { ^false }
· ┬
· ╰── source value
╰────
Error: nu:🐚:non_zero_exit_code
× External command had a non-zero exit code
╭─[entry #1:1:17]
1 │ [1 2] | each { ^false }
· ──┬──
· ╰── exited with code 1
╰────
```
This PR fixes#12874, fixes#5960, fixes#10856, and fixes#5347. This
PR also partially addresses #10633 and #10624 (only the last command of
a pipeline is currently checked). It looks like #8569 is already fixed,
but this PR will make sure it is definitely fixed (fixes#8569).
# User-Facing Changes
- Non-zero exit codes and termination by signal now cause an error to be
thrown.
- The error record value passed to a `catch` block may now have an
`exit_code` column containing the integer exit code if the error was due
to an external command.
- Adds new config values, `display_errors.exit_code` and
`display_errors.termination_signal`, which determine whether an error
message should be printed in the respective error cases. For
non-interactive sessions, these are set to `true`, and for interactive
sessions `display_errors.exit_code` is false (via the default config).
# Tests
Added a few tests.
# After Submitting
- Update docs and book.
- Future work:
- Error if other external commands besides the last in a pipeline exit
with a non-zero exit code. Then, deprecate `do -c` since this will be
the default behavior everywhere.
- Add a better mechanism for exit codes and deprecate
`$env.LAST_EXIT_CODE` (it's buggy).
# Description
Seems like I developed a bit of a bad habit of trying to link
```rust
/// [`.foo()`]
```
in docstrings, and this just doesn't work automatically; you have to do
```rust
/// [`.foo()`](Self::foo)
```
if you want it to actually link. I think I found and replaced all of
these.
# User-Facing Changes
Just docs.
# Description
[Discovered](https://discord.com/channels/601130461678272522/614593951969574961/1266503282554179604)
by `@warp` on Discord.
The IR compiler was not properly setting redirect modes for
subexpressions because `FullCellPath` was always being compiled with
capture-out redirection. This is the correct behavior if there is a tail
to the `FullCellPath`, as we need the value in order to try to extract
anything from it (although this is unlikely to work) - however, the
parser also generates `FullCellPath`s with an empty tail quite often,
including for bare subexpressions.
Because of this, the following did not behave as expected:
```nushell
(docker run -it --rm alpine)
```
Capturing the output meant that `docker` didn't have direct access to
the terminal as a TTY.
As this is a minor bug fix, it should be okay to include in the 0.96.1
patch release.
# User-Facing Changes
- Fixes the bug as described when running with IR evaluation enabled.
# Tests + Formatting
I added a test for this, though we're not currently running all tests
with IR on the CI, but it should ensure this behaviour is consistent.
The equivalent minimum repro I could find was:
```nushell
(nu --testbin cococo); null
```
as this should cause the `cococo` message to appear on stdout, and if
Nushell is capturing the output, it would be discarded instead.
# Description
This grew quite a bit beyond its original scope, but I've tried to make
`$in` a bit more consistent and easier to work with.
Instead of the parser generating calls to `collect` and creating
closures, this adds `Expr::Collect` which just evaluates in the same
scope and doesn't require any closure.
When `$in` is detected in an expression, it is replaced with a new
variable (also called `$in`) and wrapped in `Expr::Collect`. During
eval, this expression is evaluated directly, with the input and with
that new variable set to the collected value.
Other than being faster and less prone to gotchas, it also makes it
possible to typecheck the output of an expression containing `$in`,
which is nice. This is a breaking change though, because of the lack of
the closure and because now typechecking will actually happen. Also, I
haven't attempted to typecheck the input yet.
The IR generated now just looks like this:
```gas
collect %in
clone %tmp, %in
store-variable $in, %tmp
# %out <- ...expression... <- %in
drop-variable $in
```
(where `$in` is the local variable created for this collection, and not
`IN_VARIABLE_ID`)
which is a lot better than having to create a closure and call `collect
--keep-env`, dealing with all of the capture gathering and allocation
that entails. Ideally we can also detect whether that input is actually
needed, so maybe we don't have to clone, but I haven't tried to do that
yet. Theoretically now that the variable is a unique one every time, it
should be possible to give it a type - I just don't know how to
determine that yet.
On top of that, I've also reworked how `$in` works in pipeline-initial
position. Previously, it was a little bit inconsistent. For example,
this worked:
```nushell
> 3 | do { let x = $in; let y = $in; print $x $y }
3
3
```
However, this causes a runtime variable not found error on the second
`$in`:
```nushell
> def foo [] { let x = $in; let y = $in; print $x $y }; 3 | foo
Error: nu:🐚:variable_not_found
× Variable not found
╭─[entry #115:1:35]
1 │ def foo [] { let x = $in; let y = $in; print $x $y }; 3 | foo
· ─┬─
· ╰── variable not found
╰────
```
I've fixed this by making the first element `$in` detection *always*
happen at the block level, so if you use `$in` in pipeline-initial
position anywhere in a block, it will collect with an implicit
subexpression around the whole thing, and you can then use that `$in`
more than once. In doing this I also rewrote `parse_pipeline()` and
hopefully it's a bit more straightforward and possibly more efficient
too now.
Finally, I've tried to make `let` and `mut` a lot more straightforward
with how they handle the rest of the pipeline, and using a redirection
with `let`/`mut` now does what you'd expect if you assume that they
consume the whole pipeline - the redirection is just processed as
normal. These both work now:
```nushell
let x = ^foo err> err.txt
let y = ^foo out+err>| str length
```
It was previously possible to accomplish this with a subexpression, but
it just seemed like a weird gotcha that you couldn't do it. Intuitively,
`let` and `mut` just seem to take the whole line.
- closes#13137
# User-Facing Changes
- `$in` will behave more consistently with blocks and closures, since
the entire block is now just wrapped to handle it if it appears in the
first pipeline element
- `$in` no longer creates a closure, so what can be done within an
expression containing `$in` is less restrictive
- `$in` containing expressions are now type checked, rather than just
resulting in `any`. However, `$in` itself is still `any`, so this isn't
quite perfect yet
- Redirections are now allowed in `let` and `mut` and behave pretty much
how you'd expect
# Tests + Formatting
Added tests to cover the new behaviour.
# After Submitting
- [ ] release notes (definitely breaking change)
# Description
This PR adds an internal representation language to Nushell, offering an
alternative evaluator based on simple instructions, stream-containing
registers, and indexed control flow. The number of registers required is
determined statically at compile-time, and the fixed size required is
allocated upon entering the block.
Each instruction is associated with a span, which makes going backwards
from IR instructions to source code very easy.
Motivations for IR:
1. **Performance.** By simplifying the evaluation path and making it
more cache-friendly and branch predictor-friendly, code that does a lot
of computation in Nushell itself can be sped up a decent bit. Because
the IR is fairly easy to reason about, we can also implement
optimization passes in the future to eliminate and simplify code.
2. **Correctness.** The instructions mostly have very simple and
easily-specified behavior, so hopefully engine changes are a little bit
easier to reason about, and they can be specified in a more formal way
at some point. I have made an effort to document each of the
instructions in the docs for the enum itself in a reasonably specific
way. Some of the errors that would have happened during evaluation
before are now moved to the compilation step instead, because they don't
make sense to check during evaluation.
3. **As an intermediate target.** This is a good step for us to bring
the [`new-nu-parser`](https://github.com/nushell/new-nu-parser) in at
some point, as code generated from new AST can be directly compared to
code generated from old AST. If the IR code is functionally equivalent,
it will behave the exact same way.
4. **Debugging.** With a little bit more work, we can probably give
control over advancing the virtual machine that `IrBlock`s run on to
some sort of external driver, making things like breakpoints and single
stepping possible. Tools like `view ir` and [`explore
ir`](https://github.com/devyn/nu_plugin_explore_ir) make it easier than
before to see what exactly is going on with your Nushell code.
The goal is to eventually replace the AST evaluator entirely, once we're
sure it's working just as well. You can help dogfood this by running
Nushell with `$env.NU_USE_IR` set to some value. The environment
variable is checked when Nushell starts, so config runs with IR, or it
can also be set on a line at the REPL to change it dynamically. It is
also checked when running `do` in case within a script you want to just
run a specific piece of code with or without IR.
# Example
```nushell
view ir { |data|
mut sum = 0
for n in $data {
$sum += $n
}
$sum
}
```
```gas
# 3 registers, 19 instructions, 0 bytes of data
0: load-literal %0, int(0)
1: store-variable var 904, %0 # let
2: drain %0
3: drop %0
4: load-variable %1, var 903
5: iterate %0, %1, end 15 # for, label(1), from(14:)
6: store-variable var 905, %0
7: load-variable %0, var 904
8: load-variable %2, var 905
9: binary-op %0, Math(Plus), %2
10: span %0
11: store-variable var 904, %0
12: load-literal %0, nothing
13: drain %0
14: jump 5
15: drop %0 # label(0), from(5:)
16: drain %0
17: load-variable %0, var 904
18: return %0
```
# Benchmarks
All benchmarks run on a base model Mac Mini M1.
## Iterative Fibonacci sequence
This is about as best case as possible, making use of the much faster
control flow. Most code will not experience a speed improvement nearly
this large.
```nushell
def fib [n: int] {
mut a = 0
mut b = 1
for _ in 2..=$n {
let c = $a + $b
$a = $b
$b = $c
}
$b
}
use std bench
bench { 0..50 | each { |n| fib $n } }
```
IR disabled:
```
╭───────┬─────────────────╮
│ mean │ 1ms 924µs 665ns │
│ min │ 1ms 700µs 83ns │
│ max │ 3ms 450µs 125ns │
│ std │ 395µs 759ns │
│ times │ [list 50 items] │
╰───────┴─────────────────╯
```
IR enabled:
```
╭───────┬─────────────────╮
│ mean │ 452µs 820ns │
│ min │ 427µs 417ns │
│ max │ 540µs 167ns │
│ std │ 17µs 158ns │
│ times │ [list 50 items] │
╰───────┴─────────────────╯
```

##
[gradient_benchmark_no_check.nu](https://github.com/nushell/nu_scripts/blob/main/benchmarks/gradient_benchmark_no_check.nu)
IR disabled:
```
╭───┬──────────────────╮
│ 0 │ 27ms 929µs 958ns │
│ 1 │ 21ms 153µs 459ns │
│ 2 │ 18ms 639µs 666ns │
│ 3 │ 19ms 554µs 583ns │
│ 4 │ 13ms 383µs 375ns │
│ 5 │ 11ms 328µs 208ns │
│ 6 │ 5ms 659µs 542ns │
╰───┴──────────────────╯
```
IR enabled:
```
╭───┬──────────────────╮
│ 0 │ 22ms 662µs │
│ 1 │ 17ms 221µs 792ns │
│ 2 │ 14ms 786µs 708ns │
│ 3 │ 13ms 876µs 834ns │
│ 4 │ 13ms 52µs 875ns │
│ 5 │ 11ms 269µs 666ns │
│ 6 │ 6ms 942µs 500ns │
╰───┴──────────────────╯
```
##
[random-bytes.nu](https://github.com/nushell/nu_scripts/blob/main/benchmarks/random-bytes.nu)
I got pretty random results out of this benchmark so I decided not to
include it. Not clear why.
# User-Facing Changes
- IR compilation errors may appear even if the user isn't evaluating
with IR.
- IR evaluation can be enabled by setting the `NU_USE_IR` environment
variable to any value.
- New command `view ir` pretty-prints the IR for a block, and `view ir
--json` can be piped into an external tool like [`explore
ir`](https://github.com/devyn/nu_plugin_explore_ir).
# Tests + Formatting
All tests are passing with `NU_USE_IR=1`, and I've added some more eval
tests to compare the results for some very core operations. I will
probably want to add some more so we don't have to always check
`NU_USE_IR=1 toolkit test --workspace` on a regular basis.
# After Submitting
- [ ] release notes
- [ ] further documentation of instructions?
- [ ] post-release: publish `nu_plugin_explore_ir`