24 Commits

Author SHA1 Message Date
Devyn Cairns
d7392f1f3b
Internal representation (IR) compiler and evaluator (#13330)
# Description

This PR adds an internal representation language to Nushell, offering an
alternative evaluator based on simple instructions, stream-containing
registers, and indexed control flow. The number of registers required is
determined statically at compile-time, and the fixed size required is
allocated upon entering the block.

Each instruction is associated with a span, which makes going backwards
from IR instructions to source code very easy.

Motivations for IR:

1. **Performance.** By simplifying the evaluation path and making it
more cache-friendly and branch predictor-friendly, code that does a lot
of computation in Nushell itself can be sped up a decent bit. Because
the IR is fairly easy to reason about, we can also implement
optimization passes in the future to eliminate and simplify code.
2. **Correctness.** The instructions mostly have very simple and
easily-specified behavior, so hopefully engine changes are a little bit
easier to reason about, and they can be specified in a more formal way
at some point. I have made an effort to document each of the
instructions in the docs for the enum itself in a reasonably specific
way. Some of the errors that would have happened during evaluation
before are now moved to the compilation step instead, because they don't
make sense to check during evaluation.
3. **As an intermediate target.** This is a good step for us to bring
the [`new-nu-parser`](https://github.com/nushell/new-nu-parser) in at
some point, as code generated from new AST can be directly compared to
code generated from old AST. If the IR code is functionally equivalent,
it will behave the exact same way.
4. **Debugging.** With a little bit more work, we can probably give
control over advancing the virtual machine that `IrBlock`s run on to
some sort of external driver, making things like breakpoints and single
stepping possible. Tools like `view ir` and [`explore
ir`](https://github.com/devyn/nu_plugin_explore_ir) make it easier than
before to see what exactly is going on with your Nushell code.

The goal is to eventually replace the AST evaluator entirely, once we're
sure it's working just as well. You can help dogfood this by running
Nushell with `$env.NU_USE_IR` set to some value. The environment
variable is checked when Nushell starts, so config runs with IR, or it
can also be set on a line at the REPL to change it dynamically. It is
also checked when running `do` in case within a script you want to just
run a specific piece of code with or without IR.

# Example

```nushell
view ir { |data|
  mut sum = 0
  for n in $data {
    $sum += $n
  }
  $sum
}
```
  
```gas
# 3 registers, 19 instructions, 0 bytes of data
   0: load-literal           %0, int(0)
   1: store-variable         var 904, %0 # let
   2: drain                  %0
   3: drop                   %0
   4: load-variable          %1, var 903
   5: iterate                %0, %1, end 15 # for, label(1), from(14:)
   6: store-variable         var 905, %0
   7: load-variable          %0, var 904
   8: load-variable          %2, var 905
   9: binary-op              %0, Math(Plus), %2
  10: span                   %0
  11: store-variable         var 904, %0
  12: load-literal           %0, nothing
  13: drain                  %0
  14: jump                   5
  15: drop                   %0          # label(0), from(5:)
  16: drain                  %0
  17: load-variable          %0, var 904
  18: return                 %0
```

# Benchmarks

All benchmarks run on a base model Mac Mini M1.

## Iterative Fibonacci sequence

This is about as best case as possible, making use of the much faster
control flow. Most code will not experience a speed improvement nearly
this large.

```nushell
def fib [n: int] {
  mut a = 0
  mut b = 1
  for _ in 2..=$n {
    let c = $a + $b
    $a = $b
    $b = $c
  }
  $b
}
use std bench
bench { 0..50 | each { |n| fib $n } }
```

IR disabled:

```
╭───────┬─────────────────╮
│ mean  │ 1ms 924µs 665ns │
│ min   │ 1ms 700µs 83ns  │
│ max   │ 3ms 450µs 125ns │
│ std   │ 395µs 759ns     │
│ times │ [list 50 items] │
╰───────┴─────────────────╯
```

IR enabled:

```
╭───────┬─────────────────╮
│ mean  │ 452µs 820ns     │
│ min   │ 427µs 417ns     │
│ max   │ 540µs 167ns     │
│ std   │ 17µs 158ns      │
│ times │ [list 50 items] │
╰───────┴─────────────────╯
```

![explore ir
view](https://github.com/nushell/nushell/assets/10729/d7bccc03-5222-461c-9200-0dce71b83b83)

##
[gradient_benchmark_no_check.nu](https://github.com/nushell/nu_scripts/blob/main/benchmarks/gradient_benchmark_no_check.nu)

IR disabled:

```
╭───┬──────────────────╮
│ 0 │ 27ms 929µs 958ns │
│ 1 │ 21ms 153µs 459ns │
│ 2 │ 18ms 639µs 666ns │
│ 3 │ 19ms 554µs 583ns │
│ 4 │ 13ms 383µs 375ns │
│ 5 │ 11ms 328µs 208ns │
│ 6 │  5ms 659µs 542ns │
╰───┴──────────────────╯
```

IR enabled:

```
╭───┬──────────────────╮
│ 0 │       22ms 662µs │
│ 1 │ 17ms 221µs 792ns │
│ 2 │ 14ms 786µs 708ns │
│ 3 │ 13ms 876µs 834ns │
│ 4 │  13ms 52µs 875ns │
│ 5 │ 11ms 269µs 666ns │
│ 6 │  6ms 942µs 500ns │
╰───┴──────────────────╯
```

##
[random-bytes.nu](https://github.com/nushell/nu_scripts/blob/main/benchmarks/random-bytes.nu)

I got pretty random results out of this benchmark so I decided not to
include it. Not clear why.

# User-Facing Changes
- IR compilation errors may appear even if the user isn't evaluating
with IR.
- IR evaluation can be enabled by setting the `NU_USE_IR` environment
variable to any value.
- New command `view ir` pretty-prints the IR for a block, and `view ir
--json` can be piped into an external tool like [`explore
ir`](https://github.com/devyn/nu_plugin_explore_ir).

# Tests + Formatting
All tests are passing with `NU_USE_IR=1`, and I've added some more eval
tests to compare the results for some very core operations. I will
probably want to add some more so we don't have to always check
`NU_USE_IR=1 toolkit test --workspace` on a regular basis.

# After Submitting
- [ ] release notes
- [ ] further documentation of instructions?
- [ ] post-release: publish `nu_plugin_explore_ir`
2024-07-10 17:33:59 -07:00
Ian Manske
d7ba8872bf
Rename IoStream to OutDest (#12433)
# Description
I spent a while trying to come up with a good name for what is currently
`IoStream`. Looking back, this name is not the best, because it:
1. Implies that it is a stream, when it all it really does is specify
the output destination for a stream/pipeline.
2. Implies that it handles input and output, when it really only handles
output.

So, this PR renames `IoStream` to `OutDest` instead, which should be
more clear.
2024-04-09 16:48:32 +00:00
Ian Manske
b6c7656194
IO and redirection overhaul (#11934)
# Description
The PR overhauls how IO redirection is handled, allowing more explicit
and fine-grain control over `stdout` and `stderr` output as well as more
efficient IO and piping.

To summarize the changes in this PR:
- Added a new `IoStream` type to indicate the intended destination for a
pipeline element's `stdout` and `stderr`.
- The `stdout` and `stderr` `IoStream`s are stored in the `Stack` and to
avoid adding 6 additional arguments to every eval function and
`Command::run`. The `stdout` and `stderr` streams can be temporarily
overwritten through functions on `Stack` and these functions will return
a guard that restores the original `stdout` and `stderr` when dropped.
- In the AST, redirections are now directly part of a `PipelineElement`
as a `Option<Redirection>` field instead of having multiple different
`PipelineElement` enum variants for each kind of redirection. This
required changes to the parser, mainly in `lite_parser.rs`.
- `Command`s can also set a `IoStream` override/redirection which will
apply to the previous command in the pipeline. This is used, for
example, in `ignore` to allow the previous external command to have its
stdout redirected to `Stdio::null()` at spawn time. In contrast, the
current implementation has to create an os pipe and manually consume the
output on nushell's side. File and pipe redirections (`o>`, `e>`, `e>|`,
etc.) have precedence over overrides from commands.

This PR improves piping and IO speed, partially addressing #10763. Using
the `throughput` command from that issue, this PR gives the following
speedup on my setup for the commands below:
| Command | Before (MB/s) | After (MB/s) | Bash (MB/s) |
| --------------------------- | -------------:| ------------:|
-----------:|
| `throughput o> /dev/null` | 1169 | 52938 | 54305 |
| `throughput \| ignore` | 840 | 55438 | N/A |
| `throughput \| null` | Error | 53617 | N/A |
| `throughput \| rg 'x'` | 1165 | 3049 | 3736 |
| `(throughput) \| rg 'x'` | 810 | 3085 | 3815 |

(Numbers above are the median samples for throughput)

This PR also paves the way to refactor our `ExternalStream` handling in
the various commands. For example, this PR already fixes the following
code:
```nushell
^sh -c 'echo -n "hello "; sleep 0; echo "world"' | find "hello world"
```
This returns an empty list on 0.90.1 and returns a highlighted "hello
world" on this PR.

Since the `stdout` and `stderr` `IoStream`s are available to commands
when they are run, then this unlocks the potential for more convenient
behavior. E.g., the `find` command can disable its ansi highlighting if
it detects that the output `IoStream` is not the terminal. Knowing the
output streams will also allow background job output to be redirected
more easily and efficiently.

# User-Facing Changes
- External commands returned from closures will be collected (in most
cases):
  ```nushell
  1..2 | each {|_| nu -c "print a" }
  ```
This gives `["a", "a"]` on this PR, whereas this used to print "a\na\n"
and then return an empty list.

  ```nushell
  1..2 | each {|_| nu -c "print -e a" }
  ```
This gives `["", ""]` and prints "a\na\n" to stderr, whereas this used
to return an empty list and print "a\na\n" to stderr.

- Trailing new lines are always trimmed for external commands when
piping into internal commands or collecting it as a value. (Failure to
decode the output as utf-8 will keep the trailing newline for the last
binary value.) In the current nushell version, the following three code
snippets differ only in parenthesis placement, but they all also have
different outputs:

  1. `1..2 | each { ^echo a }`
     ```
     a
     a
     ╭────────────╮
     │ empty list │
     ╰────────────╯
     ```
  2. `1..2 | each { (^echo a) }`
     ```
     ╭───┬───╮
     │ 0 │ a │
     │ 1 │ a │
     ╰───┴───╯
     ```
  3. `1..2 | (each { ^echo a })`
     ```
     ╭───┬───╮
     │ 0 │ a │
     │   │   │
     │ 1 │ a │
     │   │   │
     ╰───┴───╯
     ```

  But in this PR, the above snippets will all have the same output:
  ```
  ╭───┬───╮
  │ 0 │ a │
  │ 1 │ a │
  ╰───┴───╯
  ```

- All existing flags on `run-external` are now deprecated.

- File redirections now apply to all commands inside a code block:
  ```nushell
  (nu -c "print -e a"; nu -c "print -e b") e> test.out
  ```
This gives "a\nb\n" in `test.out` and prints nothing. The same result
would happen when printing to stdout and using a `o>` file redirection.

- External command output will (almost) never be ignored, and ignoring
output must be explicit now:
  ```nushell
  (^echo a; ^echo b)
  ```
This prints "a\nb\n", whereas this used to print only "b\n". This only
applies to external commands; values and internal commands not in return
position will not print anything (e.g., `(echo a; echo b)` still only
prints "b").

- `complete` now always captures stderr (`do` is not necessary).

# After Submitting
The language guide and other documentation will need to be updated.
2024-03-14 15:51:55 -05:00
Stefan Holderbach
6e590fe0a2
Remove unused Index(Mut) impls on AST types (#11903)
# Description
Both `Block` and `Pipeline` had `Index`/`IndexMut` implementations to
access their elements, that are currently unused.
Explicit helpers or iteration would generally be preferred anyways but
in the current state the inner containers are `pub` and are liberally
used. (Sometimes with potentially panicking indexing or also iteration)

As it is potentially unclear what the meaning of the element from a
block or pipeline queried by a usize is, let's remove it entirely until
we come up with a better API.

# User-Facing Changes
None

Plugin authors shouldn't dig into AST internals
2024-02-21 18:02:30 +08:00
Wind
58c6fea60b
Support redirect stderr and stdout+stderr with a pipe (#11708)
# Description
Close: #9673
Close: #8277
Close: #10944

This pr introduces the following syntax:
1. `e>|`, pipe stderr to next command. Example: `$env.FOO=bar nu
--testbin echo_env_stderr FOO e>| str length`
2. `o+e>|` and `e+o>|`, pipe both stdout and stderr to next command,
example: `$env.FOO=bar nu --testbin echo_env_mixed out-err FOO FOO e+o>|
str length`

Note: it only works for external commands. ~There is no different for
internal commands, that is, the following three commands do the same
things:~ Edit: it raises errors if we want to pipes for internal
commands
``` 
❯ ls e>| str length
Error:   × `e>|` only works with external streams
   ╭─[entry #1:1:1]
 1 │ ls e>| str length
   ·    ─┬─
   ·     ╰── `e>|` only works on external streams
   ╰────

❯ ls e+o>| str length
Error:   × `o+e>|` only works with external streams
   ╭─[entry #2:1:1]
 1 │ ls e+o>| str length
   ·    ──┬──
   ·      ╰── `o+e>|` only works on external streams
   ╰────
```

This can help us to avoid some strange issues like the following:

`$env.FOO=bar (nu --testbin echo_env_stderr FOO) e>| str length`

Which is hard to understand and hard to explain to users.

# User-Facing Changes
Nan

# Tests + Formatting
To be done

# After Submitting
Maybe update documentation about these syntax.
2024-02-09 01:30:46 +08:00
TrMen
4b91ed57dd
Enforce call stack depth limit for all calls (#11729)
# Description
Previously, only direcly-recursive calls were checked for recursion
depth. But most recursive calls in nushell are mutually recursive since
expressions like `for`, `where`, `try` and `do` all execute a separte
block.

```nushell
def f [] {
    do { f }
}
```
Calling `f` would crash nushell with a stack overflow.

I think the only general way to prevent such a stack overflow is to
enforce a maximum call stack depth instead of only disallowing directly
recursive calls.

This commit also moves that logic into `eval_call()` instead of
`eval_block()` because the recursion limit is tracked in the `Stack`,
but not all blocks are evaluated in a new stack. Incrementing the
recursion depth of the caller's stack would permanently increment that
for all future calls.

Fixes #11667

# User-Facing Changes
Any function call can now fail with `recursion_limit_reached` instead of
just directly recursive calls. Mutually-recursive calls no longer crash
nushell.

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2024-02-08 06:42:24 +08:00
WindSoilder
077d1c8125
Support o>>, e>>, o+e>> to append output to an external file (#10764)
# Description
Close: #10278

This pr introduces `o>>`, `e>>`, `o+e>>` to allow redirection to append
to a file.
Examples:
```nushell
echo abc o>> a.txt
echo abc o>> a.txt
cat asdf e>> a.txt
cat asdf e>> a.txt
cat asdf o+e>> a.txt
```

~~TODO:~~
~~1. currently internal commands with `o+e>` redirect to a variable is
broken: `let x = "a.txt"; echo abc o+e> $x`, not sure when it was
introduced...~~
~~2. redirect stdout and stderr with append mode doesn't supported yet:
`cat asdf o>>a.txt e>>b.ext`~~

~~For these 2 items, I'd like to fix them in different prs.~~
Already done in this pr
2023-11-27 07:52:39 -06:00
JT
786ba3bf91
Input output checking (#9680)
# Description

This PR tights input/output type-checking a bit more. There are a lot of
commands that don't have correct input/output types, so part of the
effort is updating them.

This PR now contains updates to commands that had wrong input/output
signatures. It doesn't add examples for these new signatures, but that
can be follow-up work.

# User-Facing Changes

BREAKING CHANGE BREAKING CHANGE

This work enforces many more checks on pipeline type correctness than
previous nushell versions. This strictness may uncover incompatibilities
in existing scripts or shortcomings in the type information for internal
commands.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect -A clippy::result_large_err` to check that
you're using the standard code style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2023-07-14 15:20:35 +12:00
JT
9068093081
Improve type hovers (#9515)
# Description

This PR does a few things to help improve type hovers and, in the
process, fixes a few outstanding issues in the type system. Here's a
list of the changes:

* `for` now will try to infer the type of the iteration variable based
on the expression it's given. This fixes things like `for x in [1, 2, 3]
{ }` where `x` now properly gets the int type.
* Removed old input/output type fields from the signature, focuses on
the vec of signatures. Updated a bunch of dataframe commands that hadn't
moved over. This helps tie things together a bit better
* Fixed inference of types from subexpressions to use the last
expression in the block
* Fixed handling of explicit types in `let` and `mut` calls, so we now
respect that as the authoritative type

I also tried to add `def` input/output type inference, but unfortunately
we only know the predecl types universally, which means we won't have
enough information to properly know what the types of the custom
commands are.

# User-Facing Changes

Script typechecking will get tighter in some cases
Hovers should be more accurate in some cases that previously resorted to
any.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect -A clippy::result_large_err` to check that
you're using the standard code style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- crates/nu-std/tests/run.nu` to run the tests for the
standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
2023-06-29 05:19:48 +12:00
Darren Schroeder
4b8a259916
update ast to support output to json (#8962)
# Description
This PR changes the `ast` command to be able to output `--json` as well
as `nuon` (default) with "pretty" and "minified" output. I'm hoping this
functionality will be usable in the vscode extension for semantic
tokenization and highlighting.

# User-Facing Changes
There's a new `--json`/`-j` option. Prior version output of nuon is
maintained as default.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- crates/nu-std/tests/run.nu` to run the tests for the
standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2023-04-26 08:15:42 -05:00
JT
32f098d91d
Hopefully speedup startup (#8913)
# Description

Trying a few different things to hopefully speedup startup a bit. I'm
seeing some improvement on my box for the profiles I have, but the data
I'm seeing is noisy.

- Remove allocations in a few places where we created vec's but could
use iterators
- Pre-allocate space for blocks based on the lite block
- Removed a few extra clones

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- crates/nu-std/tests/run.nu` to run the tests for the
standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2023-04-18 20:19:08 +12:00
Amirhossein Akhlaghpour
00469de93e
Limit recursion to avoid stack overflow (#7657)
Add recursion limit to `def` and `block`.
Summary of this PR , it will detect if `def` call itself or not .
Then execute by using `stack` which I think best choice to use with this
design and core as it is available in all crates and mutable and
calculate the recursion limit on calling `def`.
Set 50 as recursion limit on `Config`.
Add some tests too .

Fixes #5899

Co-authored-by: Reilly Wood <reilly.wood@icloud.com>
2023-01-04 18:38:50 -08:00
JT
56b3fc61a3
Remove statements, replaced by pipelines (#4482) 2022-02-15 14:31:14 -05:00
Jakub Žádník
2fbd182993
Allow viewing the source code of blocks (#894)
* Add spans to blocks and view command

* Better description; Cleanup

* Rename "view" command to "view-source"
2022-01-31 00:05:25 +02:00
JT
44821d9941
Add support for def-env and export def-env (#887) 2022-01-29 15:45:46 -05:00
Jakub Žádník
f8f437b060
Separate Overlay into its own thing (#344)
It's no longer attached to a Block. Makes access to overlays more
streamlined by removing this one indirection. Also makes it easier to
create standalone overlays without a block which might come in handy.
2021-11-17 17:23:55 +13:00
Jakub Žádník
5459d30a24
Add environment variable support for modules (#331)
* Add 'expor env' dummy command

* (WIP) Abstract away module exportables as Overlay

* Switch to Overlays for use/hide

Works for decls only right now.

* Fix passing import patterns of hide to eval

* Simplify use/hide of decls

* Add ImportPattern as Expr; Add use env eval

Still no parsing of "export env" so I can't test it yet.

* Refactor export parsing; Add InternalError

* Add env var export and activation; Misc changes

Now it is possible to `use` env var that was exported from a module.

This commit also adds some new errors and other small changes.

* Add env var hiding

* Fix eval not recognizing hidden decls

Without this change, calling `hide foo`, the evaluator does not know
whether a custom command named "foo" was hidden during parsing,
therefore, it is not possible to reliably throw an error about the "foo"
name not found.

* Add use/hide/export env var tests; Cleanup; Notes

* Ignore hide env related tests for now

* Fix main branch merge mess

* Fixed multi-word export def

* Fix hiding tests on Windows

* Remove env var hiding for now
2021-11-16 12:16:06 +13:00
JT
d29208dd9e WIP 2021-10-26 09:04:23 +13:00
Jakub Žádník
3f8f3ecf9a Fmt 2021-09-26 14:12:39 +03:00
Jakub Žádník
f57f7b2def Allow adding definitions from module into scope 2021-09-26 13:53:52 +03:00
Jakub Žádník
12cf1a8f83 Allow adding module blocks to engine state 2021-09-26 12:12:32 +03:00
Fernando Herrera
0794ebf5fa error parsing for def, alias and let 2021-09-10 08:28:43 +01:00
JT
aaee3a8b61 WIP 2021-09-06 11:16:27 +12:00
JT
94687a7603 Back to working state 2021-09-03 06:21:37 +12:00