Commit Graph

3086 Commits

Author SHA1 Message Date
YizhePKU
6c649809d3
Rewrite run_external.rs (#12921)
This PR is a complete rewrite of `run_external.rs`. The main goal of the
rewrite is improving readability, but it also fixes some bugs related to
argument handling and the PATH variable (fixes
https://github.com/nushell/nushell/issues/6011).

I'll discuss some technical details to make reviewing easier.

## Argument handling

Quoting arguments for external commands is hard. Like, *really* hard.
We've had more than a dozen issues and PRs dedicated to quoting
arguments (see Appendix) but the current implementation is still buggy.

Here's a demonstration of the buggy behavior:

```nu
let foo = "'bar'"
^touch $foo            # This creates a file named `bar`, but it should be `'bar'`
^touch ...[ "'bar'" ]  # Same
```

I'll describe how this PR deals with argument handling.

First, we'll introduce the concept of **bare strings**. Bare strings are
**string literals** that are either **unquoted** or **quoted by
backticks** [^1]. Strings within a list literal are NOT considered bare
strings, even if they are unquoted or quoted by backticks.

When a bare string is used as an argument to external process, we need
to perform tilde-expansion, glob-expansion, and inner-quotes-removal, in
that order. "Inner-quotes-removal" means transforming from
`--option="value"` into `--option=value`.

## `.bat` files and CMD built-ins

On Windows, `.bat` files and `.cmd` files are considered executable, but
they need `CMD.exe` as the interpreter. The Rust standard library
supports running `.bat` files directly and will spawn `CMD.exe` under
the hood (see
[documentation](https://doc.rust-lang.org/std/process/index.html#windows-argument-splitting)).
However, other extensions are not supported [^2].

Nushell also supports a selected number of CMD built-ins. The problem
with CMD is that it uses a different set of quoting rules. Correctly
quoting for CMD requires using
[Command::raw_arg()](https://doc.rust-lang.org/std/os/windows/process/trait.CommandExt.html#tymethod.raw_arg)
and manually quoting CMD special characters, on top of quoting from the
Nushell side. ~~I decided that this is too complex and chose to reject
special characters in CMD built-ins instead [^3]. Hopefully this will
not affact real-world use cases.~~ I've implemented escaping that works
reasonably well.

## `which-support` feature

The `which` crate is now a hard dependency of `nu-command`, making the
`which-support` feature essentially useless. The `which` crate is
already a hard dependency of `nu-cli`, and we should consider removing
the `which-support` feature entirely.

## Appendix

Here's a list of quoting-related issues and PRs in rough chronological
order.

* https://github.com/nushell/nushell/issues/4609
* https://github.com/nushell/nushell/issues/4631
* https://github.com/nushell/nushell/issues/4601
  * https://github.com/nushell/nushell/pull/5846
* https://github.com/nushell/nushell/issues/5978
  * https://github.com/nushell/nushell/pull/6014
* https://github.com/nushell/nushell/issues/6154
  * https://github.com/nushell/nushell/pull/6161
* https://github.com/nushell/nushell/issues/6399
  * https://github.com/nushell/nushell/pull/6420
  * https://github.com/nushell/nushell/pull/6426
* https://github.com/nushell/nushell/issues/6465
* https://github.com/nushell/nushell/issues/6559
  * https://github.com/nushell/nushell/pull/6560

[^1]: The idea that backtick-quoted strings act like bare strings was
introduced by Kubouch and briefly mentioned in [the language
reference](https://www.nushell.sh/lang-guide/chapters/strings_and_text.html#backtick-quotes).

[^2]: The documentation also said "running .bat scripts in this way may
be removed in the future and so should not be relied upon", which is
another reason to move away from this. But again, quoting for CMD is
hard.

[^3]: If anyone wants to try, the best resource I found on the topic is
[this](https://daviddeley.com/autohotkey/parameters/parameters.htm).
2024-05-23 02:05:27 +00:00
Wind
ac4125f8ed
fix range semantic in detect_columns, str substring, str index-of (#12894)
# Description
Fixes: https://github.com/nushell/nushell/issues/7761

It's still unsure if we want to change the `range semantic` itself, but
it's good to keep range semantic consistent between nushell commands.

# User-Facing Changes
### Before
```nushell
❯ "abc" | str substring 1..=2
b
```
### After
```nushell
❯ "abc" | str substring 1..=2
bc
```

# Tests + Formatting
Adjust tests to fit new behavior
2024-05-22 20:00:58 +03:00
Jakub Žádník
75689ec98a
Small improvements to debug profile (#12930)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

1. With the `-l` flag, `debug profile` now collects files and line
numbers of profiled pipeline elements

![profiler_lines](https://github.com/nushell/nushell/assets/25571562/b400a956-d958-4aff-aa4c-7e65da3f78fa)

2. Error from the profiled closure will be reported instead of silently
ignored.

![profiler_lines_error](https://github.com/nushell/nushell/assets/25571562/54f7ad7a-06a3-4d56-92c2-c3466917bee8)


# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

New `--lines(-l)` flag to `debug profile`. The command will also fail if
the profiled closure fails, so technically it is a breaking change.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-22 19:56:51 +03:00
Devyn Cairns
7de513a4e0
Implement streaming I/O for CSV and TSV commands (#12918)
# Description

Implements streaming for:

- `from csv`
- `from tsv`
- `to csv`
- `to tsv`

via the new string-typed ByteStream support.

# User-Facing Changes
Commands above. Also:

- `to csv` and `to tsv` now have `--columns <List(String)>`, to provide
the exact columns desired in the output. This is required for them to
have streaming output, because otherwise collecting the entire list is
necessary to determine the output columns. If we introduce
`TableStream`, this may become less necessary.

# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

# After Submitting
- [ ] release notes

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-22 16:55:24 +00:00
Devyn Cairns
758c5d447a
Add support for the ps command on FreeBSD, NetBSD, and OpenBSD (#12892)
# Description

I feel like it's a little sad that BSDs get to enjoy almost everything
other than the `ps` command, and there are some tests that rely on this
command, so I figured it would be fun to patch that and make it work.

The different BSDs have diverged from each other somewhat, but generally
have a similar enough API for reading process information via
`sysctl()`, with some slightly different args.

This supports FreeBSD with the `freebsd` module, and NetBSD and OpenBSD
with the `netbsd` module. OpenBSD is a fork of NetBSD and the interface
has some minor differences but many things are the same.

I had wanted to try to support DragonFlyBSD too, but their Rust version
in the latest release is only 1.72.0, which is too old for me to want to
try to compile rustc up to 1.77.2... but I will revisit this whenever
they do update it. Dragonfly is a fork of FreeBSD, so it's likely to be
more or less the same - I just don't want to enable it without testing
it.

Fixes #6862 (partially, we probably won't be adding `zfs list`)

# User-Facing Changes
`ps` added for FreeBSD, NetBSD, and OpenBSD.

# Tests + Formatting
The CI doesn't run tests for BSDs, so I'm not entirely sure if
everything was already passing before. (Frankly, it's unlikely.) But
nothing appears to be broken.

# After Submitting
- [ ] release notes?
- [ ] DragonflyBSD, whenever they do update Rust to something close
enough for me to try it
2024-05-22 08:13:45 -07:00
Ian Manske
905e3d0715
Remove dataframes crate and feature (#12889)
# Description
Removes the old `nu-cmd-dataframe` crate in favor of the polars plugin.
As such, this PR also removes the `dataframe` feature, related CI, and
full releases of nushell.
2024-05-20 17:22:08 +00:00
Ian Manske
c98960d053
Take owned Read and Write (#12909)
# Description
As @YizhePKU pointed out, the [Rust API
guidelines](https://rust-lang.github.io/api-guidelines/interoperability.html#generic-readerwriter-functions-take-r-read-and-w-write-by-value-c-rw-value)
recommend that generic functions take readers and writers by value and
not by reference. This PR changes `copy_with_interupt` and few other
places to take owned `Read` and `Write` instead of mutable references.
2024-05-20 15:10:36 +02:00
Devyn Cairns
c61075e20e
Add string/binary type color to ByteStream (#12897)
# Description

This PR allows byte streams to optionally be colored as being
specifically binary or string data, which guarantees that they'll be
converted to `Binary` or `String` appropriately on `into_value()`,
making them compatible with `Type` guarantees. This makes them
significantly more broadly usable for command input and output.

There is still an `Unknown` type for byte streams coming from external
commands, which uses the same behavior as we previously did where it's a
string if it's UTF-8.

A small number of commands were updated to take advantage of this, just
to prove the point. I will be adding more after this merges.

# User-Facing Changes
- New types in `describe`: `string (stream)`, `binary (stream)`
- These commands now return a stream if their input was a stream:
  - `into binary`
  - `into string`
  - `bytes collect`
  - `str join`
  - `first` (binary)
  - `last` (binary)
  - `take` (binary)
  - `skip` (binary)
- Streams that are explicitly binary colored will print as a streaming
hexdump
  - example:
    ```nushell
    1.. | each { into binary } | bytes collect
    ```

# Tests + Formatting
I've added some tests to cover it at a basic level, and it doesn't break
anything existing, but I do think more would be nice. Some of those will
come when I modify more commands to stream.

# After Submitting
There are a few things I'm not quite satisfied with:

- **String trimming behavior.** We automatically trim newlines from
streams from external commands, but I don't think we should do this with
internal commands. If I call a command that happens to turn my string
into a stream, I don't want the newline to suddenly disappear. I changed
this to specifically do it only on `Child` and `File`, but I don't know
if this is quite right, and maybe we should bring back the old flag for
`trim_end_newline`
- **Known binary always resulting in a hexdump.** It would be nice to
have a `print --raw`, so that we can put binary data on stdout
explicitly if we want to. This PR doesn't change how external commands
work though - they still dump straight to stdout.

Otherwise, here's the normal checklist:

- [ ] release notes
- [ ] docs update for plugin protocol changes (added `type` field)

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-20 00:35:32 +00:00
Ian Manske
baeba19b22
Make get_full_help take &dyn Command (#12903)
# Description
Changes `get_full_help` to take a `&dyn Command` instead of multiple
arguments (`&Signature`, `&Examples` `is_parser_keyword`). All of these
arguments can be gathered from a `Command`, so there is no need to pass
the pieces to `get_full_help`.

This PR also fixes an issue where the search terms are not shown if
`--help` is used on a command.
2024-05-19 19:56:33 +02:00
Ian Manske
474293bf1c
Clear environment for child Commands (#12901)
# Description
There is a bug when `hide-env` is used on environment variables that
were present at shell startup. Namely, child processes still inherit the
hidden environment variable. This PR fixes #12900, fixes #11495, and
fixes #7937.

# Tests + Formatting
Added a test.
2024-05-19 15:35:07 +00:00
Ian Manske
cc9f41e553
Use CommandType in more places (#12832)
# Description
Kind of a vague title, but this PR does two main things:
1. Rather than overriding functions like `Command::is_parser_keyword`,
this PR instead changes commands to override `Command::command_type`.
The `CommandType` returned by `Command::command_type` is then used to
automatically determine whether `Command::is_parser_keyword` and the
other `is_{type}` functions should return true. These changes allow us
to remove the `CommandType::Other` case and should also guarantee than
only one of the `is_{type}` functions on `Command` will return true.
2. Uses the new, reworked `Command::command_type` function in the `scope
commands` and `which` commands.


# User-Facing Changes
- Breaking change for `scope commands`: multiple columns (`is_builtin`,
`is_keyword`, `is_plugin`, etc.) have been merged into the `type`
column.
- Breaking change: the `which` command can now report `plugin` or
`keyword` instead of `built-in` in the `type` column. It may also now
report `external` instead of `custom` in the `type` column for known
`extern`s.
2024-05-18 23:37:31 +00:00
Ian Manske
580c60bb82
Preserve metadata in more places (#12848)
# Description
This PR makes some commands and areas of code preserve pipeline
metadata. This is in an attempt to make the issue described in #12599
and #9456 less likely to occur. That is, reading and writing to the same
file in a pipeline will result in an empty file. Since we preserve
metadata in more places now, there will be a higher chance that we
successfully detect this error case and abort the pipeline.
2024-05-17 17:59:32 +00:00
Devyn Cairns
c10aa2cf09
collect: don't require a closure (#12788)
# Description

This changes the `collect` command so that it doesn't require a closure.
Still allowed, optionally.

Before:

```nushell
open foo.json | insert foo bar | collect { save -f foo.json }
```

After:

```nushell
open foo.json | insert foo bar | collect | save -f foo.json
```

The closure argument isn't really necessary, as collect values are also
supported as `PipelineData`.

# User-Facing Changes
- `collect` command changed

# Tests + Formatting
Example changed to reflect.

# After Submitting
- [ ] release notes
- [ ] we may want to deprecate the closure arg?
2024-05-17 18:46:03 +02:00
Wind
8adf3406e5
allow define it as a variable inside closure (#12888)
# Description
Fixes: #12690 

The issue is happened after
https://github.com/nushell/nushell/pull/12056 is merged. It will raise
error if user doesn't supply required parameter when run closure with
do.
And parser adds a `$it` parameter when parsing closure or block
expression.

I believe the previous behavior is because we allow such syntax on
previous version(0.44):
```nushell
let x = { print $it }
```
But it's no longer allowed after 0.60.  So I think they can be removed.

# User-Facing Changes
```nushell
let tmp = {
  let it = 42
  print $it
}

do -c $tmp
```
should be possible again.

# Tests + Formatting
Added 1 test
2024-05-17 00:03:13 +00:00
Ian Manske
6891267b53
Support ByteStreams in bytes starts-with and bytes ends-with (#12887)
# Description
Restores `bytes starts-with` so that it is able to work with byte
streams once again. For parity/consistency, this PR also adds byte
stream support to `bytes ends-with`.

# User-Facing Changes
- `bytes ends-with` now supports byte streams.

# Tests + Formatting
Re-enabled tests for `bytes starts-with` and added tests for `bytes
ends-with`.
2024-05-17 07:59:08 +08:00
Ian Manske
aec41f3df0
Add Span merging functions (#12511)
# Description
This PR adds a few functions to `Span` for merging spans together:
- `Span::append`: merges two spans that are known to be in order.
- `Span::concat`: returns a span that encompasses all the spans in a
slice. The spans must be in order.
- `Span::merge`: merges two spans (no order necessary).
- `Span::merge_many`: merges an iterator of spans into a single span (no
order necessary).

These are meant to replace the free-standing `nu_protocol::span`
function.

The spans in a `LiteCommand` (the `parts`) should always be in order
based on the lite parser and lexer. So, the parser code sees the most
usage of `Span::append` and `Span::concat` where the order is known. In
other code areas, `Span::merge` and `Span::merge_many` are used since
the order between spans is often not known.
2024-05-16 22:34:49 +00:00
Ian Manske
6fd854ed9f
Replace ExternalStream with new ByteStream type (#12774)
# Description
This PR introduces a `ByteStream` type which is a `Read`-able stream of
bytes. Internally, it has an enum over three different byte stream
sources:
```rust
pub enum ByteStreamSource {
    Read(Box<dyn Read + Send + 'static>),
    File(File),
    Child(ChildProcess),
}
```

This is in comparison to the current `RawStream` type, which is an
`Iterator<Item = Vec<u8>>` and has to allocate for each read chunk.

Currently, `PipelineData::ExternalStream` serves a weird dual role where
it is either external command output or a wrapper around `RawStream`.
`ByteStream` makes this distinction more clear (via `ByteStreamSource`)
and replaces `PipelineData::ExternalStream` in this PR:
```rust
pub enum PipelineData {
    Empty,
    Value(Value, Option<PipelineMetadata>),
    ListStream(ListStream, Option<PipelineMetadata>),
    ByteStream(ByteStream, Option<PipelineMetadata>),
}
```

The PR is relatively large, but a decent amount of it is just repetitive
changes.

This PR fixes #7017, fixes #10763, and fixes #12369.

This PR also improves performance when piping external commands. Nushell
should, in most cases, have competitive pipeline throughput compared to,
e.g., bash.
| Command | Before (MB/s) | After (MB/s) | Bash (MB/s) |
| -------------------------------------------------- | -------------:|
------------:| -----------:|
| `throughput \| rg 'x'` | 3059 | 3744 | 3739 |
| `throughput \| nu --testbin relay o> /dev/null` | 3508 | 8087 | 8136 |

# User-Facing Changes
- This is a breaking change for the plugin communication protocol,
because the `ExternalStreamInfo` was replaced with `ByteStreamInfo`.
Plugins now only have to deal with a single input stream, as opposed to
the previous three streams: stdout, stderr, and exit code.
- The output of `describe` has been changed for external/byte streams.
- Temporary breaking change: `bytes starts-with` no longer works with
byte streams. This is to keep the PR smaller, and `bytes ends-with`
already does not work on byte streams.
- If a process core dumped, then instead of having a `Value::Error` in
the `exit_code` column of the output returned from `complete`, it now is
a `Value::Int` with the negation of the signal number.

# After Submitting
- Update docs and book as necessary
- Release notes (e.g., plugin protocol changes)
- Adapt/convert commands to work with byte streams (high priority is
`str length`, `bytes starts-with`, and maybe `bytes ends-with`).
- Refactor the `tee` code, Devyn has already done some work on this.

---------

Co-authored-by: Devyn Cairns <devyn.cairns@gmail.com>
2024-05-16 07:11:18 -07:00
Ian Manske
06fe7d1e16
Remove usages of Call::positional_nth (#12871)
# Description
Following from #12867, this PR replaces usages of `Call::positional_nth`
with existing spans. This removes several `expect`s from the code.

Also remove unused `positional_nth_mut` and `positional_iter_mut`
2024-05-15 19:59:42 +02:00
NotTheDr01ds
b08135d877
Fixed small error in the help-examples for the get command (#12877)
# Description

Another small error in Help, this time for the `get` command example.

# User-Facing Changes

Help only
2024-05-15 19:49:08 +02:00
Ian Manske
0cfbdc909e
Fix sys panic (#12846)
# Description
This should fix #10155 where the `sys` command can panic due to date
math in certain cases / on certain systems.

# User-Facing Changes
The `boot_time` column now has a date value instead of a formatted date
string. This is technically a breaking change.
2024-05-15 15:40:04 +08:00
Ian Manske
c3da44cbb7
Fix char panic (#12867)
# Description
The `char` command can panic due to a failed `expect`: `char --integer
...[77 78 79]`

This PR fixes the panic for the `--integer` flag and also the
`--unicode` flag.

# After Submitting
Check other commands and places where similar bugs can occur due to
usages of `Call::positional_nth` and related methods.
2024-05-14 21:10:06 +00:00
NotTheDr01ds
aa46bc97b3
Search terms for compact command (#12864)
# Description

There was a question in Discord today about how to remove empty rows
from a table. The user found the `compact` command on their own, but I
realized that there were no search terms on the command. I've added
'empty' and 'remove', although I subsequently figured out that 'empty'
is found in the "usage" anyway. That said, I don't think it hurts to
have good search terms behind it regardless.

# User-Facing Changes

Just the help

# Tests + Formatting

- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

# After Submitting
2024-05-14 09:21:50 -05:00
Maxime Jacob
cd381b74e0
Fix improperly escaped strings in stor insert (#12820)
- fixes #12764 

Replaced the custom logic with values_to_sql method that is already used
in crate::database.
This will ensure that handling of parameters is the same between sqlite
and stor.
2024-05-13 20:22:39 -05:00
NotTheDr01ds
c70c43aae9
Add example and search term for 'repeat' to the fill command (#12844)
# Description

It's commonly forgotten or overlooked that a lot of `std repeat`
functionality can be handled with the built-in `fill`. Added 'repeat` as
a search term for `fill` to improve discoverability.

Also replaced one of the existing examples with one `fill`ing an empty
string, a la `repeat`. There were 6 examples already, and 3 of them
pretty much were variations on the same theme, so I repurposed one of
those rather than adding a 7th.

# User-Facing Changes

Changes to `help` only

# Tests + Formatting

- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

# After Submitting

I assume the "Commands" doc is auto-generated from the `help`, but I'll
double-check that assumption.
2024-05-12 20:55:07 -05:00
Ian Manske
30fc832035
Fix custom converters with save (#12833)
# Description
Fixes #10429 where `save` fails if a custom command is used as the file
format converter.

# Tests + Formatting
Added a test.
2024-05-12 13:19:28 +02:00
Brage Ingebrigtsen
075535f869
remove --not flag for 'str contains' (#12837)
# Description
This PR resolves an inconsistency between different `str` subcommands,
notably `str contains`, `str starts-with` and `str ends-with`. Only the
`str contains` command has the `--not` flag and a desicion was made in
this #12781 PR to remove the `--not` flag and use the `not` operator
instead.

Before:
`"blob" | str contains --not o`
After:
`not ("blob" | str contains o)` OR `"blob" | str contains o | not $in`

> Note, you can currently do all three, but the first will be broken
after this PR is merged.

# User-Facing Changes
- remove `--not(-n)` flag from `str contains` command
  - This is a breaking change!

# Tests + Formatting
- [x] Added tests
- [x] Ran `cargo fmt --all`
- [x] Ran `cargo clippy --workspace -- -D warnings -D
clippy::unwrap_used`
- [x] Ran `cargo test --workspace`
- [ ] Ran `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"`
    - I was unable to get this working.
```
Error: nu::parser::export_not_found

  × Export not found.
   ╭─[source:1:9]
 1 │ use std testing; testing run-tests --path crates/nu-std
   ·         ───┬───
   ·            ╰── could not find imports
   ╰────
```
^ I still can't figure out how to make this work 😂 

# After Submitting
Requires update of documentation
2024-05-11 23:13:36 +00:00
Ian Manske
cab86f49c0
Fix pipe redirection into complete (#12818)
# Description
Fixes #12796 where a combined out and err pipe redirection (`o+e>|`)
into `complete` still provides separate `stdout` and `stderr` columns in
the record. Now, the combined output will be in the `stdout` column.
This PR also fixes a similar error with the `e>|` pipe redirection.

# Tests + Formatting
Added two tests.
2024-05-11 15:32:00 +00:00
YizhePKU
b9a7faad5a
Implement PWD recovery (#12779)
This PR has two parts. The first part is the addition of the
`Stack::set_pwd()` API. It strips trailing slashes from paths for
convenience, but will reject otherwise bad paths, leaving PWD in a good
state. This should reduce the impact of faulty code incorrectly trying
to set PWD.
(https://github.com/nushell/nushell/pull/12760#issuecomment-2095393012)

The second part is implementing a PWD recovery mechanism. PWD can become
bad even when we did nothing wrong. For example, Unix allows you to
remove any directory when another process might still be using it, which
means PWD can just "disappear" under our nose. This PR makes it possible
to use `cd` to reset PWD into a good state. Here's a demonstration:

```sh
mkdir /tmp/foo
cd /tmp/foo

# delete "/tmp/foo" in a subshell, because Nushell is smart and refuse to delete PWD
nu -c 'cd /; rm -r /tmp/foo'

ls          # Error:   × $env.PWD points to a non-existent directory
            # help: Use `cd` to reset $env.PWD into a good state

cd /
pwd         # prints /
```

Also, auto-cd should be working again.
2024-05-10 11:06:33 -05:00
Ian Manske
3b3f48202c
Refactor message printing in rm (#12799)
# Description
Changes the iterator in `rm` to be an iterator over
`Result<Option<String>, ShellError>` (an optional message or error)
instead of an iterator over `Value`. Then, the iterator is consumed and
each message is printed. This allows the
`PipelineData::print_not_formatted` method to be removed.
2024-05-09 13:36:47 +08:00
Stefan Holderbach
ba6f38510c
Shrink Value by boxing Range/Closure (#12784)
# Description
On 64-bit platforms the current size of `Value` is 56 bytes. The
limiting variants were `Closure` and `Range`. Boxing the two reduces the
size of Value to 48 bytes. This is the minimal size possible with our
current 16-byte `Span` and any 24-byte `Vec` container which we use in
several variants. (Note the extra full 8-bytes necessary for the
discriminant or other smaller values due to the 8-byte alignment of
`usize`)

This is leads to a size reduction of ~15% for `Value` and should overall
be beneficial as both `Range` and `Closure` are rarely used compared to
the primitive types or even our general container types.

# User-Facing Changes
Less memory used, potential runtime benefits.

(Too late in the evening to run the benchmarks myself right now)
2024-05-09 08:10:58 +08:00
Ian Manske
3b26c08dab
Refactor parse command (#12791)
# Description
- Switches the `excess` in the `ParserStream` and
`ParseStreamerExternal` types from a `Vec` to a `VecDeque`
- Removes unnecessary clones to `stream_helper`
- Other simplifications and loop restructuring
- Merges the `ParseStreamer` and `ParseStreamerExternal` types into a
common `ParseIter`
- `parse` now streams for list values
2024-05-08 06:50:58 -05:00
YizhePKU
7a86b98f61
Migrate to a new PWD API (part 2) (#12749)
Refer to #12603 for part 1.

We need to be careful when migrating to the new API, because the new API
has slightly different semantics (PWD can contain symlinks). This PR
handles the "obviously safe" part of the migrations. Namely, it handles
two specific use cases:

* Passing PWD into `canonicalize_with()`
* Passing PWD into `EngineState::merge_env()`

The first case is safe because symlinks are canonicalized away. The
second case is safe because `EngineState::merge_env()` only uses PWD to
call `std::env::set_current_dir()`, which shouldn't affact Nushell. The
commit message contains detailed stats on the updated files.

Because these migrations touch a lot of files, I want to keep these PRs
small to avoid merge conflicts.
2024-05-07 18:17:49 +03:00
Ian Manske
b9331d1b08
Add sys users command (#12787)
# Description
Add a new `sys users` command which returns a table of the users of the
system. This is the same table that is currently present as
`(sys).host.sessions`. The same table has been removed from the recently
added `sys host` command.

# User-Facing Changes
Adds a new command. (The old `sys` command is left as is.)
2024-05-07 07:52:02 -05:00
Ian Manske
1038c64f80
Add sys subcommands (#12747)
# Description
Adds subcommands to `sys` corresponding to each column of the record
returned by `sys`. This is to alleviate the fact that `sys` now returns
a regular record, meaning that it must compute every column which might
take a noticeable amount of time. The subcommands, on the other hand,
only need to compute and return a subset of the data which should be
much faster. In fact, it should be as fast as before, since this is how
the lazy record worked (it would compute only each column as necessary).

I choose to add subcommands instead of having an optional cell-path
parameter on `sys`, since the cell-path parameter would:
- increase the code complexity (can access any value at any row or
nested column)
- prevents discovery with tab-completion
- hinders type checking and allows users to pass potentially invalid
columns

# User-Facing Changes
Deprecates `sys` in favor of the new `sys` subcommands.
2024-05-06 23:20:27 +00:00
Wind
460a1c8f87
Allow ls works inside dir with [] brackets (#12625)
# Description
Fixes: #12429

To fix the issue, we need to pass the `input pattern` itself to
`glob_from` function, but currently on latest main, nushell pass
`expanded path of input pattern` to `glob_from` function.
It causes globbing failed if expanded path includes `[]` brackets.

It's a pity that I have to duplicate `nu_engine::glob_from` function
into `ls`, because `ls` might convert from `NuGlob::NotExpand` to
`NuGlob::Expand`, in that case, `nu_engine::glob_from` won't work if
user want to ls for a directory which includes tilde:
```
mkdir "~abc"
ls "~abc"
```
So I need to duplicate `glob_from` function and pass original
`expand_tilde` information.

# User-Facing Changes
Nan

# Tests + Formatting
Done

# After Submitting
Nan
2024-05-06 14:01:32 +08:00
Ian Manske
e879d4ecaf
ListStream touchup (#12524)
# Description

Does some misc changes to `ListStream`:
- Moves it into its own module/file separate from `RawStream`.
- `ListStream`s now have an associated `Span`.
- This required changes to `ListStreamInfo` in `nu-plugin`. Note sure if
this is a breaking change for the plugin protocol.
- Hides the internals of `ListStream` but also adds a few more methods.
- This includes two functions to more easily alter a stream (these take
a `ListStream` and return a `ListStream` instead of having to go through
the whole `into_pipeline_data(..)` route).
  -  `map`: takes a `FnMut(Value) -> Value`
  - `modify`: takes a function to modify the inner stream.
2024-05-05 16:00:59 +00:00
Viktor Szépe
8eefb7313e
Minimize future false positive typos (#12751)
# Description

Make typos config more strict: ignore false positives where they occur.

1. Ignore only files with typos
2. Add regexp-s with context
3. Ignore variable names only in Rust code
4. Ignore only 1 "identifier"
5. Check dot files

🎁 Extra bonus: fix typos!!
2024-05-04 15:00:44 +00:00
Maxime Jacob
3ae6fe2114
Enable columns with spaces for into_sqlite by adding quotes to column names (#12759)
# Description
Spaces were causing an issue with into_sqlite when they appeared in
column names.

This is because the column names were not properly wrapped with
backticks that allow sqlite to properly interpret the column.

The issue has been addressed by adding backticks to the column names of
into sqlite. The output of the column names when using open is
unchanged, and the column names appear without backticks as expected.

fixes #12700 

# User-Facing Changes
N/A

# Tests + Formatting
Formatting has been respected.

Repro steps from the issue have been done, and ran multiple times. New
values get added to the correct columns as expected.
2024-05-04 08:12:44 -05:00
Ian Manske
1e71cd4777
Bump base64 to 0.22.1 (#12757)
# Description
Bumps `base64` to 0.22.1 which fixes the alphabet used for binhex
encoding and decoding. This required updating some test expected output.

Related to PR #12469 where `base64` was also bumped and ran into the
failing tests.

# User-Facing Changes
Bug fix, but still changes binhex encoding and decoding output.

# Tests + Formatting
Updated test expected output.
2024-05-04 15:56:16 +03:00
Devyn Cairns
709b2479d9
Fix trailing slash in PWD set by cd (#12760)
# Description

Fixes #12758.

#12662 introduced a bug where calling `cd` with a path with a trailing
slash would cause `PWD` to be set to a path including a trailing slash,
which is not allowed. This adds a helper to `nu_path` to remove this,
and uses it in the `cd` command to clean it up before setting `PWD`.

# Tests + Formatting
I added some tests to make sure we don't regress on this in the future.

- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
2024-05-04 12:38:37 +03:00
Stefan Holderbach
406df7f208
Avoid taking unnecessary ownership of intermediates (#12740)
# Description

Judiciously try to avoid allocations/clone by changing the signature of
functions

- **Don't pass str by value unnecessarily if only read**
- **Don't require a vec in `Sandbox::with_files`**
- **Remove unnecessary string clone**
- **Fixup unnecessary borrow**
- **Use `&str` in shape color instead**
- **Vec -> Slice**
- **Elide string clone**
- **Elide `Path` clone**
- **Take &str to elide clone in tests**

# User-Facing Changes
None

# Tests + Formatting
This touches many tests purely in changing from owned to borrowed/static
data
2024-05-04 00:53:15 +00:00
Raphael Gaschignard
eff7f33086
Report errors that occur on file operations in ls (#12033)
Currently errors just create empty entries inside of resulting
dataframes.

This changeset is meant to help debug #12004, though generally speaking
I do think it's worth having ways to make errors be visible in this kind
of pipeline be visible

An example of what this looks like

<img width="954" alt="image"
src="https://github.com/nushell/nushell/assets/1408472/2c3c9167-2aaf-4f87-bab5-e8302d7a1170">
2024-05-03 10:12:43 -05:00
YizhePKU
bdb6daa4b5
Migrate to a new PWD API (#12603)
This is the first PR towards migrating to a new `$env.PWD` API that
returns potentially un-canonicalized paths. Refer to PR #12515 for
motivations.

## New API: `EngineState::cwd()`

The goal of the new API is to cover both parse-time and runtime use
case, and avoid unintentional misuse. It takes an `Option<Stack>` as
argument, which if supplied, will search for `$env.PWD` on the stack in
additional to the engine state. I think with this design, there's less
confusion over parse-time and runtime environments. If you have access
to a stack, just supply it; otherwise supply `None`.

## Deprecation of other PWD-related APIs

Other APIs are re-implemented using `EngineState::cwd()` and properly
documented. They're marked deprecated, but their behavior is unchanged.
Unused APIs are deleted, and code that accesses `$env.PWD` directly
without using an API is rewritten.

Deprecated APIs:

* `EngineState::current_work_dir()`
* `StateWorkingSet::get_cwd()`
* `env::current_dir()`
* `env::current_dir_str()`
* `env::current_dir_const()`
* `env::current_dir_str_const()`

Other changes:

* `EngineState::get_cwd()` (deleted)
* `StateWorkingSet::list_env()` (deleted)
* `repl::do_run_cmd()` (rewritten with `env::current_dir_str()`)

## `cd` and `pwd` now use logical paths by default

This pulls the changes from PR #12515. It's currently somewhat broken
because using non-canonicalized paths exposed a bug in our path
normalization logic (Issue #12602). Once that is fixed, this should
work.

## Future plans

This PR needs some tests. Which test helpers should I use, and where
should I put those tests?

I noticed that unquoted paths are expanded within `eval_filepath()` and
`eval_directory()` before they even reach the `cd` command. This means
every paths is expanded twice. Is this intended?

Once this PR lands, the plan is to review all usages of the deprecated
APIs and migrate them to `EngineState::cwd()`. In the meantime, these
usages are annotated with `#[allow(deprecated)]` to avoid breaking CI.

---------

Co-authored-by: Jakub Žádník <kubouch@gmail.com>
2024-05-03 14:33:09 +03:00
Ian Manske
847646e44e
Remove lazy records (#12682)
# Description
Removes lazy records from the language, following from the reasons
outlined in #12622. Namely, this should make semantics more clear and
will eliminate concerns regarding maintainability.

# User-Facing Changes
- Breaking change: `lazy make` is removed.
- Breaking change: `describe --collect-lazyrecords` flag is removed.
- `sys` and `debug info` now return regular records.

# After Submitting
- Update nushell book if necessary.
- Explore new `sys` and `debug info` APIs to prevent them from taking
too long (e.g., subcommands or taking an optional column/cell-path
argument).
2024-05-03 08:36:10 +08:00
Stefan Holderbach
b88d8726d0
Rework for new clippy lints (#12736)
- **Clippy lint `assigning_clones`**
- **Clippy lint `legacy_numeric_constants`**
- **`clippy::float_equality_without_abs`**
- **`nu-table`: clippy::zero_repeat_side_effects**

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-02 19:29:03 +02:00
Darren Schroeder
0805f1fd90
overhaul shell_integration to enable individual control over ansi escape sequences (#12629)
# Description

This PR overhauls the shell_integration system by allowing individual
control over which ansi escape sequences are used. As we continue to
broaden our support for more ansi escape sequences, we can't really have
an all-or-nothing strategy. Some ansi escapes cause problems in certain
operating systems or terminals. We should allow the user to choose which
escapes they want.

TODO:
* Gather feedback
* Should osc7, osc9_9 and osc633p be mutually exclusive?
* Is the naming convention for these settings too nerdy osc2, osc7, etc?

closes #11301

# User-Facing Changes
shell_integration is no longer a boolean value. This is what is
supported in the default_config.nu
```nushell
  shell_integration: {
    # osc2 abbreviates the path if in the home_dir, sets the tab/window title, shows the running command in the tab/window title
    osc2: true
    # osc7 is a way to communicate the path to the terminal, this is helpful for spawning new tabs in the same directory
    osc7: true
    # osc8 is also implemented as the deprecated setting ls.show_clickable_links, it shows clickable links in ls output if your terminal supports it
    osc8: true
    # osc9_9 is from ConEmu and is starting to get wider support. It's similar to osc7 in that it communicates the path to the terminal
    osc9_9: false
    # osc133 is several escapes invented by Final Term which include the supported ones below.
    # 133;A - Mark prompt start
    # 133;B - Mark prompt end
    # 133;C - Mark pre-execution
    # 133;D;exit - Mark execution finished with exit code
    # This is used to enable terminals to know where the prompt is, the command is, where the command finishes, and where the output of the command is
    osc133: true
    # osc633 is closely related to osc133 but only exists in visual studio code (vscode) and supports their shell integration features
    # 633;A - Mark prompt start
    # 633;B - Mark prompt end
    # 633;C - Mark pre-execution
    # 633;D;exit - Mark execution finished with exit code
    # 633;E - NOT IMPLEMENTED - Explicitly set the command line with an optional nonce
    # 633;P;Cwd=<path> - Mark the current working directory and communicate it to the terminal
    # and also helps with the run recent menu in vscode
    osc633: true
    # reset_application_mode is escape \x1b[?1l and was added to help ssh work better
    reset_application_mode: true
  }
```

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2024-05-02 09:56:50 -04:00
Darren Schroeder
8ed0d84d6a
add raw-string literal support (#9956)
# Description

This PR adds raw string support by using `r#` at the beginning of single
quoted strings and `#` at the end.

Notice that escapes do not process, even within single quotes,
parentheses don't mean anything, $variables don't mean anything. It's
just a string.
```nushell
❯ echo r#'one\ntwo (blah) ($var)'#
one\ntwo (blah) ($var)
```
Notice how they work without `echo` or `print` and how they work without
carriage returns.
```nushell
❯ r#'adsfa'#
adsfa
❯ r##"asdfa'@qpejq'##
asdfa'@qpejq
❯ r#'asdfasdfasf
∙ foqwejfqo@'23rfjqf'#
```
They also have a special configurable color in the repl. (use single
quotes though)

![image](https://github.com/nushell/nushell/assets/343840/8780e21d-de4c-45b3-9880-2425f5fe10ef)

They should work like rust raw literals and allow `r##`, `r###`,
`r####`, etc, to help with having one or many `#`'s in the middle of
your raw-string.

They should work with `let` as well.

```nushell
r#'some\nraw\nstring'# | str upcase
```

closes https://github.com/nushell/nushell/issues/5091
# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect -A clippy::result_large_err` to check that
you're using the standard code style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

---------

Co-authored-by: WindSoilder <WindSoilder@outlook.com>
Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-02 09:36:37 -04:00
Devyn Cairns
2970d48d41
Make bytes build accept integer values as individual bytes (#12685)
# Description
This creates an option for building binary data from byte integers.
Previously I think you could only do this by formatting the integers to
hex and using `decode hex`.

One potentially confusing thing is that this is different from the `into
binary` behavior. But since this doesn't support any of the other `into
binary` behaviors, it might be okay.

# User-Facing Changes
- `bytes build` accepts single byte arguments as integers

# Tests + Formatting
Example added.

# After Submitting
- [ ] release notes
2024-05-01 17:29:33 -05:00
pwygab
b22d131279
Prevent each from swallowing errors when eval_block returns a ListStream (#12412)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

Prior, it seemed that nested errors would not get detected and shown.
This PR fixes that.

Resolves #10176:
```
~/CodingProjects/nushell> [[1,2]] | each {|x| $x | each {|y| error make {msg: "oh noes"} } }                        05/04/2024 21:34:08
Error: nu:🐚:eval_block_with_input

  × Eval block failed with pipeline input
   ╭─[entry #1:1:3]
 1 │ [[1,2]] | each {|x| $x | each {|y| error make {msg: "oh noes"} } }
   ·   ┬
   ·   ╰── source value
   ╰────

Error:   × oh noes
   ╭─[entry #1:1:36]
 1 │ [[1,2]] | each {|x| $x | each {|y| error make {msg: "oh noes"} } }
   ·                                    ─────┬────
   ·                                         ╰── originates from here
   ╰────
```

Resolves #11224:
```
~/CodingProjects/nushell> [0] | each { |_|                                                                          05/04/2024 21:35:40
:::     [0] | each { |_|
:::         non-existent-command
:::     }
::: }
Error: nu:🐚:eval_block_with_input

  × Eval block failed with pipeline input
   ╭─[entry #1:2:6]
 1 │ [0] | each { |_|
 2 │     [0] | each { |_|
   ·      ┬
   ·      ╰── source value
 3 │         non-existent-command
   ╰────

Error: nu:🐚:external_command

  × External command failed
   ╭─[entry #1:3:9]
 2 │     [0] | each { |_|
 3 │         non-existent-command
   ·         ──────────┬─────────
   ·                   ╰── executable was not found
 4 │     }
   ╰────
  help: No such file or directory (os error 2)
```

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2024-05-01 17:24:54 -05:00
Devyn Cairns
21ebdfe8d7
Bump version to 0.93.1 (#12710)
# Description

Next patch/dev release, `0.93.1`
2024-05-01 17:19:20 -05:00