Commit Graph

57 Commits

Author SHA1 Message Date
f4136aa3f4 Add pipeline span to metadata (#16014)
# Description

This PR makes the span of a pipeline accessible through `metadata`,
meaning it's possible to get the span of a pipeline without collecting
it.

Examples:
```nushell
ls | metadata
# => ╭────────┬────────────────────╮
# => │        │ ╭───────┬────────╮ │
# => │ span   │ │ start │ 170218 │ │
# => │        │ │ end   │ 170220 │ │
# => │        │ ╰───────┴────────╯ │
# => │ source │ ls                 │
# => ╰────────┴────────────────────╯
```

```nushell
ls | metadata access {|meta|
  error make {msg: "error", label: {text: "here", span: $meta.span}}
}
# => Error:   × error
# =>    ╭─[entry #7:1:1]
# =>  1 │ ls | metadata access {|meta|
# =>    · ─┬
# =>    ·  ╰── here
# =>  2 │   error make {msg: "error", label: {text: "here", span: $meta.span}}
# =>    ╰────
```

Here's an example that wouldn't be possible before, since you would have
to use `metadata $in` to get the span, collecting the (infinite) stream

```nushell
generate {|x=0| {out: 0, next: 0} } | metadata access {|meta|
  # do whatever with stream
  error make {msg: "error", label: {text: "here", span: $meta.span}}
}
# => Error:   × error
# =>    ╭─[entry #16:1:1]
# =>  1 │ generate {|x=0| {out: 0, next: 0} } | metadata access {|meta|
# =>    · ────┬───
# =>    ·     ╰── here
# =>  2 │   # do whatever with stream
# =>    ╰────
```

I haven't done the tests or anything yet since I'm not sure how we feel
about having this as part of the normal metadata, rather than a new
command like `metadata span` or something. We could also have a
`metadata access` like functionality for that with an optional closure
argument potentially.

# User-Facing Changes

* The span of a pipeline is now available through `metadata` and
`metadata access` without collecting a stream.

# Tests + Formatting

TODO

# After Submitting

N/A
2025-06-30 23:17:43 +02:00
833471241a Refactor: Construct IoError from std::io::Error instead of std::io::ErrorKind (#15777) 2025-05-18 14:52:40 +02:00
c2ac8f730e Rust 1.85, edition=2024 (#15741) 2025-05-13 16:49:30 +02:00
430b2746b8 Parse XML documents with DTDs by default, and add --disallow-dtd flag (#15272)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->
This PR allows `from xml` to parse XML documents with [document type
declarations](https://en.wikipedia.org/wiki/Document_type_declaration)
by default. This is especially notable since many HTML documents start
with `<!DOCTYPE html>`, and `roxmltree` should be able to parse some
simple HTML documents. The security concerns with DTDs are [XXE
attacks](https://en.wikipedia.org/wiki/XML_external_entity_attack), and
[exponential entity expansion
attacks](https://en.wikipedia.org/wiki/Billion_laughs_attack).
`roxmltree` [doesn't
support](d2c7801624/src/tokenizer.rs (L535-L547))
external entities (it parses them, but doesn't do anything with them),
so it is not vulnerable to XXE attacks. Additionally, `roxmltree` has
[some
safeguards](d2c7801624/src/parse.rs (L424-L452))
in place to prevent exponential entity expansion, so enabling DTDs by
default is relatively safe. The worst case is no worse than running
`loop {}`, so I think allowing DTDs by default is best, and DTDs can
still be disabled with `--disallow-dtd` if needed.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
* Allows `from xml` to parse XML documents with [document type
declarations](https://en.wikipedia.org/wiki/Document_type_declaration)
by default, and adds a `--disallow-dtd` flag to disallow parsing
documents with DTDs.

This PR also improves the errors in `from xml` by pointing at the issue
in the XML source. Example:

```
$ open --raw foo.xml | from xml 
Error:   × Failed to parse XML
   ╭─[2:7]
 1 │ <html>
 2 │     <p<>hi</p>
   ·       ▲
   ·       ╰── Unexpected character <, expected a whitespace
 3 │ </html>
   ╰────
```

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
N/A

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
N/A
2025-03-12 08:09:55 -05:00
66bc0542e0 Refactor I/O Errors (#14927)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

As mentioned in #10698, we have too many `ShellError` variants, with
some even overlapping in meaning. This PR simplifies and improves I/O
error handling by restructuring `ShellError` related to I/O issues.
Previously, `ShellError::IOError` only contained a message string,
making it convenient but overly generic. It was widely used without
providing spans (#4323).

This PR introduces a new `ShellError::Io` variant that consolidates
multiple I/O-related errors (except for `ShellError::NetworkFailure`,
which remains distinct for now). The new `ShellError::Io` variant
replaces the following:

- `FileNotFound`
- `FileNotFoundCustom`
- `IOInterrupted`
- `IOError`
- `IOErrorSpanned`
- `NotADirectory`
- `DirectoryNotFound`
- `MoveNotPossible`
- `CreateNotPossible`
- `ChangeAccessTimeNotPossible`
- `ChangeModifiedTimeNotPossible`
- `RemoveNotPossible`
- `ReadingFile`

## The `IoError`
`IoError` includes the following fields:

1. **`kind`**: Extends `std::io::ErrorKind` to specify the type of I/O
error without needing new `ShellError` variants. This aligns with the
approach used in `std::io::Error`. This adds a second dimension to error
reporting by combining the `kind` field with `ShellError` variants,
making it easier to describe errors in more detail. As proposed by
@kubouch in [#design-discussion on
Discord](https://discord.com/channels/601130461678272522/615329862395101194/1323699197165178930),
this helps reduce the number of `ShellError` variants. In the error
report, the `kind` field is displayed as the "source" of the error,
e.g., "I/O error," followed by the specific kind of I/O error.
2. **`span`**: A non-optional field to encourage providing spans for
better error reporting (#4323).
3. **`path`**: Optional `PathBuf` to give context about the file or
directory involved in the error (#7695). If provided, it’s shown as a
help entry in error reports.
4. **`additional_context`**: Allows adding custom messages when the
span, kind, and path are insufficient. This is rendered in the error
report at the labeled span.
5. **`location`**: Sometimes, I/O errors occur in the engine itself and
are not caused directly by user input. In such cases, if we don’t have a
span and must set it to `Span::unknown()`, we need another way to
reference the error. For this, the `location` field uses the new
`Location` struct, which records the Rust file and line number where the
error occurred. This ensures that we at least know the Rust code
location that failed, helping with debugging. To make this work, a new
`location!` macro was added, which retrieves `file!`, `line!`, and
`column!` values accurately. If `Location::new` is used directly, it
issues a warning to remind developers to use the macro instead, ensuring
consistent and correct usage.

### Constructor Behavior
`IoError` provides five constructor methods:
- `new` and `new_with_additional_context`: Used for errors caused by
user input and require a valid (non-unknown) span to ensure precise
error reporting.
- `new_internal` and `new_internal_with_path`: Used for internal errors
where a span is not available. These methods require additional context
and the `Location` struct to pinpoint the source of the error in the
engine code.
- `factory`: Returns a closure that maps an `std::io::Error` to an
`IoError`. This is useful for handling multiple I/O errors that share
the same span and path, streamlining error handling in such cases.

## New Report Look
This is simulation how the I/O errors look like (the `open crates` is
simulated to show how internal errors are referenced now):
![Screenshot 2025-01-25
190426](https://github.com/user-attachments/assets/a41b6aa6-a440-497d-bbcc-3ac0121c9226)

## `Span::test_data()`
To enable better testing, `Span::test_data()` now returns a value
distinct from `Span::unknown()`. Both `Span::test_data()` and
`Span::unknown()` refer to invalid source code, but having a separate
value for test data helps identify issues during testing while keeping
spans unique.

## Cursed Sneaky Error Transfers
I removed the conversions between `std::io::Error` and `ShellError` as
they often removed important information and were used too broadly to
handle I/O errors. This also removed the problematic implementation
found here:

7ea4895513/crates/nu-protocol/src/errors/shell_error.rs (L1534-L1583)

which hid some downcasting from I/O errors and made it hard to trace
where `ShellError` was converted into `std::io::Error`. To address this,
I introduced a new struct called `ShellErrorBridge`, which explicitly
defines this transfer behavior. With `ShellErrorBridge`, we can now
easily grep the codebase to locate and manage such conversions.

## Miscellaneous
- Removed the OS error added in #14640, as it’s no longer needed.
- Improved error messages in `glob_from` (#14679).
- Trying to open a directory with `open` caused a permissions denied
error (it's just what the OS provides). I added a `is_dir` check to
provide a better error in that case.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

- Error outputs now include more detailed information and are formatted
differently, including updated error codes.
- The structure of `ShellError` has changed, requiring plugin authors
and embedders to update their implementations.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

I updated tests to account for the new I/O error structure and
formatting changes.

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

This PR closes #7695 and closes #14892 and partially addresses #4323 and
#10698.

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
2025-01-28 16:03:31 -06:00
5615d21ce9 remove content_type metadata from pipeline after from ... commands (#14602)
# Description

`from ...` conversions pass along all metadata except `content_type`,
which they set to `None`.

## Rationale

`open`ing a file results in no `content_type` metadata if it can be
parsed into a nu data structure, and using `open --raw` results in
`content_type` metadata.

`from ...` commands should preserve metadata ***except*** for
`content_type`, as after parsing it's no longer that `content_type` and
just structured nu data.

These commands should return identical data *and* identical metadata

```nushell
open foo.csv
```

```nushell
open foo.csv --raw | from csv
```

# User-Facing Changes

N/A

# Tests + Formatting
- 🟢 toolkit fmt
- 🟢 toolkit clippy
- 🟢 toolkit test
- 🟢 toolkit test stdlib

# After Submitting
N/A
2024-12-16 15:59:18 -06:00
95b78eee25 Change the usage misnomer to "description" (#13598)
# Description
    
The meaning of the word usage is specific to describing how a command
function is *used* and not a synonym for general description. Usage can
be used to describe the SYNOPSIS or EXAMPLES sections of a man page
where the permitted argument combinations are shown or example *uses*
are given.
Let's not confuse people and call it what it is a description.

Our `help` command already creates its own *Usage* section based on the
available arguments and doesn't refer to the description with usage.

# User-Facing Changes

`help commands` and `scope commands` will now use `description` or
`extra_description`
`usage`-> `description`
`extra_usage` -> `extra_description`

Breaking change in the plugin protocol:

In the signature record communicated with the engine.
`usage`-> `description`
`extra_usage` -> `extra_description`

The same rename also takes place for the methods on
`SimplePluginCommand` and `PluginCommand`

# Tests + Formatting
- Updated plugin protocol specific changes
# After Submitting
- [ ] update plugin protocol doc
2024-08-22 12:02:08 +02:00
d34a24db33 setting content type metadata on all core to * commands (#13506)
# Description

All core `to *` set content type pipeline metadata. 

# User-Facing Changes
- For consistency, `from json` no longer sets the content type metadata

# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib
2024-08-01 11:10:52 +02:00
399a7c8836 Add and use new Signals struct (#13314)
# Description
This PR introduces a new `Signals` struct to replace our adhoc passing
around of `ctrlc: Option<Arc<AtomicBool>>`. Doing so has a few benefits:
- We can better enforce when/where resetting or triggering an interrupt
is allowed.
- Consolidates `nu_utils::ctrl_c::was_pressed` and other ad-hoc
re-implementations into a single place: `Signals::check`.
- This allows us to add other types of signals later if we want. E.g.,
exiting or suspension.
- Similarly, we can more easily change the underlying implementation if
we need to in the future.
- Places that used to have a `ctrlc` of `None` now use
`Signals::empty()`, so we can double check these usages for correctness
in the future.
2024-07-07 22:29:01 +00:00
0d060aeae8 Use pipeline data for http post|put|patch|delete commands. (#13254)
# Description
Provides the ability to use http commands as part of a pipeline.
Additionally, this pull requests extends the pipeline metadata to add a
content_type field. The content_type metadata field allows commands such
as `to json` to set the metadata in the pipeline allowing the http
commands to use it when making requests.

This pull request also introduces the ability to directly stream http
requests from streaming pipelines.

One other small change is that Content-Type will always be set if it is
passed in to the http commands, either indirectly or throw the content
type flag. Previously it was not preserved with requests that were not
of type json or form data.

# User-Facing Changes
* `http post`, `http put`, `http patch`, `http delete` can be used as
part of a pipeline
* `to text`, `to json`, `from json` all set the content_type metadata
field and the http commands will utilize them when making requests.
2024-07-01 12:34:19 -07:00
b06f31d3c6 Make from json --objects streaming (#12949)
# Description

Makes the `from json --objects` command produce a stream, and read
lazily from an input stream to produce its output.

Also added a helper, `PipelineData::get_type()`, to make it easier to
construct a wrong type error message when matching on `PipelineData`. I
expect checking `PipelineData` for either a string value or an `Unknown`
or `String` typed `ByteStream` will be very, very common. I would have
liked to have a helper that just returns a readable stream from either,
but that would either be a bespoke enum or a `Box<dyn BufRead>`, which
feels like it wouldn't be so great for performance. So instead, taking
the approach I did here is probably better - having a function that
accepts the `impl BufRead` and matching to use it.

# User-Facing Changes

- `from json --objects` no longer collects its input, and can be used
for large datasets or streams that produce values over time.

# Tests + Formatting
All passing.

# After Submitting
- [ ] release notes

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-24 23:37:50 +00:00
6fd854ed9f Replace ExternalStream with new ByteStream type (#12774)
# Description
This PR introduces a `ByteStream` type which is a `Read`-able stream of
bytes. Internally, it has an enum over three different byte stream
sources:
```rust
pub enum ByteStreamSource {
    Read(Box<dyn Read + Send + 'static>),
    File(File),
    Child(ChildProcess),
}
```

This is in comparison to the current `RawStream` type, which is an
`Iterator<Item = Vec<u8>>` and has to allocate for each read chunk.

Currently, `PipelineData::ExternalStream` serves a weird dual role where
it is either external command output or a wrapper around `RawStream`.
`ByteStream` makes this distinction more clear (via `ByteStreamSource`)
and replaces `PipelineData::ExternalStream` in this PR:
```rust
pub enum PipelineData {
    Empty,
    Value(Value, Option<PipelineMetadata>),
    ListStream(ListStream, Option<PipelineMetadata>),
    ByteStream(ByteStream, Option<PipelineMetadata>),
}
```

The PR is relatively large, but a decent amount of it is just repetitive
changes.

This PR fixes #7017, fixes #10763, and fixes #12369.

This PR also improves performance when piping external commands. Nushell
should, in most cases, have competitive pipeline throughput compared to,
e.g., bash.
| Command | Before (MB/s) | After (MB/s) | Bash (MB/s) |
| -------------------------------------------------- | -------------:|
------------:| -----------:|
| `throughput \| rg 'x'` | 3059 | 3744 | 3739 |
| `throughput \| nu --testbin relay o> /dev/null` | 3508 | 8087 | 8136 |

# User-Facing Changes
- This is a breaking change for the plugin communication protocol,
because the `ExternalStreamInfo` was replaced with `ByteStreamInfo`.
Plugins now only have to deal with a single input stream, as opposed to
the previous three streams: stdout, stderr, and exit code.
- The output of `describe` has been changed for external/byte streams.
- Temporary breaking change: `bytes starts-with` no longer works with
byte streams. This is to keep the PR smaller, and `bytes ends-with`
already does not work on byte streams.
- If a process core dumped, then instead of having a `Value::Error` in
the `exit_code` column of the output returned from `complete`, it now is
a `Value::Int` with the negation of the signal number.

# After Submitting
- Update docs and book as necessary
- Release notes (e.g., plugin protocol changes)
- Adapt/convert commands to work with byte streams (high priority is
`str length`, `bytes starts-with`, and maybe `bytes ends-with`).
- Refactor the `tee` code, Devyn has already done some work on this.

---------

Co-authored-by: Devyn Cairns <devyn.cairns@gmail.com>
2024-05-16 07:11:18 -07:00
e879d4ecaf ListStream touchup (#12524)
# Description

Does some misc changes to `ListStream`:
- Moves it into its own module/file separate from `RawStream`.
- `ListStream`s now have an associated `Span`.
- This required changes to `ListStreamInfo` in `nu-plugin`. Note sure if
this is a breaking change for the plugin protocol.
- Hides the internals of `ListStream` but also adds a few more methods.
- This includes two functions to more easily alter a stream (these take
a `ListStream` and return a `ListStream` instead of having to go through
the whole `into_pipeline_data(..)` route).
  -  `map`: takes a `FnMut(Value) -> Value`
  - `modify`: takes a function to modify the inner stream.
2024-05-05 16:00:59 +00:00
c747ec75c9 Add command_prelude module (#12291)
# Description
When implementing a `Command`, one must also import all the types
present in the function signatures for `Command`. This makes it so that
we often import the same set of types in each command implementation
file. E.g., something like this:
```rust
use nu_protocol::ast::Call;
use nu_protocol::engine::{Command, EngineState, Stack};
use nu_protocol::{
    record, Category, Example, IntoInterruptiblePipelineData, IntoPipelineData, PipelineData,
    ShellError, Signature, Span, Type, Value,
};
```

This PR adds the `nu_engine::command_prelude` module which contains the
necessary and commonly used types to implement a `Command`:
```rust
// command_prelude.rs
pub use crate::CallExt;
pub use nu_protocol::{
    ast::{Call, CellPath},
    engine::{Command, EngineState, Stack},
    record, Category, Example, IntoInterruptiblePipelineData, IntoPipelineData, IntoSpanned,
    PipelineData, Record, ShellError, Signature, Span, Spanned, SyntaxShape, Type, Value,
};
```

This should reduce the boilerplate needed to implement a command and
also gives us a place to track the breadth of the `Command` API. I tried
to be conservative with what went into the prelude modules, since it
might be hard/annoying to remove items from the prelude in the future.
Let me know if something should be included or excluded.
2024-03-26 21:17:30 +00:00
4e0a65c822 Strict JSON parsing (#11592)
# Description
Adds the `--strict` flag for `from json` which will try to parse text
while following the exact JSON specification (e.g., no comments or
trailing commas allowed). Fixes issue #11548.
2024-01-30 08:10:19 -06:00
1867bb1a88 Fix incorrect handling of boolean flags for builtin commands (#11492)
# Description
Possible fix of #11456
This PR fixes a bug where builtin commands did not respect the logic of
dynamically passed boolean flags. The reason is
[has_flag](6f59abaf43/crates/nu-protocol/src/ast/call.rs (L204C5-L212C6))
method did not evaluate and take into consideration expression used with
flag.

To address this issue a solution is proposed:
1. `has_flag` method is moved to `CallExt` and new logic to evaluate
expression and check if it is a boolean value is added
2. `has_flag_const` method is added to `CallExt` which is a constant
version of `has_flag`
3. `has_named` method is added to `Call` which is basically the old
logic of `has_flag`
4. All usages of `has_flag` in code are updated, mostly to pass
`engine_state` and `stack` to new `has_flag`. In `run_const` commands it
is replaced with `has_flag_const`. And in a few select places: parser,
`to nuon` and `into string` old logic via `has_named` is used.

# User-Facing Changes
Explicit values of boolean flags are now respected in builtin commands.
Before:

![image](https://github.com/nushell/nushell/assets/17511668/f9fbabb2-3cfd-43f9-ba9e-ece76d80043c)
After:

![image](https://github.com/nushell/nushell/assets/17511668/21867596-2075-437f-9c85-45563ac70083)

Another example:
Before:

![image](https://github.com/nushell/nushell/assets/17511668/efdbc5ca-5227-45a4-ac5b-532cdc2bbf5f)
After:

![image](https://github.com/nushell/nushell/assets/17511668/2907d5c5-aa93-404d-af1c-21cdc3d44646)


# Tests + Formatting
Added test reproducing some variants of original issue.
2024-01-11 17:19:48 +02:00
3e5f81ae14 Convert remainder of ShellError variants to named fields (#11276)
# Description

Removed variants that are no longer in use:
* `NoFile*`
* `UnexpectedAbbrComponent`

Converted:
* `OutsideSpannedLabeledError`
* `EvalBlockWithInput`
* `Break`
* `Continue`
* `Return`
* `NotAConstant`
* `NotAConstCommand`
* `NotAConstHelp`
* `InvalidGlobPattern`
* `ErrorExpandingGlob`

Fixes #10700 

# User-Facing Changes

None

# Tests + Formatting

- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

# After Submitting

N/A
2023-12-09 18:46:21 -06:00
a95a4505ef Convert Shellerror::GenericError to named fields (#11230)
# Description

Replace `.to_string()` used in `GenericError` with `.into()` as
`.into()` seems more popular

Replace `Vec::new()` used in `GenericError` with `vec![]` as `vec![]`
seems more popular

(There are so, so many)
2023-12-07 00:40:03 +01:00
4b301710d3 Convert more examples and tests to record! macro (#10840)
# Description
Use `record!` macro instead of defining two separate `vec!` for `cols`
and `vals` when appropriate.
This visually aligns the key with the value.
Further more you don't have to deal with the construction of `Record {
cols, vals }` so we can hide the implementation details in the future.

## State

Not covering all possible commands yet, also some tests/examples are
better expressed by creating cols and vals separately.

# User/Developer-Facing Changes
The examples and tests should read more natural. No relevant functional
change

# Bycatch

Where I noticed it I replaced usage of `Value` constructors with
`Span::test_data()` or `Span::unknown()` to the `Value::test_...`
constructors. This should make things more readable and also simplify
changes to the `Span` system in the future.
2023-10-28 14:52:31 +02:00
JT
6cdfee3573 Move Value to helpers, separate span call (#10121)
# Description

As part of the refactor to split spans off of Value, this moves to using
helper functions to create values, and using `.span()` instead of
matching span out of Value directly.

Hoping to get a few more helping hands to finish this, as there are a
lot of commands to update :)

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
Co-authored-by: WindSoilder <windsoilder@outlook.com>
2023-09-03 07:27:29 -07:00
JT
1e3e034021 Spanned Value step 1: span all value cases (#10042)
# Description

This doesn't really do much that the user could see, but it helps get us
ready to do the steps of the refactor to split the span off of Value, so
that values can be spanless. This allows us to have top-level values
that can hold both a Value and a Span, without requiring that all values
have them.

We expect to see significant memory reduction by removing so many
unnecessary spans from values. For example, a table of 100,000 rows and
5 columns would have a savings of ~8megs in just spans that are almost
always duplicated.

# User-Facing Changes

Nothing yet

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect -A clippy::result_large_err` to check that
you're using the standard code style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2023-08-25 08:48:05 +12:00
8da27a1a09 Create Record type (#10103)
# Description
This PR creates a new `Record` type to reduce duplicate code and
possibly bugs as well. (This is an edited version of #9648.)
- `Record` implements `FromIterator` and `IntoIterator` and so can be
iterated over or collected into. For example, this helps with
conversions to and from (hash)maps. (Also, no more
`cols.iter().zip(vals)`!)
- `Record` has a `push(col, val)` function to help insure that the
number of columns is equal to the number of values. I caught a few
potential bugs thanks to this (e.g. in the `ls` command).
- Finally, this PR also adds a `record!` macro that helps simplify
record creation. It is used like so:
   ```rust
   record! {
       "key1" => some_value,
       "key2" => Value::string("text", span),
       "key3" => Value::int(optional_int.unwrap_or(0), span),
       "key4" => Value::bool(config.setting, span),
   }
   ```
Since macros hinder formatting, etc., the right hand side values should
be relatively short and sweet like the examples above.

Where possible, prefer `record!` or `.collect()` on an iterator instead
of multiple `Record::push`s, since the first two automatically set the
record capacity and do less work overall.

# User-Facing Changes
Besides the changes in `nu-protocol` the only other breaking changes are
to `nu-table::{ExpandedTable::build_map, JustTable::kv_table}`.
2023-08-25 07:50:29 +12:00
a52386e837 Box ShellError in Value::Error (#8375)
# Description

Our `ShellError` at the moment has a `std::mem::size_of<ShellError>` of
136 bytes (on AMD64). As a result `Value` directly storing the struct
also required 136 bytes (thanks to alignment requirements).

This change stores the `Value::Error` `ShellError` on the heap.

Pro:
- Value now needs just 80 bytes
- Should be 1 cacheline less (still at least 2 cachelines)

Con:
- More small heap allocations when dealing with `Value::Error`
  - More heap fragmentation
  - Potential for additional required memcopies

# Further code changes

Includes a small refactor of `try` due to a type mismatch in its large
match.

# User-Facing Changes

None for regular users.

Plugin authors may have to update their matches on `Value` if they use
`nu-protocol`

Needs benchmarking to see if there is a benefit in real world workloads.
**Update** small improvements in runtime for workloads with high volume
of values. Significant reduction in maximum resident set size, when many
values are held in memory.

# Tests + Formatting
2023-03-12 09:57:27 +01:00
62575c9a4f Document and critically review ShellError variants - Ep. 3 (#8340)
Continuation of #8229 and #8326

# Description

The `ShellError` enum at the moment is kind of messy. 

Many variants are basic tuple structs where you always have to reference
the implementation with its macro invocation to know which field serves
which purpose.
Furthermore we have both variants that are kind of redundant or either
overly broad to be useful for the user to match on or overly specific
with few uses.

So I set out to start fixing the lacking documentation and naming to
make it feasible to critically review the individual usages and fix
those.
Furthermore we can decide to join or split up variants that don't seem
to be fit for purpose.

# Call to action

**Everyone:** Feel free to add review comments if you spot inconsistent
use of `ShellError` variants.

# User-Facing Changes

(None now, end goal more explicit and consistent error messages)

# Tests + Formatting

(No additional tests needed so far)

# Commits (so far)

- Remove `ShellError::FeatureNotEnabled`
- Name fields on `SE::ExternalNotSupported`
- Name field on `SE::InvalidProbability`
- Name fields on `SE::NushellFailed` variants
- Remove unused `SE::NushellFailedSpannedHelp`
- Name field on `SE::VariableNotFoundAtRuntime`
- Name fields on `SE::EnvVarNotFoundAtRuntime`
- Name fields on `SE::ModuleNotFoundAtRuntime`
- Remove usused `ModuleOrOverlayNotFoundAtRuntime`
- Name fields on `SE::OverlayNotFoundAtRuntime`
- Name field on `SE::NotFound`
2023-03-06 18:33:09 +01:00
a5c604c283 Uniformize usage() and extra_usage() message ending for commands helper. (#8268)
# Description

Working on uniformizing the ending messages regarding methods usage()
and extra_usage(). This is related to the issue
https://github.com/nushell/nushell/issues/5066 after discussing it with
@jntrnr

# User-Facing Changes

None.

# Tests + Formatting

Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass

# After Submitting

If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
2023-02-28 21:33:02 -08:00
99076af18b Use imported names in Command::run signatures (#7967)
# Description

_(Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.)_

I opened this PR to unify the run command method. It's mainly to improve
consistency across the tree.

# User-Facing Changes

None.

# Tests + Formatting

Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass

# After Submitting

If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
2023-02-05 22:17:46 +01:00
ab480856a5 Use variable names directly in the format strings (#7906)
# Description

Lint: `clippy::uninlined_format_args`

More readable in most situations.
(May be slightly confusing for modifier format strings
https://doc.rust-lang.org/std/fmt/index.html#formatting-parameters)

Alternative to #7865

# User-Facing Changes

None intended

# Tests + Formatting

(Ran `cargo +stable clippy --fix --workspace -- -A clippy::all -D
clippy::uninlined_format_args` to achieve this. Depends on Rust `1.67`)
2023-01-29 19:37:54 -06:00
45fe3be83e Further cleanup of Span::test_data usage + span fixes (#7595)
# Description

Inspired by #7592

For brevity use `Value::test_{string,int,float,bool}`

Includes fixes to commands that were abusing `Span::test_data` in their
implementation. Now the call span is used where possible or the explicit
`Span::unknonw` is used.

## Command fixes
- Fix abuse of `Span::test_data()` in `query_xml`
- Fix abuse of `Span::test_data()` in `term size`
- Fix abuse of `Span::test_data()` in `seq date`
- Fix two abuses of `Span::test_data` in `nu-cli`
- Change `Span::test_data` to `Span::unknown` in `keybindings listen`
- Add proper call span to `registry query`
- Fix span use in `nu_plugin_query`
- Fix span assignment in `select`
- Use `Span::unknown` instead of `test_data` in more places

## Other
- Use `Value::test_int`/`test_float()` consistently
- More `test_string` and `test_bool`
- Fix unused imports


# User-Facing Changes

Some commands may now provide more helpful spans for downstream use in
errors
2022-12-24 07:41:57 -06:00
dd7b7311b3 Standardise the use of ShellError::UnsupportedInput and ShellError::TypeMismatch and add spans to every instance of the former (#7217)
# Description

* I was dismayed to discover recently that UnsupportedInput and
TypeMismatch are used *extremely* inconsistently across the codebase.
UnsupportedInput is sometimes used for input type-checks (as per the
name!!), but *also* used for argument type-checks. TypeMismatch is also
used for both.
I thus devised the following standard: input type-checking *only* uses
UnsupportedInput, and argument type-checking *only* uses TypeMismatch.
Moreover, to differentiate them, UnsupportedInput now has *two* error
arrows (spans), one pointing at the command and the other at the input
origin, while TypeMismatch only has the one (because the command should
always be nearby)
* In order to apply that standard, a very large number of
UnsupportedInput uses were changed so that the input's span could be
retrieved and delivered to it.
* Additionally, I noticed many places where **errors are not propagated
correctly**: there are lots of `match` sites which take a Value::Error,
then throw it away and replace it with a new Value::Error with
less/misleading information (such as reporting the error as an
"incorrect type"). I believe that the earliest errors are the most
important, and should always be propagated where possible.
* Also, to standardise one broad subset of UnsupportedInput error
messages, who all used slightly different wordings of "expected
`<type>`, got `<type>`", I created OnlySupportsThisInputType as a
variant of it.
* Finally, a bunch of error sites that had "repeated spans" - i.e. where
an error expected two spans, but `call.head` was given for both - were
fixed to use different spans.

# Example
BEFORE
```
〉20b | str starts-with 'a'
Error: nu:🐚:unsupported_input (link)

  × Unsupported input
   ╭─[entry #31:1:1]
 1 │ 20b | str starts-with 'a'
   ·   ┬
   ·   ╰── Input's type is filesize. This command only works with strings.
   ╰────

〉'a' | math cos
Error: nu:🐚:unsupported_input (link)

  × Unsupported input
   ╭─[entry #33:1:1]
 1 │ 'a' | math cos
   · ─┬─
   ·  ╰── Only numerical values are supported, input type: String
   ╰────

〉0x[12] | encode utf8
Error: nu:🐚:unsupported_input (link)

  × Unsupported input
   ╭─[entry #38:1:1]
 1 │ 0x[12] | encode utf8
   ·          ───┬──
   ·             ╰── non-string input
   ╰────
```
AFTER
```
〉20b | str starts-with 'a'
Error: nu:🐚:pipeline_mismatch (link)

  × Pipeline mismatch.
   ╭─[entry #1:1:1]
 1 │ 20b | str starts-with 'a'
   ·   ┬   ───────┬───────
   ·   │          ╰── only string input data is supported
   ·   ╰── input type: filesize
   ╰────

〉'a' | math cos
Error: nu:🐚:pipeline_mismatch (link)

  × Pipeline mismatch.
   ╭─[entry #2:1:1]
 1 │ 'a' | math cos
   · ─┬─   ────┬───
   ·  │        ╰── only numeric input data is supported
   ·  ╰── input type: string
   ╰────

〉0x[12] | encode utf8
Error: nu:🐚:pipeline_mismatch (link)

  × Pipeline mismatch.
   ╭─[entry #3:1:1]
 1 │ 0x[12] | encode utf8
   · ───┬──   ───┬──
   ·    │        ╰── only string input data is supported
   ·    ╰── input type: binary
   ╰────
```

# User-Facing Changes

Various error messages suddenly make more sense (i.e. have two arrows
instead of one).

# Tests + Formatting

Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass

# After Submitting

If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
2022-12-23 01:48:53 -05:00
220b105efb Reduced LOC by replacing several instances of Value::Int {}, Value::Float{}, Value::Bool {}, and Value::String {} with Value::int(), Value::float(), Value::boolean() and Value::string() (#7412)
# Description

While perusing Value.rs, I noticed the `Value::int()`, `Value::float()`,
`Value::boolean()` and `Value::string()` constructors, which seem
designed to make it easier to construct various Values, but which aren't
used often at all in the codebase. So, using a few find-replaces
regexes, I increased their usage. This reduces overall LOC because
structures like this:
```
Value::Int {
  val: a,
  span: head
}
```
are changed into
```
Value::int(a, head)
```
and are respected as such by the project's formatter.
There are little readability concerns because the second argument to all
of these is `span`, and it's almost always extremely obvious which is
the span at every callsite.

# User-Facing Changes

None.

# Tests + Formatting

Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass

# After Submitting

If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
2022-12-09 11:37:51 -05:00
850ecf648a Protocol: debug_assert!() Span to reflect a valid slice (#6806)
Also enforce this by #[non_exhaustive] span such that going forward we
cannot, in debug builds (1), construct invalid spans.

The motivation for this stems from #6431 where I've seen crashes due to
invalid slice indexing.

My hope is this will mitigate such senarios

1. https://github.com/nushell/nushell/pull/6431#issuecomment-1278147241

# Description

(description of your pull request here)

# Tests

Make sure you've done the following:

- [ ] Add tests that cover your changes, either in the command examples,
the crate/tests folder, or in the /tests folder.
- [ ] Try to think about corner cases and various ways how your changes
could break. Cover them with tests.
- [ ] If adding tests is not possible, please document in the PR body a
minimal example with steps on how to reproduce so one can verify your
change works.

Make sure you've run and fixed any issues with these commands:

- [x] `cargo fmt --all -- --check` to check standard code formatting
(`cargo fmt --all` applies these changes)
- [ ] `cargo clippy --workspace --features=extra -- -D warnings -D
clippy::unwrap_used -A clippy::needless_collect` to check that you're
using the standard code style
- [ ] `cargo test --workspace --features=extra` to check that all the
tests pass

# Documentation

- [ ] If your PR touches a user-facing nushell feature then make sure
that there is an entry in the documentation
(https://github.com/nushell/nushell.github.io) for the feature, and
update it if necessary.
2022-12-03 11:44:12 +02:00
d9d6cea5a9 Make json require string and pass around metadata (#7010)
* Make json require string and pass around metadata

The json deserializer was accepting any inputs by coercing non-strings
into strings. As an example, if the input was `[1, 2]` the coercion
would turn into `[12]` and deserialize as a list containing number
twelve instead of a list of two numbers, one and two. This could lead
to silent data corruption.

Aside from that pipeline metadata wasn't passed aroud.

This commit fixes the type issue by adding a strict conversion
function that errors if the input type is not a string or external
stream. It then uses this function instead of the original
`collect_string()`. In addition, this function returns the pipeline
metadata so it can be passed along.

* Make other formats require string

The problem with json coercing non-string types to string was present in
all other text formats. This reuses the `collect_string_strict` function
to fix them.

* `IntoPipelineData` cleanup

The method `into_pipeline_data_with_metadata` can now be conveniently
used.
2022-11-20 17:06:09 -08:00
df94052180 Declare input and output types of commands (#6796)
* Add failing test that list of ints and floats is List<Number>

* Start defining subtype relation

* Make it possible to declare input and output types for commands

- Enforce them in tests

* Declare input and output types of commands

* Add formatted signatures to `help commands` table

* Revert SyntaxShape::Table -> Type::Table change

* Revert unnecessary derive(Hash) on SyntaxShape

Co-authored-by: JT <547158+jntrnr@users.noreply.github.com>
2022-11-10 10:55:05 +13:00
6a7a60429f Remove unnecessary #[allow(...)] annotations (#6870)
* Remove unnecessary `#[allow]` annots

Reduce the number of lint exceptions that are not necessary with the
current state of the code (or more recent toolchain)

* Remove dead code from `FileStructure` in nu-command

* Replace `allow(unused)` with relevant feature switch

* Deal with `needless_collect` with annotations

* Change hack for needless_collect in `from json`

This change obviates the need for `allow(needless_collect)`
Removes a pessimistic allocation for empty strings, but increases
allocation size to `Value`

Probably not really worth it.

* Revert "Deal with `needless_collect` with annotations"

This reverts commit 05aca98445.

The previous state seems to better from a performance perspective as a
`Vec<String>` is lighter weight than `Vec<Value>`
2022-10-24 20:12:16 +02:00
JT
76079d5183 Move config to be an env var (#5230)
* Move config to be an env var

* fix fmt and tests
2022-04-19 10:28:01 +12:00
1314a87cb0 update miette and switch to GenericErrors (#5222) 2022-04-19 00:34:10 +12:00
ea7c8c237e CantConvert improvements (#4926)
* CantConvert improvements

* cargo fmt
2022-03-24 07:04:31 -05:00
JT
fd88920a9d Make sure we have text before json parse (#4697) 2022-03-02 15:58:56 -05:00
JT
d454fad4dc Improve json errors a bit (#4579)
* Improve json errors a bit

* typo
2022-02-21 07:08:09 -05:00
JT
fd22211737 Add nuon format for fun (#4401)
* Add nuon format for fun

* more fun

* More nuon fixes, allow comments, improve errors
2022-02-20 16:26:41 -05:00
JT
fa75c93765 Slight cleanup of 'from json' line-at-a-time (#4512) 2022-02-17 12:49:31 -05:00
JT
3522bead97 Add string stream and binary stream, add text decoding (#570)
* WIP

* Add binary/string streams and text decoding

* Make string collection fallible

* Oops, forgot pretty hex

* Oops, forgot pretty hex

* clippy
2021-12-24 18:22:11 +11:00
JT
2883d6cd1e Remove Span::unknown (#525) 2021-12-19 18:46:13 +11:00
JT
2013e9300a Make config default if broken (#482)
* Make config default if broken

* Make config default if broken
2021-12-13 14:16:51 +11:00
b35914bd17 Category option for signature (#343)
* category option for signature

* category option for signature

* column description for $scope
2021-11-17 17:22:37 +13:00
JT
0f107b2830 Add a config variable with engine support (#332)
* Add a config variable with engine support

* Add a config variable with engine support

* Oops, cleanup
2021-11-15 08:25:57 +13:00
JT
bb1740d733 Add from csv and from tsv (#320) 2021-11-10 09:17:37 +13:00
JT
02b8027749 Improve external output in subexprs (#294) 2021-11-06 18:50:33 +13:00
JT
bac8b8a450 Add initial ctrl-c support 2021-10-28 17:13:10 +13:00
JT
5d19017603 WIP 2021-10-26 05:58:58 +13:00