Commit Graph

9 Commits

Author SHA1 Message Date
Ian Manske
6fd854ed9f
Replace ExternalStream with new ByteStream type (#12774)
# Description
This PR introduces a `ByteStream` type which is a `Read`-able stream of
bytes. Internally, it has an enum over three different byte stream
sources:
```rust
pub enum ByteStreamSource {
    Read(Box<dyn Read + Send + 'static>),
    File(File),
    Child(ChildProcess),
}
```

This is in comparison to the current `RawStream` type, which is an
`Iterator<Item = Vec<u8>>` and has to allocate for each read chunk.

Currently, `PipelineData::ExternalStream` serves a weird dual role where
it is either external command output or a wrapper around `RawStream`.
`ByteStream` makes this distinction more clear (via `ByteStreamSource`)
and replaces `PipelineData::ExternalStream` in this PR:
```rust
pub enum PipelineData {
    Empty,
    Value(Value, Option<PipelineMetadata>),
    ListStream(ListStream, Option<PipelineMetadata>),
    ByteStream(ByteStream, Option<PipelineMetadata>),
}
```

The PR is relatively large, but a decent amount of it is just repetitive
changes.

This PR fixes #7017, fixes #10763, and fixes #12369.

This PR also improves performance when piping external commands. Nushell
should, in most cases, have competitive pipeline throughput compared to,
e.g., bash.
| Command | Before (MB/s) | After (MB/s) | Bash (MB/s) |
| -------------------------------------------------- | -------------:|
------------:| -----------:|
| `throughput \| rg 'x'` | 3059 | 3744 | 3739 |
| `throughput \| nu --testbin relay o> /dev/null` | 3508 | 8087 | 8136 |

# User-Facing Changes
- This is a breaking change for the plugin communication protocol,
because the `ExternalStreamInfo` was replaced with `ByteStreamInfo`.
Plugins now only have to deal with a single input stream, as opposed to
the previous three streams: stdout, stderr, and exit code.
- The output of `describe` has been changed for external/byte streams.
- Temporary breaking change: `bytes starts-with` no longer works with
byte streams. This is to keep the PR smaller, and `bytes ends-with`
already does not work on byte streams.
- If a process core dumped, then instead of having a `Value::Error` in
the `exit_code` column of the output returned from `complete`, it now is
a `Value::Int` with the negation of the signal number.

# After Submitting
- Update docs and book as necessary
- Release notes (e.g., plugin protocol changes)
- Adapt/convert commands to work with byte streams (high priority is
`str length`, `bytes starts-with`, and maybe `bytes ends-with`).
- Refactor the `tee` code, Devyn has already done some work on this.

---------

Co-authored-by: Devyn Cairns <devyn.cairns@gmail.com>
2024-05-16 07:11:18 -07:00
Ian Manske
e879d4ecaf
ListStream touchup (#12524)
# Description

Does some misc changes to `ListStream`:
- Moves it into its own module/file separate from `RawStream`.
- `ListStream`s now have an associated `Span`.
- This required changes to `ListStreamInfo` in `nu-plugin`. Note sure if
this is a breaking change for the plugin protocol.
- Hides the internals of `ListStream` but also adds a few more methods.
- This includes two functions to more easily alter a stream (these take
a `ListStream` and return a `ListStream` instead of having to go through
the whole `into_pipeline_data(..)` route).
  -  `map`: takes a `FnMut(Value) -> Value`
  - `modify`: takes a function to modify the inner stream.
2024-05-05 16:00:59 +00:00
Devyn Cairns
72f3942c37
Upgrade to interprocess 2.0.0 (#12729)
# Description

This fixes #12724. NetBSD confirmed to work with this change.

The update also behaves a bit better in some ways - it automatically
unlinks and reclaims sockets on Unix, and doesn't try to flush/sync the
socket on Windows, so I was able to remove that platform-specific logic.

They also have a way to split the socket so I could just use one socket
now, but I haven't tried to do that yet. That would be more of a
breaking change but I think it's more straightforward.

# User-Facing Changes

- Hopefully more platforms work

# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
2024-05-02 22:31:33 -07:00
Ian Manske
847646e44e
Remove lazy records (#12682)
# Description
Removes lazy records from the language, following from the reasons
outlined in #12622. Namely, this should make semantics more clear and
will eliminate concerns regarding maintainability.

# User-Facing Changes
- Breaking change: `lazy make` is removed.
- Breaking change: `describe --collect-lazyrecords` flag is removed.
- `sys` and `debug info` now return regular records.

# After Submitting
- Update nushell book if necessary.
- Explore new `sys` and `debug info` APIs to prevent them from taking
too long (e.g., subcommands or taking an optional column/cell-path
argument).
2024-05-03 08:36:10 +08:00
Devyn Cairns
ad6deadf24
Flush on every plugin Data message (#12728)
# Description

This helps to ensure data produced on a stream is immediately available
to the consumer of the stream. The BufWriter introduced for performance
reasons in 0.93 exposed the behavior that data messages wouldn't make it
to the other side until they filled the buffer in @cablehead's
[`nu_plugin_from_sse`](https://github.com/cablehead/nu_plugin_from_sse).

I had originally not flushed on every `Data` message because I figured
that it isn't really critical that the other side sees those messages
immediately, since they're not used for control and they are flushed
when waiting for acknowledgement or when the buffer is too full anyway.

Increasing the amount of data that can be sent with a single underlying
write increases performance, but this interferes with some plugins that
want to use streams in a more real-time way. In the future I would like
to make this configurable, maybe even per-command, so that a command can
decide what the priority is. But for now I think this is reasonable.

In the worst case, this decreases performance by about 40%, when sending
very small values (just numbers). But for larger values, this PR
actually increases performance by about 20%, because I've increased the
buffer size about 2x to 16,384 bytes. The previous value of 8,192 bytes
was too small to fit a full buffer coming from an external command, so
doubling it makes sense, and now a write of a buffer from an external
command can be done in exactly one write call, which I think makes
sense. I'm doing this at the same time because flushing each data
message would make it very likely that each individual data message from
an external stream would require exactly two writes rather than
approximately one (amortized).

Again, hopefully the tradeoff isn't too bad, and if it is I'll just make
it configurable.

# User-Facing Changes

- Performance of plugin streams will be a bit different
- Plugins that expect to send streams in real-time will work again

# Tests + Formatting

- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
2024-05-02 23:51:16 +00:00
Devyn Cairns
21ebdfe8d7
Bump version to 0.93.1 (#12710)
# Description

Next patch/dev release, `0.93.1`
2024-05-01 17:19:20 -05:00
Devyn Cairns
3b220e07e3
Bump version to 0.93.0 (#12709)
# Description

Bump version to `0.93.0`
2024-04-30 15:51:13 -07:00
Devyn Cairns
648486400c
Fix missing local socket feature (#12698)
# Description

So sorry to do this during the pre-release freeze, but my plugin crate
split PR broke local socket mode, because `nu-plugin-protocol` didn't
have the compile feature to advertise the `LocalSocket` protocol
feature.

This is a very simple, configuration-only bugfix that I think really
needs to be merged before the release, or else local socket mode won't
work at all.

# Tests + Formatting

There's an oversight in my testing that caused this to not be caught:
the engine really did have the feature, but it just wasn't advertising
it, so for `stress_internals` it was still able to use it successfully.
Post-release I'll try to make sure this is properly handled somehow.
2024-04-29 15:02:56 +08:00
Devyn Cairns
0c4d5330ee
Split the plugin crate (#12563)
# Description

This breaks `nu-plugin` up into four crates:

- `nu-plugin-protocol`: just the type definitions for the protocol, no
I/O. If someone wanted to wire up something more bare metal, maybe for
async I/O, they could use this.
- `nu-plugin-core`: the shared stuff between engine/plugin. Less stable
interface.
- `nu-plugin-engine`: everything required for the engine to talk to
plugins. Less stable interface.
- `nu-plugin`: everything required for the plugin to talk to the engine,
what plugin developers use. Should be the most stable interface.

No changes are made to the interface exposed by `nu-plugin` - it should
all still be there. Re-exports from `nu-plugin-protocol` or
`nu-plugin-core` are used as required. Plugins shouldn't ever have to
use those crates directly.

This should be somewhat faster to compile as `nu-plugin-engine` and
`nu-plugin` can compile in parallel, and the engine doesn't need
`nu-plugin` and plugins don't need `nu-plugin-engine` (except for test
support), so that should reduce what needs to be compiled too.

The only significant change here other than splitting stuff up was to
break the `source` out of `PluginCustomValue` and create a new
`PluginCustomValueWithSource` type that contains that instead. One bonus
of that is we get rid of the option and it's now more type-safe, but it
also means that the logic for that stuff (actually running the plugin
for custom value ops) can live entirely within the `nu-plugin-engine`
crate.

# User-Facing Changes
- New crates.
- Added `local-socket` feature for `nu` to try to make it possible to
compile without that support if needed.

# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
2024-04-27 12:08:12 -05:00