Add string/binary type color to ByteStream (#12897)

# Description

This PR allows byte streams to optionally be colored as being
specifically binary or string data, which guarantees that they'll be
converted to `Binary` or `String` appropriately on `into_value()`,
making them compatible with `Type` guarantees. This makes them
significantly more broadly usable for command input and output.

There is still an `Unknown` type for byte streams coming from external
commands, which uses the same behavior as we previously did where it's a
string if it's UTF-8.

A small number of commands were updated to take advantage of this, just
to prove the point. I will be adding more after this merges.

# User-Facing Changes
- New types in `describe`: `string (stream)`, `binary (stream)`
- These commands now return a stream if their input was a stream:
  - `into binary`
  - `into string`
  - `bytes collect`
  - `str join`
  - `first` (binary)
  - `last` (binary)
  - `take` (binary)
  - `skip` (binary)
- Streams that are explicitly binary colored will print as a streaming
hexdump
  - example:
    ```nushell
    1.. | each { into binary } | bytes collect
    ```

# Tests + Formatting
I've added some tests to cover it at a basic level, and it doesn't break
anything existing, but I do think more would be nice. Some of those will
come when I modify more commands to stream.

# After Submitting
There are a few things I'm not quite satisfied with:

- **String trimming behavior.** We automatically trim newlines from
streams from external commands, but I don't think we should do this with
internal commands. If I call a command that happens to turn my string
into a stream, I don't want the newline to suddenly disappear. I changed
this to specifically do it only on `Child` and `File`, but I don't know
if this is quite right, and maybe we should bring back the old flag for
`trim_end_newline`
- **Known binary always resulting in a hexdump.** It would be nice to
have a `print --raw`, so that we can put binary data on stdout
explicitly if we want to. This PR doesn't change how external commands
work though - they still dump straight to stdout.

Otherwise, here's the normal checklist:

- [ ] release notes
- [ ] docs update for plugin protocol changes (added `type` field)

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
This commit is contained in:
Devyn Cairns
2024-05-19 17:35:32 -07:00
committed by GitHub
parent baeba19b22
commit c61075e20e
42 changed files with 1107 additions and 416 deletions

View File

@ -127,15 +127,18 @@ fn into_binary(
let cell_paths = call.rest(engine_state, stack, 0)?;
let cell_paths = (!cell_paths.is_empty()).then_some(cell_paths);
if let PipelineData::ByteStream(stream, ..) = input {
// TODO: in the future, we may want this to stream out, converting each to bytes
Ok(Value::binary(stream.into_bytes()?, head).into_pipeline_data())
if let PipelineData::ByteStream(stream, metadata) = input {
// Just set the type - that should be good enough
Ok(PipelineData::ByteStream(
stream.with_type(ByteStreamType::Binary),
metadata,
))
} else {
let args = Arguments {
cell_paths,
compact: call.has_flag(engine_state, stack, "compact")?,
};
operate(action, args, input, call.head, engine_state.ctrlc.clone())
operate(action, args, input, head, engine_state.ctrlc.clone())
}
}

View File

@ -103,7 +103,7 @@ fn into_cell_path(call: &Call, input: PipelineData) -> Result<PipelineData, Shel
}
PipelineData::ByteStream(stream, ..) => Err(ShellError::OnlySupportsThisInputType {
exp_input_type: "list, int".into(),
wrong_type: "byte stream".into(),
wrong_type: stream.type_().describe().into(),
dst_span: head,
src_span: stream.span(),
}),

View File

@ -156,9 +156,23 @@ fn string_helper(
let cell_paths = call.rest(engine_state, stack, 0)?;
let cell_paths = (!cell_paths.is_empty()).then_some(cell_paths);
if let PipelineData::ByteStream(stream, ..) = input {
// TODO: in the future, we may want this to stream out, converting each to bytes
Ok(Value::string(stream.into_string()?, head).into_pipeline_data())
if let PipelineData::ByteStream(stream, metadata) = input {
// Just set the type - that should be good enough. There is no guarantee that the data
// within a string stream is actually valid UTF-8. But refuse to do it if it was already set
// to binary
if stream.type_() != ByteStreamType::Binary {
Ok(PipelineData::ByteStream(
stream.with_type(ByteStreamType::String),
metadata,
))
} else {
Err(ShellError::CantConvert {
to_type: "string".into(),
from_type: "binary".into(),
span: stream.span(),
help: Some("try using the `decode` command".into()),
})
}
} else {
let config = engine_state.get_config().clone();
let args = Arguments {