Add string/binary type color to ByteStream (#12897)

# Description

This PR allows byte streams to optionally be colored as being
specifically binary or string data, which guarantees that they'll be
converted to `Binary` or `String` appropriately on `into_value()`,
making them compatible with `Type` guarantees. This makes them
significantly more broadly usable for command input and output.

There is still an `Unknown` type for byte streams coming from external
commands, which uses the same behavior as we previously did where it's a
string if it's UTF-8.

A small number of commands were updated to take advantage of this, just
to prove the point. I will be adding more after this merges.

# User-Facing Changes
- New types in `describe`: `string (stream)`, `binary (stream)`
- These commands now return a stream if their input was a stream:
  - `into binary`
  - `into string`
  - `bytes collect`
  - `str join`
  - `first` (binary)
  - `last` (binary)
  - `take` (binary)
  - `skip` (binary)
- Streams that are explicitly binary colored will print as a streaming
hexdump
  - example:
    ```nushell
    1.. | each { into binary } | bytes collect
    ```

# Tests + Formatting
I've added some tests to cover it at a basic level, and it doesn't break
anything existing, but I do think more would be nice. Some of those will
come when I modify more commands to stream.

# After Submitting
There are a few things I'm not quite satisfied with:

- **String trimming behavior.** We automatically trim newlines from
streams from external commands, but I don't think we should do this with
internal commands. If I call a command that happens to turn my string
into a stream, I don't want the newline to suddenly disappear. I changed
this to specifically do it only on `Child` and `File`, but I don't know
if this is quite right, and maybe we should bring back the old flag for
`trim_end_newline`
- **Known binary always resulting in a hexdump.** It would be nice to
have a `print --raw`, so that we can put binary data on stdout
explicitly if we want to. This PR doesn't change how external commands
work though - they still dump straight to stdout.

Otherwise, here's the normal checklist:

- [ ] release notes
- [ ] docs update for plugin protocol changes (added `type` field)

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
This commit is contained in:
Devyn Cairns
2024-05-19 17:35:32 -07:00
committed by GitHub
parent baeba19b22
commit c61075e20e
42 changed files with 1107 additions and 416 deletions

View File

@ -1,3 +1,4 @@
use itertools::Itertools;
use nu_engine::command_prelude::*;
#[derive(Clone, Copy)]
@ -35,46 +36,33 @@ impl Command for BytesCollect {
input: PipelineData,
) -> Result<PipelineData, ShellError> {
let separator: Option<Vec<u8>> = call.opt(engine_state, stack, 0)?;
let span = call.head;
// input should be a list of binary data.
let mut output_binary = vec![];
for value in input {
match value {
Value::Binary { mut val, .. } => {
output_binary.append(&mut val);
// manually concat
// TODO: make use of std::slice::Join when it's available in stable.
if let Some(sep) = &separator {
let mut work_sep = sep.clone();
output_binary.append(&mut work_sep)
}
}
// Explicitly propagate errors instead of dropping them.
Value::Error { error, .. } => return Err(*error),
other => {
return Err(ShellError::OnlySupportsThisInputType {
let metadata = input.metadata();
let iter = Itertools::intersperse(
input.into_iter_strict(span)?.map(move |value| {
// Everything is wrapped in Some in case there's a separator, so we can flatten
Some(match value {
// Explicitly propagate errors instead of dropping them.
Value::Error { error, .. } => Err(*error),
Value::Binary { val, .. } => Ok(val),
other => Err(ShellError::OnlySupportsThisInputType {
exp_input_type: "binary".into(),
wrong_type: other.get_type().to_string(),
dst_span: call.head,
dst_span: span,
src_span: other.span(),
});
}
}
}
}),
})
}),
Ok(separator).transpose(),
)
.flatten();
match separator {
None => Ok(Value::binary(output_binary, call.head).into_pipeline_data()),
Some(sep) => {
if output_binary.is_empty() {
Ok(Value::binary(output_binary, call.head).into_pipeline_data())
} else {
// have push one extra separator in previous step, pop them out.
for _ in sep {
let _ = output_binary.pop();
}
Ok(Value::binary(output_binary, call.head).into_pipeline_data())
}
}
}
let output = ByteStream::from_result_iter(iter, span, None, ByteStreamType::Binary);
Ok(PipelineData::ByteStream(output, metadata))
}
fn examples(&self) -> Vec<Example> {