Make from json --objects streaming (#12949)

# Description

Makes the `from json --objects` command produce a stream, and read
lazily from an input stream to produce its output.

Also added a helper, `PipelineData::get_type()`, to make it easier to
construct a wrong type error message when matching on `PipelineData`. I
expect checking `PipelineData` for either a string value or an `Unknown`
or `String` typed `ByteStream` will be very, very common. I would have
liked to have a helper that just returns a readable stream from either,
but that would either be a bespoke enum or a `Box<dyn BufRead>`, which
feels like it wouldn't be so great for performance. So instead, taking
the approach I did here is probably better - having a function that
accepts the `impl BufRead` and matching to use it.

# User-Facing Changes

- `from json --objects` no longer collects its input, and can be used
for large datasets or streams that produce values over time.

# Tests + Formatting
All passing.

# After Submitting
- [ ] release notes

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
This commit is contained in:
Devyn Cairns
2024-05-24 16:37:50 -07:00
committed by GitHub
parent 84b7a99adf
commit b06f31d3c6
3 changed files with 124 additions and 34 deletions

View File

@ -3,7 +3,7 @@ use crate::{
engine::{EngineState, Stack},
process::{ChildPipe, ChildProcess, ExitStatus},
ByteStream, ByteStreamType, Config, ErrSpan, ListStream, OutDest, PipelineMetadata, Range,
ShellError, Span, Value,
ShellError, Span, Type, Value,
};
use nu_utils::{stderr_write_all_and_flush, stdout_write_all_and_flush};
use std::{
@ -99,6 +99,24 @@ impl PipelineData {
}
}
/// Get a type that is representative of the `PipelineData`.
///
/// The type returned here makes no effort to collect a stream, so it may be a different type
/// than would be returned by [`Value::get_type()`] on the result of [`.into_value()`].
///
/// Specifically, a `ListStream` results in [`list stream`](Type::ListStream) rather than
/// the fully complete [`list`](Type::List) type (which would require knowing the contents),
/// and a `ByteStream` with [unknown](crate::ByteStreamType::Unknown) type results in
/// [`any`](Type::Any) rather than [`string`](Type::String) or [`binary`](Type::Binary).
pub fn get_type(&self) -> Type {
match self {
PipelineData::Empty => Type::Nothing,
PipelineData::Value(value, _) => value.get_type(),
PipelineData::ListStream(_, _) => Type::ListStream,
PipelineData::ByteStream(stream, _) => stream.type_().into(),
}
}
pub fn into_value(self, span: Span) -> Result<Value, ShellError> {
match self {
PipelineData::Empty => Ok(Value::nothing(span)),