Improve working with IntoValue and FromValue for byte collections (#13641)

<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->
I was working with byte collections like `Vec<u8>` and
[`bytes::Bytes`](https://docs.rs/bytes/1.7.1/bytes/struct.Bytes.html),
both are currently not possible to be used directly in a struct that
derives `IntoValue` and `FromValue` at the same time. The `Vec<u8>` will
convert itself into a `Value::List` but expects a `Value::String` or
`Value::Binary` to load from. I now also implemented that it can load
from `Value::List` just like the other `Vec<uX>` versions. For further
working with byte collections the type `bytes::Bytes` is wildly used,
therefore I added a implementation for it. `bytes` is already part of
the dependency graph as many crates (more than 5000 to crates.io) use
it.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
User of `nu-protocol` as library, e.g. plugin developers, can now use
byte collections more easily in their data structures and derive
`IntoValue` and `FromValue` for it.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
I added a few tests that check that these byte collections are correctly
translated in and from `Value`. They live in `test_derive.rs` as part of
the `ByteContainer` and I also explicitely tested that `FromValue` for
`Vec<u8>` works as expected.

- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
Maybe it should be explored if `Value::Binary` should use `bytes::Bytes`
instead of `Vec<u8>`.
This commit is contained in:
Piepmatz 2024-08-23 00:59:00 +02:00 committed by GitHub
parent 43dcf19ac3
commit 712fec166d
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
6 changed files with 162 additions and 19 deletions

1
Cargo.lock generated
View File

@ -3364,6 +3364,7 @@ version = "0.97.2"
dependencies = [
"brotli",
"byte-unit",
"bytes",
"chrono",
"chrono-humanize",
"convert_case",

View File

@ -69,6 +69,7 @@ base64 = "0.22.1"
bracoxide = "0.1.2"
brotli = "5.0"
byteorder = "1.5"
bytes = "1"
bytesize = "1.3"
calamine = "0.24.0"
chardetng = "0.1.17"

View File

@ -19,6 +19,7 @@ nu-system = { path = "../nu-system", version = "0.97.2" }
nu-derive-value = { path = "../nu-derive-value", version = "0.97.2" }
brotli = { workspace = true, optional = true }
bytes = { workspace = true }
byte-unit = { version = "5.1", features = [ "serde" ] }
chrono = { workspace = true, features = [ "serde", "std", "unstable-locales" ], default-features = false }
chrono-humanize = { workspace = true }

View File

@ -1,13 +1,14 @@
use crate::{
ast::{CellPath, PathMember},
engine::Closure,
NuGlob, Range, Record, ShellError, Spanned, Type, Value,
NuGlob, Range, Record, ShellError, Span, Spanned, Type, Value,
};
use chrono::{DateTime, FixedOffset};
use std::{
any,
cmp::Ordering,
collections::{HashMap, VecDeque},
fmt,
path::PathBuf,
str::FromStr,
};
@ -221,20 +222,8 @@ macro_rules! impl_from_value_for_int {
"int should be within the valid range for {}",
stringify!($type)
),
i64::MIN..=MIN => ShellError::GenericError {
error: "Integer too small".to_string(),
msg: format!("{int} is smaller than {}", <$type>::MIN),
span: Some(span),
help: None,
inner: vec![],
},
MAX..=i64::MAX => ShellError::GenericError {
error: "Integer too large".to_string(),
msg: format!("{int} is larger than {}", <$type>::MAX),
span: Some(span),
help: None,
inner: vec![],
},
i64::MIN..=MIN => int_too_small_error(int, <$type>::MIN, span),
MAX..=i64::MAX => int_too_large_error(int, <$type>::MAX, span),
})
}
@ -417,14 +406,31 @@ impl FromValue for String {
}
}
// This impl is different from Vec<T> as it reads from Value::Binary and
// Value::String instead of Value::List.
// This also denies implementing FromValue for u8.
// This impl is different from Vec<T> as it allows reading from Value::Binary and Value::String too.
// This also denies implementing FromValue for u8 as it would be in conflict with the Vec<T> impl.
impl FromValue for Vec<u8> {
fn from_value(v: Value) -> Result<Self, ShellError> {
match v {
Value::Binary { val, .. } => Ok(val),
Value::String { val, .. } => Ok(val.into_bytes()),
Value::List { vals, .. } => {
const U8MIN: i64 = u8::MIN as i64;
const U8MAX: i64 = u8::MAX as i64;
let mut this = Vec::with_capacity(vals.len());
for val in vals {
let span = val.span();
let int = i64::from_value(val)?;
// calculating -1 on these ranges would be less readable
#[allow(overlapping_range_endpoints)]
#[allow(clippy::match_overlapping_arm)]
match int {
U8MIN..=U8MAX => this.push(int as u8),
i64::MIN..=U8MIN => return Err(int_too_small_error(int, U8MIN, span)),
U8MAX..=i64::MAX => return Err(int_too_large_error(int, U8MAX, span)),
};
}
Ok(this)
}
v => Err(ShellError::CantConvert {
to_type: Self::expected_type().to_string(),
from_type: v.get_type().to_string(),
@ -664,9 +670,51 @@ where
}
}
// Foreign Types
impl FromValue for bytes::Bytes {
fn from_value(v: Value) -> Result<Self, ShellError> {
match v {
Value::Binary { val, .. } => Ok(val.into()),
v => Err(ShellError::CantConvert {
to_type: Self::expected_type().to_string(),
from_type: v.get_type().to_string(),
span: v.span(),
help: None,
}),
}
}
fn expected_type() -> Type {
Type::Binary
}
}
// Use generics with `fmt::Display` to allow passing different kinds of integer
fn int_too_small_error(int: impl fmt::Display, min: impl fmt::Display, span: Span) -> ShellError {
ShellError::GenericError {
error: "Integer too small".to_string(),
msg: format!("{int} is smaller than {min}"),
span: Some(span),
help: None,
inner: vec![],
}
}
fn int_too_large_error(int: impl fmt::Display, max: impl fmt::Display, span: Span) -> ShellError {
ShellError::GenericError {
error: "Integer too large".to_string(),
msg: format!("{int} is larger than {max}"),
span: Some(span),
help: None,
inner: vec![],
}
}
#[cfg(test)]
mod tests {
use crate::{engine::Closure, FromValue, Record, Type};
use crate::{engine::Closure, FromValue, IntoValue, Record, Span, Type, Value};
use std::ops::Deref;
#[test]
fn expected_type_default_impl() {
@ -680,4 +728,34 @@ mod tests {
Type::Custom("Closure".to_string().into_boxed_str())
);
}
#[test]
fn from_value_vec_u8() {
let vec: Vec<u8> = vec![1, 2, 3];
let span = Span::test_data();
let string = "Hello Vec<u8>!".to_string();
assert_eq!(
Vec::<u8>::from_value(vec.clone().into_value(span)).unwrap(),
vec.clone(),
"Vec<u8> roundtrip"
);
assert_eq!(
Vec::<u8>::from_value(Value::test_string(string.clone()))
.unwrap()
.deref(),
string.as_bytes(),
"Vec<u8> from String"
);
assert_eq!(
Vec::<u8>::from_value(Value::test_binary(vec.clone())).unwrap(),
vec,
"Vec<u8> from Binary"
);
assert!(Vec::<u8>::from_value(vec![u8::MIN as i32 - 1].into_value(span)).is_err());
assert!(Vec::<u8>::from_value(vec![u8::MAX as i32 + 1].into_value(span)).is_err());
}
}

View File

@ -172,6 +172,14 @@ impl IntoValue for Value {
}
}
// Foreign Types
impl IntoValue for bytes::Bytes {
fn into_value(self, span: Span) -> Value {
Value::binary(self.to_vec(), span)
}
}
// TODO: use this type for all the `into_value` methods that types implement but return a Result
/// A trait for trying to convert a value into a `Value`.
///

View File

@ -1,4 +1,5 @@
use crate::{record, FromValue, IntoValue, Record, Span, Value};
use bytes::Bytes;
use std::collections::HashMap;
// Make nu_protocol available in this namespace, consumers of this crate will
@ -440,3 +441,56 @@ enum_rename_all! {
Flat: "flatcase" => ["alphaone", "betatwo", "charliethree"],
UpperFlat: "UPPERFLATCASE" => ["ALPHAONE", "BETATWO", "CHARLIETHREE"]
}
#[derive(IntoValue, FromValue, Debug, PartialEq)]
struct ByteContainer {
vec: Vec<u8>,
bytes: Bytes,
}
impl ByteContainer {
fn make() -> Self {
ByteContainer {
vec: vec![1, 2, 3],
bytes: Bytes::from_static(&[4, 5, 6]),
}
}
fn value() -> Value {
Value::test_record(record! {
"vec" => Value::test_list(vec![
Value::test_int(1),
Value::test_int(2),
Value::test_int(3),
]),
"bytes" => Value::test_binary(vec![4, 5, 6]),
})
}
}
#[test]
fn bytes_into_value() {
let expected = ByteContainer::value();
let actual = ByteContainer::make().into_test_value();
assert_eq!(expected, actual);
}
#[test]
fn bytes_from_value() {
let expected = ByteContainer::make();
let actual = ByteContainer::from_value(ByteContainer::value()).unwrap();
assert_eq!(expected, actual);
}
#[test]
fn bytes_roundtrip() {
let expected = ByteContainer::make();
let actual = ByteContainer::from_value(ByteContainer::make().into_test_value()).unwrap();
assert_eq!(expected, actual);
let expected = ByteContainer::value();
let actual = ByteContainer::from_value(ByteContainer::value())
.unwrap()
.into_test_value();
assert_eq!(expected, actual);
}