mirror of
https://github.com/nushell/nushell.git
synced 2025-04-23 20:58:19 +02:00
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR adds type checking of all command input types at run-time. Generally, these errors should be caught by the parser, but sometimes we can't know the type of a value at parse-time. The simplest example is using the `echo` command, which has an output type of `any`, so prefixing a literal with `echo` will bypass parse-time type checking. Before this PR, each command has to individually check its input types. This can result in scenarios where the input/output types don't match the actual command behavior. This can cause valid usage with an non-`any` type to become a parse-time error if a command is missing that type in its pipeline input/output (`drop nth` and `history import` do this before this PR). Alternatively, a command may not list a type in its input/output types, but doesn't actually reject that type in its code, which can have unintended side effects (`get` does this on an empty pipeline input, and `sort` used to before #13154). After this PR, the type of the pipeline input is checked to ensure it matches one of the input types listed in the proceeding command's input/output types. While each of the issues in the "before this PR" section could be addressed with each command individually, this PR solves this issue for _all_ commands. **This will likely cause some breakage**, as some commands have incorrect input/output types, and should be adjusted. Also, some scripts may have erroneous usage of commands. In writing this PR, I discovered that `toolkit.nu` was passing `null` values to `str join`, which doesn't accept nothing types (if folks think it should, we can adjust it in this PR or in a different PR). I found some issues in the standard library and its tests. I also found that carapace's vendor script had an incorrect chaining of `get -i`: ```nushell let expanded_alias = (scope aliases | where name == $spans.0 | get -i 0 | get -i expansion) ``` Before this PR, if the `get -i 0` ever actually did evaluate to `null`, the second `get` invocation would error since `get` doesn't operate on `null` values. After this PR, this is immediately a run-time error, alerting the user to the problematic code. As a side note, we'll need to PR this fix (`get -i 0 | get -i expansion` -> `get -i 0.expansion`) to carapace. A notable exception to the type checking is commands with input type of `nothing -> <type>`. In this case, any input type is allowed. This allows piping values into the command without an error being thrown. For example, `123 | echo $in` would be an error without this exception. Additionally, custom types bypass type checking (I believe this also happens during parsing, but not certain) I added a `is_subtype` method to `Value` and `PipelineData`. It functions slightly differently than `get_type().is_subtype()`, as noted in the doccomments. Notably, it respects structural typing of lists and tables. For example, the type of a value `[{a: 123} {a: 456, b: 789}]` is a subtype of `table<a: int>`, whereas the type returned by `Value::get_type` is a `list<any>`. Similarly, `PipelineData` has some special handling for `ListStream`s and `ByteStream`s. The latter was needed for this PR to work properly with external commands. Here's some examples. Before: ```nu 1..2 | drop nth 1 Error: nu::parser::input_type_mismatch × Command does not support range input. ╭─[entry #9:1:8] 1 │ 1..2 | drop nth 1 · ────┬─── · ╰── command doesn't support range input ╰──── echo 1..2 | drop nth 1 # => ╭───┬───╮ # => │ 0 │ 1 │ # => ╰───┴───╯ ``` After this PR, I've adjusted `drop nth`'s input/output types to accept range input. Before this PR, zip accepted any value despite not being listed in its input/output types. This caused different behavior depending on if you triggered a parse error or not: ```nushell 1 | zip [2] # => Error: nu::parser::input_type_mismatch # => # => × Command does not support int input. # => ╭─[entry #3:1:5] # => 1 │ 1 | zip [2] # => · ─┬─ # => · ╰── command doesn't support int input # => ╰──── echo 1 | zip [2] # => ╭───┬───────────╮ # => │ 0 │ ╭───┬───╮ │ # => │ │ │ 0 │ 1 │ │ # => │ │ │ 1 │ 2 │ │ # => │ │ ╰───┴───╯ │ # => ╰───┴───────────╯ ``` After this PR, it works the same in both cases. For cases like this, if we do decide we want `zip` or other commands to accept any input value, then we should explicitly add that to the input types. ```nushell 1 | zip [2] # => Error: nu::parser::input_type_mismatch # => # => × Command does not support int input. # => ╭─[entry #3:1:5] # => 1 │ 1 | zip [2] # => · ─┬─ # => · ╰── command doesn't support int input # => ╰──── echo 1 | zip [2] # => Error: nu:🐚:only_supports_this_input_type # => # => × Input type not supported. # => ╭─[entry #14:2:6] # => 2 │ echo 1 | zip [2] # => · ┬ ─┬─ # => · │ ╰── only list<any> and range input data is supported # => · ╰── input type: int # => ╰──── ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> **Breaking change**: The type of a command's input is now checked against the input/output types of that command at run-time. While these errors should mostly be caught at parse-time, in cases where they can't be detected at parse-time they will be caught at run-time instead. This applies to both internal commands and custom commands. Example function and corresponding parse-time error (same before and after PR): ```nushell def foo []: int -> nothing { print $"my cool int is ($in)" } 1 | foo # => my cool int is 1 "evil string" | foo # => Error: nu::parser::input_type_mismatch # => # => × Command does not support string input. # => ╭─[entry #16:1:17] # => 1 │ "evil string" | foo # => · ─┬─ # => · ╰── command doesn't support string input # => ╰──── # => ``` Before: ```nu echo "evil string" | foo # => my cool int is evil string ``` After: ```nu echo "evil string" | foo # => Error: nu:🐚:only_supports_this_input_type # => # => × Input type not supported. # => ╭─[entry #17:1:6] # => 1 │ echo "evil string" | foo # => · ──────┬────── ─┬─ # => · │ ╰── only int input data is supported # => · ╰── input type: string # => ╰──── ``` Known affected internal commands which erroneously accepted any type: * `str join` * `zip` * `reduce` # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> - 🟢 `toolkit fmt` - 🟢 `toolkit clippy` - 🟢 `toolkit test` - 🟢 `toolkit test stdlib` # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> * Play whack-a-mole with the commands and scripts this will inevitably break
484 lines
12 KiB
Rust
484 lines
12 KiB
Rust
use nu_test_support::fs::Stub::FileWithContentToBeTrimmed;
|
||
use nu_test_support::playground::Playground;
|
||
use nu_test_support::{nu, pipeline};
|
||
|
||
#[test]
|
||
fn table_to_csv_text_and_from_csv_text_back_into_table() {
|
||
let actual = nu!(
|
||
cwd: "tests/fixtures/formats",
|
||
"open caco3_plastics.csv | to csv | from csv | first | get origin "
|
||
);
|
||
|
||
assert_eq!(actual.out, "SPAIN");
|
||
}
|
||
|
||
#[test]
|
||
fn table_to_csv_text() {
|
||
Playground::setup("filter_to_csv_test_1", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"csv_text_sample.txt",
|
||
r#"
|
||
importer,shipper,tariff_item,name,origin
|
||
Plasticos Rival,Reverte,2509000000,Calcium carbonate,Spain
|
||
Tigre Ecuador,OMYA Andina,3824909999,Calcium carbonate,Colombia
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open csv_text_sample.txt
|
||
| lines
|
||
| str trim
|
||
| split column "," a b c d origin
|
||
| last 1
|
||
| to csv
|
||
| lines
|
||
| get 1
|
||
"#
|
||
));
|
||
|
||
assert!(actual
|
||
.out
|
||
.contains("Tigre Ecuador,OMYA Andina,3824909999,Calcium carbonate,Colombia"));
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn table_to_csv_text_skipping_headers_after_conversion() {
|
||
Playground::setup("filter_to_csv_test_2", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"csv_text_sample.txt",
|
||
r#"
|
||
importer,shipper,tariff_item,name,origin
|
||
Plasticos Rival,Reverte,2509000000,Calcium carbonate,Spain
|
||
Tigre Ecuador,OMYA Andina,3824909999,Calcium carbonate,Colombia
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open csv_text_sample.txt
|
||
| lines
|
||
| str trim
|
||
| split column "," a b c d origin
|
||
| last 1
|
||
| to csv --noheaders
|
||
"#
|
||
));
|
||
|
||
assert!(actual
|
||
.out
|
||
.contains("Tigre Ecuador,OMYA Andina,3824909999,Calcium carbonate,Colombia"));
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn infers_types() {
|
||
Playground::setup("filter_from_csv_test_1", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_cuatro_mosqueteros.csv",
|
||
r#"
|
||
first_name,last_name,rusty_luck,d
|
||
Andrés,Robalino,1,d
|
||
JT,Turner,1,d
|
||
Yehuda,Katz,1,d
|
||
Jason,Gedge,1,d
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_cuatro_mosqueteros.csv
|
||
| where rusty_luck > 0
|
||
| length
|
||
"#
|
||
));
|
||
|
||
assert_eq!(actual.out, "4");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_text_to_table() {
|
||
Playground::setup("filter_from_csv_test_2", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_name,last_name,rusty_luck
|
||
Andrés,Robalino,1
|
||
JT,Turner,1
|
||
Yehuda,Katz,1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_caballeros.txt
|
||
| from csv
|
||
| get rusty_luck
|
||
| length
|
||
"#
|
||
));
|
||
|
||
assert_eq!(actual.out, "3");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_text_with_separator_to_table() {
|
||
Playground::setup("filter_from_csv_test_3", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_name;last_name;rusty_luck
|
||
Andrés;Robalino;1
|
||
JT;Turner;1
|
||
Yehuda;Katz;1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_caballeros.txt
|
||
| from csv --separator ";"
|
||
| get rusty_luck
|
||
| length
|
||
"#
|
||
));
|
||
|
||
assert_eq!(actual.out, "3");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_text_with_tab_separator_to_table() {
|
||
Playground::setup("filter_from_csv_test_4", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_name last_name rusty_luck
|
||
Andrés Robalino 1
|
||
JT Turner 1
|
||
Yehuda Katz 1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_caballeros.txt
|
||
| from csv --separator (char tab)
|
||
| get rusty_luck
|
||
| length
|
||
"#
|
||
));
|
||
|
||
assert_eq!(actual.out, "3");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
#[ignore = "csv crate has a bug when the last line is a comment: https://github.com/BurntSushi/rust-csv/issues/363"]
|
||
fn from_csv_text_with_comments_to_table() {
|
||
Playground::setup("filter_from_csv_test_5", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
# This is a comment
|
||
first_name,last_name,rusty_luck
|
||
# This one too
|
||
Andrés,Robalino,1
|
||
Jonathan,Turner,1
|
||
Yehuda,Katz,1
|
||
# This one also
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r##"
|
||
open los_tres_caballeros.txt
|
||
| from csv --comment "#"
|
||
| get rusty_luck
|
||
| length
|
||
"##
|
||
));
|
||
|
||
assert_eq!(actual.out, "3");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_text_with_custom_quotes_to_table() {
|
||
Playground::setup("filter_from_csv_test_6", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_name,last_name,rusty_luck
|
||
'And''rés',Robalino,1
|
||
Jonathan,Turner,1
|
||
Yehuda,Katz,1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_caballeros.txt
|
||
| from csv --quote "'"
|
||
| first
|
||
| get first_name
|
||
"#
|
||
));
|
||
|
||
assert_eq!(actual.out, "And'rés");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_text_with_custom_escapes_to_table() {
|
||
Playground::setup("filter_from_csv_test_7", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_name,last_name,rusty_luck
|
||
"And\"rés",Robalino,1
|
||
Jonathan,Turner,1
|
||
Yehuda,Katz,1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r"
|
||
open los_tres_caballeros.txt
|
||
| from csv --escape '\'
|
||
| first
|
||
| get first_name
|
||
"
|
||
));
|
||
|
||
assert_eq!(actual.out, "And\"rés");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_text_skipping_headers_to_table() {
|
||
Playground::setup("filter_from_csv_test_8", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_amigos.txt",
|
||
r#"
|
||
Andrés,Robalino,1
|
||
JT,Turner,1
|
||
Yehuda,Katz,1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_amigos.txt
|
||
| from csv --noheaders
|
||
| get column2
|
||
| length
|
||
"#
|
||
));
|
||
|
||
assert_eq!(actual.out, "3");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_text_with_missing_columns_to_table() {
|
||
Playground::setup("filter_from_csv_test_9", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_name,last_name,rusty_luck
|
||
Andrés,Robalino
|
||
Jonathan,Turner,1
|
||
Yehuda,Katz,1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_caballeros.txt
|
||
| from csv --flexible
|
||
| get -i rusty_luck
|
||
| compact
|
||
| length
|
||
"#
|
||
));
|
||
|
||
assert_eq!(actual.out, "2");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_text_with_multiple_char_separator() {
|
||
Playground::setup("filter_from_csv_test_10", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_name,last_name,rusty_luck
|
||
Andrés,Robalino,1
|
||
Jonathan,Turner,1
|
||
Yehuda,Katz,1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_caballeros.txt
|
||
| from csv --separator "li"
|
||
"#
|
||
));
|
||
|
||
assert!(actual
|
||
.err
|
||
.contains("separator should be a single char or a 4-byte unicode"));
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_text_with_wrong_type_separator() {
|
||
Playground::setup("filter_from_csv_test_11", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_name,last_name,rusty_luck
|
||
Andrés,Robalino,1
|
||
Jonathan,Turner,1
|
||
Yehuda,Katz,1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_caballeros.txt
|
||
| from csv --separator ('123' | into int)
|
||
"#
|
||
));
|
||
|
||
assert!(actual.err.contains("can't convert int to string"));
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn table_with_record_error() {
|
||
let actual = nu!(pipeline(
|
||
r#"
|
||
[[a b]; [1 2] [3 {a: 1 b: 2}]]
|
||
| to csv
|
||
"#
|
||
));
|
||
|
||
assert!(actual.err.contains("can't convert"))
|
||
}
|
||
|
||
#[test]
|
||
fn list_not_table_error() {
|
||
let actual = nu!(pipeline(
|
||
r#"
|
||
[{a: 1 b: 2} {a: 3 b: 4} 1]
|
||
| to csv
|
||
"#
|
||
));
|
||
|
||
assert!(actual.err.contains("Input type not supported"))
|
||
}
|
||
|
||
#[test]
|
||
fn string_to_csv_error() {
|
||
let actual = nu!(pipeline(
|
||
r#"
|
||
'qwe' | to csv
|
||
"#
|
||
));
|
||
|
||
assert!(actual.err.contains("command doesn't support"))
|
||
}
|
||
|
||
#[test]
|
||
fn parses_csv_with_unicode_sep() {
|
||
Playground::setup("filter_from_csv_unicode_sep_test_3", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_name;last_name;rusty_luck
|
||
Andrés;Robalino;1
|
||
JT;Turner;1
|
||
Yehuda;Katz;1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_caballeros.txt
|
||
| from csv --separator "003B"
|
||
| get rusty_luck
|
||
| length
|
||
"#
|
||
));
|
||
|
||
assert_eq!(actual.out, "3");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn parses_csv_with_unicode_x1f_sep() {
|
||
Playground::setup("filter_from_csv_unicode_sep_x1f_test_3", |dirs, sandbox| {
|
||
sandbox.with_files(&[FileWithContentToBeTrimmed(
|
||
"los_tres_caballeros.txt",
|
||
r#"
|
||
first_namelast_namerusty_luck
|
||
AndrésRobalino1
|
||
JTTurner1
|
||
YehudaKatz1
|
||
"#,
|
||
)]);
|
||
|
||
let actual = nu!(
|
||
cwd: dirs.test(), pipeline(
|
||
r#"
|
||
open los_tres_caballeros.txt
|
||
| from csv --separator "001F"
|
||
| get rusty_luck
|
||
| length
|
||
"#
|
||
));
|
||
|
||
assert_eq!(actual.out, "3");
|
||
})
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_test_flexible_extra_vals() {
|
||
let actual = nu!(pipeline(
|
||
r#"
|
||
echo "a,b\n1,2,3" | from csv --flexible | first | values | to nuon
|
||
"#
|
||
));
|
||
assert_eq!(actual.out, "[1, 2, 3]");
|
||
}
|
||
|
||
#[test]
|
||
fn from_csv_test_flexible_missing_vals() {
|
||
let actual = nu!(pipeline(
|
||
r#"
|
||
echo "a,b\n1" | from csv --flexible | first | values | to nuon
|
||
"#
|
||
));
|
||
assert_eq!(actual.out, "[1]");
|
||
}
|