mirror of
https://github.com/nushell/nushell.git
synced 2025-04-29 07:34:28 +02:00
669b44ad7d
171 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
669b44ad7d
|
feat(polars): add polars truncate for rounding datetimes (#15582)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR directly ports the polars function `polars.Expr.dt.truncate` (https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.dt.truncate.html), which rounds a datetime to an arbitrarily specified period length. This function is particularly useful when rounding to variable period lengths such as months or quarters. See below for examples. ```nushell # Truncate a series of dates by period length > seq date -b 2025-01-01 --periods 4 --increment 6wk -o "%Y-%m-%d %H:%M:%S" | polars into-df | polars as-datetime "%F %H:%M:%S" --naive | polars select datetime (polars col datetime | polars truncate 5d37m | polars as truncated) | polars collect ╭───┬───────────────────────┬───────────────────────╮ │ # │ datetime │ truncated │ ├───┼───────────────────────┼───────────────────────┤ │ 0 │ 01/01/2025 12:00:00AM │ 12/30/2024 04:49:00PM │ │ 1 │ 02/12/2025 12:00:00AM │ 02/08/2025 09:45:00PM │ │ 2 │ 03/26/2025 12:00:00AM │ 03/21/2025 02:41:00AM │ │ 3 │ 05/07/2025 12:00:00AM │ 05/05/2025 08:14:00AM │ ╰───┴───────────────────────┴───────────────────────╯ # Truncate based on period length measured in quarters and months > seq date -b 2025-01-01 --periods 4 --increment 6wk -o "%Y-%m-%d %H:%M:%S" | polars into-df | polars as-datetime "%F %H:%M:%S" --naive | polars select datetime (polars col datetime | polars truncate 1q5mo | polars as truncated) | polars collect ╭───┬───────────────────────┬───────────────────────╮ │ # │ datetime │ truncated │ ├───┼───────────────────────┼───────────────────────┤ │ 0 │ 01/01/2025 12:00:00AM │ 09/01/2024 12:00:00AM │ │ 1 │ 02/12/2025 12:00:00AM │ 09/01/2024 12:00:00AM │ │ 2 │ 03/26/2025 12:00:00AM │ 09/01/2024 12:00:00AM │ │ 3 │ 05/07/2025 12:00:00AM │ 05/01/2025 12:00:00AM │ ╰───┴───────────────────────┴───────────────────────╯ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> No breaking changes. This PR introduces a new command `polars truncate` # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> Example test was added. # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
8f81812ef9
|
fix cannot find issue when performing collect on an eager dataframe (#15577)
# Description Performing a `polars collect` on an eager dataframe should be a no-op operation. However, when used with a pipeline and not saving to a value a cache error occurs. This addresses that cache error. |
||
|
a33650a69e
|
fix(polars): cast as date now returns Date type instead of Datetime<ns> (#15574)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR fixes the bug where various commands that cast a column as a `date` type would return `datetime<ns>` rather than the intended type `date`. Affected commands include `polars into-df --schema`, `polars into-lazy --schema`, `polars as-date`, and `polars cast date`. This bug derives from the fact that Nushell uses the `date` type to denote a datetime type whereas polars differentiates between `Date` and `Datetime` types. By default, this PR retains the behavior that a Nushell `date` type will be mapped to a polars `Datetime<ns>` unless otherwise specified. ```nushell # Current (erroneous) implementation > [[a]; [2025-03-20]] | polars into-df --schema {a: "date"} | polars schema ╭───┬──────────────╮ │ a │ datetime<ns> │ ╰───┴──────────────╯ # Fixed implementation > [[a]; [2025-03-20]] | polars into-df --schema {a: "date"} | polars schema ╭───┬──────╮ │ a │ date │ ╰───┴──────╯ # Fixed implementation: by default, Nushell dates map to datetime<ns> > [[a]; [2025-03-20]] | polars into-df | polars schema ╭───┬───────────────────╮ │ a │ datetime<ns, UTC> │ ╰───┴───────────────────╯ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> Soft breaking change: users previously who wanted to cast a date column to type `date` can now expect the output to be type `date` instead of `datetime<ns>`. # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> Example test added to `polars as-date` command. # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
89322f59f2
|
Fix output type of polars schema (#15572)
# Description Output type of `polars schema` signature output type is of dataframe. It should be of type record. # User-Facing Changes - `polars schema` - how has an output type of record |
||
|
4e307480e4
|
polars : extend NuExpression::extract_exprs to handle records (#15553)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR seeks to simplify the syntax for commands that handle a list of expressions (e.g., `select`, `with-column`, and `agg`) by enabling the user to replace a list of expressions each aliased with `polars as` to a single record where the key is the alias for the value. See below for examples in several contexts. ```nushell # Select a column from a dataframe using a record > [[a b]; [6 2] [4 2] [2 2]] | polars into-df | polars select {c: ((polars col a) * 2)} ╭───┬────╮ │ # │ c │ ├───┼────┤ │ 0 │ 12 │ │ 1 │ 8 │ │ 2 │ 4 │ ╰───┴────╯ # Select a column from a dataframe using a mix of expressions and record of expressions > [[a b]; [6 2] [4 2] [2 2]] | polars into-df | polars select a b {c: ((polars col a) * 2)} ╭───┬───┬───┬────╮ │ # │ a │ b │ c │ ├───┼───┼───┼────┤ │ 0 │ 6 │ 2 │ 12 │ │ 1 │ 4 │ 2 │ 8 │ │ 2 │ 2 │ 2 │ 4 │ ╰───┴───┴───┴────╯ # Add series to the dataframe using a record > [[a b]; [1 2] [3 4]] | polars into-lazy | polars with-column { c: ((polars col a) * 2) d: ((polars col a) * 3) } | polars collect ╭───┬───┬───┬───┬───╮ │ # │ a │ b │ c │ d │ ├───┼───┼───┼───┼───┤ │ 0 │ 1 │ 2 │ 2 │ 3 │ │ 1 │ 3 │ 4 │ 6 │ 9 │ ╰───┴───┴───┴───┴───╯ # Group by and perform an aggregation using a record > [[a b]; [1 2] [1 4] [2 6] [2 4]] | polars into-lazy | polars group-by a | polars agg { b_min: (polars col b | polars min) b_max: (polars col b | polars max) b_sum: (polars col b | polars sum) } | polars collect | polars sort-by a ╭───┬───┬───────┬───────┬───────╮ │ # │ a │ b_min │ b_max │ b_sum │ ├───┼───┼───────┼───────┼───────┤ │ 0 │ 1 │ 2 │ 4 │ 6 │ │ 1 │ 2 │ 4 │ 6 │ 10 │ ╰───┴───┴───────┴───────┴───────╯ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> No breaking changes. Users now can use a mix of lists of expressions and records of expressions where previously only lists of expressions were accepted (e.g., in `select`, `with-column`, and `agg`). # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> Example tests were added to `select`, `with-column`, and `agg`. # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
ceaa0f9375
|
polars : add new command polars over (#15551)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> Introducing a basic implementation of the polars expression for window functions: `over` (https://docs.pola.rs/api/python/stable/reference/expressions/api/polars.Expr.over.html). Note that this PR only implements the default values for the sorting and `mapping_strategy` parameters. Implementations for other values for these parameters may be added in a future PR, as the demand arises. ```nushell # Compute expression over an aggregation window > [[a b]; [x 2] [x 4] [y 6] [y 4]] | polars into-lazy | polars select a (polars col b | polars cumulative sum | polars over a | polars as cum_b) | polars collect ╭───┬───┬───────╮ │ # │ a │ cum_b │ ├───┼───┼───────┤ │ 0 │ x │ 2 │ │ 1 │ x │ 6 │ │ 2 │ y │ 6 │ │ 3 │ y │ 10 │ ╰───┴───┴───────╯ # Compute expression over an aggregation window where partitions are defined by expressions > [[a b]; [x 2] [X 4] [Y 6] [y 4]] | polars into-lazy | polars select a (polars col b | polars cumulative sum | polars over (polars col a | polars lowercase) | polars as cum_b) | polars collect ╭───┬───┬───────╮ │ # │ a │ cum_b │ ├───┼───┼───────┤ │ 0 │ x │ 2 │ │ 1 │ X │ 6 │ │ 2 │ Y │ 6 │ │ 3 │ y │ 10 │ ╰───┴───┴───────╯ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> No breaking changes. This PR seeks to add a new command only. # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> Example tests are included. # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
d31b7024d8
|
polars : update get- datetime components commands to allow expressions as inputs (#15557)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR updates the following functions so they may also be used in a polars expression: - `polars get-day` - `polars get-hour` - `polars get-minute` - `polars get-month` - `polars get-nanosecond` - `polars get-ordinal` - `polars get-second` - `polars get-week` - `polars get-weekday` - `polars get-year` Below examples provide a comparison of the two contexts in which each of these commands may be used: ```nushell # Returns day from a date (current use case) > let dt = ('2020-08-04T16:39:18+00:00' | into datetime --timezone 'UTC'); let df = ([$dt $dt] | polars into-df); $df | polars get-day ╭───┬───╮ │ # │ 0 │ ├───┼───┤ │ 0 │ 4 │ │ 1 │ 4 │ ╰───┴───╯ # Returns day from a date in an expression (additional use case provided by this PR) > let dt = ('2020-08-04T16:39:18+00:00' | into datetime --timezone 'UTC'); let df = ([$dt $dt] | polars into-df); $df | polars select (polars col 0 | polars get-day) ╭───┬───╮ │ # │ 0 │ ├───┼───┤ │ 0 │ 4 │ │ 1 │ 4 │ ╰───┴───╯ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> No breaking changes. Each of these functions retains its current behavior and gains the benefit that they can now be used in an expression as well. # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> Tests have been added to each of the examples. # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
9dd30d7756
|
polars : update polars lit to handle nushell Value::Duration and Value::Date types (#15564)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR seeks to expand `polars lit` to handle additional nushell types: Value::Date and Value::Duration. This change is especially relevant to the `polars filter` command, where expressions would then directly incorporate Value::Date and Value::Duration types as literals. See one such example below. ```nushell # Filter dataframe for rows where dt is within the last 2 days of the maximum dt value > [[dt val]; [2025-04-01 1] [2025-04-02 2] [2025-04-03 3] [2025-04-04 4]] | polars into-df | polars filter ((polars col dt) > ((polars col dt | polars max | $in - 2day))) ╭───┬─────────────────────┬─────╮ │ # │ dt │ val │ ├───┼─────────────────────┼─────┤ │ 0 │ 04/03/25 12:00:00AM │ 3 │ │ 1 │ 04/04/25 12:00:00AM │ 4 │ ╰───┴─────────────────────┴─────╯ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> No breaking changes. Users now can directly access Value::Date and Value::Duration types as literals in polars expressions. # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> Several additional examples added to `polars lit` and `polars filter` # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
885b87a842
|
polars : add new command polars convert-time-zone (#15550)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This is a direct port of the python polars command `convert_time_zone` (https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.dt.convert_time_zone.html). Consistent with the rust/python implementation, naive datetimes are treated as if they are in UTC time. ```nushell # Convert timezone for timezone-aware datetime > ["2025-04-10 09:30:00 -0400" "2025-04-10 10:30:00 -0400"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S %z" | polars select (polars col datetime | polars convert-time-zone "Europe/Lisbon") ╭───┬───────────────────────╮ │ # │ datetime │ ├───┼───────────────────────┤ │ 0 │ 04/10/2025 02:30:00PM │ │ 1 │ 04/10/2025 03:30:00PM │ ╰───┴───────────────────────╯ # Timezone conversions for timezone-naive datetime will assume the original timezone is UTC > ["2025-04-10 09:30:00" "2025-04-10 10:30:00"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S" --naive | polars select (polars col datetime | polars convert-time-zone "America/New_York") ╭───┬───────────────────────╮ │ # │ datetime │ ├───┼───────────────────────┤ │ 0 │ 04/10/2025 05:30:00AM │ │ 1 │ 04/10/2025 06:30:00AM │ ╰───┴───────────────────────╯ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> No breaking changes. Users have access to a new command `polars convert-time-zone` # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> Example tests have been added. # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
1a0778d77e
|
polars : add new command polars replace-time-zone (#15538)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR seeks to add a direct port of the python polars `replace_time_zone` command in the `dt` namespace (https://docs.pola.rs/api/python/stable/reference/series/api/polars.Series.dt.replace_time_zone.html). Please note: I opted for two keywords "dt" and "replace-time-zone" to map directly with the implementation in both the rust and python packages, but I'm open to simplifying it to just one keyword, or `polars replace-time-zone` ```nushell # Apply timezone to a naive datetime > ["2021-12-30 00:00:00" "2021-12-31 00:00:00"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S" --naive | polars select (polars col datetime | polars dt replace-time-zone "America/New_York") ╭───┬─────────────────────╮ │ # │ datetime │ ├───┼─────────────────────┤ │ 0 │ 12/30/21 12:00:00AM │ │ 1 │ 12/31/21 12:00:00AM │ ╰───┴─────────────────────╯ # Apply timezone with ambiguous datetime > ["2025-11-02 00:00:00", "2025-11-02 01:00:00", "2025-11-02 02:00:00", "2025-11-02 03:00:00"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S" --naive | polars select (polars col datetime | polars dt replace-time-zone "America/New_York" --ambiguous null) ╭───┬─────────────────────╮ │ # │ datetime │ ├───┼─────────────────────┤ │ 0 │ 11/02/25 12:00:00AM │ │ 1 │ │ │ 2 │ 11/02/25 02:00:00AM │ │ 3 │ 11/02/25 03:00:00AM │ ╰───┴─────────────────────╯ # Apply timezone with nonexistent datetime > ["2025-03-09 01:00:00", "2025-03-09 02:00:00", "2025-03-09 03:00:00", "2025-03-09 04:00:00"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S" --naive | polars select (polars col datetime | polars dt replace-time-zone "America/New_York" --nonexistent null) ╭───┬─────────────────────╮ │ # │ datetime │ ├───┼─────────────────────┤ │ 0 │ 03/09/25 01:00:00AM │ │ 1 │ │ │ 2 │ 03/09/25 03:00:00AM │ │ 3 │ 03/09/25 04:00:00AM │ ╰───┴─────────────────────╯ ``` # User-Facing Changes No breaking changes. The user will be able to access the new command. # Tests + Formatting See example tests. # After Submitting |
||
|
f8ed4b45fd
|
Introducing polars into-schema (#15534)
# Description Introduces `polars into-schema` which allows converting Values such as records to a schema. This implicitly happens when when passing records into commands like `polars into-df` today. This allows you to convert to a schema object ahead of time and reuse the schema object. This can be useful for guaranteeing your schema object is correct. ```nu > ❯ : let schema = ({name: str, type: str} | polars into-schema) > ❯ : ls | select name type | polars into-lazy -s $schema | polars schema ╭──────┬─────╮ │ name │ str │ │ type │ str │ ╰──────┴─────╯ ``` # User-Facing Changes - Introduces `polars into-schema` allowing records to be converted to schema objects. |
||
|
b0f9cda9b5
|
Introduction of NuDataType and polars dtype (#15529)
# Description This pull request does a lot of the heavy lifting needed to supported more complex dtypes like categorical dtypes. It introduces a new CustomValue, NuDataType and makes NuSchema a full CustomValue. Further more it introduces a new command `polars into-dtype` that allows a dtype to be created. This can then be passed into schemas when they are created. ```nu > ❯ : let dt = ("str" | polars to-dtype) > ❯ : [[a b]; ["one" "two"]] | polars into-df -s {a: $dt, b: str} | polars schema ╭───┬─────╮ │ a │ str │ │ b │ str │ ╰───┴─────╯ ``` # User-Facing Changes - Introduces new command `polars into-dtype`, allows dtype variables to be passed in during schema creation. |
||
|
c0b944edb6
|
build(deps): bump indexmap from 2.8.0 to 2.9.0 (#15531)
Bumps [indexmap](https://github.com/indexmap-rs/indexmap) from 2.8.0 to 2.9.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/indexmap-rs/indexmap/blob/main/RELEASES.md">indexmap's changelog</a>.</em></p> <blockquote> <h2>2.9.0 (2025-04-04)</h2> <ul> <li>Added a <code>get_disjoint_mut</code> method to <code>IndexMap</code>, matching Rust 1.86's <code>HashMap</code> method.</li> <li>Added a <code>get_disjoint_indices_mut</code> method to <code>IndexMap</code> and <code>map::Slice</code>, matching Rust 1.86's <code>get_disjoint_mut</code> method on slices.</li> <li>Deprecated the <code>borsh</code> feature in favor of their own <code>indexmap</code> feature, solving a cyclic dependency that occured via <code>borsh-derive</code>.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
147009a161
|
polars into-df /polars into-lazy : --schema will not throw error if only some columns are defined (#15473)
# Description The current implementation of `polars into-df` and `polars into-lazy` will throw an error if `--schema` is provided but not all columns are defined. This PR seeks to remove this requirement so that when a partial `--schema` is provided, the types on the defined columns are overridden while the remaining columns take on their default types. **Current Implementation** ``` $ [[a b]; [1 "foo"] [2 "bar"]] | polars into-df -s {a: str} | polars schema Error: × Schema does not contain column: b ╭─[entry #88:1:12] 1 │ [[a b]; [1 "foo"] [2 "bar"]] | polars into-df -s {a: str} | polars schema · ───── ╰──── ``` **New Implementation (no error thrown on partial schema definition)** Column b is not defined in `--schema` ``` $ [[a b]; [1 "foo"] [2 "bar"]] | polars into-df --schema {a: str} | polars schema ╭───┬─────╮ │ a │ str │ │ b │ str │ ╰───┴─────╯ ``` # User-Facing Changes Soft breaking change: The user's previous (erroneous) code that would have thrown an error would no longer throw an error. The user's previous working code will still work. # Tests + Formatting # After Submitting |
||
|
1c6c85d35d
|
Fix clippy (#15489)
# Description There are some clippy(version 0.1.86) errors on nushell repo. This pr is trying to fix it. # User-Facing Changes Hopefully none. # Tests + Formatting NaN # After Submitting NaN |
||
|
7ca2a6f8ac
|
FIX polars as-datetime : ignores timezone information on conversion (#15490)
# Description This PR seeks to fix an error in `polars as-datetime` where timezone information is entirely ignored. This behavior raises a host of silent errors when dealing with datetime conversions (see example below). ## Current Implementation Timezones are entirely ignored and datetimes with different timezones are converted to the same naive datetimes even when the user specifically indicates that the timezone should be parsed. For example, "2021-12-30 00:00:00 +0000" and "2021-12-30 00:00:00 -0400" will both be parsed to "2021-12-30 00:00:00" even when the format string specifically includes "%z". ``` $ ["2021-12-30 00:00:00 +0000" "2021-12-30 00:00:00 -0400"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S %z" ╭───┬───────────────────────╮ │ # │ datetime │ ├───┼───────────────────────┤ │ 0 │ 12/30/2021 12:00:00AM │ │ 1 │ 12/30/2021 12:00:00AM │ <-- Same datetime even though the first is +0000 and second is -0400 ╰───┴───────────────────────╯ $ ["2021-12-30 00:00:00 +0000" "2021-12-30 00:00:00 -0400"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S %z" | polars schema ╭──────────┬──────────────╮ │ datetime │ datetime<ns> │ ╰──────────┴──────────────╯ ``` ## New Implementation Datetimes are converted to UTC and timezone information is retained. ``` $ "2021-12-30 00:00:00 +0000" "2021-12-30 00:00:00 -0400"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S %z" ╭───┬───────────────────────╮ │ # │ datetime │ ├───┼───────────────────────┤ │ 0 │ 12/30/2021 12:00:00AM │ │ 1 │ 12/30/2021 04:00:00AM │ <-- Converted to UTC ╰───┴───────────────────────╯ $ ["2021-12-30 00:00:00 +0000" "2021-12-30 00:00:00 -0400"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S %z" | polars schema ╭──────────┬───────────────────╮ │ datetime │ datetime<ns, UTC> │ ╰──────────┴───────────────────╯ ``` The user may intentionally ignore timezone information by setting the `--naive` flag. ``` $ ["2021-12-30 00:00:00 +0000" "2021-12-30 00:00:00 -0400"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S %z" --naive ╭───┬───────────────────────╮ │ # │ datetime │ ├───┼───────────────────────┤ │ 0 │ 12/30/2021 12:00:00AM │ │ 1 │ 12/30/2021 12:00:00AM │ <-- the -0400 offset is ignored when --naive is set ╰───┴───────────────────────╯ $ ["2021-12-30 00:00:00 +0000" "2021-12-30 00:00:00 -0400"] | polars into-df | polars as-datetime "%Y-%m-%d %H:%M:%S %z" --naive | polars schema ╭──────────┬──────────────╮ │ datetime │ datetime<ns> │ ╰──────────┴──────────────╯ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> `polars as-datetime` will now account for timezone information and return type `datetime<ns,UTC>` rather than `datetime<ns>` by default. The user can replicate the previous behavior by setting `--naive`. # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> Tests that incorporated `polars as-datetime` had to be tweaked to include `--naive` flag to replicate previous behavior. # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
2bf0397d80
|
bump to the latest rust version (#15483)
# Description This PR bumps nushell to use the latest rust version 1.84.1. |
||
|
470d130289
|
polars cast : add decimal option for dtype parameter (#15464)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description This PR expands the `dtype` parameter of the `polars cast` command to include `decimal<precision, scale>` type. Setting precision to "*" will compel inferring the value. Note, however, setting scale to a non-integer value will throw an explicit error (the underlying polars crate assigns scale = 0 in such a case, but I opted for throwing an error instead). . ``` $ [[a b]; [1 2] [3 4]] | polars into-df | polars cast decimal<4,2> a | polars schema ╭───┬──────────────╮ │ a │ decimal<4,2> │ │ b │ i64 │ ╰───┴──────────────╯ $ [[a b]; [10.5 2] [3.1 4]] | polars into-df | polars cast decimal<*,2> a | polars schema ╭───┬──────────────╮ │ a │ decimal<*,2> │ │ b │ i64 │ ╰───┴──────────────╯ $ [[a b]; [10.05 2] [3.1 4]] | polars into-df | polars cast decimal<5,*> a | polars schema rror: × Invalid polars data type ╭─[entry #25:1:47] 1 │ [[a b]; [10.05 2] [3.1 4]] | polars into-df | polars cast decimal<5,*> a | polars schema · ─────┬───── · ╰── `*` is not a permitted value for scale ╰──── ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> There are no breaking changes. The user has the additional option to `polars cast` to a decimal type # Tests + Formatting Tests have been added to `nu_plugin_polars/src/dataframe/values/nu_schema.rs` |
||
|
eaf522b41f
|
Polars cut (#15431)
- fixes #15366 # Description Introducing binning commands, `polars cut` and `polars qcut` # User-Facing Changes - New command `polars cut` - New command `polars qcut` |
||
|
1979b61a92
|
build(deps): bump tokio from 1.43.0 to 1.44.1 (#15419) | ||
|
946cef77f1
|
build(deps): bump uuid from 1.12.0 to 1.16.0 (#15346) | ||
|
c99c8119fe
|
build(deps): bump indexmap from 2.7.0 to 2.8.0 (#15345) | ||
|
2c7ab6e898
|
Bump to 0.103.1 dev version (#15347)
# Description Marks development or hotfix |
||
|
c986426478
|
Bump version for 0.103.0 release (#15340) | ||
|
42aa2ff5ba
|
remove mimalloc allocator (#15317)
# Description This PR removes the mimalloc allocator due to run-away memory leaks recently found. closes #15311 # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
0f6996b70d
|
Support for reading Categorical and Enum types (#15292)
# fixes https://github.com/nushell/nushell/issues/15281 # Description Provides the ability read dataframes with Categorical and Enum data The ability to write Categorical and Enum data will provided in a future PR |
||
|
966cebec34
|
Adds polars list-contains command (#15304)
# Description This PR adds the `polars list-contains` command. It works like this: ``` ~/Projects/nushell/nushell> let df = [[a]; [[a,b,c]] [[b,c,d]] [[c,d,f]]] | polars into-df -s {a: list<str>}; ~/Projects/nushell/nushell> $df | polars with-column [(polars col a | polars list-contains (polars lit a) | polars as b)] | polars collect ╭───┬───────────┬───────╮ │ # │ a │ b │ ├───┼───────────┼───────┤ │ 0 │ ╭───┬───╮ │ true │ │ │ │ 0 │ a │ │ │ │ │ │ 1 │ b │ │ │ │ │ │ 2 │ c │ │ │ │ │ ╰───┴───╯ │ │ │ 1 │ ╭───┬───╮ │ false │ │ │ │ 0 │ b │ │ │ │ │ │ 1 │ c │ │ │ │ │ │ 2 │ d │ │ │ │ │ ╰───┴───╯ │ │ │ 2 │ ╭───┬───╮ │ false │ │ │ │ 0 │ c │ │ │ │ │ │ 1 │ d │ │ │ │ │ │ 2 │ f │ │ │ │ │ ╰───┴───╯ │ │ ╰───┴───────────┴───────╯ ``` or ``` ~/Projects/nushell/nushell> let df = [[a, b]; [[a,b,c], a] [[b,c,d], f] [[c,d,f], f]] | polars into-df -s {a: list<str>, b: str} ~/Projects/nushell/nushell> $df | polars with-column [(polars col a | polars list-contains b | polars as c)] | polars collect ╭───┬───────────┬───┬───────╮ │ # │ a │ b │ c │ ├───┼───────────┼───┼───────┤ │ 0 │ ╭───┬───╮ │ a │ true │ │ │ │ 0 │ a │ │ │ │ │ │ │ 1 │ b │ │ │ │ │ │ │ 2 │ c │ │ │ │ │ │ ╰───┴───╯ │ │ │ │ 1 │ ╭───┬───╮ │ f │ false │ │ │ │ 0 │ b │ │ │ │ │ │ │ 1 │ c │ │ │ │ │ │ │ 2 │ d │ │ │ │ │ │ ╰───┴───╯ │ │ │ │ 2 │ ╭───┬───╮ │ f │ true │ │ │ │ 0 │ c │ │ │ │ │ │ │ 1 │ d │ │ │ │ │ │ │ 2 │ f │ │ │ │ │ │ ╰───┴───╯ │ │ │ ╰───┴───────────┴───┴───────╯ ``` or ``` ~/Projects/nushell/nushell> let df = [[a, b]; [[1,2,3], 4] [[2,4,1], 2] [[2,1,6], 3]] | polars into-df -s {a: list<i64>, b: i64} ~/Projects/nushell/nushell> $df | polars with-column [(polars col a | polars list-contains ((polars col b) * 2) | polars as c)] | polars collect ╭───┬───────────┬───┬───────╮ │ # │ a │ b │ c │ ├───┼───────────┼───┼───────┤ │ 0 │ ╭───┬───╮ │ 4 │ false │ │ │ │ 0 │ 1 │ │ │ │ │ │ │ 1 │ 2 │ │ │ │ │ │ │ 2 │ 3 │ │ │ │ │ │ ╰───┴───╯ │ │ │ │ 1 │ ╭───┬───╮ │ 2 │ true │ │ │ │ 0 │ 2 │ │ │ │ │ │ │ 1 │ 4 │ │ │ │ │ │ │ 2 │ 1 │ │ │ │ │ │ ╰───┴───╯ │ │ │ │ 2 │ ╭───┬───╮ │ 3 │ true │ │ │ │ 0 │ 2 │ │ │ │ │ │ │ 1 │ 1 │ │ │ │ │ │ │ 2 │ 6 │ │ │ │ │ │ ╰───┴───╯ │ │ │ ╰───┴───────────┴───┴───────╯ ``` Let me know what you think. I'm a bit surprised that a list by default seems to get converted to "object" when doing `into-df` which is why I added the extra `-s` flag every time to explicitly force it into a list. |
||
|
e926919582
|
polars open : exposing the ability to configure hive settings. (#15255)
# Description Exposes parameters for working with [hive](https://docs.pola.rs/user-guide/io/hive/#scanning-hive-partitioned-data) partitioning. # User-Facing Changes - Added flags `--hive-enabled`, `--hive-start-idx`, `--hive-schema`, `--hive-try-parse-dates` to `polars open` |
||
|
2dab65f852
|
Polars: Map pq extension to parquet files (#15284)
# Description Files with the extension pq will automatically be treated as parquet files. closes #15282 |
||
|
087fe484f6
|
Enhance polars plugin documentation (#15250)
This PR (based on #15249 and #15248 because it mentions them) adds extra documentation to the main polars command outlining the main datatypes that are used by the plugin. The lack of a description of the types involved in `polars xxx` commands was quite confusing to me when I started using the plugin and this is a first try improving it. I didn't find a better place but please let me know what you think. |
||
|
88bbe4abaa
|
Add Xor to polars plugin nu_expressions (#15249)
solution for #15242 , based on PR #15248 . Allows doing this: ``` ~/Projects/nushell> [[a, b]; [1., 2.], [3.,3.], [4., 6.]] | polars into-df | polars filter (((polars col a) < 2) xor ((polars col b) > 5)) ╭───┬──────┬──────╮ │ # │ a │ b │ ├───┼──────┼──────┤ │ 0 │ 1.00 │ 2.00 │ │ 1 │ 4.00 │ 6.00 │ ╰───┴──────┴──────╯ ``` |
||
|
7939fb05ea
|
polars strip-chars : Allow any polars expression for pattern argument (#15178)
# Description Allow any polars expression for pattern argument for `polars strip-chars` |
||
|
53d30ee7ea
|
add polars str strip chars (with --end / --start options) (#15118)
# Description This PR adds `polars str-strip-chars-end` # User-Facing Changes New function that can be used as follows: ``` ~/Projects/nushell> [[text]; [hello!!!] [world!!!]] | polars into-df | polars select (polars col text | polars str-strip-chars-end "!") | polars collect ╭───┬───────╮ │ # │ text │ ├───┼───────┤ │ 0 │ hello │ │ 1 │ world │ ╰───┴───────╯ ``` # Tests + Formatting tests ran locally. I ran the formatter. # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
058ce0ed2d
|
move to polars bigidx (#15177)
Fixes [#15157](https://github.com/nushell/nushell/issues/15157) # Description Utilizes the polar's bigidx feature to support massive datasets. |
||
|
3d58c3f70e
|
Expose flag to not maintain order on polars concat (#15145)
|
||
|
c504c93a1d
|
Polars: Minor code cleanup (#15144)
# Description Removing todos and deadcode from a previous refactor |
||
|
62e56d3581
|
Rework operator type errors (#14429)
# Description This PR adds two new `ParseError` and `ShellError` cases for type errors relating to operators. - `OperatorUnsupportedType` is used when a type is not supported by an operator in any way, shape, or form. E.g., `+` does not support `bool`. - `OperatorIncompatibleTypes` is used when a operator is used with types it supports, but the combination of types provided cannot be used together. E.g., `filesize + duration` is not a valid combination. The other preexisting error cases related to operators have been removed and replaced with the new ones above. Namely: - `ShellError::OperatorMismatch` - `ShellError::UnsupportedOperator` - `ParseError::UnsupportedOperationLHS` - `ParseError::UnsupportedOperationRHS` - `ParseError::UnsupportedOperationTernary` # User-Facing Changes - `help operators` now lists the precedence of `not` as 55 instead of 0 (above the other boolean operators). Fixes #13675. - `math median` and `math mode` now ignore NaN values so that `[NaN NaN] | math median` and `[NaN NaN] | math mode` no longer trigger a type error. Instead, it's now an empty input error. Fixing this in earnest can be left for a future PR. - Comparisons with `nan` now return false instead of causing an error. E.g., `1 == nan` is now `false`. - All the operator type errors have been standardized and reworked. In particular, they can now have a help message, which is currently used for types errors relating to `++`. ```nu [1] ++ 2 ``` ``` Error: nu::parser::operator_unsupported_type × The '++' operator does not work on values of type 'int'. ╭─[entry #1:1:5] 1 │ [1] ++ 2 · ─┬ ┬ · │ ╰── int · ╰── does not support 'int' ╰──── help: if you meant to append a value to a list or a record to a table, use the `append` command or wrap the value in a list. For example: `$list ++ $value` should be `$list ++ [$value]` or `$list | append $value`. ``` |
||
|
bdc767bf23
|
fix polars save example typo (#15008)
# Description fix polars save example dfr -> polars I'm wondering why the commands `polars open` and `polars save` don't have the same flags? |
||
|
0705fb9cd1
|
Added S3 support for polars save (#15005)
# Description Parquet, CSV, NDJSON, and Arrow files can be written to AWS S3 via `polars save`. This mirrors the s3 functionality provided by `polars open`. ```nushell ls | polars into-df | polars save s3://my-bucket/test.parquet ``` # User-Facing Changes - S3 urls are now supported by `polars save` |
||
|
803a348f41
|
Bump to 0.102.1 dev version (#15012) | ||
|
1aa2ed1947
|
Bump version to 0.102.0 (#14998) | ||
|
13d5a15f75
|
Run-time pipeline input typechecking tweaks (#14922)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR makes two changes related to [run-time pipeline input type checking](https://github.com/nushell/nushell/pull/14741): 1. The check which bypasses type checking for commands with only `Type::Nothing` input types has been expanded to work with commands with multiple `Type::Nothing` inputs for different outputs. For example, `ast` has three input/output type pairs, but all of the inputs are `Type::Nothing`: ``` ╭───┬─────────┬────────╮ │ # │ input │ output │ ├───┼─────────┼────────┤ │ 0 │ nothing │ table │ │ 1 │ nothing │ record │ │ 2 │ nothing │ string │ ╰───┴─────────┴────────╯ ``` Before this PR, passing a value (which would otherwise be ignored) to `ast` caused a run-time type error: ``` Error: nu:🐚:only_supports_this_input_type × Input type not supported. ╭─[entry #1:1:6] 1 │ echo 123 | ast -j -f "hi" · ─┬─ ─┬─ · │ ╰── only nothing, nothing, and nothing input data is supported · ╰── input type: int ╰──── ``` After this PR, no error is raised. This doesn't really matter for `ast` (the only other built-in command with a similar input/output type signature is `cal`), but it's more logically consistent. 2. Bypasses input type-checking (parse-time ***and*** run-time) for some (not all, see below) commands which have both a `Type::Nothing` input and some other non-nothing `Type` input. This is accomplished by adding a `Type::Any` input with the same output as the corresponding `Type::Nothing` input/output pair. This is necessary because some commands are intended to operate on an argument with empty pipeline input, or operate on an empty pipeline input with no argument. This causes issues when a value is implicitly passed to one of these commands. I [discovered this issue](https://discord.com/channels/601130461678272522/615962413203718156/1329945784346611712) when working with an example where the `open` command is used in `sort-by` closure: ```nushell ls | sort-by { open -r $in.name | lines | length } ``` Before this PR (but after the run-time input type checking PR), this error is raised: ``` Error: nu:🐚:only_supports_this_input_type × Input type not supported. ╭─[entry #1:1:1] 1 │ ls | sort-by { open -r $in.name | lines | length } · ─┬ ──┬─ · │ ╰── only nothing and string input data is supported · ╰── input type: record<name: string, type: string, size: filesize, modified: date> ╰──── ``` While this error is technically correct, we don't actually want to return an error here since `open` ignores its pipeline input when an argument is passed. This would be a parse-time error as well if the parser was able to infer that the closure input type was a record, but our type inference isn't that robust currently, so this technically incorrect form snuck by type checking until #14741. However, there are some commands with the same kind of type signature where this behavior is actually desirable. This means we can't just bypass type-checking for any command with a `Type::Nothing` input. These commands operate on true `null` values, rather than ignoring their input. For example, `length` returns `0` when passed a `null` value. It's correct, and even desirable, to throw a run-time error when `length` is passed an unexpected type. For example, a string, which should instead be measured with `str length`: ```nushell ["hello" "world"] | sort-by { length } # => Error: nu:🐚:only_supports_this_input_type # => # => × Input type not supported. # => ╭─[entry #32:1:10] # => 1 │ ["hello" "world"] | sort-by { length } # => · ───┬─── ───┬── # => · │ ╰── only list<any>, binary, and nothing input data is supported # => · ╰── input type: string # => ╰──── ``` We need a more robust way for commands to express how they handle the `Type::Nothing` input case. I think a possible solution here is to allow commands to express that they operate on `PipelineData::Empty`, rather than `Value::Nothing`. Then, a command like `open` could have an empty pipeline input type rather than a `Type::Nothing`, and the parse-time and run-time pipeline input type checks know that `open` will safely ignore an incorrectly typed input. That being said, we have a release coming up and the above solution might take a while to implement, so while unfortunate, bypassing input type-checking for these problematic commands serves as a workaround to avoid breaking changes in the release until a more robust solution is implemented. This PR bypasses input type-checking for the following commands: * `load-env`: can take record of envvars as input or argument * `nu-check`: checks input string or filename argument * `open`: can take filename as input or argument * `polars when`: can be used with input, or can be chained with another `polars when` * `stor insert`: data record can be passed as input or argument * `stor update`: data record can be passed as input or argument * `format date`: `--list` ignores input value * `into datetime`: `--list` ignores input value (also added a `Type::Nothing` input which was missing from this command) These commands have a similar input/output signature to the above commands, but are working as intended: * `cd`: The input/output signature was actually incorrect, `cd` always ignores its input. I fixed this in this PR. * `generate` * `get` * `history import` * `interleave` * `into bool` * `length` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> As a temporary workaround, pipeline input type-checking for the following commands has been bypassed to avoid undesirable run-time input type checking errors which were previously not caught at parse-time: * `open` * `load-env` * `format date` * `into datetime` * `nu-check` * `stor insert` * `stor update` * `polars when` # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> CI became green in the time it took me to type the description 😄 # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> N/A |
||
|
66bc0542e0
|
Refactor I/O Errors (#14927)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx
you can also mention related issues, PRs or discussions!
-->
# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.
Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->
As mentioned in #10698, we have too many `ShellError` variants, with
some even overlapping in meaning. This PR simplifies and improves I/O
error handling by restructuring `ShellError` related to I/O issues.
Previously, `ShellError::IOError` only contained a message string,
making it convenient but overly generic. It was widely used without
providing spans (#4323).
This PR introduces a new `ShellError::Io` variant that consolidates
multiple I/O-related errors (except for `ShellError::NetworkFailure`,
which remains distinct for now). The new `ShellError::Io` variant
replaces the following:
- `FileNotFound`
- `FileNotFoundCustom`
- `IOInterrupted`
- `IOError`
- `IOErrorSpanned`
- `NotADirectory`
- `DirectoryNotFound`
- `MoveNotPossible`
- `CreateNotPossible`
- `ChangeAccessTimeNotPossible`
- `ChangeModifiedTimeNotPossible`
- `RemoveNotPossible`
- `ReadingFile`
## The `IoError`
`IoError` includes the following fields:
1. **`kind`**: Extends `std::io::ErrorKind` to specify the type of I/O
error without needing new `ShellError` variants. This aligns with the
approach used in `std::io::Error`. This adds a second dimension to error
reporting by combining the `kind` field with `ShellError` variants,
making it easier to describe errors in more detail. As proposed by
@kubouch in [#design-discussion on
Discord](https://discord.com/channels/601130461678272522/615329862395101194/1323699197165178930),
this helps reduce the number of `ShellError` variants. In the error
report, the `kind` field is displayed as the "source" of the error,
e.g., "I/O error," followed by the specific kind of I/O error.
2. **`span`**: A non-optional field to encourage providing spans for
better error reporting (#4323).
3. **`path`**: Optional `PathBuf` to give context about the file or
directory involved in the error (#7695). If provided, it’s shown as a
help entry in error reports.
4. **`additional_context`**: Allows adding custom messages when the
span, kind, and path are insufficient. This is rendered in the error
report at the labeled span.
5. **`location`**: Sometimes, I/O errors occur in the engine itself and
are not caused directly by user input. In such cases, if we don’t have a
span and must set it to `Span::unknown()`, we need another way to
reference the error. For this, the `location` field uses the new
`Location` struct, which records the Rust file and line number where the
error occurred. This ensures that we at least know the Rust code
location that failed, helping with debugging. To make this work, a new
`location!` macro was added, which retrieves `file!`, `line!`, and
`column!` values accurately. If `Location::new` is used directly, it
issues a warning to remind developers to use the macro instead, ensuring
consistent and correct usage.
### Constructor Behavior
`IoError` provides five constructor methods:
- `new` and `new_with_additional_context`: Used for errors caused by
user input and require a valid (non-unknown) span to ensure precise
error reporting.
- `new_internal` and `new_internal_with_path`: Used for internal errors
where a span is not available. These methods require additional context
and the `Location` struct to pinpoint the source of the error in the
engine code.
- `factory`: Returns a closure that maps an `std::io::Error` to an
`IoError`. This is useful for handling multiple I/O errors that share
the same span and path, streamlining error handling in such cases.
## New Report Look
This is simulation how the I/O errors look like (the `open crates` is
simulated to show how internal errors are referenced now):

## `Span::test_data()`
To enable better testing, `Span::test_data()` now returns a value
distinct from `Span::unknown()`. Both `Span::test_data()` and
`Span::unknown()` refer to invalid source code, but having a separate
value for test data helps identify issues during testing while keeping
spans unique.
## Cursed Sneaky Error Transfers
I removed the conversions between `std::io::Error` and `ShellError` as
they often removed important information and were used too broadly to
handle I/O errors. This also removed the problematic implementation
found here:
|
||
|
c0b4d19761
|
Polars upgrade to 0.46 (#14933)
Upgraded to Polars 0.46 |
||
|
0ad5f4389c
|
nu_plugin_polars: add polars into-repr to display dataframe in portable repr format (#14917)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR adds a new command that outputs a NuDataFrame or NuLazyFrame in its repr format, which can then be ingested in another polars instance. Advantages of serializing a dataframe in this format are that it can be viewed as a table, carries type information, and can easily be copied to the clipboard. ```nushell # In Nushell > [[a b]; [2025-01-01 2] [2025-01-02 4]] | polars into-df | polars into-lazy | polars into-repr shape: (2, 2) ┌─────────────────────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ datetime[ns] ┆ i64 │ ╞═════════════════════╪═════╡ │ 2025-01-01 00:00:00 ┆ 2 │ │ 2025-01-02 00:00:00 ┆ 4 │ └─────────────────────┴─────┘ ``` ```python # In python >>> import polars as pl >>> df = pl.from_repr(""" ... shape: (2, 2) ... ┌─────────────────────┬─────┐ ... │ a ┆ b │ ... │ --- ┆ --- │ ... │ datetime[ns] ┆ i64 │ ... ╞═════════════════════╪═════╡ ... │ 2025-01-01 00:00:00 ┆ 2 │ ... │ 2025-01-02 00:00:00 ┆ 4 │ ... └─────────────────────┴─────┘""") shape: (2, 2) ┌─────────────────────┬─────┐ │ a ┆ b │ │ --- ┆ --- │ │ datetime[ns] ┆ i64 │ ╞═════════════════════╪═════╡ │ 2025-01-01 00:00:00 ┆ 2 │ │ 2025-01-02 00:00:00 ┆ 4 │ └─────────────────────┴─────┘ >>> df.select(pl.col("a").dt.offset_by("12m")) shape: (2, 1) ┌─────────────────────┐ │ a │ │ --- │ │ datetime[ns] │ ╞═════════════════════╡ │ 2025-01-01 00:12:00 │ │ 2025-01-02 00:12:00 │ └─────────────────────┘ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> A new command `polars into-repr` is added. No other commands are impacted by the changes in this PR. # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> Examples were added in the command definition. # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> |
||
|
b99a8c9d80
|
Bump tokio from 1.42.0 to 1.43.0 (#14829)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.42.0 to 1.43.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/tokio-rs/tokio/releases">tokio's releases</a>.</em></p> <blockquote> <h2>Tokio v1.43.0</h2> <h1>1.43.0 (Jan 8th, 2025)</h1> <h3>Added</h3> <ul> <li>net: add <code>UdpSocket::peek</code> methods (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7068">#7068</a>)</li> <li>net: add support for Haiku OS (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7042">#7042</a>)</li> <li>process: add <code>Command::into_std()</code> (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7014">#7014</a>)</li> <li>signal: add <code>SignalKind::info</code> on illumos (<a href="https://redirect.github.com/tokio-rs/tokio/issues/6995">#6995</a>)</li> <li>signal: add support for realtime signals on illumos (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7029">#7029</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>io: don't call <code>set_len</code> before initializing vector in <code>Blocking</code> (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7054">#7054</a>)</li> <li>macros: suppress <code>clippy::needless_return</code> in <code>#[tokio::main]</code> (<a href="https://redirect.github.com/tokio-rs/tokio/issues/6874">#6874</a>)</li> <li>runtime: fix thread parking on WebAssembly (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7041">#7041</a>)</li> </ul> <h3>Changes</h3> <ul> <li>chore: use unsync loads for <code>unsync_load</code> (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7073">#7073</a>)</li> <li>io: use <code>Buf::put_bytes</code> in <code>Repeat</code> read impl (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7055">#7055</a>)</li> <li>task: drop the join waker of a task eagerly (<a href="https://redirect.github.com/tokio-rs/tokio/issues/6986">#6986</a>)</li> </ul> <h3>Changes to unstable APIs</h3> <ul> <li>metrics: improve flexibility of H2Histogram Configuration (<a href="https://redirect.github.com/tokio-rs/tokio/issues/6963">#6963</a>)</li> <li>taskdump: add accessor methods for backtrace (<a href="https://redirect.github.com/tokio-rs/tokio/issues/6975">#6975</a>)</li> </ul> <h3>Documented</h3> <ul> <li>io: clarify <code>ReadBuf::uninit</code> allows initialized buffers as well (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7053">#7053</a>)</li> <li>net: fix ambiguity in <code>TcpStream::try_write_vectored</code> docs (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7067">#7067</a>)</li> <li>runtime: fix <code>LocalRuntime</code> doc links (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7074">#7074</a>)</li> <li>sync: extend documentation for <code>watch::Receiver::wait_for</code> (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7038">#7038</a>)</li> <li>sync: fix typos in <code>OnceCell</code> docs (<a href="https://redirect.github.com/tokio-rs/tokio/issues/7047">#7047</a>)</li> </ul> <p><a href="https://redirect.github.com/tokio-rs/tokio/issues/6874">#6874</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/6874">tokio-rs/tokio#6874</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/6963">#6963</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/6963">tokio-rs/tokio#6963</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/6975">#6975</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/6975">tokio-rs/tokio#6975</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/6986">#6986</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/6986">tokio-rs/tokio#6986</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/6995">#6995</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/6995">tokio-rs/tokio#6995</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/7014">#7014</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/7014">tokio-rs/tokio#7014</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/7029">#7029</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/7029">tokio-rs/tokio#7029</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/7038">#7038</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/7038">tokio-rs/tokio#7038</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/7041">#7041</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/7041">tokio-rs/tokio#7041</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/7042">#7042</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/7042">tokio-rs/tokio#7042</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/7047">#7047</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/7047">tokio-rs/tokio#7047</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/7053">#7053</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/7053">tokio-rs/tokio#7053</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/7054">#7054</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/7054">tokio-rs/tokio#7054</a> <a href="https://redirect.github.com/tokio-rs/tokio/issues/7055">#7055</a>: <a href="https://redirect.github.com/tokio-rs/tokio/pull/7055">tokio-rs/tokio#7055</a></p> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
d9bfcb4c09
|
Bump uuid from 1.11.0 to 1.12.0 (#14830)
Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.11.0 to 1.12.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/uuid-rs/uuid/releases">uuid's releases</a>.</em></p> <blockquote> <h2>1.12.0</h2> <h2>What's Changed</h2> <ul> <li>feat: Add <code>NonZeroUuid</code> type for optimized <code>Option<Uuid></code> representation by <a href="https://github.com/ab22593k"><code>@ab22593k</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/779">uuid-rs/uuid#779</a></li> <li>Finalize <code>NonNilUuid</code> by <a href="https://github.com/KodrAus"><code>@KodrAus</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/783">uuid-rs/uuid#783</a></li> <li>Prepare for 1.12.0 release by <a href="https://github.com/KodrAus"><code>@KodrAus</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/784">uuid-rs/uuid#784</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/ab22593k"><code>@ab22593k</code></a> made their first contribution in <a href="https://redirect.github.com/uuid-rs/uuid/pull/779">uuid-rs/uuid#779</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/uuid-rs/uuid/compare/1.11.1...1.12.0">https://github.com/uuid-rs/uuid/compare/1.11.1...1.12.0</a></p> <h2>1.11.1</h2> <h2>What's Changed</h2> <ul> <li>Finish cut off docs by <a href="https://github.com/KodrAus"><code>@KodrAus</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/777">uuid-rs/uuid#777</a></li> <li>Fix links in CONTRIBUTING.md by <a href="https://github.com/jacobggman"><code>@jacobggman</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/778">uuid-rs/uuid#778</a></li> <li>Update rust toolchain before building by <a href="https://github.com/KodrAus"><code>@KodrAus</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/781">uuid-rs/uuid#781</a></li> <li>Prepare for 1.11.1 release by <a href="https://github.com/KodrAus"><code>@KodrAus</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/782">uuid-rs/uuid#782</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/jacobggman"><code>@jacobggman</code></a> made their first contribution in <a href="https://redirect.github.com/uuid-rs/uuid/pull/778">uuid-rs/uuid#778</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/uuid-rs/uuid/compare/1.11.0...1.11.1">https://github.com/uuid-rs/uuid/compare/1.11.0...1.11.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|
214714e0ab
|
Add run-time type checking for command pipeline input (#14741)
<!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [*linking keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience. --> This PR adds type checking of all command input types at run-time. Generally, these errors should be caught by the parser, but sometimes we can't know the type of a value at parse-time. The simplest example is using the `echo` command, which has an output type of `any`, so prefixing a literal with `echo` will bypass parse-time type checking. Before this PR, each command has to individually check its input types. This can result in scenarios where the input/output types don't match the actual command behavior. This can cause valid usage with an non-`any` type to become a parse-time error if a command is missing that type in its pipeline input/output (`drop nth` and `history import` do this before this PR). Alternatively, a command may not list a type in its input/output types, but doesn't actually reject that type in its code, which can have unintended side effects (`get` does this on an empty pipeline input, and `sort` used to before #13154). After this PR, the type of the pipeline input is checked to ensure it matches one of the input types listed in the proceeding command's input/output types. While each of the issues in the "before this PR" section could be addressed with each command individually, this PR solves this issue for _all_ commands. **This will likely cause some breakage**, as some commands have incorrect input/output types, and should be adjusted. Also, some scripts may have erroneous usage of commands. In writing this PR, I discovered that `toolkit.nu` was passing `null` values to `str join`, which doesn't accept nothing types (if folks think it should, we can adjust it in this PR or in a different PR). I found some issues in the standard library and its tests. I also found that carapace's vendor script had an incorrect chaining of `get -i`: ```nushell let expanded_alias = (scope aliases | where name == $spans.0 | get -i 0 | get -i expansion) ``` Before this PR, if the `get -i 0` ever actually did evaluate to `null`, the second `get` invocation would error since `get` doesn't operate on `null` values. After this PR, this is immediately a run-time error, alerting the user to the problematic code. As a side note, we'll need to PR this fix (`get -i 0 | get -i expansion` -> `get -i 0.expansion`) to carapace. A notable exception to the type checking is commands with input type of `nothing -> <type>`. In this case, any input type is allowed. This allows piping values into the command without an error being thrown. For example, `123 | echo $in` would be an error without this exception. Additionally, custom types bypass type checking (I believe this also happens during parsing, but not certain) I added a `is_subtype` method to `Value` and `PipelineData`. It functions slightly differently than `get_type().is_subtype()`, as noted in the doccomments. Notably, it respects structural typing of lists and tables. For example, the type of a value `[{a: 123} {a: 456, b: 789}]` is a subtype of `table<a: int>`, whereas the type returned by `Value::get_type` is a `list<any>`. Similarly, `PipelineData` has some special handling for `ListStream`s and `ByteStream`s. The latter was needed for this PR to work properly with external commands. Here's some examples. Before: ```nu 1..2 | drop nth 1 Error: nu::parser::input_type_mismatch × Command does not support range input. ╭─[entry #9:1:8] 1 │ 1..2 | drop nth 1 · ────┬─── · ╰── command doesn't support range input ╰──── echo 1..2 | drop nth 1 # => ╭───┬───╮ # => │ 0 │ 1 │ # => ╰───┴───╯ ``` After this PR, I've adjusted `drop nth`'s input/output types to accept range input. Before this PR, zip accepted any value despite not being listed in its input/output types. This caused different behavior depending on if you triggered a parse error or not: ```nushell 1 | zip [2] # => Error: nu::parser::input_type_mismatch # => # => × Command does not support int input. # => ╭─[entry #3:1:5] # => 1 │ 1 | zip [2] # => · ─┬─ # => · ╰── command doesn't support int input # => ╰──── echo 1 | zip [2] # => ╭───┬───────────╮ # => │ 0 │ ╭───┬───╮ │ # => │ │ │ 0 │ 1 │ │ # => │ │ │ 1 │ 2 │ │ # => │ │ ╰───┴───╯ │ # => ╰───┴───────────╯ ``` After this PR, it works the same in both cases. For cases like this, if we do decide we want `zip` or other commands to accept any input value, then we should explicitly add that to the input types. ```nushell 1 | zip [2] # => Error: nu::parser::input_type_mismatch # => # => × Command does not support int input. # => ╭─[entry #3:1:5] # => 1 │ 1 | zip [2] # => · ─┬─ # => · ╰── command doesn't support int input # => ╰──── echo 1 | zip [2] # => Error: nu:🐚:only_supports_this_input_type # => # => × Input type not supported. # => ╭─[entry #14:2:6] # => 2 │ echo 1 | zip [2] # => · ┬ ─┬─ # => · │ ╰── only list<any> and range input data is supported # => · ╰── input type: int # => ╰──── ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> **Breaking change**: The type of a command's input is now checked against the input/output types of that command at run-time. While these errors should mostly be caught at parse-time, in cases where they can't be detected at parse-time they will be caught at run-time instead. This applies to both internal commands and custom commands. Example function and corresponding parse-time error (same before and after PR): ```nushell def foo []: int -> nothing { print $"my cool int is ($in)" } 1 | foo # => my cool int is 1 "evil string" | foo # => Error: nu::parser::input_type_mismatch # => # => × Command does not support string input. # => ╭─[entry #16:1:17] # => 1 │ "evil string" | foo # => · ─┬─ # => · ╰── command doesn't support string input # => ╰──── # => ``` Before: ```nu echo "evil string" | foo # => my cool int is evil string ``` After: ```nu echo "evil string" | foo # => Error: nu:🐚:only_supports_this_input_type # => # => × Input type not supported. # => ╭─[entry #17:1:6] # => 1 │ echo "evil string" | foo # => · ──────┬────── ─┬─ # => · │ ╰── only int input data is supported # => · ╰── input type: string # => ╰──── ``` Known affected internal commands which erroneously accepted any type: * `str join` * `zip` * `reduce` # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> - 🟢 `toolkit fmt` - 🟢 `toolkit clippy` - 🟢 `toolkit test` - 🟢 `toolkit test stdlib` # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> * Play whack-a-mole with the commands and scripts this will inevitably break |
||
|
df3892f323
|
Provide the ability to split strings in columns via polars str-split (#14723)
# Description Provides the ability to split string columns. This will change the column type to list<str>. ```nushell > ❯ : [[a]; ["one,two,three"]] | polars into-df | polars select (polars col a | polars str-split ",") | polars collect ╭───┬───────────────╮ │ # │ a │ ├───┼───────────────┤ │ 0 │ ╭───┬───────╮ │ │ │ │ 0 │ one │ │ │ │ │ 1 │ two │ │ │ │ │ 2 │ three │ │ │ │ ╰───┴───────╯ │ ╰───┴───────────────╯ > ❯ : [[a]; ["one,two,three"]] | polars into-df | polars select (polars col a | polars str-split ",") | polars schema ╭───┬───────────╮ │ a │ list<str> │ ╰───┴───────────╯ ``` # User-Facing Changes - Introduces new command `polars str-split` |
||
|
23ba613b00
|
Polars AWS S3 support (#14648)
# Description Provides Amazon S3 support. - Utilizes your existing AWS cli configuration. - Supports AWS SSO - Supports [gimme-aws-creds](https://github.com/Nike-Inc/gimme-aws-creds). - respects the settings of AWS_PROFILE environment variable for selecting profile config - AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_REGION environment variables for configuring without an AWS config Usage: ```nushell polars open s3://bucket/and/path.parquet ``` Supports: - CSV - Parquet - NDJSON / json lines - Arrow Doesn't support: - eager dataframes - Avro - JSON |