nushell

mirror of https://github.com/nushell/nushell.git synced 2025-07-30 06:11:48 +02:00

Author	SHA1	Message	Date
suimong	12f57dbc62	Add "--as-columns" flag to `polars into-df` (#13449 ) <!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [linking keywords](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> Per discussion on [Discord](https://discord.com/channels/601130461678272522/864228801851949077/1265718178927870045) # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. Provide examples and/or screenshots if your changes affect the user experience. --> To facilitate column-oriented dataframe construction, this PR added a `--as-columns` flag to `polars into-df` command so that when specified, and when input shape is record of lists, each list will be treated as a column rather than a cell value, i.e. `{a: [1 3], b: [2 4]} \| polars into-df --as-columns` returns the same dataframe as `[[a b];[1 2] [3 4]] \| polars into-df` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> A new flag `--as-columns`, no change of semantics if this flag is unspecified. # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > Note > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> --------- Co-authored-by: Ben Yang <ben@ya.ng>	2024-07-30 08:50:50 -05:00
Devyn Cairns	c31291753c	Bump version to `0.96.2` (#13485 ) This should be the new development version. We most likely don't need a 0.96.2 patch release. Should be free to merge PRs after this.	2024-07-29 17:20:55 -07:00
Devyn Cairns	9f90d611e1	Bump version to `0.96.1` (#13439 ) (Post-release bump.)	2024-07-25 18:28:18 +08:00
Devyn Cairns	a80dfe8e80	Bump version to `0.96.0` (#13433 )	2024-07-23 16:10:35 -07:00
dependabot[bot]	63cea44130	Bump uuid from 1.9.1 to 1.10.0 (#13390 ) Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.9.1 to 1.10.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/uuid-rs/uuid/releases">uuid's releases</a>.</em></p> <blockquote> <h2>1.10.0</h2> <h2>Deprecations</h2> <p>This release deprecates and renames the following functions:</p> <ul> <li><code>Builder::from_rfc4122_timestamp</code> -> <code>Builder::from_gregorian_timestamp</code></li> <li><code>Builder::from_sorted_rfc4122_timestamp</code> -> <code>Builder::from_sorted_gregorian_timestamp</code></li> <li><code>Timestamp::from_rfc4122</code> -> <code>Timestamp::from_gregorian</code></li> <li><code>Timestamp::to_rfc4122</code> -> <code>Timestamp::to_gregorian</code></li> </ul> <h2>What's Changed</h2> <ul> <li>Use const identifier in uuid macro by <a href="https://github.com/Vrajs16"><code>@Vrajs16</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/764">uuid-rs/uuid#764</a></li> <li>Rename most methods referring to RFC4122 by <a href="https://github.com/Mikopet"><code>@Mikopet</code></a> / <a href="https://github.com/KodrAus"><code>@KodrAus</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/765">uuid-rs/uuid#765</a></li> <li>prepare for 1.10.0 release by <a href="https://github.com/KodrAus"><code>@KodrAus</code></a> in <a href="https://redirect.github.com/uuid-rs/uuid/pull/766">uuid-rs/uuid#766</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/Vrajs16"><code>@Vrajs16</code></a> made their first contribution in <a href="https://redirect.github.com/uuid-rs/uuid/pull/764">uuid-rs/uuid#764</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/uuid-rs/uuid/compare/1.9.1...1.10.0">https://github.com/uuid-rs/uuid/compare/1.9.1...1.10.0</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`4b4c590ae3`"><code>4b4c590</code></a> Merge pull request <a href="https://redirect.github.com/uuid-rs/uuid/issues/766">#766</a> from uuid-rs/cargo/1.10.0</li> <li><a href="`68eff32640`"><code>68eff32</code></a> Merge pull request <a href="https://redirect.github.com/uuid-rs/uuid/issues/765">#765</a> from uuid-rs/chore/time-fn-deprecations</li> <li><a href="`3d5384da4b`"><code>3d5384d</code></a> update docs and deprecation messages for timestamp fns</li> <li><a href="`de50f2091f`"><code>de50f20</code></a> renaming rfc4122 functions</li> <li><a href="`4a8841792a`"><code>4a88417</code></a> prepare for 1.10.0 release</li> <li><a href="`66b4fcef14`"><code>66b4fce</code></a> Merge pull request <a href="https://redirect.github.com/uuid-rs/uuid/issues/764">#764</a> from Vrajs16/main</li> <li><a href="`8896e26c42`"><code>8896e26</code></a> Use expr instead of ident</li> <li><a href="`09973d6aff`"><code>09973d6</code></a> Added changes</li> <li><a href="`6edf3e8cd5`"><code>6edf3e8</code></a> Use const identifer in uuid macro</li> <li>See full diff in <a href="https://github.com/uuid-rs/uuid/compare/1.9.1...1.10.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=uuid&package-manager=cargo&previous-version=1.9.1&new-version=1.10.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-07-17 09:46:59 +08:00
Jack Wright	b68c7cf3fa	Make `polars unpivot` consistent with `polars pivot` (#13335 ) # Description Makes `polars unpivot` use the same arguments as `polars pivot` and makes it consistent with the polars' rust api. Additionally, support for the polar's streaming engine has been exposed on eager dataframes. Previously, it would only work with lazy dataframes. # User-Facing Changes * `polars unpivot` argument `--columns`\|`-c` has been renamed to `--index`\|`-i` * `polars unpivot` argument `--values`\|`-v` has been renamed to `--on`\|`-o` * `polars unpivot` short argument for `--streamable` is now `-t` to make it consistent with `polars pivot`. It was made `-t` for `polars pivot` because `-s` is short for `--short`	2024-07-10 16:36:38 -05:00
Jack Wright	ff27d6a18e	Implemented a command to expose polar's pivot functionality (#13282 ) # Description Implementing pivot support The example below is a port of the [python API example](https://docs.pola.rs/api/python/stable/reference/dataframe/api/polars.DataFrame.pivot.html) <img width="1079" alt="Screenshot 2024-07-01 at 14 29 27" src="https://github.com/nushell/nushell/assets/56345/277eb7a2-233b-4070-9d24-c2183805c1b8"> # User-Facing Changes * Introduction of the `polars pivot` command	2024-07-09 10:17:20 -07:00
Jack Wright	8316a1597e	Polars: Check to see if the cache is empty before enabling GC. More logging (#13286 ) There was a bug where anytime the plugin cache remove was called, the plugin gc was turned back on. This probably happened when I added the reference counter logic.	2024-07-03 06:44:26 -05:00
Jack Wright	720b4cbd01	Polars 0.41 Upgrade (#13238 ) # Description Upgrading to Polars 0.41 # User-Facing Changes * `polars melt` has been renamed to `polars unpivot` to match the change in the polars API. Additionally, it now supports lazy dataframes. Introduced a `--streamable` option to use the polars streaming engine for lazy frames. * The parameter `outer` has been replaced with `full` in `polars join` to match polars change. * `polars value-count` now supports the column (rename count column), parallelize (multithread), sort, and normalize options. The list of polars changes can be found [here](https://github.com/pola-rs/polars/releases/tag/rs-0.41.2)	2024-06-28 06:37:45 -05:00
Jack Wright	1f1f581357	Converted perf function to be a macro. Utilized the perf macro within the polars plugin. (#13224 ) In this pull request, I converted the `perf` function within `nu_utils` to a macro. This change facilitates easier usage within plugins by allowing the use of `env_logger` and setting `RUST_LOG=nu_plugin_polars` (or another plugin). Without this conversion, the `RUST_LOG` variable would need to be set to `RUST_LOG=nu_utils::utils`, which is less intuitive and impossible to narrow the perf results to one plugin.	2024-06-27 18:56:56 -05:00
dependabot[bot]	38ecb6d380	Bump uuid from 1.8.0 to 1.9.1 (#13227 )	2024-06-26 06:43:46 +00:00
Jack Wright	0dd35cddcd	Bumping version to 0.95.1 (#13231 ) Marks development for hotfix	2024-06-25 18:26:07 -07:00
Jakub Žádník	f93c6680bd	Bump to 0.95.0 (#13221 ) <!-- if this PR closes one or more issues, you can automatically link the PR with them by using one of the [linking keywords](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword), e.g. - this PR should close #xxxx - fixes #xxxx you can also mention related issues, PRs or discussions! --> # Description <!-- Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes. Description of your pull request goes here. Provide examples and/or screenshots if your changes affect the user experience. --> # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > Note > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. -->	2024-06-25 21:29:47 +03:00
Jack Wright	db86dd9f26	Polars default infer (#13193 ) Addresses performance issues that @maxim-uvarov found with CSV and JSON lines. This ensures that the schema inference follows the polars defaults of 100 lines. Recent changes caused the default values to be override and caused the entire file to be scanned when inferring the schema.	2024-06-22 07:23:42 -05:00
Devyn Cairns	91d44f15c1	Allow plugins to report their own version and store it in the registry (#12883 ) # Description This allows plugins to report their version (and potentially other metadata in the future). The version is shown in `plugin list` and in `version`. The metadata is stored in the registry file, and reflects whatever was retrieved on `plugin add`, not necessarily the running binary. This can help you to diagnose if there's some kind of mismatch with what you expect. We could potentially use this functionality to show a warning or error if a plugin being run does not have the same version as what was in the cache file, suggesting `plugin add` be run again, but I haven't done that at this point. It is optional, and it requires the plugin author to make some code changes if they want to provide it, since I can't automatically determine the version of the calling crate or anything tricky like that to do it. Example: ``` > plugin list \| select name version is_running pid ╭───┬────────────────┬─────────┬────────────┬─────╮ │ # │ name │ version │ is_running │ pid │ ├───┼────────────────┼─────────┼────────────┼─────┤ │ 0 │ example │ 0.93.1 │ false │ │ │ 1 │ gstat │ 0.93.1 │ false │ │ │ 2 │ inc │ 0.93.1 │ false │ │ │ 3 │ python_example │ 0.1.0 │ false │ │ ╰───┴────────────────┴─────────┴────────────┴─────╯ ``` cc @maxim-uvarov (he asked for it) # User-Facing Changes - `plugin list` gets a `version` column - `version` shows plugin versions when available - plugin authors should add `fn metadata()` to their `impl Plugin`, but don't have to # Tests + Formatting Tested the low level stuff and also the `plugin list` column. # After Submitting - [ ] update plugin guide docs - [ ] update plugin protocol docs (`Metadata` call & response) - [ ] update plugin template (`fn metadata()` should be easy) - [ ] release notes	2024-06-21 06:27:09 -05:00
Jack Wright	20834c9d47	Added the ability to turn on performance debugging through and env var for the polars plugin (#13191 ) This allows performance debugging to be turned on by setting: ```nushell $env.POLARS_PLUGIN_PERF = "true" ``` Furthermore, this improves the other plugin debugging by allowing the env variable for debugging to be set at any time versus having to be available when nushell is launched: ```nushell $env.POLARS_PLUGIN_DEBUG = "true" ``` This plugin introduces a `perf` function that will output timing results. This works very similar to the perf function available in nu_utils::utils::perf. This version prints everything to std error to not break the plugin stream and uses the engine interface to see if the env variable is configured. This pull requests uses this `perf` function when: * opening csv files as dataframes * opening json lines files as dataframes This will hopefully help provide some more fine grained information on how long it takes polars to open different dataframes. The `perf` can also be utilized later for other dataframes use cases.	2024-06-20 16:37:38 -07:00
Jack Wright	7d2d573eb8	Added the ability to open json lines dataframes with polars lazy json lines reader. (#13167 ) The `--lazy` flag will now use the polars' LazyJsonLinesReader when opening a json lines file with `polars open`	2024-06-20 10:55:49 -07:00
Jack Wright	021b8633cb	Allow the addition of an index column to be optional (#13097 ) Per discussion on discord dataframes channel with @maxim-uvarov and pyz. When converting a dataframe to an nushell value via `polars into-nu`, the index column should not be added by default and should only be added when specifying `--index`	2024-06-10 10:45:25 +08:00
Jack Wright	650ae537c3	Fix the use of right hand expressions in operations (#13096 ) As reported by @maxim-uvarov and pyz in the dataframes discord channel: ```nushell [[a b]; [1 1] [1 2] [2 1] [2 2] [3 1] [3 2]] \| polars into-df \| polars with-column ((polars col a) / (polars col b)) --name c × Type mismatch. ╭─[entry #45:1:102] 1 │ [[a b]; [1 1] [1 2] [2 1] [2 2] [3 1] [3 2]] \| polars into-df \| polars with-column ((polars col a) / (polars col b)) --name c · ───────┬────── · ╰── Right hand side not a dataframe expression ╰──── ``` This pull request corrects the type casting on the right hand side and allows more than just polars literal expressions.	2024-06-10 10:44:04 +08:00
Jack Wright	a6b1d1f6d9	Upgrade to polars 0.40 (#13069 ) Upgrading to polars 0.40	2024-06-06 07:26:47 +08:00
Jack Wright	b10325dff1	Allow int values to be converted into floats. (#13025 ) Addresses the bug found by @maxim-uvarov when trying to coerce an int Value to a polars float: <img width="863" alt="image" src="https://github.com/nushell/nushell/assets/56345/4d858812-a7b3-4296-98f4-dce0c544b4c6"> Conversion now works correctly: <img width="891" alt="Screenshot 2024-05-31 at 14 28 51" src="https://github.com/nushell/nushell/assets/56345/78d9f711-7ad5-4503-abc6-7aba64a2e675">	2024-06-04 18:51:11 -07:00
Jack Wright	a84fdb1d37	Fixed a couple of incorrect errors messages (#13043 ) Fixed a couple of error message that incorrectly reported as parquet errors instead of CSV errors.	2024-06-05 07:40:02 +08:00
Wind	ad5a6cdc00	bump version to 0.94.3 (#13055 )	2024-06-05 06:52:40 +08:00
Devyn Cairns	6635b74d9d	Bump version to `0.94.2` (#13014 ) Version bump after 0.94.1 patch release.	2024-06-03 10:28:35 +03:00
Devyn Cairns	f3991f2080	Bump version to `0.94.1` (#12988 ) Merge this PR before merging any other PRs.	2024-05-28 22:41:23 +00:00
Jakub Žádník	61182deb96	Bump version to 0.94.0 (#12987 )	2024-05-28 12:04:09 -07:00
Darren Schroeder	0c5a67f4e5	make polars plugin use mimalloc (#12967 ) # Description @maxim-uvarov did a ton of research and work with the dply-rs author and ritchie from polars and found out that the allocator matters on macos and it seems to be what was messing up the performance of polars plugin. ritchie suggested to use jemalloc but i switched it to mimalloc to match nushell and it seems to run better. ## Before (default allocator) note - using 1..10 vs 1..100 since it takes so long. also notice how high the `max` timings are compared to mimalloc below. ```nushell ❯ 1..10 \| each {timeit {polars open Data7602DescendingYearOrder.csv \| polars group-by year \| polars agg (polars col geo_count \| polars sum) \| polars collect \| null}} \| \| {mean: ($in \| math avg), min: ($in \| math min), max: ($in \| math max), stddev: ($in \| into int \| into float \| math stddev \| into int \| $'($in)ns' \| into duration)} ╭────────┬─────────────────────────╮ │ mean │ 4sec 999ms 605µs 995ns │ │ min │ 983ms 627µs 42ns │ │ max │ 13sec 398ms 135µs 791ns │ │ stddev │ 3sec 476ms 479µs 939ns │ ╰────────┴─────────────────────────╯ ❯ use std bench ❯ bench { polars open Data7602DescendingYearOrder.csv \| polars group-by year \| polars agg (polars col geo_count \| polars sum) \| polars collect \| null } -n 10 ╭───────┬────────────────────────╮ │ mean │ 6sec 220ms 783µs 983ns │ │ min │ 1sec 184ms 997µs 708ns │ │ max │ 18sec 882ms 81µs 708ns │ │ std │ 5sec 350ms 375µs 697ns │ │ times │ [list 10 items] │ ╰───────┴────────────────────────╯ ``` ## After (using mimalloc) ```nushell ❯ 1..100 \| each {timeit {polars open Data7602DescendingYearOrder.csv \| polars group-by year \| polars agg (polars col geo_count \| polars sum) \| polars collect \| null}} \| \| {mean: ($in \| math avg), min: ($in \| math min), max: ($in \| math max), stddev: ($in \| into int \| into float \| math stddev \| into int \| $'($in)ns' \| into duration)} ╭────────┬───────────────────╮ │ mean │ 103ms 728µs 902ns │ │ min │ 97ms 107µs 42ns │ │ max │ 149ms 430µs 84ns │ │ stddev │ 5ms 690µs 664ns │ ╰────────┴───────────────────╯ ❯ use std bench ❯ bench { polars open Data7602DescendingYearOrder.csv \| polars group-by year \| polars agg (polars col geo_count \| polars sum) \| polars collect \| null } -n 100 ╭───────┬───────────────────╮ │ mean │ 103ms 620µs 195ns │ │ min │ 97ms 541µs 166ns │ │ max │ 130ms 262µs 166ns │ │ std │ 4ms 948µs 654ns │ │ times │ [list 100 items] │ ╰───────┴───────────────────╯ ``` ## After (using jemalloc - just for comparison) ```nushell ❯ 1..100 \| each {timeit {polars open Data7602DescendingYearOrder.csv \| polars group-by year \| polars agg (polars col geo_count \| polars sum) \| polars collect \| null}} \| \| {mean: ($in \| math avg), min: ($in \| math min), max: ($in \| math max), stddev: ($in \| into int \| into float \| math stddev \| into int \| $'($in)ns' \| into duration)} ╭────────┬───────────────────╮ │ mean │ 113ms 939µs 777ns │ │ min │ 108ms 337µs 333ns │ │ max │ 166ms 467µs 458ns │ │ stddev │ 6ms 175µs 618ns │ ╰────────┴───────────────────╯ ❯ use std bench ❯ bench { polars open Data7602DescendingYearOrder.csv \| polars group-by year \| polars agg (polars col geo_count \| polars sum) \| polars collect \| null } -n 100 ╭───────┬───────────────────╮ │ mean │ 114ms 363µs 530ns │ │ min │ 108ms 804µs 833ns │ │ max │ 143ms 521µs 459ns │ │ std │ 5ms 88µs 56ns │ │ times │ [list 100 items] │ ╰───────┴───────────────────╯ ``` ## After (using parquet + mimalloc) ```nushell ❯ 1..100 \| each {timeit {polars open data.parquet \| polars group-by year \| polars agg (polars col geo_count \| polars sum) \| polars collect \| null}} \| \| {mean: ($in \| math avg), min: ($in \| math min), max: ($in \| math max), stddev: ($in \| into int \| into float \| math stddev \| into int \| $'($in)ns' \| into duration)} ╭────────┬──────────────────╮ │ mean │ 34ms 255µs 492ns │ │ min │ 31ms 787µs 250ns │ │ max │ 76ms 408µs 416ns │ │ stddev │ 4ms 472µs 916ns │ ╰────────┴──────────────────╯ ❯ use std bench ❯ bench { polars open data.parquet \| polars group-by year \| polars agg (polars col geo_count \| polars sum) \| polars collect \| null } -n 100 ╭───────┬──────────────────╮ │ mean │ 34ms 897µs 562ns │ │ min │ 31ms 518µs 542ns │ │ max │ 65ms 943µs 625ns │ │ std │ 3ms 450µs 741ns │ │ times │ [list 100 items] │ ╰───────┴──────────────────╯ ``` # User-Facing Changes <!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. --> # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass (on Windows make sure to [enable developer mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging)) - `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the tests for the standard library > Note > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. -->	2024-05-25 09:10:01 -05:00
Ian Manske	84b7a99adf	Revert "Polars lazy refactor (#12669 )" (#12962 ) This reverts commit `68adc4657f`. # Description Reverts the lazyframe refactor (#12669) for the next release, since there are still a few lingering issues. This temporarily solves #12863 and #12828. After the release, the lazyframes can be added back and cleaned up.	2024-05-24 18:09:26 -05:00
Ian Manske	905e3d0715	Remove dataframes crate and feature (#12889 ) # Description Removes the old `nu-cmd-dataframe` crate in favor of the polars plugin. As such, this PR also removes the `dataframe` feature, related CI, and full releases of nushell.	2024-05-20 17:22:08 +00:00
Ian Manske	aec41f3df0	Add `Span` merging functions (#12511 ) # Description This PR adds a few functions to `Span` for merging spans together: - `Span::append`: merges two spans that are known to be in order. - `Span::concat`: returns a span that encompasses all the spans in a slice. The spans must be in order. - `Span::merge`: merges two spans (no order necessary). - `Span::merge_many`: merges an iterator of spans into a single span (no order necessary). These are meant to replace the free-standing `nu_protocol::span` function. The spans in a `LiteCommand` (the `parts`) should always be in order based on the lite parser and lexer. So, the parser code sees the most usage of `Span::append` and `Span::concat` where the order is known. In other code areas, `Span::merge` and `Span::merge_many` are used since the order between spans is often not known.	2024-05-16 22:34:49 +00:00
Ian Manske	6fd854ed9f	Replace `ExternalStream` with new `ByteStream` type (#12774 ) # Description This PR introduces a `ByteStream` type which is a `Read`-able stream of bytes. Internally, it has an enum over three different byte stream sources: ```rust pub enum ByteStreamSource { Read(Box<dyn Read + Send + 'static>), File(File), Child(ChildProcess), } ``` This is in comparison to the current `RawStream` type, which is an `Iterator<Item = Vec<u8>>` and has to allocate for each read chunk. Currently, `PipelineData::ExternalStream` serves a weird dual role where it is either external command output or a wrapper around `RawStream`. `ByteStream` makes this distinction more clear (via `ByteStreamSource`) and replaces `PipelineData::ExternalStream` in this PR: ```rust pub enum PipelineData { Empty, Value(Value, Option<PipelineMetadata>), ListStream(ListStream, Option<PipelineMetadata>), ByteStream(ByteStream, Option<PipelineMetadata>), } ``` The PR is relatively large, but a decent amount of it is just repetitive changes. This PR fixes #7017, fixes #10763, and fixes #12369. This PR also improves performance when piping external commands. Nushell should, in most cases, have competitive pipeline throughput compared to, e.g., bash. \| Command \| Before (MB/s) \| After (MB/s) \| Bash (MB/s) \| \| -------------------------------------------------- \| -------------:\| ------------:\| -----------:\| \| `throughput \\| rg 'x'` \| 3059 \| 3744 \| 3739 \| \| `throughput \\| nu --testbin relay o> /dev/null` \| 3508 \| 8087 \| 8136 \| # User-Facing Changes - This is a breaking change for the plugin communication protocol, because the `ExternalStreamInfo` was replaced with `ByteStreamInfo`. Plugins now only have to deal with a single input stream, as opposed to the previous three streams: stdout, stderr, and exit code. - The output of `describe` has been changed for external/byte streams. - Temporary breaking change: `bytes starts-with` no longer works with byte streams. This is to keep the PR smaller, and `bytes ends-with` already does not work on byte streams. - If a process core dumped, then instead of having a `Value::Error` in the `exit_code` column of the output returned from `complete`, it now is a `Value::Int` with the negation of the signal number. # After Submitting - Update docs and book as necessary - Release notes (e.g., plugin protocol changes) - Adapt/convert commands to work with byte streams (high priority is `str length`, `bytes starts-with`, and maybe `bytes ends-with`). - Refactor the `tee` code, Devyn has already done some work on this. --------- Co-authored-by: Devyn Cairns <devyn.cairns@gmail.com>	2024-05-16 07:11:18 -07:00
Jack Wright	6f3dbc97bb	fixed syntax shape requirements for --quantiles option for polars summary (#12878 ) Fix for #12730 All of the code expected a list of floats, but the syntax shape expected a table. Resolved by changing the syntax shape to list of floats. cc: @maxim-uvarov	2024-05-15 16:55:07 -05:00
Jack Wright	98369985b1	Allow custom value operations to work on eager and lazy dataframes interchangeably. (#12819 ) Fixes Bug #12809 The example that @maxim-uvarov posted now works as expected: <img width="1223" alt="Screenshot 2024-05-09 at 16 21 01" src="https://github.com/nushell/nushell/assets/56345/a4df62e3-e432-4c09-8e25-9a6c198741a3">	2024-05-13 18:17:31 -05:00
Jack Wright	68adc4657f	Polars lazy refactor (#12669 ) This moves to predominantly supporting only lazy dataframes for most operations. It removes a lot of the type conversion between lazy and eager dataframes based on what was inputted into the command. For the most part the changes will mean: * You will need to run `polars collect` after performing operations * The into-lazy command has been removed as it is redundant. * When opening files a lazy frame will be outputted by default if the reader supports lazy frames A list of individual command changes can be found [here](https://hackmd.io/@nucore/Bk-3V-hW0) --------- Co-authored-by: Ian Manske <ian.manske@pm.me>	2024-05-06 23:19:11 +00:00
Maxim Uvarov	a1287f7b3f	add more tests to the `polars` plugin (#12719 ) # Description I added some more tests to our mighty `polars` ~~, yet I don't know how to add expected results in some of them. I would like to ask for help.~~ ~~My experiments are in the last commit: [polars: experiments](`f7e5e72019`). Without those experiments `cargo test` goes well.~~ UPD. I moved out my unsuccessful test experiments into a separate [branch](https://github.com/maxim-uvarov/nushell/blob/polars-tests-broken2/). So, this branch seems ready for a merge. @ayax79, maybe you'll find time for me please? It's not urgent for sure. P.S. I'm very new to git. Please feel free to give me any suggestions on how I should use it better	2024-05-03 20:14:55 -05:00
Stefan Holderbach	be6137d136	Fix clippy::wrong_self_convention in polars plugin (#12737 ) Expected `into_` for `fn(self) -> T`	2024-05-02 19:31:51 +02:00
Devyn Cairns	21ebdfe8d7	Bump version to `0.93.1` (#12710 ) # Description Next patch/dev release, `0.93.1`	2024-05-01 17:19:20 -05:00
Devyn Cairns	3b220e07e3	Bump version to `0.93.0` (#12709 ) # Description Bump version to `0.93.0`	2024-04-30 15:51:13 -07:00
Maxim Uvarov	884d5312bb	add tests to `polars unique` (#12683 ) # Description I would like to help with `polars` plugin development and add tests to all the `polars` command's existing params. Since I have never written any lines of Rust, even though the task of creating tests is relatively simple, I would like to ask for feedback to ensure I did everything correctly here.	2024-04-27 12:04:54 -05:00
Ian Manske	9996e4a1f8	Shrink the size of `Expr` (#12610 ) # Description Continuing from #12568, this PR further reduces the size of `Expr` from 64 to 40 bytes. It also reduces `Expression` from 128 to 96 bytes and `Type` from 32 to 24 bytes. This was accomplished by: - for `Expr` with multiple fields (e.g., `Expr::Thing(A, B, C)`), merging the fields into new AST struct types and then boxing this struct (e.g. `Expr::Thing(Box<ABC>)`). - replacing `Vec<T>` with `Box<[T]>` in multiple places. `Expr`s and `Expression`s should rarely be mutated, if at all, so this optimization makes sense. By reducing the size of these types, I didn't notice a large performance improvement (at least compared to #12568). But this PR does reduce the memory usage of nushell. My config is somewhat light so I only noticed a difference of 1.4MiB (38.9MiB vs 37.5MiB). --------- Co-authored-by: Stefan Holderbach <sholderbach@users.noreply.github.com>	2024-04-24 15:46:35 +00:00
Jack Wright	a60381a932	Added commands for working with the plugin cache. (#12576 ) # Description This pull request provides three new commands: `polars store-ls` - moved from `polars ls`. It provides the list of all object stored in the plugin cache `polars store-rm` - deletes a cached object `polars store-get` - gets an object from the cache. The addition of `polars store-get` required adding a reference_count to cached entries. `polars get` is the only command that will increment this value. `polars rm` will remove the value despite it's count. Calls to PolarsPlugin::custom_value_dropped will decrement the value. The prefix store- was chosen due to there already being a `polars cache` command. These commands were not made sub-commands as there isn't a way to display help for sub commands in plugins (e.g. `polars store` displaying help) and I felt the store- seemed fine anyways. The output of `polars store-ls` now shows the reference count for each object. # User-Facing Changes polars ls has now moved to polars store-ls --------- Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-21 19:43:43 -05:00
Jack Wright	9fb59a6f43	Removed the polars dtypes command (#12577 ) # Description The polars dtype command is largerly redundant since the introduction of the schema command. The schema command also has the added benefit that it's output can be used as a parameter to other schema commands: ```nushell [[a b]; [5 6] [5 7]] \| polars into-df -s ($df \| polars schema ``` # User-Facing Changes `polars dtypes` has been removed. Users should use `polars schema` instead. Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-19 07:01:47 -05:00
Jack Wright	cc7b5c5a26	Only mark collected dataframes as from_lazy=false when collect is called from the collect command. (#12571 ) I had previously changed NuLazyFrame::collect to set the NuDataFrame's from_lazy field to false to prevent conversion back to a lazy frame. It appears there are cases where this should happen. Instead, I am only setting from_lazy=false inside the `polars collect` command. [Related discord message](https://discord.com/channels/601130461678272522/1227612017171501136/1230600465159421993) Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-18 17:10:38 -05:00
Jack Wright	410f3c5c8a	Upgrading nu_plugin_polars to polars 0.39.1 (#12551 ) # Description Upgrading nu_plugin_polars to polars 0.39.1 Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-17 06:35:09 -05:00
Jack Wright	a7a5ec31be	Fixing NuLazyFrame/NuDataFrame conversion issues (#12538 ) # Description @maxim-uvarov brought up another case where converting back and forth between eager and lazy dataframes was not working correctly: ``` > [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars append -c ([[a b]; [6 2] [1 4] [4 1]] \| polars into-df) Error: nu:🐚:cant_convert × Can't convert to NuDataFrame. ╭─[entry #1:1:49] 1 │ [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars append -c ([[a b]; [6 2] [1 4] [4 1]] \| polars into-df) · ──────┬────── · ╰── can't convert NuLazyFrameCustomValue to NuDataFrame ╰──── ``` This pull request fixes this case and glaringly obvious similar cases I could find. Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-16 11:16:37 -05:00
Jack Wright	1661bb68f9	Cleaning up to_pipe_line_data and cache_and_to_value, making them part of CustomValueSupport (#12528 ) # Description This is just some cleanup. I moved to_pipeline_data and to_cache_value to the CustomValueSupport trait, where I should've put them to begin with. Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-16 06:35:52 -05:00
Jack Wright	5f818eaefe	Ensure that lazy frames converted via to-lazy are not converted back to eager frames later in the pipeline. (#12525 ) # Description @maxim-uvarov discovered the following error: ``` > [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars sort-by a \| polars unique --subset [a] Error: × Error using as series ╭─[entry #1:1:68] 1 │ [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars sort-by a \| polars unique --subset [a] · ──────┬────── · ╰── dataframe has more than one column ╰──── ``` During investigation, I discovered the root cause was that the lazy frame was incorrectly converted back to a eager dataframe. In order to keep this from happening, I explicitly set that the dataframe did not come from an eager frame. This causes the conversion logic to not attempt to convert the dataframe later in the pipeline. --------- Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-15 18:29:42 -05:00
Devyn Cairns	2ae9ad8676	Copy-on-write for record values (#12305 ) # Description This adds a `SharedCow` type as a transparent copy-on-write pointer that clones to unique on mutate. As an initial test, the `Record` within `Value::Record` is shared. There are some pretty big wins for performance. I'll post benchmark results in a comment. The biggest winner is nested access, as that would have cloned the records for each cell path follow before and it doesn't have to anymore. The reusability of the `SharedCow` type is nice and I think it could be used to clean up the previous work I did with `Arc` in `EngineState`. It's meant to be a mostly transparent clone-on-write that just clones on `.to_mut()` or `.into_owned()` if there are actually multiple references, but avoids cloning if the reference is unique. # User-Facing Changes - `Value::Record` field is a different type (plugin authors) # Tests + Formatting - 🟢 `toolkit fmt` - 🟢 `toolkit clippy` - 🟢 `toolkit test` - 🟢 `toolkit test stdlib` # After Submitting - [ ] use for `EngineState` - [ ] use for `Value::List`	2024-04-14 01:42:03 +00:00
Jack Wright	10a9a17b8c	Two consecutive calls to into-lazy should not fail (#12505 ) # Description From @maxim-uvarov's [post](https://discord.com/channels/601130461678272522/1227612017171501136/1228656319704203375). When calling `to-lazy` back to back in a pipeline, an error should not occur: ``` > [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars into-lazy Error: nu:🐚:cant_convert × Can't convert to NuDataFrame. ╭─[entry #1:1:30] 1 │ [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars into-lazy · ────────┬─────── · ╰── can't convert NuLazyFrameCustomValue to NuDataFrame ╰──── ``` This pull request ensures that custom value's of NuLazyFrameCustomValue are properly converted when passed in. Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-13 13:00:46 -05:00
Jack Wright	b9dd47ebb7	Polars 0.38 upgrade (#12506 ) # Description Polars 0.38 upgrade for both the dataframe crate and the polars plugin. --------- Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-13 13:00:04 -05:00

1 2

58 Commits