nushell

mirror of https://github.com/nushell/nushell.git synced 2025-07-15 05:45:10 +02:00

Author	SHA1	Message	Date
Jack Wright	5f818eaefe	Ensure that lazy frames converted via to-lazy are not converted back to eager frames later in the pipeline. (#12525 ) # Description @maxim-uvarov discovered the following error: ``` > [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars sort-by a \| polars unique --subset [a] Error: × Error using as series ╭─[entry #1:1:68] 1 │ [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars sort-by a \| polars unique --subset [a] · ──────┬────── · ╰── dataframe has more than one column ╰──── ``` During investigation, I discovered the root cause was that the lazy frame was incorrectly converted back to a eager dataframe. In order to keep this from happening, I explicitly set that the dataframe did not come from an eager frame. This causes the conversion logic to not attempt to convert the dataframe later in the pipeline. --------- Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-15 18:29:42 -05:00
Devyn Cairns	2ae9ad8676	Copy-on-write for record values (#12305 ) # Description This adds a `SharedCow` type as a transparent copy-on-write pointer that clones to unique on mutate. As an initial test, the `Record` within `Value::Record` is shared. There are some pretty big wins for performance. I'll post benchmark results in a comment. The biggest winner is nested access, as that would have cloned the records for each cell path follow before and it doesn't have to anymore. The reusability of the `SharedCow` type is nice and I think it could be used to clean up the previous work I did with `Arc` in `EngineState`. It's meant to be a mostly transparent clone-on-write that just clones on `.to_mut()` or `.into_owned()` if there are actually multiple references, but avoids cloning if the reference is unique. # User-Facing Changes - `Value::Record` field is a different type (plugin authors) # Tests + Formatting - 🟢 `toolkit fmt` - 🟢 `toolkit clippy` - 🟢 `toolkit test` - 🟢 `toolkit test stdlib` # After Submitting - [ ] use for `EngineState` - [ ] use for `Value::List`	2024-04-14 01:42:03 +00:00
Jack Wright	10a9a17b8c	Two consecutive calls to into-lazy should not fail (#12505 ) # Description From @maxim-uvarov's [post](https://discord.com/channels/601130461678272522/1227612017171501136/1228656319704203375). When calling `to-lazy` back to back in a pipeline, an error should not occur: ``` > [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars into-lazy Error: nu:🐚:cant_convert × Can't convert to NuDataFrame. ╭─[entry #1:1:30] 1 │ [[a b]; [6 2] [1 4] [4 1]] \| polars into-lazy \| polars into-lazy · ────────┬─────── · ╰── can't convert NuLazyFrameCustomValue to NuDataFrame ╰──── ``` This pull request ensures that custom value's of NuLazyFrameCustomValue are properly converted when passed in. Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-13 13:00:46 -05:00
Jack Wright	b9dd47ebb7	Polars 0.38 upgrade (#12506 ) # Description Polars 0.38 upgrade for both the dataframe crate and the polars plugin. --------- Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-13 13:00:04 -05:00
Ian Manske	211d9c685c	Fix clippy lint (#12504 ) Just fixes a clippy lint.	2024-04-13 16:19:32 +00:00
Jack Wright	1bded8572c	Ensure that two columns named index don't exist when converting a Dataframe to a nu Value. (#12501 ) # Description @maxim-uvarov discovered an issue with the current implementation. When executing [[index a]; [1 1]] \| polars into-df, a plugin_failed_to_decode error occurs. This happens because a Record is created with two columns named "index" as an index column is added during conversion. This pull request addresses the problem by not adding an index column if there is already a column named "index" in the dataframe. --------- Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-13 06:33:29 -05:00
Jack Wright	f975c9923a	Handle relative paths correctly on polars to-(parquet\|jsonl\|arrow\|etc) commands (#12486 ) # Description All polars commands that output a file were not handling relative paths correctly. A command like ``` [[a b]; [6 2] [1 4] [4 1]] \| polars into-df \| polars to-parquet foo.json``` was outputting the foo.json to the directory of the plugin executable. This pull request pulls in nu-path and using it for resolving the file paths. Related discussion https://discord.com/channels/601130461678272522/1227612017171501136/1227889870358183966 # User-Facing Changes None # Tests + Formatting Done, added tests for each of the polars to-* commands. --------- Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-12 19:30:37 -05:00
Jack Wright	50fb8243c8	Added a short flag -c to polars append --col (#12487 ) # Description `dfr append --col` had a short version -c. This polar requests adds the short flag back. Reference Conversation: https://discord.com/channels/601130461678272522/1227612017171501136/1227902980628676688 Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-12 10:55:36 -05:00
Jack Wright	b9c2f9ee56	displaying span information, creation time, and size with polars ls (#12472 ) # Description `polars ls` is already different that `dfr ls`. Currently it just shows the cache key, columns, rows, and type. I have added: - creation time - size - span contents - span start and end <img width="1471" alt="Screenshot 2024-04-10 at 17 27 06" src="https://github.com/nushell/nushell/assets/56345/545918b7-7c96-4c25-bc01-b9e2b659a408"> # Tests + Formatting Done Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-12 09:23:46 -05:00
Stefan Holderbach	872945ae8e	Bump version to `0.92.3` (#12476 )	2024-04-12 08:00:43 -05:00
Jack Wright	81c61f3243	Showing full help when running the polars command (#12462 ) Displays the full help message for all sub commands. Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-10 07:26:33 -05:00
Jack Wright	efc1cfa939	Move dataframes support to a plugin (#12220 ) WIP This PR covers migration crates/nu-cmd-dataframes to a new plugin ./crates/nu_plugin_polars ## TODO List Other: - [X] Fix examples - [x] Fix Plugin Test Harness - [X] Move Cache to Mutex<BTreeMap> - [X] Logic for disabling/enabling plugin GC based off whether items are cached. - [x] NuExpression custom values - [X] Optimize caching (don't cache every object creation). - [x] Fix dataframe operations (in NuDataFrameCustomValue::operations) - [x] Added plugin_debug! macro that for checking an env variable POLARS_PLUGIN_DEBUG Fix duplicated commands: - [x] There are two polars median commands, one for lazy and one for expr.. there should only be one that works for both. I temporarily called on polars expr-median (inside expressions_macros.rs) - [x] polars quantile (lazy, and expr). the expr one is temporarily expr-median - [x] polars is-in (renamed one series-is-in) Commands: - [x] AppendDF - [x] CastDF - [X] ColumnsDF - [x] DataTypes - [x] Summary - [x] DropDF - [x] DropDuplicates - [x] DropNulls - [x] Dummies - [x] FilterWith - [X] FirstDF - [x] GetDF - [x] LastDF - [X] ListDF - [x] MeltDF - [X] OpenDataFrame - [x] QueryDf - [x] RenameDF - [x] SampleDF - [x] SchemaDF - [x] ShapeDF - [x] SliceDF - [x] TakeDF - [X] ToArrow - [x] ToAvro - [X] ToCSV - [X] ToDataFrame - [X] ToNu - [x] ToParquet - [x] ToJsonLines - [x] WithColumn - [x] ExprAlias - [x] ExprArgWhere - [x] ExprCol - [x] ExprConcatStr - [x] ExprCount - [x] ExprLit - [x] ExprWhen - [x] ExprOtherwise - [x] ExprQuantile - [x] ExprList - [x] ExprAggGroups - [x] ExprCount - [x] ExprIsIn - [x] ExprNot - [x] ExprMax - [x] ExprMin - [x] ExprSum - [x] ExprMean - [x] ExprMedian - [x] ExprStd - [x] ExprVar - [x] ExprDatePart - [X] LazyAggregate - [x] LazyCache - [X] LazyCollect - [x] LazyFetch - [x] LazyFillNA - [x] LazyFillNull - [x] LazyFilter - [x] LazyJoin - [x] LazyQuantile - [x] LazyMedian - [x] LazyReverse - [x] LazySelect - [x] LazySortBy - [x] ToLazyFrame - [x] ToLazyGroupBy - [x] LazyExplode - [x] LazyFlatten - [x] AllFalse - [x] AllTrue - [x] ArgMax - [x] ArgMin - [x] ArgSort - [x] ArgTrue - [x] ArgUnique - [x] AsDate - [x] AsDateTime - [x] Concatenate - [x] Contains - [x] Cumulative - [x] GetDay - [x] GetHour - [x] GetMinute - [x] GetMonth - [x] GetNanosecond - [x] GetOrdinal - [x] GetSecond - [x] GetWeek - [x] GetWeekDay - [x] GetYear - [x] IsDuplicated - [x] IsIn - [x] IsNotNull - [x] IsNull - [x] IsUnique - [x] NNull - [x] NUnique - [x] NotSeries - [x] Replace - [x] ReplaceAll - [x] Rolling - [x] SetSeries - [x] SetWithIndex - [x] Shift - [x] StrLengths - [x] StrSlice - [x] StrFTime - [x] ToLowerCase - [x] ToUpperCase - [x] Unique - [x] ValueCount --------- Co-authored-by: Jack Wright <jack.wright@disqo.com>	2024-04-09 19:31:43 -05:00

12 Commits