Provides support for reading Polars structs. This allows opening of
supported files (jsonl, parquet, etc) that contain rows with structured
data.
The following attached json lines
file([receipts.jsonl.gz](https://github.com/nushell/nushell/files/13311476/receipts.jsonl.gz))
contains a customer column with structured data. This json lines file
can now be loaded via `dfr open` and will render as follows:
<img width="525" alt="Screenshot 2023-11-09 at 10 09 18"
src="https://github.com/nushell/nushell/assets/56345/4b26ccdc-c230-43ae-a8d5-8af88a1b72de">
This also addresses some cleanup of date handling and utilizing
timezones where provided.
This pull request only addresses reading data from polars structs. I
will address converting nushell data to polars structs in a future
request as this change is large enough as it is.
---------
Co-authored-by: Jack Wright <jack.wright@disqo.com>
# Description
This is easy to do with rust-analyzer, but I didn't want to just pump
these all out without feedback.
Part of #10700
# User-Facing Changes
None
# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
# After Submitting
N/A
---------
Co-authored-by: Stefan Holderbach <sholderbach@users.noreply.github.com>
# Description
This PR cleans up some warnings on the latest chrono dependency.
# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
# Description
As part of the refactor to split spans off of Value, this moves to using
helper functions to create values, and using `.span()` instead of
matching span out of Value directly.
Hoping to get a few more helping hands to finish this, as there are a
lot of commands to update :)
# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
---------
Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
Co-authored-by: WindSoilder <windsoilder@outlook.com>
This addresses the warnings generated from using DateTime::from_utc.
DateTime::from_utc was deprecated as of chrono 0.4.27
Co-authored-by: Jack Wright <jack.wright@disqo.com>
Polars and SQLParser upgrade.
I have exposed features that have been added to polars as command args
where appropriate.
---------
Co-authored-by: Jack Wright <jack.wright@disqo.com>
Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
Co-authored-by: sholderbach <sholderbach@users.noreply.github.com>
# Description
This doesn't really do much that the user could see, but it helps get us
ready to do the steps of the refactor to split the span off of Value, so
that values can be spanless. This allows us to have top-level values
that can hold both a Value and a Span, without requiring that all values
have them.
We expect to see significant memory reduction by removing so many
unnecessary spans from values. For example, a table of 100,000 rows and
5 columns would have a savings of ~8megs in just spans that are almost
always duplicated.
# User-Facing Changes
Nothing yet
# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect -A clippy::result_large_err` to check that
you're using the standard code style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
# Description
This PR creates a new `Record` type to reduce duplicate code and
possibly bugs as well. (This is an edited version of #9648.)
- `Record` implements `FromIterator` and `IntoIterator` and so can be
iterated over or collected into. For example, this helps with
conversions to and from (hash)maps. (Also, no more
`cols.iter().zip(vals)`!)
- `Record` has a `push(col, val)` function to help insure that the
number of columns is equal to the number of values. I caught a few
potential bugs thanks to this (e.g. in the `ls` command).
- Finally, this PR also adds a `record!` macro that helps simplify
record creation. It is used like so:
```rust
record! {
"key1" => some_value,
"key2" => Value::string("text", span),
"key3" => Value::int(optional_int.unwrap_or(0), span),
"key4" => Value::bool(config.setting, span),
}
```
Since macros hinder formatting, etc., the right hand side values should
be relatively short and sweet like the examples above.
Where possible, prefer `record!` or `.collect()` on an iterator instead
of multiple `Record::push`s, since the first two automatically set the
record capacity and do less work overall.
# User-Facing Changes
Besides the changes in `nu-protocol` the only other breaking changes are
to `nu-table::{ExpandedTable::build_map, JustTable::kv_table}`.
# Description
- Adds support for conversion between nushell lists and polars lists
instead of treating them as a polars object.
- Fixed explode and flatten to work both as expressions or lazy
dataframe commands. The previous item was required to make this work.
---------
Co-authored-by: Jack Wright <jack.wright@disqo.com>
Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
# Description
Today the only way to extract date parts from a dfr series is the dfr
get-* set of commands. These create a new dataframe with just the
datepart in it, which is almost entirely useless. As far as I can tell
there's no way to append it as a series in the original dataframe. In
discussion with fdncred on Discord we decided the best route was to add
an expression for modifying columns created in dfr with-column. These
are the way you manipulate series within a data frame.
I'd like feedback on this approach - I think it's a fair way to handle
things. An example to test it would be:
```[[ record_time]; [ (date now)]] | dfr into-df | dfr with-column [ ((dfr col record_time) | dfr datepart nanosecond | dfr as "ns" ), (dfr col record_time | dfr datepart second | dfr as "s"), (dfr col record_time | dfr datepart minute | dfr as "m"), (dfr col record_time | dfr datepart hour | dfr as "h") ]```
I'm also proposing we deprecate the dfr get-* commands. I've not been able to figure out any meaningful way they could ever be useful, and this approach makes more sense by attaching the extracted date part to the row in the original dataframe as a new column.
<!--
Thank you for improving Nushell. Please, check our [contributing guide](../CONTRIBUTING.md) and talk to the core team before making major changes.
Description of your pull request goes here. **Provide examples and/or screenshots** if your changes affect the user experience.
-->
# User-Facing Changes
add in dfr datepart as an expression
<!-- List of all changes that impact the user experience here. This helps us keep track of breaking changes. -->
# Tests + Formatting
Need to add some better assertive tests. I'm also not sure how to properly write the test_dataframe at the bottom, but will revisit as part of this PR. Wanted to get feedback early.
<!--
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A clippy::needless_collect -A clippy::result_large_err` to check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- crates/nu-std/tests/run.nu` to run the tests for the standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it automatically
> toolkit check pr
> ```
-->
# After Submitting
<!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. -->
---------
Co-authored-by: Robert Waugh <robert@waugh.io>
All of the dataframe commands ported over with no issues...
### 11 tests are commented out (for now)
So 100 of the original 111 tests are passing with only 11 tests being
ignored for now..
As per our conversation in the core team meeting on Wednesday
I took @jntrnr suggestion and just commented out the tests dealing
with
[IntoDatetime](https://github.com/nushell/nushell/blob/main/crates/nu-command/src/conversions/into/mod.rs)
Later on we can move this functionality out of nu-command if we decide
it makes sense...
### The following tests were ignored...
```rust
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_day.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_hour.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_minute.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_month.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_nanosecond.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_ordinal.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_second.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_week.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_weekday.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_year.rs
modified: crates/nu-cmd-dataframe/src/dataframe/series/string/strftime.rs
```