nushell

mirror of https://github.com/nushell/nushell.git synced 2025-07-09 10:57:54 +02:00

Author	SHA1	Message	Date
Ian Manske	f4c0d9d45b	Path migration part 4: various tests (#13373 ) # Description Part 4 of replacing std::path types with nu_path types added in https://github.com/nushell/nushell/pull/13115. This PR migrates various tests throughout the code base.	2024-08-03 10:09:13 +02:00
Ian Manske	d56457d63e	Path migration part 2: `nu-test-support` (#13329 ) # Description Part 2 of replacing `std::path` types with `nu_path` types added in #13115. This PR targets `nu-test-support`.	2024-07-12 02:43:10 +00:00
Stefan Holderbach	406df7f208	Avoid taking unnecessary ownership of intermediates (#12740 ) # Description Judiciously try to avoid allocations/clone by changing the signature of functions - Don't pass str by value unnecessarily if only read - Don't require a vec in `Sandbox::with_files` - Remove unnecessary string clone - Fixup unnecessary borrow - Use `&str` in shape color instead - Vec -> Slice - Elide string clone - Elide `Path` clone - Take &str to elide clone in tests # User-Facing Changes None # Tests + Formatting This touches many tests purely in changing from owned to borrowed/static data	2024-05-04 00:53:15 +00:00
Antoine Stevan	be5ed3290c	add "to nuon" enumeration of possible styles (#12591 ) # Description in order to change the style of the _serialized_ NUON data, `nuon::to_nuon` takes three mutually exclusive arguments, `raw: bool`, `tabs: Option<usize>` and `indent: Option<usize>` 🤔 this begs to use an enumeration with all possible alternatives, right? this PR changes the signature of `nuon::to_nuon` to use `nuon::ToStyle` which has three variants - `Raw`: no newlines - `Tabs(n: usize)`: newlines and `n` tabulations as indent - `Spaces(n: usize)`: newlines and `n` spaces as indent # User-Facing Changes the signature of `nuon::to_nuon` changes from ```rust to_nuon( input: &Value, raw: bool, tabs: Option<usize>, indent: Option<usize>, span: Option<Span>, ) -> Result<String, ShellError> ``` to ```rust to_nuon( input: &Value, style: ToStyle, span: Option<Span> ) -> Result<String, ShellError> ``` # Tests + Formatting # After Submitting	2024-04-20 11:40:52 +02:00
Antoine Stevan	55edef5dda	create `nuon` crate from `from nuon` and `to nuon` (#12553 ) # Description playing with the NUON format in Rust code in some plugins, we agreed with the team it was a great time to create a standalone NUON format to allow Rust devs to use this Nushell file format. > Note > this PR almost copy-pastes the code from `nu_commands/src/formats/from/nuon.rs` and `nu_commands/src/formats/to/nuon.rs` to `nuon/src/from.rs` and `nuon/src/to.rs`, with minor tweaks to make then standalone functions, e.g. remove the rest of the command implementations ### TODO - [x] add tests - [x] add documentation # User-Facing Changes devs will have access to a new crate, `nuon`, and two functions, `from_nuon` and `to_nuon` ```rust from_nuon( input: &str, span: Option<Span>, ) -> Result<Value, ShellError> ``` ```rust to_nuon( input: &Value, raw: bool, tabs: Option<usize>, indent: Option<usize>, span: Option<Span>, ) -> Result<String, ShellError> ``` # Tests + Formatting i've basically taken all the tests from `crates/nu-command/tests/format_conversions/nuon.rs` and converted them to use `from_nuon` and `to_nuon` instead of Nushell commands - i've created a `nuon_end_to_end` to run both conversions with an optional middle value to check that all is fine > Note > the `nuon::tests::read_code_should_fail_rather_than_panic` test does give different results locally and in the CI... > i've left it ignored with comments to help future us :) # After Submitting mention that in the release notes for sure!!	2024-04-19 13:54:16 +02:00
Skyler Hawthorne	cf923fc44c	into sqlite: Fix insertion of null values (#12328 ) # Description In #10232, the allowed input types were changed to be stricter, only allowing records with types that can easily map onto sqlite equivalents. Unfortunately, null was left out of the accepted input types, which makes inserting rows with null values impossible. This change fixes that by accepting null values as input. One caveat of this is that when the command is creating a new table, it uses the first row to infer an appropriate sqlite schema. If the first row contains a null value, then it is impossible to tell which type this column is supposed to have. Throwing a hard error seems undesirable from a UX perspective, but guessing can lead to a potentially useless database if we guess wrong. So as a compromise, for null columns, we will assume the sqlite type is TEXT and print a warning so the user knows. For the time being, if users can't avoid a first row with null values, but also wants the right schema, they are advised to create their table before running `into sqlite`. A future PR can add the ability to explicitly specify a schema. Fixes #12225 # Tests + Formatting * Tests added to cover expected behavior around insertion of null values	2024-03-29 06:41:16 -05:00
Ian Manske	dfe072fd30	Fix `chrono` deprecation warnings (#12091 ) # Description Bumps `chrono` to 0.4.35 and fixes any deprecation warnings.	2024-03-07 06:01:30 -06:00
Ian Manske	fb4251aba7	Remove `Record::from_raw_cols_vals_unchecked` (#11810 ) # Description Follows from #11718 and replaces all usages of `Record::from_raw_cols_vals_unchecked` with iterator or `record!` equivalents.	2024-02-18 14:20:22 +02:00
Ian Manske	1c49ca503a	Name the `Value` conversion functions more clearly (#11851 ) # Description This PR renames the conversion functions on `Value` to be more consistent. It follows the Rust [API guidelines](https://rust-lang.github.io/api-guidelines/naming.html#ad-hoc-conversions-follow-as_-to_-into_-conventions-c-conv) for ad-hoc conversions. The conversion functions on `Value` now come in a few forms: - `coerce_{type}` takes a `&Value` and attempts to convert the value to `type` (e.g., `i64` are converted to `f64`). This is the old behavior of some of the `as_{type}` functions -- these functions have simply been renamed to better reflect what they do. - The new `as_{type}` functions take a `&Value` and returns an `Ok` result only if the value is of `type` (no conversion is attempted). The returned value will be borrowed if `type` is non-`Copy`, otherwise an owned value is returned. - `into_{type}` exists for non-`Copy` types, but otherwise does not attempt conversion just like `as_type`. It takes an owned `Value` and always returns an owned result. - `coerce_into_{type}` has the same relationship with `coerce_{type}` as `into_{type}` does with `as_{type}`. - `to_{kind}_string`: conversion to different string formats (debug, abbreviated, etc.). Only two of the old string conversion functions were removed, the rest have been renamed only. - `to_{type}`: other conversion functions. Currently, only `to_path` exists. (And `to_string` through `Display`.) This table summaries the above: \| Form \| Cost \| Input Ownership \| Output Ownership \| Converts `Value` case/`type` \| \| ---------------------------- \| ----- \| --------------- \| ---------------- \| -------- \| \| `as_{type}` \| Cheap \| Borrowed \| Borrowed/Owned \| No \| \| `into_{type}` \| Cheap \| Owned \| Owned \| No \| \| `coerce_{type}` \| Cheap \| Borrowed \| Borrowed/Owned \| Yes \| \| `coerce_into_{type}` \| Cheap \| Owned \| Owned \| Yes \| \| `to_{kind}_string` \| Expensive \| Borrowed \| Owned \| Yes \| \| `to_{type}` \| Expensive \| Borrowed \| Owned \| Yes \| # User-Facing Changes Breaking API change for `Value` in `nu-protocol` which is exposed as part of the plugin API.	2024-02-17 18:14:16 +00:00
Skyler Hawthorne	7ac3e97bfe	Fix memory consumption of into sqlite (#10232 ) # Description Currently, the `into sqlite` command collects the entire input stream into a single Value, which soaks up the entire input into memory, before it ever tries to write anything to the DB. This is very problematic for large inputs; for example, I tried transforming a multi-gigabyte CSV file into SQLite, and before I knew what was happening, my system's memory was completely exhausted, and I had to hard reboot to recover. This PR fixes this problem by working directly with the pipeline stream, inserting into the DB as values are read from the stream. In order to facilitate working with the stream directly, I introduced a new `Table` struct to store the connection and a few configuration parameters, as well as to make it easier to lazily create the table on the first read value. In addition to the purely functional fixes, a few other changes were made to the serialization and user facing behavior. ### Serialization Much of the preexisting code was focused on generating the exact text needed for a SQL statement. This is unneeded and less safe than using the `rusqlite` crate's serialization for native Rust types along with prepared statements. ### User-Facing Changes Currently, the command is very liberal in the input types it accepts. The strategy is basically if it is a record, try to follow its structure and make an analogous SQL row, which is pretty reasonable. However, when it's not a record, it basically tries to guess what the user wanted and just makes a single column table and serializes the value into that one column, whatever type it may be. This has been changed so that it only accepts records as input. If the user wants to serialize non-record types into SQL, then they must explicitly opt into doing this by constructing a record or table with it first. For a utility for inserting data into SQL, I think it makes more sense to let the user choose how to convert their data, rather than make a choice for them that may surprise them. However, I understand this may be a controversial change. If the maintainers don't agree, I can change this back. #### Long switch names The `file_name` and `table_name` long form switches are currently snake_case and expect to be as such at the command line. These have been changed to kebab-case to be more conventional. # Tests + Formatting To test the memory consumption, I used [this publicly available index of all Wikipedia articles](https://dumps.wikimedia.org/enwiki/20230820/), using the first 10,000, 100,000, and 1,000,000 entries, in that order. I ran the following script to benchmark the changes against the current stable release: ```nu #!/usr/bin/nu # let shellbin = $"($env.HOME)/src/nushell/target/aarch64-linux-android/release/nu" let shellbin = "nu" const dbpath = 'enwiki-index.db' [10000, 100000, 1000000] \| each {\|rows\| rm -f $dbpath; do { time -f '%M %e %U %S' $shellbin -c ( $"bzip2 -cdk ~/enwiki-20230820-pages-articles-multistream-index.txt.bz2 \| head -n ($rows) \| lines \| parse '{offset}:{id}:{title}' \| update cells -c [offset, id] { into int } \| into sqlite ($dbpath)" ) } \| complete \| get stderr \| str trim \| parse '{rss_max} {real} {user} {kernel}' \| update cells -c [rss_max] { $"($in)kb" \| into filesize } \| update cells -c [real, user, kernel] { $"($in)sec" \| into duration } \| insert rows $rows \| roll right } \| flatten \| to nuon ``` This yields the following results Current stable release: \|rows\|rss_max\|real\|user\|kernel\| \|-\|-\|-\|-\|-\| \|10000\|53.6 MiB\|770ms\|460ms\|420ms\| \|100000\|209.6 MiB\|6sec 940ms\|3sec 740ms\|4sec 380ms\| \|1000000\|1.7 GiB\|1min 8sec 810ms\|38sec 690ms\|42sec 550ms\| This PR: \|rows\|rss_max\|real\|user\|kernel\| \|-\|-\|-\|-\|-\| \|10000\|38.2 MiB\|780ms\|440ms\|410ms\| \|100000\|39.8 MiB\|6sec 450ms\|3sec 530ms\|4sec 160ms\| \|1000000\|39.8 MiB\|1min 3sec 230ms\|37sec 440ms\|40sec 180ms\| # Note I started this branch kind of at the same time as my others, but I understand the feedback that smaller PRs are preferred. Let me know if it would be better to split this up. I do think the scope of the changes are on the bigger side even without the behavior changes I mentioned, so I'm not sure if that will help this particular PR very much, but I'm happy to oblige on request.	2024-01-15 21:41:25 -06:00

10 Commits