Commit Graph

8967 Commits

Author SHA1 Message Date
Embers-of-the-Fire
96493b26d9
Make string related commands parse-time evaluatable (#13032)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

Related meta-issue: #10239.

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

This PR will modify some `str`-related commands so that they can be
evaluated at parse time.

See the following list for those implemented by this pr.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

Available now:
- `str` subcommands
  - `trim`
  - `contains`
  - `distance`
  - `ends-with`
  - `expand`
  - `index-of`
  - `join`
  - `replace`
  - `reverse`
  - `starts-with`
  - `stats`
  - `substring`
  - `capitalize`
  - `downcase`
  - `upcase`
- `split` subcommands
  - `chars`
  - `column`
  - `list`
  - `row`
  - `words`
- `format` subcommands
  - `date`
  - `duration`
  - `filesize`
- string related commands
  - `parse`
  - `detect columns`
  - `encode` & `decode`

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

Unresolved questions:
- [ ] Is there any routine of testing const expressions? I haven't found
any yet.
- [ ] Is const expressions required to behave just like there non-const
version, like what rust promises?

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

Unresolved questions:
- [ ] Do const commands need special marks in the docs?
2024-06-05 22:21:52 +03:00
NotTheDr01ds
36ad7f15c4
cd/def --env examples (#13068)
# Description

Per a Discord question
(https://discord.com/channels/601130461678272522/1244293194603167845/1247794228696711198),
this adds examples to the `help` for both:

* `cd`
* `def`

to demonstrate that `def --env` is required when changing directories in
a custom command.

Since the existing examples for `def` were a bit more complex (and had
output) but the `cd` ones were more simplified, I did use slightly
different examples in each. Either or both could be tweaked if desired.

# User-Facing Changes

Command `help` examples

# Tests + Formatting

- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

# After Submitting

N/A

---------

Co-authored-by: Jakub Žádník <kubouch@gmail.com>
2024-06-05 21:35:31 +03:00
Michael Angerman
78fdb2f4d1
reduce log tracing in nu-cli (#13067)
This one trace message creates thousands of lines of trace messages that
is probably
better suited to using on an *as needed* basis rather than everyone
having to wade
through it...

For now, I just commented it out but eventually this line of code should
be removed
and used simply for the time when someone needs to see it...
2024-06-05 08:54:38 -07:00
Devyn Cairns
a3bc85bb5f
Add options for filtering the log output from nu (#13044)
# Description

Add `--log-include` and `--log-exclude` options to filter the log output
from `nu` to specific module prefixes. For example,

```nushell
nu --log-level trace --log-exclude '[nu_parser]'
```

This avoids having to scan through parser spam when trying to debug
something else at `TRACE` level, and should make it feel much more
reasonable to add logging, particularly at `TRACE` level, to various
places in the codebase. It can also be used to debug non-Nushell crates
that support the Rust logging infrastructure, as many do.

You can also include a specific module instead of excluding the parser
log output:

```nushell
nu --log-level trace --log-include '[nu_plugin]'
```

Pinging #13041 for reference, but hesitant to outright say that this
closes that. I think it address that concern though. I've also struggled
with debugging plugin stuff with all of the other output, so this will
really help me there.

# User-Facing Changes

- New `nu` option: `--log-include`
- New `nu` option: `--log-exclude`
2024-06-05 16:42:55 +08:00
Reilly Wood
a9c2349ada
Refactor explore cursor code (#12979)
`explore` has 3 cursor-related structs that are extensively used to
track the currently shown "window" of the data being shown. I was
finding the cursor code quite difficult to follow, so this PR:
- rewrites the base `Cursor` struct from scratch, with some tests
- makes big changes to `WindowCursor`
- renames `XYCursor` to `WindowCursor2D`
- makes some of the cursor functions fallible as a start towards better
error handling
- changes lots of function names to things that I find more intuitive
- adds comments, including ASCII diagrams to explain how the cursors
work

More work could be done (I'd like to review/change more function names
in `WindowCursor` and `WindowCursor2D` and add more tests), but this is
the limit of what I can get done in a weekend. I think this part of the
code is in a better place now.

# Testing performed

I did a lot of manual testing in the record view and binary viewer,
moving around with arrow keys / page up+down / home+end.

This can definitely wait until after the release freeze, this area has
very few automated tests and it'd be good to let the changes bake a bit.
2024-06-04 19:50:11 -07:00
Jakub Žádník
e4104d0792
Span ID Refactor - Step 1 (#12960)
# Description
First part of SpanID refactoring series. This PR adds a `SpanId` type
and a corresponding `span_id` field to `Expression`. Parser creating
expressions will now add them to an array in `StateWorkingSet`,
generates a corresponding ID and saves the ID to the Expression. The IDs
are not used anywhere yet.

For the rough overall plan, see
https://github.com/nushell/nushell/issues/12963.

# User-Facing Changes
Hopefully none. This is only a refactor of Nushell's internals that
shouldn't have visible side effects.

# Tests + Formatting

# After Submitting
2024-06-05 09:57:14 +08:00
Jack Wright
b10325dff1
Allow int values to be converted into floats. (#13025)
Addresses the bug found by @maxim-uvarov when trying to coerce an int
Value to a polars float:

<img width="863" alt="image"
src="https://github.com/nushell/nushell/assets/56345/4d858812-a7b3-4296-98f4-dce0c544b4c6">

Conversion now works correctly:

<img width="891" alt="Screenshot 2024-05-31 at 14 28 51"
src="https://github.com/nushell/nushell/assets/56345/78d9f711-7ad5-4503-abc6-7aba64a2e675">
2024-06-04 18:51:11 -07:00
Antoine Büsch
d4fa014534
Make query xml return nodes in document order (#13047)
# Description

`query xml` used to return results from an XPath query in a random,
non-deterministic order. With this change, results get returned in the
order they appear in the document.

# User-Facing Changes
`query xml` will now return results in a non-random order.

# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

# After Submitting
2024-06-05 09:47:36 +08:00
dependabot[bot]
c310175a1e
Bump hustcer/setup-nu from 3.10 to 3.11 (#13058) 2024-06-05 09:17:27 +08:00
dependabot[bot]
6b2012bdfa Bump crate-ci/typos from 1.21.0 to 1.22.0
Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.21.0 to 1.22.0.
- [Release notes](https://github.com/crate-ci/typos/releases)
- [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md)
- [Commits](https://github.com/crate-ci/typos/compare/v1.21.0...v1.22.0)

---
updated-dependencies:
- dependency-name: crate-ci/typos
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-06-05 01:01:54 +00:00
Devyn Cairns
28e33587d9
msgpackz: increase default compression level (#13035)
# Description
Increase default compression level for brotli on msgpackz to 3. This has
the best compression time generally. Level 0 and 1 give weird results
and sometimes cause extremely inflated outputs rather than being
compressed. So far this hasn't really been a problem for the plugin
registry file, but has been for other data.

The `$example` is the web-app example from https://json.org/example.html

Benchmarked with:

```nushell
seq 0 11 | each { |level|
  let compressed = ($example | to msgpackz --quality $level)
  let time = (timeit { $example | to msgpackz --quality $level })
  {
    level: $level
    time: $time
    length: ($compressed | bytes length)
    ratio: (($uncompressed_length | into float) / ($compressed | bytes length))
  }
}
```

```
╭────┬───────┬─────────────────┬────────┬───────╮
│  # │ level │      time       │ length │ ratio │
├────┼───────┼─────────────────┼────────┼───────┤
│  0 │     0 │ 4ms 611µs 875ns │   3333 │  0.72 │
│  1 │     1 │ 1ms 334µs 500ns │   3333 │  0.72 │
│  2 │     2 │     190µs 333ns │   1185 │  2.02 │
│  3 │     3 │      184µs 42ns │   1128 │  2.12 │
│  4 │     4 │      245µs 83ns │   1098 │  2.18 │
│  5 │     5 │     265µs 584ns │   1040 │  2.30 │
│  6 │     6 │     270µs 792ns │   1040 │  2.30 │
│  7 │     7 │     444µs 708ns │   1040 │  2.30 │
│  8 │     8 │       1ms 801µs │   1040 │  2.30 │
│  9 │     9 │     843µs 875ns │   1037 │  2.31 │
│ 10 │    10 │ 4ms 128µs 375ns │    984 │  2.43 │
│ 11 │    11 │ 6ms 352µs 834ns │    986 │  2.43 │
╰────┴───────┴─────────────────┴────────┴───────╯
```

cc @maxim-uvarov
2024-06-04 17:19:10 -07:00
Antoine Stevan
ff5cb6f1ff
complete the type of --error-label in std assert commands (#12998)
i was looking at the website documentation of `std assert` and i noticed
one thing
- the `--error-label` argument of `assert` and `assert not` was just a
`record` -> now it's that complete type `record<text: string, span:
record<start: int, end: int>>`
2024-06-05 08:02:17 +08:00
Antoine Büsch
65911c125c
Try to preserve the ordering of elements in from toml (#13045)
# Description

Enable the `preserve_order` feature of the `toml` crate to preserve the
ordering of elements when converting from/to toml.

Additionally, use `to_string_pretty()` instead of `to_string()` in `to
toml`. This displays arrays on multiple lines instead of one big single
line. I'm not sure if this one is a good idea or not... Happy to remove
this from this PR if it's not.

# User-Facing Changes
The order of elements will be different when using `from toml`. The
formatting of arrays will also be different when using `to toml`. For
example:

- before
```
❯ "foo=1\nbar=2\ndoo=3" | from toml
╭─────┬───╮
│ bar │ 2 │
│ doo │ 3 │
│ foo │ 1 │
╰─────┴───╯
❯ {a: [a b c d]} | to toml
a = ["a", "b", "c", "d"]
```
- after
```
❯ "foo=1\nbar=2\ndoo=3" | from toml
╭─────┬───╮
│ foo │ 1 │
│ bar │ 2 │
│ doo │ 3 │
╰─────┴───╯
❯ {a: [a b c d]} | to toml
a = [
    "a",
    "b",
    "c",
    "d",
]
```

# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🔴 `toolkit test`
-  `toolkit test stdlib`

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2024-06-05 08:00:39 +08:00
Sang-Heon Jeon
3f0db11ae5
support plus sign for "into filesize" (#12974)
# Description
Fixes https://github.com/nushell/nushell/issues/12968. After apply this
patch, we can use explict plus sign character included string with `into
filesize` cmd.

# User-Facing Changes
AS-IS (before fixing)
```
$ "+8 KiB" | into filesize                                                                                                         
Error: nu:🐚:cant_convert

  × Can't convert to int.
   ╭─[entry #31:1:1]
 1 │ "+8 KiB" | into filesize
   · ────┬───
   ·     ╰── can't convert string to int
   ╰────
```

TO-BE (after fixing)
```
$ "+8KiB" | into filesize                                                                                       
8.0 KiB
```

# Tests + Formatting
Added a test 

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2024-06-05 07:43:50 +08:00
Reilly Wood
7746e84199
Make LS_COLORS functionality faster in explore, especially on Windows (#12984)
Closes #12980. More context there, but basically `explore` was getting
file metadata for every row every time the record view was rendered. The
quick fix for now is to do the `LS_COLORS` colouring with a `&str`
instead of a path and file metadata.
2024-06-05 07:43:12 +08:00
Jack Wright
a84fdb1d37
Fixed a couple of incorrect errors messages (#13043)
Fixed a couple of error message that incorrectly reported as parquet
errors instead of CSV errors.
2024-06-05 07:40:02 +08:00
Wind
ad5a6cdc00
bump version to 0.94.3 (#13055) 2024-06-05 06:52:40 +08:00
Devyn Cairns
be8c1dc006
Fix run_external::expand_glob() to return paths that are PWD-relative but reflect the original intent (#13028)
# Description

Fix #13021

This changes the `expand_glob()` function to use
`nu_engine::glob_from()` so that absolute paths are actually preserved,
rather than being made relative to the provided parent. This preserves
the intent of whoever wrote the original path/glob, and also makes it so
that tilde always produces absolute paths.

I also made `expand_glob()` handle Ctrl-C so that it can be interrupted.

cc @YizhePKU

# Tests + Formatting
No additional tests here... but that might be a good idea.
2024-06-03 10:38:55 +03:00
Devyn Cairns
b50903cf58
Fix external command name parsing with backslashes, and add tests (#13027)
# Description

Fixes #13016 and adds tests for many variations of external call
parsing.

I just realized @kubouch took a crack at this too (#13022) so really
whichever is better, but I think the
tests are a good addition.
2024-06-03 10:28:45 +03:00
Devyn Cairns
6635b74d9d
Bump version to 0.94.2 (#13014)
Version bump after 0.94.1 patch release.
2024-06-03 10:28:35 +03:00
Ian Manske
f3cf693ec7
Disallow more characters in arguments for internal cmd commands (#13009)
# Description
Makes `run-external` error if arguments to `cmd.exe` internal commands
contain newlines or a percent sign. This is because the percent sign can
expand environment variables, potentially? allowing command injection.
Newlines I think will truncate the rest of the arguments and should
probably be disallowed to be safe.

# After Submitting
- If the user calls `cmd.exe` directly, then this bypasses our
handling/checking for internal `cmd` commands. Instead, we use the
handling from the Rust std lib which, in this case, does not do special
handling and is potentially unsafe. Then again, it could be the user's
specific intention to run `cmd` with whatever trusted input. The problem
is that since we use the std lib handling, it assumes the exe uses the C
runtime escaping rules and will perform some unwanted escaping. E.g., it
will add backslashes to the quotes in `cmd echo /c '""'`.
- If `cmd` is called indirectly via a `.bat` or `.cmd` file, then we use
the Rust std lib which has separate handling for bat files that should
be safe, but will reject some inputs.
- ~~I'm not sure how we handle `PATHEXT`, that can also cause a file
without an extension to be run as a bat file. If so, I don't know where
the handling, if any, is done for that.~~ It looks like we use the
`which` crate to do the lookup using `PATHEXT`. Then, we pass the exe
path from that to the Rust std lib `Command`, which should be safe
(except for the first `cmd.exe` note).

So, in the future we need to unify and/or fix these different
implementations, including our own special handling for internal `cmd`
commands that this PR tries to fix.
2024-05-30 19:24:48 +00:00
Ian Manske
31f3d2f664
Restore path type behavior (#13006)
# Description
Restores `path type` to return an empty string on error like it did pre
0.94.0.
2024-05-30 13:42:22 +00:00
Wind
40772fea15
fix do closure with both required, options, and rest args (#13002)
# Description
Fixes: #12985

`val_iter` has already handle required positional and optional
positional arguments, it not skip them again while handling rest
arguments.

# User-Facing Changes
Makes `do {|a, ...b| echo $a ...$b} 1 2 3 4` output the following again:
```nushell
╭───┬───╮
│ 0 │ 1 │
│ 1 │ 2 │
│ 2 │ 3 │
│ 3 │ 4 │
╰───┴───╯
```

# Tests + Formatting
Added some test cases
2024-05-30 08:29:46 -05:00
Devyn Cairns
0e1553026e
Restore tilde expansion on external command names (#13001)
# Description

Fix a regression introduced by #12921, where tilde expansion was no
longer done on the external command name, breaking things like

```nushell
> ~/.cargo/bin/exa
```

This properly handles quoted strings, so they don't expand:

```nushell
> ^"~/.cargo/bin/exa"
Error: nu:🐚:external_command

  × External command failed
   ╭─[entry #1:1:2]
 1 │ ^"~/.cargo/bin/exa"
   ·  ─────────┬────────
   ·           ╰── Command `~/.cargo/bin/exa` not found
   ╰────
  help: `~/.cargo/bin/exa` is neither a Nushell built-in or a known external command

```

This required a change to the parser, so the command name is also parsed
in the same way the arguments are - i.e. the quotes on the outside
remain in the expression. Hopefully that doesn't break anything else. 🤞

Fixes #13000. Should include in patch release 0.94.1

cc @YizhePKU

# User-Facing Changes
- Tilde expansion now works again for external commands
- The `command` of `run-external` will now have its quotes removed like
the other arguments if it is a literal string
- The parser is changed to include quotes in the command expression of
`ExternalCall` if they were present

# Tests + Formatting
I would like to add a regression test for this, but it's complicated
because we need a well-known binary within the home directory, which
just isn't a thing. We could drop one there, but that's kind of a bad
behavior for a test to do. I also considered changing the home directory
for the test, but that's so platform-specific - potentially could get it
working on specific platforms though. Changing `HOME` env on Linux
definitely works as far as tilde expansion works.

- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`
2024-05-29 18:48:29 -07:00
Darren Schroeder
6a2996251c
fixes a bug in OSC9;9 execution (#12994)
# Description

This fixes a bug in the `OSC 9;9` functionality where the path wasn't
being constructed properly and therefore wasn't getting set right for
things like "Duplicate Tab" in Windows Terminal. Thanks to @Araxeus for
finding it.

Related to https://github.com/nushell/nushell/issues/10166

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2024-05-29 18:06:47 -05:00
Devyn Cairns
f3991f2080
Bump version to 0.94.1 (#12988)
Merge this PR before merging any other PRs.
2024-05-28 22:41:23 +00:00
Jakub Žádník
61182deb96
Bump version to 0.94.0 (#12987) 2024-05-28 12:04:09 -07:00
Ian Manske
4ab2c3238a
Disable reedline patch for 0.94.0 (#12986)
Disable crates.io git patch for reedline for 0.94.0 release.
2024-05-28 18:53:51 +00:00
Ian Manske
6012af2412
Fix panic when redirecting nothing (#12970)
# Description
Fixes #12969 where the parser can panic if a redirection is applied to
nothing / an empty command.

# Tests + Formatting
Added a test.
2024-05-27 10:03:06 +08:00
YizhePKU
f74dd33ba9
Fix touch --reference using PWD from the environment (#12976)
This PR fixes `touch --reference path` so that it resolves `path` using
PWD from the engine state.
2024-05-26 20:24:00 +03:00
YizhePKU
a1fc41db22
Fix path type using PWD from the environment (#12975)
This PR fixes the `path type` command so that it resolves relative paths
using PWD from the engine state.

As a bonus, it also fixes the issue of `path type` returning an empty
string instead of an error when it fails.
2024-05-26 20:23:52 +03:00
YizhePKU
f38f88d42c
Fixes . expanded incorrectly as external argument (#12950)
This PR fixes a bug where `.` is expanded into an empty string when used
as an argument to external commands. Fixes
https://github.com/nushell/nushell/issues/12948.

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-26 07:06:17 +08:00
Darren Schroeder
0c5a67f4e5
make polars plugin use mimalloc (#12967)
# Description
@maxim-uvarov did a ton of research and work with the dply-rs author and
ritchie from polars and found out that the allocator matters on macos
and it seems to be what was messing up the performance of polars plugin.
ritchie suggested to use jemalloc but i switched it to mimalloc to match
nushell and it seems to run better.

## Before (default allocator)
note - using 1..10 vs 1..100 since it takes so long. also notice how
high the `max` timings are compared to mimalloc below.
```nushell
❯ 1..10 | each {timeit {polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null}} |   | {mean: ($in | math avg), min: ($in | math min), max: ($in | math max), stddev: ($in | into int | into float | math stddev | into int | $'($in)ns' | into duration)}
╭────────┬─────────────────────────╮
│ mean   │ 4sec 999ms 605µs 995ns  │
│ min    │ 983ms 627µs 42ns        │
│ max    │ 13sec 398ms 135µs 791ns │
│ stddev │ 3sec 476ms 479µs 939ns  │
╰────────┴─────────────────────────╯
❯ use std bench
❯ bench { polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null } -n 10
╭───────┬────────────────────────╮
│ mean  │ 6sec 220ms 783µs 983ns │
│ min   │ 1sec 184ms 997µs 708ns │
│ max   │ 18sec 882ms 81µs 708ns │
│ std   │ 5sec 350ms 375µs 697ns │
│ times │ [list 10 items]        │
╰───────┴────────────────────────╯
```

## After (using mimalloc)
```nushell
❯ 1..100 | each {timeit {polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null}} |   | {mean: ($in | math avg), min: ($in | math min), max: ($in | math max), stddev: ($in | into int | into float | math stddev | into int | $'($in)ns' | into duration)}
╭────────┬───────────────────╮
│ mean   │ 103ms 728µs 902ns │
│ min    │ 97ms 107µs 42ns   │
│ max    │ 149ms 430µs 84ns  │
│ stddev │ 5ms 690µs 664ns   │
╰────────┴───────────────────╯
❯ use std bench
❯ bench { polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null } -n 100
╭───────┬───────────────────╮
│ mean  │ 103ms 620µs 195ns │
│ min   │ 97ms 541µs 166ns  │
│ max   │ 130ms 262µs 166ns │
│ std   │ 4ms 948µs 654ns   │
│ times │ [list 100 items]  │
╰───────┴───────────────────╯
```

## After (using jemalloc - just for comparison)
```nushell
❯ 1..100 | each {timeit {polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null}} |   | {mean: ($in | math avg), min: ($in | math min), max: ($in | math max), stddev: ($in | into int | into float | math stddev | into int | $'($in)ns' | into duration)}

╭────────┬───────────────────╮
│ mean   │ 113ms 939µs 777ns │
│ min    │ 108ms 337µs 333ns │
│ max    │ 166ms 467µs 458ns │
│ stddev │ 6ms 175µs 618ns   │
╰────────┴───────────────────╯
❯ use std bench
❯ bench { polars open Data7602DescendingYearOrder.csv | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null } -n 100
╭───────┬───────────────────╮
│ mean  │ 114ms 363µs 530ns │
│ min   │ 108ms 804µs 833ns │
│ max   │ 143ms 521µs 459ns │
│ std   │ 5ms 88µs 56ns     │
│ times │ [list 100 items]  │
╰───────┴───────────────────╯
```

## After (using parquet + mimalloc)
```nushell
❯ 1..100 | each {timeit {polars open data.parquet | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null}} |   | {mean: ($in | math avg), min: ($in | math min), max: ($in | math max), stddev: ($in | into int | into float | math stddev | into int | $'($in)ns' | into duration)}
╭────────┬──────────────────╮
│ mean   │ 34ms 255µs 492ns │
│ min    │ 31ms 787µs 250ns │
│ max    │ 76ms 408µs 416ns │
│ stddev │ 4ms 472µs 916ns  │
╰────────┴──────────────────╯
❯ use std bench
❯ bench { polars open data.parquet | polars group-by year | polars agg (polars col geo_count | polars sum) | polars collect | null } -n 100
╭───────┬──────────────────╮
│ mean  │ 34ms 897µs 562ns │
│ min   │ 31ms 518µs 542ns │
│ max   │ 65ms 943µs 625ns │
│ std   │ 3ms 450µs 741ns  │
│ times │ [list 100 items] │
╰───────┴──────────────────╯
```

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
2024-05-25 09:10:01 -05:00
Ian Manske
95977faf2d
Do not propagate glob creation error for external args (#12955)
# Description
Instead of returning an error, this PR changes `expand_glob` in
`run_external.rs` to return the original string arg if glob creation
failed. This makes it so that, e.g.,
```nushell
^echo `[`
^echo `***`
```
no longer fail with a shell error. (This follows from #12921.)
2024-05-25 08:59:36 +08:00
Ian Manske
c5d716951f
Allow byte streams with unknown type to be compatiable with binary (#12959)
# Description
Currently, this pipeline doesn't work `open --raw file | take 100`,
since the type of the byte stream is `Unknown`, but `take` expects
`Binary` streams. This PR changes commands that expect
`ByteStreamType::Binary` to also work with `ByteStreamType::Unknown`.
This was done by adding two new methods to `ByteStreamType`:
`is_binary_coercible` and `is_string_coercible`. These return true if
the type is `Unknown` or matches the type in the method name.
2024-05-24 17:54:38 -07:00
Devyn Cairns
b06f31d3c6
Make from json --objects streaming (#12949)
# Description

Makes the `from json --objects` command produce a stream, and read
lazily from an input stream to produce its output.

Also added a helper, `PipelineData::get_type()`, to make it easier to
construct a wrong type error message when matching on `PipelineData`. I
expect checking `PipelineData` for either a string value or an `Unknown`
or `String` typed `ByteStream` will be very, very common. I would have
liked to have a helper that just returns a readable stream from either,
but that would either be a bespoke enum or a `Box<dyn BufRead>`, which
feels like it wouldn't be so great for performance. So instead, taking
the approach I did here is probably better - having a function that
accepts the `impl BufRead` and matching to use it.

# User-Facing Changes

- `from json --objects` no longer collects its input, and can be used
for large datasets or streams that produce values over time.

# Tests + Formatting
All passing.

# After Submitting
- [ ] release notes

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-24 23:37:50 +00:00
Ian Manske
84b7a99adf
Revert "Polars lazy refactor (#12669)" (#12962)
This reverts commit 68adc4657f.

# Description

Reverts the lazyframe refactor (#12669) for the next release, since
there are still a few lingering issues. This temporarily solves #12863
and #12828. After the release, the lazyframes can be added back and
cleaned up.
2024-05-24 18:09:26 -05:00
Darren Schroeder
7d11c28eea
Revert "Remove std::env::set_current_dir() call from EngineState::merge_env()" (#12954)
Reverts nushell/nushell#12922
2024-05-24 11:09:59 -05:00
Ian Manske
bf07806b1b
Use cwd in grid (#12947)
# Description
Fixes #12946. The `grid` command does not use the cwd when trying to get
the icon or color for a file/path.
2024-05-23 20:38:47 +00:00
Reilly Wood
0b5a4c0d95
explore refactoring+clarification (#12940)
Another very boring PR cleaning up and documenting some of `explore`'s
innards. Mostly renaming things that I found confusing or vague when
reading through the code, also adding some comments.
2024-05-23 08:51:39 -05:00
Wind
f53aa6fcbf
fix std help (#12943)
# Description
Fixes: #12941

~~The issue is cause by some columns(is_builtin, is_plugin, is_custom,
is_keyword) are removed in #10023~~
Edit: I'm wrong

# Tests + Formatting
Added one test for `std help`
2024-05-23 08:51:02 -05:00
Ian Manske
2612a167e3
Remove list support in with-env (#12939)
# Description
Following from #12523, this PR removes support for lists of environments
variables in the `with-env` command. Rather, only records will be
supported now.

# After Submitting
Update examples using the list form in the docs and book.
2024-05-23 13:53:55 +08:00
Reilly Wood
c7097ca937
explore cleanup: remove+move binary viewer config (#12920)
Small change, removing 4 more configuration options from `explore`'s
binary viewer:

1. `show_index`
2. `show_data`
3. `show_ascii`
4. `show_split`

These controlled whether the 3 columns in the binary viewer (index, hex
data, ASCII) and the pipe separator (`|`) in between them are shown. I
don't think we need this level of configurability until the `explore`
command is more mature, and maybe even not then; we can just show them
all.

I think it's very unlikely that anyone is using these configuration
points.

Also, the row offset (e.g. how many rows we have scrolled down) was
being stored in config/settings when it's arguably not config; more like
internal state of the binary viewer. I moved it to a more appropriate
location and renamed it.
2024-05-22 20:06:14 -07:00
Wind
58cf0c56f8
add some completion tests (#12908)
# Description
```nushell
❯ ls
╭───┬───────┬──────┬──────┬──────────╮
│ # │ name  │ type │ size │ modified │
├───┼───────┼──────┼──────┼──────────┤
│ 0 │ a.txt │ file │  0 B │ now      │
╰───┴───────┴──────┴──────┴──────────╯

❯ ls a.
NO RECORDS FOUND
```

There is a completion issue on previous version, I think @amtoine have
reproduced it before. But currently I can't reproduce it on latest main.
To avoid such regression, I added some tests for completion.

---------

Co-authored-by: Antoine Stevan <44101798+amtoine@users.noreply.github.com>
2024-05-23 10:47:06 +08:00
YizhePKU
6c649809d3
Rewrite run_external.rs (#12921)
This PR is a complete rewrite of `run_external.rs`. The main goal of the
rewrite is improving readability, but it also fixes some bugs related to
argument handling and the PATH variable (fixes
https://github.com/nushell/nushell/issues/6011).

I'll discuss some technical details to make reviewing easier.

## Argument handling

Quoting arguments for external commands is hard. Like, *really* hard.
We've had more than a dozen issues and PRs dedicated to quoting
arguments (see Appendix) but the current implementation is still buggy.

Here's a demonstration of the buggy behavior:

```nu
let foo = "'bar'"
^touch $foo            # This creates a file named `bar`, but it should be `'bar'`
^touch ...[ "'bar'" ]  # Same
```

I'll describe how this PR deals with argument handling.

First, we'll introduce the concept of **bare strings**. Bare strings are
**string literals** that are either **unquoted** or **quoted by
backticks** [^1]. Strings within a list literal are NOT considered bare
strings, even if they are unquoted or quoted by backticks.

When a bare string is used as an argument to external process, we need
to perform tilde-expansion, glob-expansion, and inner-quotes-removal, in
that order. "Inner-quotes-removal" means transforming from
`--option="value"` into `--option=value`.

## `.bat` files and CMD built-ins

On Windows, `.bat` files and `.cmd` files are considered executable, but
they need `CMD.exe` as the interpreter. The Rust standard library
supports running `.bat` files directly and will spawn `CMD.exe` under
the hood (see
[documentation](https://doc.rust-lang.org/std/process/index.html#windows-argument-splitting)).
However, other extensions are not supported [^2].

Nushell also supports a selected number of CMD built-ins. The problem
with CMD is that it uses a different set of quoting rules. Correctly
quoting for CMD requires using
[Command::raw_arg()](https://doc.rust-lang.org/std/os/windows/process/trait.CommandExt.html#tymethod.raw_arg)
and manually quoting CMD special characters, on top of quoting from the
Nushell side. ~~I decided that this is too complex and chose to reject
special characters in CMD built-ins instead [^3]. Hopefully this will
not affact real-world use cases.~~ I've implemented escaping that works
reasonably well.

## `which-support` feature

The `which` crate is now a hard dependency of `nu-command`, making the
`which-support` feature essentially useless. The `which` crate is
already a hard dependency of `nu-cli`, and we should consider removing
the `which-support` feature entirely.

## Appendix

Here's a list of quoting-related issues and PRs in rough chronological
order.

* https://github.com/nushell/nushell/issues/4609
* https://github.com/nushell/nushell/issues/4631
* https://github.com/nushell/nushell/issues/4601
  * https://github.com/nushell/nushell/pull/5846
* https://github.com/nushell/nushell/issues/5978
  * https://github.com/nushell/nushell/pull/6014
* https://github.com/nushell/nushell/issues/6154
  * https://github.com/nushell/nushell/pull/6161
* https://github.com/nushell/nushell/issues/6399
  * https://github.com/nushell/nushell/pull/6420
  * https://github.com/nushell/nushell/pull/6426
* https://github.com/nushell/nushell/issues/6465
* https://github.com/nushell/nushell/issues/6559
  * https://github.com/nushell/nushell/pull/6560

[^1]: The idea that backtick-quoted strings act like bare strings was
introduced by Kubouch and briefly mentioned in [the language
reference](https://www.nushell.sh/lang-guide/chapters/strings_and_text.html#backtick-quotes).

[^2]: The documentation also said "running .bat scripts in this way may
be removed in the future and so should not be relied upon", which is
another reason to move away from this. But again, quoting for CMD is
hard.

[^3]: If anyone wants to try, the best resource I found on the topic is
[this](https://daviddeley.com/autohotkey/parameters/parameters.htm).
2024-05-23 02:05:27 +00:00
Jakub Žádník
64afb52ffa
Fix leftover wrong column name (#12937)
# Description

Small fixup for https://github.com/nushell/nushell/pull/12930
2024-05-22 21:24:22 +00:00
Wind
ac4125f8ed
fix range semantic in detect_columns, str substring, str index-of (#12894)
# Description
Fixes: https://github.com/nushell/nushell/issues/7761

It's still unsure if we want to change the `range semantic` itself, but
it's good to keep range semantic consistent between nushell commands.

# User-Facing Changes
### Before
```nushell
❯ "abc" | str substring 1..=2
b
```
### After
```nushell
❯ "abc" | str substring 1..=2
bc
```

# Tests + Formatting
Adjust tests to fit new behavior
2024-05-22 20:00:58 +03:00
YizhePKU
7ede90cba5
Remove std::env::set_current_dir() call from EngineState::merge_env() (#12922)
As discussed in https://github.com/nushell/nushell/pull/12749, we no
longer need to call `std::env::set_current_dir()` to sync `$env.PWD`
with the actual working directory. This PR removes the call from
`EngineState::merge_env()`.
2024-05-22 19:58:27 +03:00
Jakub Žádník
75689ec98a
Small improvements to debug profile (#12930)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

1. With the `-l` flag, `debug profile` now collects files and line
numbers of profiled pipeline elements

![profiler_lines](https://github.com/nushell/nushell/assets/25571562/b400a956-d958-4aff-aa4c-7e65da3f78fa)

2. Error from the profiled closure will be reported instead of silently
ignored.

![profiler_lines_error](https://github.com/nushell/nushell/assets/25571562/54f7ad7a-06a3-4d56-92c2-c3466917bee8)


# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

New `--lines(-l)` flag to `debug profile`. The command will also fail if
the profiled closure fails, so technically it is a breaking change.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-22 19:56:51 +03:00
Devyn Cairns
7de513a4e0
Implement streaming I/O for CSV and TSV commands (#12918)
# Description

Implements streaming for:

- `from csv`
- `from tsv`
- `to csv`
- `to tsv`

via the new string-typed ByteStream support.

# User-Facing Changes
Commands above. Also:

- `to csv` and `to tsv` now have `--columns <List(String)>`, to provide
the exact columns desired in the output. This is required for them to
have streaming output, because otherwise collecting the entire list is
necessary to determine the output columns. If we introduce
`TableStream`, this may become less necessary.

# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

# After Submitting
- [ ] release notes

---------

Co-authored-by: Ian Manske <ian.manske@pm.me>
2024-05-22 16:55:24 +00:00