Fix quoting in to nuon and refactor quoting functions (#14180)

<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

This PR fixes the quoting and escaping of column names in `to nuon`.
Before the PR, column names with quotes inside them would get quoted,
but not escaped:

```nushell
> { 'a"b': 2 } | to nuon
{ "a"b": 2 }

> { 'a"b': 2 } | to nuon | from nuon
Error:   × error when loading nuon text
   ╭─[entry #1:1:27]
 1 │ { "a\"b": 2 } | to nuon | from nuon
   ·                           ────┬────
   ·                               ╰── could not load nuon text
   ╰────

Error:   × error when parsing nuon text
   ╭─[entry #1:1:27]
 1 │ { "a\"b": 2 } | to nuon | from nuon
   ·                           ────┬────
   ·                               ╰── could not parse nuon text
   ╰────

Error:   × error when parsing
   ╭────
 1 │ {"a"b": 2}
   ·          ┬
   ·          ╰── Unexpected end of code.
   ╰────

> [['a"b']; [2] [3]] | to nuon
[["a"b"]; [2], [3]]

> [['a"b']; [2] [3]] | to nuon | from nuon
Error:   × error when loading nuon text
   ╭─[entry #1:1:32]
 1 │ [['a"b']; [2] [3]] | to nuon | from nuon
   ·                                ────┬────
   ·                                    ╰── could not load nuon text
   ╰────

Error:   × error when parsing nuon text
   ╭─[entry #1:1:32]
 1 │ [['a"b']; [2] [3]] | to nuon | from nuon
   ·                                ────┬────
   ·                                    ╰── could not parse nuon text
   ╰────

Error:   × error when parsing
   ╭────
 1 │ [["a"b"]; [2], [3]]
   ·                   ┬
   ·                   ╰── Unexpected end of code.
   ╰────
```

After this PR, the quote is escaped properly:

```nushell
> { 'a"b': 2 } | to nuon
{ "a\"b": 2 }

> { 'a"b': 2 } | to nuon | from nuon
╭─────┬───╮
│ a"b │ 2 │
╰─────┴───╯

> [['a"b']; [2] [3]] | to nuon
[["a\"b"]; [2], [3]]

> [['a"b']; [2] [3]] | to nuon | from nuon
╭─────╮
│ a"b │
├─────┤
│   2 │
│   3 │
╰─────╯
```

The cause of the issue was that `to nuon` simply wrapped column names in
`'"'` instead of calling `escape_quote_string`.

As part of this change, I also moved the functions related to quoting
(`needs_quoting` and `escape_quote_string`) into `nu-utils`, since
previously they were defined in very ad-hoc places (and, in the case of
`escape_quote_string`, it was defined multiple times with the same
body!).

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

`to nuon` now properly escapes quotes in column names.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

All tests pass, including workspace and stdlib tests.

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
This commit is contained in:
Alex Ionescu
2024-10-29 13:43:26 +01:00
committed by GitHub
parent 3ec76af96e
commit 79ea70d4ec
12 changed files with 76 additions and 86 deletions

View File

@ -4,6 +4,7 @@ mod deansi;
pub mod emoji;
pub mod filesystem;
pub mod locale;
mod quoting;
mod shared_cow;
pub mod utils;
@ -18,6 +19,7 @@ pub use deansi::{
strip_ansi_likely, strip_ansi_string_likely, strip_ansi_string_unlikely, strip_ansi_unlikely,
};
pub use emoji::contains_emoji;
pub use quoting::{escape_quote_string, needs_quoting};
pub use shared_cow::SharedCow;
#[cfg(unix)]

View File

@ -0,0 +1,43 @@
use fancy_regex::Regex;
use once_cell::sync::Lazy;
// This hits, in order:
// • Any character of []:`{}#'";()|$,
// • Any digit (\d)
// • Any whitespace (\s)
// • Case-insensitive sign-insensitive float "keywords" inf, infinity and nan.
static NEEDS_QUOTING_REGEX: Lazy<Regex> = Lazy::new(|| {
Regex::new(r#"[\[\]:`\{\}#'";\(\)\|\$,\d\s]|(?i)^[+\-]?(inf(inity)?|nan)$"#)
.expect("internal error: NEEDS_QUOTING_REGEX didn't compile")
});
pub fn needs_quoting(string: &str) -> bool {
if string.is_empty() {
return true;
}
// These are case-sensitive keywords
match string {
// `true`/`false`/`null` are active keywords in JSON and NUON
// `&&` is denied by the nu parser for diagnostics reasons
// (https://github.com/nushell/nushell/pull/7241)
"true" | "false" | "null" | "&&" => return true,
_ => (),
};
// All other cases are handled here
NEEDS_QUOTING_REGEX.is_match(string).unwrap_or(false)
}
pub fn escape_quote_string(string: &str) -> String {
let mut output = String::with_capacity(string.len() + 2);
output.push('"');
for c in string.chars() {
if c == '"' || c == '\\' {
output.push('\\');
}
output.push(c);
}
output.push('"');
output
}