Support extended unicode escapes in strings: "\u{10fff}" (#7883)

# Description

Support extended unicode escapes in strings with same syntax as Rust:
`"\u{6e}"`.

# User-Facing Changes

New syntax in string literals, `\u{NNNNNN}`, to go along with the
existing `\uNNNN`.
New syntax accepts 1-6 hex digits and rejects values greater than
0x10FFFF (max Unicode char)..

_(List of all changes that impact the user experience here. This helps
us keep track of breaking changes.)_

Won't break existing scripts, since this is new syntax.  

We might consider deprecating `char -u`, since users can now embed
unicode chars > 0xFFFF with the new escape.

# Tests + Formatting

Several unit tests and one integration test added.

- [x] `cargo fmt --all -- --check` to check standard code formatting
(`cargo fmt --all` applies these changes)
Done
- [x] `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
Done
- [x] `cargo test --workspace` to check that all tests pass  
Done

# After Submitting

- [ ] If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
This commit is contained in:
Bob Hyman
2023-01-28 15:25:53 -05:00
committed by GitHub
parent 2a39332d51
commit e616b2e247
4 changed files with 146 additions and 64 deletions

View File

@ -380,14 +380,15 @@ fn block_arity_check1() -> TestResult {
)
}
// deprecating former support for escapes like `/uNNNN`, dropping test.
#[test]
fn string_escape() -> TestResult {
run_test(r#""\u015B""#, "ś")
fn string_escape_unicode_extended() -> TestResult {
run_test(r#""\u{015B}\u{1f10b}""#, "ś🄋")
}
#[test]
fn string_escape_interpolation() -> TestResult {
run_test(r#"$"\u015B(char hamburger)abc""#, "ś≡abc")
run_test(r#"$"\u{015B}(char hamburger)abc""#, "ś≡abc")
}
#[test]