This PR is an attempt to fix#8257 and fix#10985 (which is
duplicate-ish)
# Description
The parser currently doesn't know how to deal with colons appearing
while lexing whitespace-terminated tokens specifying a record value.
Most notably, this means you can't use datetime literals in record value
position (and as a consequence, `| to nuon | from nuon` roundtrips can
fail), but it also means that bare words containing colons cause a
non-useful error message.
![image](https://github.com/user-attachments/assets/f04a8417-ee18-44e7-90eb-a0ecef943a0f)
`parser::parse_record` calls `lex::lex` with the `:` colon character in
the `special_tokens` argument. This allows colons to terminate record
keys, but as a side effect, it also causes colons to terminate record
*values*. I added a new function `lex::lex_n_tokens`, which allows the
caller to drive the lexing process more explicitly, and used it in
`parser::parse_record` to let colons terminate record keys while not
giving them special treatment when appearing in record values.
This PR description previously said: *Another approach suggested in one
of the issues was to support an additional datetime literal format that
doesn't require colons. I like that that wouldn't require new
`lex::lex_internal` behaviour, but an advantage of my approach is that
it also newly allows for string record values given as bare words
containing colons. I think this eliminates another possible source of
confusion.* It was determined that this is undesirable, and in the
current state of this PR, bare word record values with colons are
rejected explicitly. The better error message is still a win.
# User-Facing Changes
In addition to the above, this PR also disables the use of "special"
(non-item) tokens in record key and value position, and the use of a
single bare `:` as a record key.
Examples of behaviour *before* this PR:
```nu
{ a: b } # Valid, same as { 'a': 'b' }
{ a: b:c } # Error: expected ':'
{ a: 2024-08-13T22:11:09 } # Error: expected ':'
{ :: 1 } # Valid, same as { ':': 1 }
{ ;: 1 } # Valid, same as { ';': 1 }
{ a: || } # Valid, same as { 'a': '||' }
```
Examples of behaviour *after* this PR:
```nu
{ a: b } # (Unchanged) Valid, same as { 'a': 'b' }
{ a: b:c } # Error: colon in bare word specifying record value
{ a: 2024-08-13T22:11:09 } # Valid, same as { a: (2024-08-13T22:11:09) }
{ :: 1 } # Error: colon in bare word specifying record key
{ ;: 1 } # Error: expected item in record key position
{ a: || } # Error: expected item in record value position
```
# Tests + Formatting
I added tests, but I'm not sure if they're sufficient and in the right
place.
# After Submitting
I don't think documentation changes are needed for this, but please let
me know if you disagree.
# Description
This is a pretty heavy refactor of the parser to support multiple parser
errors. It has a few issues we should address before landing:
- [x] In some cases, error quality has gotten worse `1 / "bob"` for
example
- [x] if/else isn't currently parsing correctly
- probably others
# User-Facing Changes
This may have error quality degradation as we adjust to the new error
reporting mechanism.
# Tests + Formatting
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- `cargo test --workspace` to check that all tests pass
- `cargo run -- crates/nu-utils/standard_library/tests.nu` to run the
tests for the standard library
> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
# After Submitting
If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
this pr refines #8270 and closes#8109
# description
examples:
the original syntax is okay
```nu
def okay [nums: list] {} # the type of list will be list<any>
```
empty annotations are allowed in any variation
the last two may be caught by a future formatter,
but do not affect `nu` code currently
```nu
def okay [nums: list<>] {} # okay
def okay [nums: list< >] {} # weird but also okay
def okay [nums: list<
>] {} # also weird but okay
```
types are allowed (See [notes](#notes) below)
```nu
def okay [nums: list<int>] {} # `test [a b c]` will throw an error
def okay [nums: list< int > {} # any amount of space within the angle brackets is okay
def err [nums: list <int>] {} # this is not okay, `nums` and `<int>` will be parsed as
# two separate params,
```
nested annotations are allowed in many variations
```nu
def okay [items: list<list<int>>] {}
def okay [items: list<list>] {}
```
any unterminated annotation is caught
```nu
Error: nu::parser::unexpected_eof
× Unexpected end of code.
╭─[source:1:1]
1 │ def err [nums: list<int] {}
· ▲
· ╰── expected closing >
╰────
```
unknown types are flagged
```nu
Error: nu::parser::unknown_type
× Unknown type.
╭─[source:1:1]
1 │ def err [nums: list<str>] {}
· ─┬─
· ╰── unknown type
╰────
Error: nu::parser::unknown_type
× Unknown type.
╭─[source:1:1]
1 │ def err [nums: list<int, string>] {}
· ─────┬─────
· ╰── unknown type
╰────
```
# notes
the error message for mismatched types in not as intuitive
```nu
Error: nu::parser::parse_mismatch
× Parse mismatch during operation.
╭─[source:1:1]
1 │ def err [nums: list<int>] {}; err [a b c]
· ┬
· ╰── expected int
╰────
```
it should be something like this
```nu
Error: nu::parser::parse_mismatch
× Parse mismatch during operation.
╭─[source:1:1]
1 │ def err [nums: list<int>] {}; err [a b c]
· ──┬──
· ╰── expected list<int>
╰────
```
this is currently not implemented
# Description
Previously `nix run nixpkgs#hello` was lexed as `Item, Item, Item,
Comment`, however, `#hello` is *not* supposed to be a comment here and
should be parsed as part of the third `Item`.
This change introduces this behavior by not interrupting the parse of
the current token upon seeing a `#`.
Thank you so much for considering this, I think many `nix` users will be
grateful for this change and I think this will lead to more adaptation
in the ecosystem.
- closes#8137 and #6335
# User-Facing Changes
- code like `somecode# bla` and `somecode#bla` will not be parsed as
`somecode, comment` but as `somecode#bla`, hence this is a breaking
change for all users who didn't put a space before a comment introducing
token (`#`)
# Tests + Formatting
I've added tests that cover this behavior in `test_lex.rs`
- [x] `cargo fmt --all -- --check` to check standard code formatting
(`cargo fmt --all` applies these changes)
- [x] `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` to check that you're using the standard code
style
- [x] `cargo test --workspace` to check that all tests pass
# After Submitting
> If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
I think this is expected behavior in most other shells, so the
documentation was lacking for not documenting the unexpected behavior
before and hence now is automatically more complete >D
Also enforce this by #[non_exhaustive] span such that going forward we
cannot, in debug builds (1), construct invalid spans.
The motivation for this stems from #6431 where I've seen crashes due to
invalid slice indexing.
My hope is this will mitigate such senarios
1. https://github.com/nushell/nushell/pull/6431#issuecomment-1278147241
# Description
(description of your pull request here)
# Tests
Make sure you've done the following:
- [ ] Add tests that cover your changes, either in the command examples,
the crate/tests folder, or in the /tests folder.
- [ ] Try to think about corner cases and various ways how your changes
could break. Cover them with tests.
- [ ] If adding tests is not possible, please document in the PR body a
minimal example with steps on how to reproduce so one can verify your
change works.
Make sure you've run and fixed any issues with these commands:
- [x] `cargo fmt --all -- --check` to check standard code formatting
(`cargo fmt --all` applies these changes)
- [ ] `cargo clippy --workspace --features=extra -- -D warnings -D
clippy::unwrap_used -A clippy::needless_collect` to check that you're
using the standard code style
- [ ] `cargo test --workspace --features=extra` to check that all the
tests pass
# Documentation
- [ ] If your PR touches a user-facing nushell feature then make sure
that there is an entry in the documentation
(https://github.com/nushell/nushell.github.io) for the feature, and
update it if necessary.