nushell/crates/nu-parser/tests
Kira ae0e13733d
Fix parsing record values containing colons (#13413)
This PR is an attempt to fix #8257 and fix #10985 (which is
duplicate-ish)

# Description
The parser currently doesn't know how to deal with colons appearing
while lexing whitespace-terminated tokens specifying a record value.
Most notably, this means you can't use datetime literals in record value
position (and as a consequence, `| to nuon | from nuon` roundtrips can
fail), but it also means that bare words containing colons cause a
non-useful error message.

![image](https://github.com/user-attachments/assets/f04a8417-ee18-44e7-90eb-a0ecef943a0f)

`parser::parse_record` calls `lex::lex` with the `:` colon character in
the `special_tokens` argument. This allows colons to terminate record
keys, but as a side effect, it also causes colons to terminate record
*values*. I added a new function `lex::lex_n_tokens`, which allows the
caller to drive the lexing process more explicitly, and used it in
`parser::parse_record` to let colons terminate record keys while not
giving them special treatment when appearing in record values.

This PR description previously said: *Another approach suggested in one
of the issues was to support an additional datetime literal format that
doesn't require colons. I like that that wouldn't require new
`lex::lex_internal` behaviour, but an advantage of my approach is that
it also newly allows for string record values given as bare words
containing colons. I think this eliminates another possible source of
confusion.* It was determined that this is undesirable, and in the
current state of this PR, bare word record values with colons are
rejected explicitly. The better error message is still a win.

# User-Facing Changes
In addition to the above, this PR also disables the use of "special"
(non-item) tokens in record key and value position, and the use of a
single bare `:` as a record key.

Examples of behaviour *before* this PR:
```nu
{ a: b } # Valid, same as { 'a': 'b' }
{ a: b:c } # Error: expected ':'
{ a: 2024-08-13T22:11:09 } # Error: expected ':'
{ :: 1 } # Valid, same as { ':': 1 }
{ ;: 1 } # Valid, same as { ';': 1 }
{ a: || } # Valid, same as { 'a': '||' }
```

Examples of behaviour *after* this PR:
```nu
{ a: b } # (Unchanged) Valid, same as { 'a': 'b' }
{ a: b:c } # Error: colon in bare word specifying record value
{ a: 2024-08-13T22:11:09 } # Valid, same as { a: (2024-08-13T22:11:09) }
{ :: 1 } # Error: colon in bare word specifying record key
{ ;: 1 } # Error: expected item in record key position
{ a: || } # Error: expected item in record value position
```

# Tests + Formatting
I added tests, but I'm not sure if they're sufficient and in the right
place.

# After Submitting
I don't think documentation changes are needed for this, but please let
me know if you disagree.
2024-08-28 22:53:56 +02:00
..
test_lex.rs Fix parsing record values containing colons (#13413) 2024-08-28 22:53:56 +02:00
test_parser_unicode_escapes.rs IO and redirection overhaul (#11934) 2024-03-14 15:51:55 -05:00
test_parser.rs Fix parsing record values containing colons (#13413) 2024-08-28 22:53:56 +02:00