Fix spread operator lexing in records (#15023)

# Description

Zyphys found that when parsing `{...{}, ...{}, a: 1}`, the `a:` would be
considered one token, leading to a parse error ([Discord
message](https://discord.com/channels/601130461678272522/614593951969574961/1336762075535511573)).
This PR fixes that.

What would happen is that while getting tokens, the following would
happen in a loop:
1. Get the next two tokens while treating `:` as a special character (so
we get the next field key and a colon token)
2. Get the next token while not treating `:` as a special character (so
we get the next value)

I didn't update this when I added the spread operator. With `{...{},
...{}, a: 1}`, the first two tokens would be `...{}` and `...{}`, and
the next token would be `a:`. This PR changes this loop to first get a
single token, check if it's spreading a record, and move on if so.

Alternatives considered:
- Treat `:` as a special character when getting the value too. This
would simplify the loop greatly, but would mean you can't use colons in
values.
- Merge the loop for getting tokens and the loop for parsing those
tokens. I tried this, but it complicates things if you run into a syntax
error and want to create a garbage span going to the end of the record.

# User-Facing Changes

Nothing new
This commit is contained in:
Yash Thakur 2025-02-11 09:51:34 -05:00 committed by GitHub
parent 81243c48f0
commit 1128fa137f
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 22 additions and 11 deletions

View File

@ -6013,20 +6013,30 @@ pub fn parse_record(working_set: &mut StateWorkingSet, span: Span) -> Expression
error: None,
span_offset: start,
};
let mut lex_n = |additional_whitespace, special_tokens, max_tokens| {
lex_n_tokens(
&mut lex_state,
additional_whitespace,
special_tokens,
true,
max_tokens,
)
};
loop {
if lex_n(&[b'\n', b'\r', b','], &[b':'], 2) < 2 {
let additional_whitespace = &[b'\n', b'\r', b','];
if lex_n_tokens(&mut lex_state, additional_whitespace, &[b':'], true, 1) < 1 {
break;
};
if lex_n(&[b'\n', b'\r', b','], &[], 1) < 1 {
let span = lex_state
.output
.last()
.expect("should have gotten 1 token")
.span;
let contents = working_set.get_span_contents(span);
if contents.len() > 3
&& contents.starts_with(b"...")
&& (contents[3] == b'$' || contents[3] == b'{' || contents[3] == b'(')
{
// This was a spread operator, so there's no value
continue;
}
// Get token for colon
if lex_n_tokens(&mut lex_state, additional_whitespace, &[b':'], true, 1) < 1 {
break;
};
// Get token for value
if lex_n_tokens(&mut lex_state, additional_whitespace, &[], true, 1) < 1 {
break;
};
}

View File

@ -65,6 +65,7 @@ fn spread_type_list() -> TestResult {
#[test]
fn spread_in_record() -> TestResult {
run_test(r#"{...{} ...{}, a: 1} | to nuon"#, "{a: 1}").unwrap();
run_test(r#"{...{...{...{}}}} | to nuon"#, "{}").unwrap();
run_test(
r#"{foo: bar ...{a: {x: 1}} b: 3} | to nuon"#,