truly flexible csv/tsv parsing (#14399)

- fixes #14398

I will properly fill out this PR and fix any tests that might break when
I have the time, this was a quick fix.

# Description

This PR makes `from csv` and `from tsv`, with the `--flexible` flag,
stop dropping extra/unexpected columns.

# User-Facing Changes

`$text`'s contents
```csv
value
1,aaa
2,bbb
3
4,ddd
5,eee,extra
```

Old behavior
```nushell
> $text | from csv --flexible --noheaders 
╭─#─┬─column0─╮
│ 0 │ value   │
│ 1 │       1 │
│ 2 │       2 │
│ 3 │       3 │
│ 4 │       4 │
│ 5 │       5 │
╰─#─┴─column0─╯
```

New behavior
```nushell
> $text | from csv --flexible --noheaders 
╭─#─┬─column0─┬─column1─┬─column2─╮
│ 0 │ value   │       │
│ 1 │       1 │ aaa     │       │
│ 2 │       2 │ bbb     │       │
│ 3 │       3 │       │
│ 4 │       4 │ ddd     │       │
│ 5 │       5 │ eee     │ extra   │
╰─#─┴─column0─┴─column1─┴─column2─╯
```

- The first line in a csv (or tsv) document no longer limits the number
of columns
- Missing values in columns are longer automatically filled with `null`
with this change, as a later row can introduce new columns. **BREAKING
CHANGE**

Because missing columns are different from empty columns, operations on
possibly missing columns will have to use optional access syntax e.g.
`get foo` => `get foo?`
  
# Tests + Formatting
Added examples that run as tests and adjusted existing tests to confirm
the new behavior.

# After Submitting

Update the workaround with fish completer mentioned
[here](https://www.nushell.sh/cookbook/external_completers.html#fish-completer)
This commit is contained in:
Bahex
2024-11-22 00:58:31 +03:00
committed by GitHub
parent 2a90cb7355
commit 5f7082f053
4 changed files with 62 additions and 30 deletions

View File

@ -469,7 +469,7 @@ fn from_csv_test_flexible_extra_vals() {
echo "a,b\n1,2,3" | from csv --flexible | first | values | to nuon
"#
));
assert_eq!(actual.out, "[1, 2]");
assert_eq!(actual.out, "[1, 2, 3]");
}
#[test]
@ -479,5 +479,5 @@ fn from_csv_test_flexible_missing_vals() {
echo "a,b\n1" | from csv --flexible | first | values | to nuon
"#
));
assert_eq!(actual.out, "[1, null]");
assert_eq!(actual.out, "[1]");
}