nushell/crates/nu-command/src/conversions/split_cell_path.rs
Bahex c4dcfdb77b
feat!: Explicit cell-path case sensitivity syntax (#15692)
Related:
- #15683
- #14551
- #849
- #12701
- #11527

# Description
Currently various commands have differing behavior regarding cell-paths

```nushell
{a: 1, A: 2} | get a A
# => ╭───┬───╮
# => │ 0 │ 2 │
# => │ 1 │ 2 │
# => ╰───┴───╯
{a: 1, A: 2} | select a A
# => ╭───┬───╮
# => │ a │ 1 │
# => │ A │ 2 │
# => ╰───┴───╯
{A: 1} | update a 2
# => Error: nu:🐚:column_not_found
# => 
# =>   × Cannot find column 'a'
# =>    ╭─[entry #62:1:1]
# =>  1 │ {A: 1} | update a 2
# =>    · ───┬──          ┬
# =>    ·    │            ╰── cannot find column 'a'
# =>    ·    ╰── value originates here
# =>    ╰────
```

Proposal: making cell-path access case-sensitive by default and adding
new syntax for case-insensitive parts, similar to optional (?) parts.

```nushell
{FOO: BAR}.foo
# => Error: nu:🐚:name_not_found
# => 
# =>   × Name not found
# =>    ╭─[entry #60:1:21]
# =>  1 │ {FOO: BAR}.foo
# =>    ·            ─┬─
# =>    ·             ╰── did you mean 'FOO'?
# =>    ╰────
{FOO: BAR}.foo!
# => BAR
```

This would solve the problem of case sensitivity for all commands
without causing an explosion of flags _and_ make it more granular

Assigning to a field using a case-insensitive path is case-preserving.
```nushell
mut val = {FOO: "I'm FOO"}; $val
# => ╭─────┬─────────╮
# => │ FOO │ I'm FOO │
# => ╰─────┴─────────╯
$val.foo! = "I'm still FOO"; $val
# => ╭─────┬───────────────╮
# => │ FOO │ I'm still FOO │
# => ╰─────┴───────────────╯
```

For `update`, case-insensitive is case-preserving.
```nushell
{FOO: 1} | update foo! { $in + 1 }
# => ╭─────┬───╮
# => │ FOO │ 2 │
# => ╰─────┴───╯
```

`insert` can insert values into nested values so accessing into existing
columns is case-insensitive, but creating new columns uses the cell-path
as it is.
So `insert foo! ...` and `insert FOO! ...` would work exactly as they do
without `!`
```nushell
{FOO: {quox: 0}}
# => ╭─────┬──────────────╮
# => │     │ ╭──────┬───╮ │
# => │ FOO │ │ quox │ 0 │ │
# => │     │ ╰──────┴───╯ │
# => ╰─────┴──────────────╯
{FOO: {quox: 0}} | insert foo.bar 1
# => ╭─────┬──────────────╮
# => │     │ ╭──────┬───╮ │
# => │ FOO │ │ quox │ 0 │ │
# => │     │ ╰──────┴───╯ │
# => │     │ ╭─────┬───╮  │
# => │ foo │ │ bar │ 1 │  │
# => │     │ ╰─────┴───╯  │
# => ╰─────┴──────────────╯
{FOO: {quox: 0}} | insert foo!.bar 1
# => ╭─────┬──────────────╮
# => │     │ ╭──────┬───╮ │
# => │ FOO │ │ quox │ 0 │ │
# => │     │ │ bar  │ 1 │ │
# => │     │ ╰──────┴───╯ │
# => ╰─────┴──────────────╯
```

`upsert` is tricky, depending on the input, the data might end up with
different column names in rows. We can either forbid case-insensitive
cell-paths for `upsert` or trust the user to keep their data in a
sensible shape.

This would be a breaking change as it would make existing cell-path
accesses case-sensitive, however the case-sensitivity is already
inconsistent and any attempt at making it consistent would be a breaking
change.

> What about `$env`?

1. Initially special case it so it keeps its current behavior.
2. Accessing environment variables with non-matching paths gives a
deprecation warning urging users to either use exact casing or use the
new explicit case-sensitivity syntax
3. Eventuall remove `$env`'s special case, making `$env` accesses
case-sensitive by default as well.

> `$env.ENV_CONVERSIONS`?

In addition to `from_string` and `to_string` add an optional field to
opt into case insensitive/preserving behavior.

# User-Facing Changes

- `get`, `where` and other previously case-insensitive commands are now
case-sensitive by default.
- `get`'s `--sensitive` flag removed, similar to `--ignore-errors` there
is now an `--ignore-case` flag that treats all parts of the cell-path as
case-insensitive.
- Users can explicitly choose the case case-sensitivity of cell-path
accesses or commands.

# Tests + Formatting

Existing tests required minimal modification. ***However, new tests are
not yet added***.

- 🟢 toolkit fmt
- 🟢 toolkit clippy
- 🟢 toolkit test
- 🟢 toolkit test stdlib

# After Submitting

- Update the website to include the new syntax
- Update [tree-sitter-nu](https://github.com/nushell/tree-sitter-nu)

---------

Co-authored-by: Bahex <17417311+Bahex@users.noreply.github.com>
2025-05-18 12:19:09 +03:00

180 lines
6.1 KiB
Rust

use nu_engine::command_prelude::*;
use nu_protocol::{IntoValue, ast::PathMember, casing::Casing};
#[derive(Clone)]
pub struct SplitCellPath;
impl Command for SplitCellPath {
fn name(&self) -> &str {
"split cell-path"
}
fn signature(&self) -> Signature {
Signature::build(self.name())
.input_output_types(vec![
(Type::CellPath, Type::List(Box::new(Type::Any))),
(
Type::CellPath,
Type::List(Box::new(Type::Record(
[
("value".into(), Type::Any),
("optional".into(), Type::Bool),
("insensitive".into(), Type::Bool),
]
.into(),
))),
),
])
.category(Category::Conversions)
.allow_variants_without_examples(true)
}
fn description(&self) -> &str {
"Split a cell-path into its components."
}
fn search_terms(&self) -> Vec<&str> {
vec!["convert"]
}
fn run(
&self,
_engine_state: &EngineState,
_stack: &mut Stack,
call: &Call,
input: PipelineData,
) -> Result<PipelineData, ShellError> {
let head = call.head;
let input_type = input.get_type();
let src_span = match input {
// Early return on correct type and empty pipeline
PipelineData::Value(Value::CellPath { val, .. }, _) => {
return Ok(split_cell_path(val, head)?.into_pipeline_data());
}
PipelineData::Empty => return Err(ShellError::PipelineEmpty { dst_span: head }),
// Extract span from incorrect pipeline types
// NOTE: Match arms can't be combined, `stream`s are of different types
PipelineData::Value(other, _) => other.span(),
PipelineData::ListStream(stream, ..) => stream.span(),
PipelineData::ByteStream(stream, ..) => stream.span(),
};
Err(ShellError::OnlySupportsThisInputType {
exp_input_type: "cell-path".into(),
wrong_type: input_type.to_string(),
dst_span: head,
src_span,
})
}
fn examples(&self) -> Vec<Example> {
vec![
Example {
description: "Split a cell-path into its components",
example: "$.5?.c | split cell-path",
result: Some(Value::test_list(vec![
Value::test_record(record! {
"value" => Value::test_int(5),
"optional" => Value::test_bool(true),
"insensitive" => Value::test_bool(false),
}),
Value::test_record(record! {
"value" => Value::test_string("c"),
"optional" => Value::test_bool(false),
"insensitive" => Value::test_bool(false),
}),
])),
},
Example {
description: "Split a complex cell-path",
example: r#"$.a!.b?.1."2"."c.d" | split cell-path"#,
result: Some(Value::test_list(vec![
Value::test_record(record! {
"value" => Value::test_string("a"),
"optional" => Value::test_bool(false),
"insensitive" => Value::test_bool(true),
}),
Value::test_record(record! {
"value" => Value::test_string("b"),
"optional" => Value::test_bool(true),
"insensitive" => Value::test_bool(false),
}),
Value::test_record(record! {
"value" => Value::test_int(1),
"optional" => Value::test_bool(false),
"insensitive" => Value::test_bool(false),
}),
Value::test_record(record! {
"value" => Value::test_string("2"),
"optional" => Value::test_bool(false),
"insensitive" => Value::test_bool(false),
}),
Value::test_record(record! {
"value" => Value::test_string("c.d"),
"optional" => Value::test_bool(false),
"insensitive" => Value::test_bool(false),
}),
])),
},
]
}
}
fn split_cell_path(val: CellPath, span: Span) -> Result<Value, ShellError> {
#[derive(IntoValue)]
struct PathMemberRecord {
value: Value,
optional: bool,
insensitive: bool,
}
impl PathMemberRecord {
fn from_path_member(pm: PathMember) -> Self {
let (optional, insensitive, internal_span) = match pm {
PathMember::String {
optional,
casing,
span,
..
} => (optional, casing == Casing::Insensitive, span),
PathMember::Int { optional, span, .. } => (optional, false, span),
};
let value = match pm {
PathMember::String { val, .. } => Value::string(val, internal_span),
PathMember::Int { val, .. } => Value::int(val as i64, internal_span),
};
Self {
value,
optional,
insensitive,
}
}
}
let members = val
.members
.into_iter()
.map(|pm| {
let span = match pm {
PathMember::String { span, .. } | PathMember::Int { span, .. } => span,
};
PathMemberRecord::from_path_member(pm).into_value(span)
})
.collect();
Ok(Value::list(members, span))
}
#[cfg(test)]
mod test {
use super::*;
#[test]
fn test_examples() {
use crate::test_examples;
test_examples(SplitCellPath {})
}
}