Files
nushell/crates/nu-protocol/src/ast/match_pattern.rs
Devyn Cairns 35d2750757 Change how and and or operations are compiled to IR to support custom values (#14653)
# Description

Because `and` and `or` are short-circuiting operations in Nushell, they
must be compiled to a sequence that avoids evaluating the RHS if the LHS
is already sufficient to determine the output - i.e., `false` for `and`
and `true` for `or`. I initially implemented this with `branch-if`
instructions, simply returning the RHS if it needed to be evaluated, and
returning the short-circuited boolean value if it did not.

Example for `$a and $b`:

```
   0: load-variable          %0, var 999 "$a"
   1: branch-if              %0, 3
   2: jump                   5
   3: load-variable          %0, var 1000 "$b" # label(0), from(1:)
   4: jump                   6
   5: load-literal           %0, bool(false) # label(1), from(2:)
   6: span                   %0          # label(2), from(4:)
   7: return                 %0
```

Unfortunately, this broke polars, because using `and`/`or` on custom
values is perfectly valid and they're allowed to define that behavior
differently, and the polars plugin uses this for boolean masks. But
without using the `binary-op` instruction, that custom behavior is never
invoked. Additionally, `branch-if` requires a boolean, and custom values
are not booleans. This changes the IR to the following, using the
`match` instruction to check for the specific short-circuit value
instead, and still invoking `binary-op` otherwise:

```
   0: load-variable          %0, var 125 "$a"
   1: match                  (false), %0, 4
   2: load-variable          %1, var 124 "$b"
   3: binary-op              %0, Boolean(And), %1
   4: span                   %0          # label(0), from(1:)
   5: return                 %0
```

I've also renamed `Pattern::Value` to `Pattern::Expression` and added a
proper `Pattern::Value` variant that actually contains a `Value`
instead. I'm still hoping to remove `Pattern::Expression` eventually,
because it's kind of a hack - we don't actually evaluate the expression,
we just match it against a few cases specifically for pattern matching,
and it's one of the cases where AST leaks into IR and I want to remove
all of those cases, because AST should not leak into IR.

Fixes #14518

# User-Facing Changes

- `and` and `or` now support custom values again.
- the IR is actually a little bit cleaner, though it may be a bit
slower; `match` is more complex.

# Tests + Formatting

The existing tests pass, but I didn't add anything new. Unfortunately I
don't think there's anything built-in to trigger this, but maybe some
testcases could be added to polars to test it.
2024-12-25 06:12:53 -06:00

77 lines
2.2 KiB
Rust

use super::Expression;
use crate::{Span, Value, VarId};
use serde::{Deserialize, Serialize};
/// AST Node for match arm with optional match guard
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct MatchPattern {
pub pattern: Pattern,
pub guard: Option<Box<Expression>>,
pub span: Span,
}
impl MatchPattern {
pub fn variables(&self) -> Vec<VarId> {
self.pattern.variables()
}
}
/// AST Node for pattern matching rules
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub enum Pattern {
/// Destructuring of records
Record(Vec<(String, MatchPattern)>),
/// List destructuring
List(Vec<MatchPattern>),
/// Matching against a literal (from expression result)
// TODO: it would be nice if this didn't depend on AST
// maybe const evaluation can get us to a Value instead?
Expression(Box<Expression>),
/// Matching against a literal (pure value)
Value(Value),
/// binding to a variable
Variable(VarId),
/// the `pattern1 \ pattern2` or-pattern
Or(Vec<MatchPattern>),
/// the `..$foo` pattern
Rest(VarId),
/// the `..` pattern
IgnoreRest,
/// the `_` pattern
IgnoreValue,
/// Failed parsing of a pattern
Garbage,
}
impl Pattern {
pub fn variables(&self) -> Vec<VarId> {
let mut output = vec![];
match self {
Pattern::Record(items) => {
for item in items {
output.append(&mut item.1.variables());
}
}
Pattern::List(items) => {
for item in items {
output.append(&mut item.variables());
}
}
Pattern::Variable(var_id) => output.push(*var_id),
Pattern::Or(patterns) => {
for pattern in patterns {
output.append(&mut pattern.variables());
}
}
Pattern::Rest(var_id) => output.push(*var_id),
Pattern::Expression(_)
| Pattern::Value(_)
| Pattern::IgnoreValue
| Pattern::Garbage
| Pattern::IgnoreRest => {}
}
output
}
}