Add Range and start Signature support

This commit contains two improvements:

- Support for a Range syntax (and a corresponding Range value)
- Work towards a signature syntax

Implementing the Range syntax resulted in cleaning up how operators in
the core syntax works. There are now two kinds of infix operators

- tight operators (`.` and `..`)
- loose operators

Tight operators may not be interspersed (`$it.left..$it.right` is a
syntax error). Loose operators require whitespace on both sides of the
operator, and can be arbitrarily interspersed. Precedence is left to
right in the core syntax.

Note that delimited syntax (like `( ... )` or `[ ... ]`) is a single
token node in the core syntax. A single token node can be parsed from
beginning to end in a context-free manner.

The rule for `.` is `<token node>.<member>`. The rule for `..` is
`<token node>..<token node>`.

Loose operators all have the same syntactic rule: `<token
node><space><loose op><space><token node>`.

The second aspect of this pull request is the beginning of support for a
signature syntax. Before implementing signatures, a necessary
prerequisite is for the core syntax to support multi-line programs.

That work establishes a few things:

- `;` and newlines are handled in the core grammar, and both count as
  "separators"
- line comments begin with `#` and continue until the end of the line

In this commit, multi-token productions in the core grammar can use
separators interchangably with spaces. However, I think we will
ultimately want a different rule preventing separators from occurring
before an infix operator, so that the end of a line is always
unambiguous. This would avoid gratuitous differences between modules and
repl usage.

We already effectively have this rule, because otherwise `x<newline> |
y` would be a single pipeline, but of course that wouldn't work.
This commit is contained in:
Yehuda Katz
2019-12-04 13:14:52 -08:00
parent 16272b1b20
commit 57af9b5040
64 changed files with 2522 additions and 738 deletions

View File

@ -8,6 +8,7 @@ mod return_value;
mod signature;
mod syntax_shape;
mod type_name;
mod type_shape;
mod value;
pub use crate::call_info::{CallInfo, EvaluatedArgs};
@ -17,9 +18,11 @@ pub use crate::return_value::{CommandAction, ReturnSuccess, ReturnValue};
pub use crate::signature::{NamedType, PositionalType, Signature};
pub use crate::syntax_shape::SyntaxShape;
pub use crate::type_name::{PrettyType, ShellTypeName, SpannedTypeName};
pub use crate::type_shape::{Row as RowType, Type};
pub use crate::value::column_path::{did_you_mean, ColumnPath, PathMember, UnspannedPathMember};
pub use crate::value::dict::{Dictionary, TaggedDictBuilder};
pub use crate::value::evaluate::{Evaluate, EvaluateTrait, Scope};
pub use crate::value::primitive::format_primitive;
pub use crate::value::primitive::Primitive;
pub use crate::value::range::{Range, RangeInclusion};
pub use crate::value::{UntaggedValue, Value};

View File

@ -1,4 +1,5 @@
use crate::syntax_shape::SyntaxShape;
use crate::type_shape::Type;
use indexmap::IndexMap;
use nu_source::{b, DebugDocBuilder, PrettyDebug, PrettyDebugWithSource};
use serde::{Deserialize, Serialize};
@ -76,6 +77,8 @@ pub struct Signature {
pub positional: Vec<(PositionalType, Description)>,
pub rest_positional: Option<(SyntaxShape, Description)>,
pub named: IndexMap<String, (NamedType, Description)>,
pub yields: Option<Type>,
pub input: Option<Type>,
pub is_filter: bool,
}
@ -98,14 +101,16 @@ impl PrettyDebugWithSource for Signature {
}
impl Signature {
pub fn new(name: String) -> Signature {
pub fn new(name: impl Into<String>) -> Signature {
Signature {
name,
name: name.into(),
usage: String::new(),
positional: vec![],
rest_positional: None,
named: IndexMap::new(),
is_filter: false,
yields: None,
input: None,
}
}
@ -186,4 +191,14 @@ impl Signature {
self.rest_positional = Some((ty, desc.into()));
self
}
pub fn yields(mut self, ty: Type) -> Signature {
self.yields = Some(ty);
self
}
pub fn input(mut self, ty: Type) -> Signature {
self.input = Some(ty);
self
}
}

View File

@ -8,6 +8,7 @@ pub enum SyntaxShape {
Member,
ColumnPath,
Number,
Range,
Int,
Path,
Pattern,
@ -22,6 +23,7 @@ impl PrettyDebug for SyntaxShape {
SyntaxShape::Member => "member shape",
SyntaxShape::ColumnPath => "column path shape",
SyntaxShape::Number => "number shape",
SyntaxShape::Range => "range shape",
SyntaxShape::Int => "integer shape",
SyntaxShape::Path => "file path shape",
SyntaxShape::Pattern => "pattern shape",

View File

@ -0,0 +1,382 @@
use crate::value::dict::Dictionary;
use crate::value::primitive::Primitive;
use crate::value::range::RangeInclusion;
use crate::value::{UntaggedValue, Value};
use derive_new::new;
use nu_source::{b, DebugDoc, DebugDocBuilder, PrettyDebug};
use serde::{Deserialize, Deserializer, Serialize};
use std::collections::BTreeMap;
use std::fmt::Debug;
use std::hash::Hash;
/**
This file describes the structural types of the nushell system.
Its primary purpose today is to identify "equivalent" values for the purpose
of merging rows into a single table or identify rows in a table that have the
same shape for reflection.
*/
#[derive(Debug, Clone, Eq, PartialEq, Ord, PartialOrd, Hash, Serialize, Deserialize, new)]
pub struct RangeType {
from: (Type, RangeInclusion),
to: (Type, RangeInclusion),
}
#[derive(Debug, Clone, Eq, PartialEq, Ord, PartialOrd, Hash, Serialize, Deserialize)]
pub enum Type {
Nothing,
Int,
Range(Box<RangeType>),
Decimal,
Bytesize,
String,
Line,
ColumnPath,
Pattern,
Boolean,
Date,
Duration,
Path,
Binary,
Row(Row),
Table(Vec<Type>),
// TODO: Block arguments
Block,
// TODO: Error type
Error,
// Stream markers (used as bookend markers rather than actual values)
BeginningOfStream,
EndOfStream,
}
#[derive(Debug, Clone, Eq, PartialEq, Ord, PartialOrd, Hash, new)]
pub struct Row {
#[new(default)]
map: BTreeMap<Column, Type>,
}
impl Serialize for Row {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
serializer.collect_map(self.map.iter())
}
}
impl<'de> Deserialize<'de> for Row {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
struct RowVisitor;
impl<'de> serde::de::Visitor<'de> for RowVisitor {
type Value = Row;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(formatter, "a row")
}
fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
where
A: serde::de::MapAccess<'de>,
{
let mut new_map = BTreeMap::new();
loop {
let entry = map.next_entry()?;
match entry {
None => return Ok(Row { map: new_map }),
Some((key, value)) => {
new_map.insert(key, value);
}
}
}
}
}
deserializer.deserialize_map(RowVisitor)
}
}
impl Type {
pub fn from_primitive(primitive: &Primitive) -> Type {
match primitive {
Primitive::Nothing => Type::Nothing,
Primitive::Int(_) => Type::Int,
Primitive::Range(range) => {
let (left_value, left_inclusion) = &range.from;
let (right_value, right_inclusion) = &range.to;
let left_type = (Type::from_primitive(left_value), *left_inclusion);
let right_type = (Type::from_primitive(right_value), *right_inclusion);
let range = RangeType::new(left_type, right_type);
Type::Range(Box::new(range))
}
Primitive::Decimal(_) => Type::Decimal,
Primitive::Bytes(_) => Type::Bytesize,
Primitive::String(_) => Type::String,
Primitive::Line(_) => Type::Line,
Primitive::ColumnPath(_) => Type::ColumnPath,
Primitive::Pattern(_) => Type::Pattern,
Primitive::Boolean(_) => Type::Boolean,
Primitive::Date(_) => Type::Date,
Primitive::Duration(_) => Type::Duration,
Primitive::Path(_) => Type::Path,
Primitive::Binary(_) => Type::Binary,
Primitive::BeginningOfStream => Type::BeginningOfStream,
Primitive::EndOfStream => Type::EndOfStream,
}
}
pub fn from_dictionary(dictionary: &Dictionary) -> Type {
let mut map = BTreeMap::new();
for (key, value) in dictionary.entries.iter() {
let column = Column::String(key.clone());
map.insert(column, Type::from_value(value));
}
Type::Row(Row { map })
}
pub fn from_table<'a>(table: impl IntoIterator<Item = &'a Value>) -> Type {
let mut vec = vec![];
for item in table.into_iter() {
vec.push(Type::from_value(item))
}
Type::Table(vec)
}
pub fn from_value<'a>(value: impl Into<&'a UntaggedValue>) -> Type {
match value.into() {
UntaggedValue::Primitive(p) => Type::from_primitive(p),
UntaggedValue::Row(row) => Type::from_dictionary(row),
UntaggedValue::Table(table) => Type::from_table(table.iter()),
UntaggedValue::Error(_) => Type::Error,
UntaggedValue::Block(_) => Type::Block,
}
}
}
impl PrettyDebug for Type {
fn pretty(&self) -> DebugDocBuilder {
match self {
Type::Nothing => ty("nothing"),
Type::Int => ty("integer"),
Type::Range(range) => {
let (left, left_inclusion) = &range.from;
let (right, right_inclusion) = &range.to;
let left_bracket = b::delimiter(match left_inclusion {
RangeInclusion::Exclusive => "(",
RangeInclusion::Inclusive => "[",
});
let right_bracket = b::delimiter(match right_inclusion {
RangeInclusion::Exclusive => ")",
RangeInclusion::Inclusive => "]",
});
b::typed(
"range",
(left_bracket
+ left.pretty()
+ b::operator(",")
+ b::space()
+ right.pretty()
+ right_bracket)
.group(),
)
}
Type::Decimal => ty("decimal"),
Type::Bytesize => ty("bytesize"),
Type::String => ty("string"),
Type::Line => ty("line"),
Type::ColumnPath => ty("column-path"),
Type::Pattern => ty("pattern"),
Type::Boolean => ty("boolean"),
Type::Date => ty("date"),
Type::Duration => ty("duration"),
Type::Path => ty("path"),
Type::Binary => ty("binary"),
Type::Error => b::error("error"),
Type::BeginningOfStream => b::keyword("beginning-of-stream"),
Type::EndOfStream => b::keyword("end-of-stream"),
Type::Row(row) => (b::kind("row")
+ b::space()
+ b::intersperse(
row.map.iter().map(|(key, ty)| {
(b::key(match key {
Column::String(string) => string.clone(),
Column::Value => "<value>".to_string(),
}) + b::delimit("(", ty.pretty(), ")").into_kind())
.nest()
}),
b::space(),
)
.nest())
.nest(),
Type::Table(table) => {
let mut group: Group<DebugDoc, Vec<(usize, usize)>> = Group::new();
for (i, item) in table.iter().enumerate() {
group.add(item.to_doc(), i);
}
(b::kind("table") + b::space() + b::keyword("of")).group()
+ b::space()
+ (if group.len() == 1 {
let (doc, _) = group.into_iter().nth(0).unwrap();
DebugDocBuilder::from_doc(doc)
} else {
b::intersperse(
group.into_iter().map(|(doc, rows)| {
(b::intersperse(
rows.iter().map(|(from, to)| {
if from == to {
b::description(from)
} else {
(b::description(from)
+ b::space()
+ b::keyword("to")
+ b::space()
+ b::description(to))
.group()
}
}),
b::description(", "),
) + b::description(":")
+ b::space()
+ DebugDocBuilder::from_doc(doc))
.nest()
}),
b::space(),
)
})
}
Type::Block => ty("block"),
}
}
}
#[derive(Debug, new)]
struct DebugEntry<'a> {
key: &'a Column,
value: &'a Type,
}
impl<'a> PrettyDebug for DebugEntry<'a> {
fn pretty(&self) -> DebugDocBuilder {
(b::key(match self.key {
Column::String(string) => string.clone(),
Column::Value => format!("<value>"),
}) + b::delimit("(", self.value.pretty(), ")").into_kind())
}
}
fn ty(name: impl std::fmt::Display) -> DebugDocBuilder {
b::kind(format!("{}", name))
}
pub trait GroupedValue: Debug + Clone {
type Item;
fn new() -> Self;
fn merge(&mut self, value: Self::Item);
}
impl GroupedValue for Vec<(usize, usize)> {
type Item = usize;
fn new() -> Vec<(usize, usize)> {
vec![]
}
fn merge(&mut self, new_value: usize) {
match self.last_mut() {
Some(value) if value.1 == new_value - 1 => {
value.1 += 1;
}
_ => self.push((new_value, new_value)),
}
}
}
#[derive(Debug)]
pub struct Group<K: Debug + Eq + Hash, V: GroupedValue> {
values: indexmap::IndexMap<K, V>,
}
impl<K, G> Group<K, G>
where
K: Debug + Eq + Hash,
G: GroupedValue,
{
pub fn new() -> Group<K, G> {
Group {
values: indexmap::IndexMap::default(),
}
}
pub fn len(&self) -> usize {
self.values.len()
}
pub fn into_iter(self) -> impl Iterator<Item = (K, G)> {
self.values.into_iter()
}
pub fn add(&mut self, key: impl Into<K>, value: impl Into<G::Item>) {
let key = key.into();
let value = value.into();
let group = self.values.get_mut(&key);
match group {
None => {
self.values.insert(key, {
let mut group = G::new();
group.merge(value.into());
group
});
}
Some(group) => {
group.merge(value.into());
}
}
}
}
#[derive(Debug, Clone, Eq, PartialEq, Ord, PartialOrd, Serialize, Deserialize, Hash)]
pub enum Column {
String(String),
Value,
}
impl Into<Column> for String {
fn into(self) -> Column {
Column::String(self)
}
}
impl Into<Column> for &String {
fn into(self) -> Column {
Column::String(self.clone())
}
}
impl Into<Column> for &str {
fn into(self) -> Column {
Column::String(self.to_string())
}
}

View File

@ -4,6 +4,7 @@ mod debug;
pub mod dict;
pub mod evaluate;
pub mod primitive;
pub mod range;
mod serde_bigdecimal;
mod serde_bigint;
@ -11,11 +12,12 @@ use crate::type_name::{ShellTypeName, SpannedTypeName};
use crate::value::dict::Dictionary;
use crate::value::evaluate::Evaluate;
use crate::value::primitive::Primitive;
use crate::value::range::{Range, RangeInclusion};
use crate::{ColumnPath, PathMember};
use bigdecimal::BigDecimal;
use indexmap::IndexMap;
use nu_errors::ShellError;
use nu_source::{AnchorLocation, HasSpan, Span, Tag};
use nu_source::{AnchorLocation, HasSpan, Span, Spanned, Tag};
use num_bigint::BigInt;
use serde::{Deserialize, Serialize};
use std::path::PathBuf;
@ -156,6 +158,13 @@ impl UntaggedValue {
UntaggedValue::Primitive(Primitive::Binary(binary))
}
pub fn range(
left: (Spanned<Primitive>, RangeInclusion),
right: (Spanned<Primitive>, RangeInclusion),
) -> UntaggedValue {
UntaggedValue::Primitive(Primitive::Range(Box::new(Range::new(left, right))))
}
pub fn boolean(s: impl Into<bool>) -> UntaggedValue {
UntaggedValue::Primitive(Primitive::Boolean(s.into()))
}
@ -224,6 +233,23 @@ impl Value {
_ => Err(ShellError::type_error("Path", self.spanned_type_name())),
}
}
pub fn as_primitive(&self) -> Result<Primitive, ShellError> {
match &self.value {
UntaggedValue::Primitive(primitive) => Ok(primitive.clone()),
_ => Err(ShellError::type_error(
"Primitive",
self.spanned_type_name(),
)),
}
}
pub fn as_u64(&self) -> Result<u64, ShellError> {
match &self.value {
UntaggedValue::Primitive(primitive) => primitive.as_u64(self.tag.span),
_ => Err(ShellError::type_error("integer", self.spanned_type_name())),
}
}
}
impl Into<UntaggedValue> for &str {

View File

@ -28,6 +28,7 @@ impl PrettyType for Primitive {
match self {
Primitive::Nothing => ty("nothing"),
Primitive::Int(_) => ty("integer"),
Primitive::Range(_) => ty("range"),
Primitive::Decimal(_) => ty("decimal"),
Primitive::Bytes(_) => ty("bytesize"),
Primitive::String(_) => ty("string"),
@ -51,6 +52,21 @@ impl PrettyDebug for Primitive {
Primitive::Nothing => b::primitive("nothing"),
Primitive::Int(int) => prim(format_args!("{}", int)),
Primitive::Decimal(decimal) => prim(format_args!("{}", decimal)),
Primitive::Range(range) => {
let (left, left_inclusion) = &range.from;
let (right, right_inclusion) = &range.to;
b::typed(
"range",
(left_inclusion.debug_left_bracket()
+ left.pretty()
+ b::operator(",")
+ b::space()
+ right.pretty()
+ right_inclusion.debug_right_bracket())
.group(),
)
}
Primitive::Bytes(bytes) => primitive_doc(bytes, "bytesize"),
Primitive::String(string) => prim(string),
Primitive::Line(string) => prim(string),

View File

@ -1,12 +1,14 @@
use crate::type_name::ShellTypeName;
use crate::value::column_path::ColumnPath;
use crate::value::range::Range;
use crate::value::{serde_bigdecimal, serde_bigint};
use bigdecimal::BigDecimal;
use chrono::{DateTime, Utc};
use chrono_humanize::Humanize;
use nu_source::PrettyDebug;
use nu_errors::{ExpectedRange, ShellError};
use nu_source::{PrettyDebug, Span, SpannedItem};
use num_bigint::BigInt;
use num_traits::cast::FromPrimitive;
use num_traits::cast::{FromPrimitive, ToPrimitive};
use serde::{Deserialize, Serialize};
use std::path::PathBuf;
@ -25,6 +27,7 @@ pub enum Primitive {
Boolean(bool),
Date(DateTime<Utc>),
Duration(u64), // Duration in seconds
Range(Box<Range>),
Path(PathBuf),
#[serde(with = "serde_bytes")]
Binary(Vec<u8>),
@ -34,6 +37,25 @@ pub enum Primitive {
EndOfStream,
}
impl Primitive {
pub fn as_u64(&self, span: Span) -> Result<u64, ShellError> {
match self {
Primitive::Int(int) => match int.to_u64() {
None => Err(ShellError::range_error(
ExpectedRange::U64,
&format!("{}", int).spanned(span),
"converting an integer into a 64-bit integer",
)),
Some(num) => Ok(num),
},
other => Err(ShellError::type_error(
"integer",
other.type_name().spanned(span),
)),
}
}
}
impl From<BigDecimal> for Primitive {
fn from(decimal: BigDecimal) -> Primitive {
Primitive::Decimal(decimal)
@ -51,6 +73,7 @@ impl ShellTypeName for Primitive {
match self {
Primitive::Nothing => "nothing",
Primitive::Int(_) => "integer",
Primitive::Range(_) => "range",
Primitive::Decimal(_) => "decimal",
Primitive::Bytes(_) => "bytes",
Primitive::String(_) => "string",
@ -91,6 +114,11 @@ pub fn format_primitive(primitive: &Primitive, field_name: Option<&String>) -> S
Primitive::Duration(sec) => format_duration(*sec),
Primitive::Int(i) => i.to_string(),
Primitive::Decimal(decimal) => decimal.to_string(),
Primitive::Range(range) => format!(
"{}..{}",
format_primitive(&range.from.0.item, None),
format_primitive(&range.to.0.item, None)
),
Primitive::Pattern(s) => s.to_string(),
Primitive::String(s) => s.to_owned(),
Primitive::Line(s) => s.to_owned(),
@ -125,7 +153,8 @@ pub fn format_primitive(primitive: &Primitive, field_name: Option<&String>) -> S
Primitive::Date(d) => d.humanize().to_string(),
}
}
fn format_duration(sec: u64) -> String {
pub fn format_duration(sec: u64) -> String {
let (minutes, seconds) = (sec / 60, sec % 60);
let (hours, minutes) = (minutes / 60, minutes % 60);
let (days, hours) = (hours / 24, hours % 24);

View File

@ -0,0 +1,32 @@
use crate::value::Primitive;
use derive_new::new;
use nu_source::{b, DebugDocBuilder, Spanned};
use serde::{Deserialize, Serialize};
#[derive(Debug, Copy, Clone, PartialEq, PartialOrd, Eq, Ord, Serialize, Deserialize, Hash)]
pub enum RangeInclusion {
Inclusive,
Exclusive,
}
impl RangeInclusion {
pub fn debug_left_bracket(&self) -> DebugDocBuilder {
b::delimiter(match self {
RangeInclusion::Exclusive => "(",
RangeInclusion::Inclusive => "[",
})
}
pub fn debug_right_bracket(&self) -> DebugDocBuilder {
b::delimiter(match self {
RangeInclusion::Exclusive => ")",
RangeInclusion::Inclusive => "]",
})
}
}
#[derive(Debug, Clone, PartialEq, PartialOrd, Eq, Ord, Serialize, Deserialize, new)]
pub struct Range {
pub from: (Spanned<Primitive>, RangeInclusion),
pub to: (Spanned<Primitive>, RangeInclusion),
}