* Document the lexer and lightly improve its names
The bulk of this pull request adds a substantial amount of new inline
documentation for the lexer. Along the way, I made a few minor changes
to the names in the lexer, most of which were internal.
The main change that affects other files is renaming `group` to `block`,
since the function is actually parsing a block (a list of groups).
* Further clean up the lexer
- Consolidate the logic of the various token builders into a single type
- Improve and clean up the event-driven BlockParser
- Clean up comment parsing. Comments now contain their original leading
whitespace as well as trailing whitespace, and know how to move some
leading whitespace back into the body based on how the lexer decides
to dedent the comments. This preserves the original whitespace
information while still making it straight-forward to eliminate leading
whitespace in help comments.
* Update meta.rs
* WIP
* fix clippy
* remove unwraps
* remove unwraps
Co-authored-by: Jonathan Turner <jonathandturner@users.noreply.github.com>
Co-authored-by: Jonathan Turner <jonathan.d.turner@gmail.com>
* Put parse_definition related funcs into own module
* Add failing lexer test
* Implement Parsing of definition signature
This commit applied changes how the signature of a function is parsed. Before
there was a little bit of "quick-and-dirty" string-matching/parsing involved.
Now, a signature is a little bit more properly parsed.
The grammar of a definition signature understood by these parsing-functions is
as follows:
`[ (parameter | flag | <eol>)* ]`
where
parameter is:
`name (<:> type)? (<,> | <eol> | (#Comment <eol>))?`
flag is:
`--name (-shortform)? (<:> type)? (<,> | <eol> | (#Comment <eol>))?`
(Note: After the last item no <,> has to come.)
Note: It is now possible to pass comments to flags and parameters
Example:
[
d:int # The required d parameter
--x (-x):string # The all powerful x flag
--y (-y):int # The accompanying y flag
]
(Sadly there seems to be a bug (Or is this expected behaviour?) in the lexer, because of which `--x(-x)` would
be treated as one baseline token and is therefore not correctly recognized as 2. For
now a space has to be inserted)
During the implementation of the module, 2 question arose:
Should flag/parameter names be allowed to be type names?
Example case:
```shell
def f [ string ] { echo $string }
```
Currently an error is thrown
* Fix clippy lints
* Remove wrong comment
* Add spacing
* Add Cargo.lock
* Update dependencies
* Document the lexer and lightly improve its names
The bulk of this pull request adds a substantial amount of new inline
documentation for the lexer. Along the way, I made a few minor changes
to the names in the lexer, most of which were internal.
The main change that affects other files is renaming `group` to `block`,
since the function is actually parsing a block (a list of groups).
* Fix rustfmt
* Update lock
Co-authored-by: Jonathan Turner <jonathandturner@users.noreply.github.com>
Co-authored-by: Jonathan Turner <jonathan.d.turner@gmail.com>
* Begin allowing comments and multiline scripts.
* clippy
* Finish moving to groups. Test pass
* Keep going
* WIP
* WIP
* BROKEN WIP
* WIP
* WIP
* Fix more tests
* WIP: alias starts working
* Broken WIP
* Broken WIP
* Variables begin to work
* captures start working
* A little better but needs fixed scope
* Shorthand env setting
* Update main merge
* Broken WIP
* WIP
* custom command parsing
* Custom commands start working
* Fix coloring and parsing of block
* Almost there
* Add some tests
* Add more param types
* Bump version
* Fix benchmark
* Fix stuff
* remove unused dependencies
* moved umask to cfg(unix)
* changed Inflector to inflector, hoping it fixes the issue.
* roll back Inflector
* removed commented out deps now that everything looks good.
* Manifests check. Ignore doctests for now.
* We continue with refactorings towards the separation of concerns between
crates. `nu_plugin_inc` and `nu_plugin_str` common test helpers usage
has been refactored into `nu-plugin` value test helpers.
Inc also uses the new API for integration tests.
This commit contains two improvements:
- Support for a Range syntax (and a corresponding Range value)
- Work towards a signature syntax
Implementing the Range syntax resulted in cleaning up how operators in
the core syntax works. There are now two kinds of infix operators
- tight operators (`.` and `..`)
- loose operators
Tight operators may not be interspersed (`$it.left..$it.right` is a
syntax error). Loose operators require whitespace on both sides of the
operator, and can be arbitrarily interspersed. Precedence is left to
right in the core syntax.
Note that delimited syntax (like `( ... )` or `[ ... ]`) is a single
token node in the core syntax. A single token node can be parsed from
beginning to end in a context-free manner.
The rule for `.` is `<token node>.<member>`. The rule for `..` is
`<token node>..<token node>`.
Loose operators all have the same syntactic rule: `<token
node><space><loose op><space><token node>`.
The second aspect of this pull request is the beginning of support for a
signature syntax. Before implementing signatures, a necessary
prerequisite is for the core syntax to support multi-line programs.
That work establishes a few things:
- `;` and newlines are handled in the core grammar, and both count as
"separators"
- line comments begin with `#` and continue until the end of the line
In this commit, multi-token productions in the core grammar can use
separators interchangably with spaces. However, I think we will
ultimately want a different rule preventing separators from occurring
before an infix operator, so that the end of a line is always
unambiguous. This would avoid gratuitous differences between modules and
repl usage.
We already effectively have this rule, because otherwise `x<newline> |
y` would be a single pipeline, but of course that wouldn't work.
Previously, external words accidentally used
ExpansionRule::new().allow_external_command(), when it should have been
ExpansionRule::new().allow_external_word().
External words are the broadest category in the parser, and are the
appropriate category for external arguments. This was just a mistake.
This commit extracts five new crates:
- nu-source, which contains the core source-code handling logic in Nu,
including Text, Span, and also the pretty.rs-based debug logic
- nu-parser, which is the parser and expander logic
- nu-protocol, which is the bulk of the types and basic conveniences
used by plugins
- nu-errors, which contains ShellError, ParseError and error handling
conveniences
- nu-textview, which is the textview plugin extracted into a crate
One of the major consequences of this refactor is that it's no longer
possible to `impl X for Spanned<Y>` outside of the `nu-source` crate, so
a lot of types became more concrete (Value became a concrete type
instead of Spanned<Value>, for example).
This also turned a number of inherent methods in the main nu crate into
plain functions (impl Value {} became a bunch of functions in the
`value` namespace in `crate::data::value`).