Commit Graph

22 Commits

Author SHA1 Message Date
Leonhard Kipp
ef4e3f907c
parser/refactor def (#2986)
* Move tests into own file

* Move data structs to own file

* Move functions parsing 1 Token (primitives) into own file

* Rename param_flag_list to signature

* Add tests

* Fix clippy lint

* Change imports to new lexer structure
2021-02-08 08:10:14 +13:00
Yehuda Katz
d07789677f
Clean up lexer (#2956)
* Document the lexer and lightly improve its names

The bulk of this pull request adds a substantial amount of new inline
documentation for the lexer. Along the way, I made a few minor changes
to the names in the lexer, most of which were internal.

The main change that affects other files is renaming `group` to `block`,
since the function is actually parsing a block (a list of groups).

* Further clean up the lexer

- Consolidate the logic of the various token builders into a single type
- Improve and clean up the event-driven BlockParser
- Clean up comment parsing. Comments now contain their original leading
  whitespace as well as trailing whitespace, and know how to move some
  leading whitespace back into the body based on how the lexer decides
  to dedent the comments. This preserves the original whitespace
  information while still making it straight-forward to eliminate leading
  whitespace in help comments.

* Update meta.rs

* WIP

* fix clippy

* remove unwraps

* remove unwraps

Co-authored-by: Jonathan Turner <jonathandturner@users.noreply.github.com>
Co-authored-by: Jonathan Turner <jonathan.d.turner@gmail.com>
2021-02-04 20:20:21 +13:00
Leonhard Kipp
3e6e3a207c
Feature/def signature with comments (#2905)
* Put parse_definition related funcs into own module

* Add failing lexer test

* Implement Parsing of definition signature

This commit applied changes how the signature of a function is parsed. Before
there was a little bit of "quick-and-dirty" string-matching/parsing involved.
Now, a signature is a little bit more properly parsed.
The grammar of a definition signature understood by these parsing-functions is
as follows:
 `[ (parameter | flag | <eol>)* ]`
where
parameter is:
    `name (<:> type)? (<,> | <eol> | (#Comment <eol>))?`
flag is:
    `--name (-shortform)? (<:> type)? (<,> | <eol> | (#Comment <eol>))?`
(Note: After the last item no <,> has to come.)
Note: It is now possible to pass comments to flags and parameters
Example:
[
  d:int          # The required d parameter
  --x (-x):string # The all powerful x flag
  --y (-y):int    # The accompanying y flag
]

(Sadly there seems to be a bug (Or is this expected behaviour?) in the lexer, because of which `--x(-x)` would
be treated as one baseline token and is therefore not correctly recognized as 2. For
now a space has to be inserted)

During the implementation of the module, 2 question arose:
Should flag/parameter names be allowed to be type names?
Example case:
```shell
def f [ string ] { echo $string }
```
Currently an error is thrown

* Fix clippy lints

* Remove wrong comment

* Add spacing

* Add Cargo.lock
2021-01-12 06:53:58 +13:00
Yehuda Katz
f410fb6689
Document lexer (#2865)
* Update dependencies

* Document the lexer and lightly improve its names

The bulk of this pull request adds a substantial amount of new inline
documentation for the lexer. Along the way, I made a few minor changes
to the names in the lexer, most of which were internal.

The main change that affects other files is renaming `group` to `block`,
since the function is actually parsing a block (a list of groups).

* Fix rustfmt

* Update lock

Co-authored-by: Jonathan Turner <jonathandturner@users.noreply.github.com>
Co-authored-by: Jonathan Turner <jonathan.d.turner@gmail.com>
2021-01-07 16:03:00 +13:00
Jonathan Turner
ac578b8491
Multiline scripts part 2 (#2795)
* Begin allowing comments and multiline scripts.

* clippy

* Finish moving to groups. Test pass

* Keep going

* WIP

* WIP

* BROKEN WIP

* WIP

* WIP

* Fix more tests

* WIP: alias starts working

* Broken WIP

* Broken WIP

* Variables begin to work

* captures start working

* A little better but needs fixed scope

* Shorthand env setting

* Update main merge

* Broken WIP

* WIP

* custom command parsing

* Custom commands start working

* Fix coloring and parsing of block

* Almost there

* Add some tests

* Add more param types

* Bump version

* Fix benchmark

* Fix stuff
2020-12-18 20:53:49 +13:00
Jonathan Turner
8df748463d
Getting ready for multiline scripts (#2737)
* WIP

* WIP

* WIP

* Tests are passing

* make parser more resilient

* lint
2020-11-10 05:27:07 +13:00
Jason Gedge
cda53b6cda
Return incomplete parse from lite_parse (#2284)
* Move lite_parse tests into a submodule

* Have lite_parse return partial parses when error encountered.

Although a parse fails, we can generally still return what was successfully
parsed. This is useful, for example, when figuring out completions at some
cursor position, because we can map the cursor to something more structured
(e.g., cursor is at a flag name).
2020-08-02 06:39:55 +12:00
Jonathan Turner
b89976daef
let format access variables also (#1842) 2020-05-19 16:20:09 +12:00
Jonathan Turner
0743b69ad5
Move from language-reporting to codespan (#1825) 2020-05-19 06:44:27 +12:00
Kevin Del Castillo
9ec2aca86f
Support completion for paths with multiple dots (#1640)
* refactor: expand_path and expand_ndots now work for any string.

* refactor: refactor test and add new ones.

* refactor: convert expanded to owned string

* feat: pub export of expand_ndots

* feat: add completion for ndots in fs-shell
2020-04-23 16:17:38 +12:00
Jonathan Turner
eec94e4016
Semicolon (#1613)
* WIP on blocks

* Getting further

* add some tests
2020-04-20 18:41:51 +12:00
Kevin Del Castillo
928188b18e
Redesign custom canonicalize, enable multiple dots expansion earlier. (#1588)
* Expand n dots early where tilde was also expanded.

* Remove normalize, not needed.
New function absolutize, doesn't follow links neither checks existence.
Renamed canonicalize_existing to canonicalize, works as expected.

* Remove normalize usages, change canonicalize.

* Treat strings as paths
2020-04-16 11:29:22 +12:00
Jonathan Turner
08a09e2273
Pipeline blocks (#1579)
* Making Commands match what UntaggedValue needs

* WIP

* WIP

* WIP

* Moved to expressions for conditions

* Add 'each' command to use command blocks

* More cleanup

* Add test for 'each'

* Instead use an expression block
2020-04-13 19:59:57 +12:00
Jonathan Turner
c4daa2e40f
Add experimental new parser (#1554)
Move to an experimental new parser
2020-04-06 19:16:14 +12:00
Andrés N. Robalino
b36d21e76f
Infer types from regular delimited plain text unstructured files. (#1494)
* Infer types from regular delimited plain text unstructured files.

* Nothing resolves to an empty string.
2020-03-16 15:50:45 -05:00
Yehuda Katz
7efb31a4e4 Restructure and streamline token expansion (#1123)
Restructure and streamline token expansion

The purpose of this commit is to streamline the token expansion code, by
removing aspects of the code that are no longer relevant, removing
pointless duplication, and eliminating the need to pass the same
arguments to `expand_syntax`.

The first big-picture change in this commit is that instead of a handful
of `expand_` functions, which take a TokensIterator and ExpandContext, a
smaller number of methods on the `TokensIterator` do the same job.

The second big-picture change in this commit is fully eliminating the
coloring traits, making coloring a responsibility of the base expansion
implementations. This also means that the coloring tracer is merged into
the expansion tracer, so you can follow a single expansion and see how
the expansion process produced colored tokens.

One side effect of this change is that the expander itself is marginally
more error-correcting. The error correction works by switching from
structured expansion to `BackoffColoringMode` when an unexpected token
is found, which guarantees that all spans of the source are colored, but
may not be the most optimal error recovery strategy.

That said, because `BackoffColoringMode` only extends as far as a
closing delimiter (`)`, `]`, `}`) or pipe (`|`), it does result in
fairly granular correction strategy.

The current code still produces an `Err` (plus a complete list of
colored shapes) from the parsing process if any errors are encountered,
but this could easily be addressed now that the underlying expansion is
error-correcting.

This commit also colors any spans that are syntax errors in red, and
causes the parser to include some additional information about what
tokens were expected at any given point where an error was encountered,
so that completions and hinting could be more robust in the future.

Co-authored-by: Jonathan Turner <jonathandturner@users.noreply.github.com>
Co-authored-by: Andrés N. Robalino <andres@androbtech.com>
2020-01-21 17:45:03 -05:00
Jonathan Turner
72838cc083
Move to using clippy (#1142)
* Clippy fixes

* Finish converting to use clippy

* fix warnings in new master

* fix windows

* fix windows

Co-authored-by: Artem Vorotnikov <artem@vorotnikov.me>
2019-12-31 20:36:08 +13:00
Andrés N. Robalino
87cc6d6f01 Separate internal and external command definitions. 2019-12-15 01:24:31 -05:00
Yehuda Katz
57af9b5040 Add Range and start Signature support
This commit contains two improvements:

- Support for a Range syntax (and a corresponding Range value)
- Work towards a signature syntax

Implementing the Range syntax resulted in cleaning up how operators in
the core syntax works. There are now two kinds of infix operators

- tight operators (`.` and `..`)
- loose operators

Tight operators may not be interspersed (`$it.left..$it.right` is a
syntax error). Loose operators require whitespace on both sides of the
operator, and can be arbitrarily interspersed. Precedence is left to
right in the core syntax.

Note that delimited syntax (like `( ... )` or `[ ... ]`) is a single
token node in the core syntax. A single token node can be parsed from
beginning to end in a context-free manner.

The rule for `.` is `<token node>.<member>`. The rule for `..` is
`<token node>..<token node>`.

Loose operators all have the same syntactic rule: `<token
node><space><loose op><space><token node>`.

The second aspect of this pull request is the beginning of support for a
signature syntax. Before implementing signatures, a necessary
prerequisite is for the core syntax to support multi-line programs.

That work establishes a few things:

- `;` and newlines are handled in the core grammar, and both count as
  "separators"
- line comments begin with `#` and continue until the end of the line

In this commit, multi-token productions in the core grammar can use
separators interchangably with spaces. However, I think we will
ultimately want a different rule preventing separators from occurring
before an infix operator, so that the end of a line is always
unambiguous. This would avoid gratuitous differences between modules and
repl usage.

We already effectively have this rule, because otherwise `x<newline> |
y` would be a single pipeline, but of course that wouldn't work.
2019-12-11 16:41:07 -08:00
Jonathan Turner
b6ba7f97fd WIP param completions 2019-12-08 18:58:53 +13:00
Belhorma Bendebiche
8f9dd6516e Add =~ and !~ operators on strings
`left =~ right` return true if left contains right, using Rust's
`String::contains`. `!~` is the negated version.

A new `apply_operator` function is added which decouples evaluation from
`Value::compare`. This returns a `Value` and opens the door to
implementing `+` for example, though it wouldn't be useful immediately.

The `operator!` macro had to be changed slightly as it would choke on
`~` in arguments.
2019-12-02 11:02:57 -08:00
Yehuda Katz
e4226def16 Extract core stuff into own crates
This commit extracts five new crates:

- nu-source, which contains the core source-code handling logic in Nu,
  including Text, Span, and also the pretty.rs-based debug logic
- nu-parser, which is the parser and expander logic
- nu-protocol, which is the bulk of the types and basic conveniences
  used by plugins
- nu-errors, which contains ShellError, ParseError and error handling
  conveniences
- nu-textview, which is the textview plugin extracted into a crate

One of the major consequences of this refactor is that it's no longer
possible to `impl X for Spanned<Y>` outside of the `nu-source` crate, so
a lot of types became more concrete (Value became a concrete type
instead of Spanned<Value>, for example).

This also turned a number of inherent methods in the main nu crate into
plain functions (impl Value {} became a bunch of functions in the
`value` namespace in `crate::data::value`).
2019-12-02 10:54:12 -08:00