tables and able to work with them for data processing & viewing
purposes. At the moment, certain ways to process said tables we
are able to view a histogram of a given column.
As usage matures, we may find certain core commands that could
be used ergonomically when working with tables on Nu.
The functions for retrieving, replacing, and inserting values into values all assumed they get the complete
column path as regular tagged strings. This commit changes for these to accept a tagged values instead. Basically
it means we can have column paths containing strings and numbers (eg. package.authors.1)
Unfortunately, for the moment all members when parsed and deserialized for a command that expects column paths
of tagged values will get tagged values (encapsulating Members) as strings only.
This makes it impossible to determine whether package.authors.1 package.authors."1" (meaning the "number" 1) is
a string member or a number member and thus prevents to know and force the user that paths enclosed in double
quotes means "retrieve the column at this given table" and that numbers are for retrieving a particular row number
from a table.
This commit sets in place the infraestructure needed when integer members land, in the mean time the workaround
is to convert back to strings the tagged values passed from the column paths.
The original purpose of this PR was to modernize the external parser to
use the new Shape system.
This commit does include some of that change, but a more important
aspect of this change is an improvement to the expansion trace.
Previous commit 6a7c00ea adding trace infrastructure to the syntax coloring
feature. This commit adds tracing to the expander.
The bulk of that work, in addition to the tree builder logic, was an
overhaul of the formatter traits to make them more general purpose, and
more structured.
Some highlights:
- `ToDebug` was split into two traits (`ToDebug` and `DebugFormat`)
because implementations needed to become objects, but a convenience
method on `ToDebug` didn't qualify
- `DebugFormat`'s `fmt_debug` method now takes a `DebugFormatter` rather
than a standard formatter, and `DebugFormatter` has a new (but still
limited) facility for structured formatting.
- Implementations of `ExpandSyntax` need to produce output that
implements `DebugFormat`.
Unlike the highlighter changes, these changes are fairly focused in the
trace output, so these changes aren't behind a flag.
Adds new substr function to str plugin with tests and documentation
Function takes a start/end location as a string in the form "##,##", both sides of comma are optional, and
behaves like Rust's own index operator [##..##].
a joy. Fundamentally we embrace functional programming principles for
transforming the dataset from any format picked up by Nu. This table
processing "primitive" commands will build up and make pipelines
composable with data processing capabilities allowing us the valuate,
reduce, and map, the tables as far as even composing this declartively.
On this regard, `split-by` expects some table with grouped data and we
can use it further in interesting ways (Eg. collecting labels for
visualizing the data in charts and/or suit it for a particular chart
of our interest).
This commit should finish the `coloring_in_tokens` feature, which moves
the shape accumulator into the token stream. This allows rollbacks of
the token stream to also roll back any shapes that were added.
This commit also adds a much nicer syntax highlighter trace, which shows
all of the paths the highlighter took to arrive at a particular coloring
output. This change is fairly substantial, but really improves the
understandability of the flow. I intend to update the normal parser with
a similar tracing view.
In general, this change also fleshes out the concept of "atomic" token
stream operations.
A good next step would be to try to make the parser more
error-correcting, using the coloring infrastructure. A follow-up step
would involve merging the parser and highlighter shapes themselves.
The code still compiles, so this doesn't seem to break anything. That also means
it's not critical to fix it, but having dead code around isn't great either.
Previously it would split the last column on the first separator value found
between the start of the column and the end of the row. Changing this to using
everything from the start of the column to the end of the string makes it behave
more similarly to the other columns, making it less surprising.
The table parsing/creation logic has changed from treating every line the same
to processing each line in context of the column header's placement. Previously,
lines on separate rows would go towards the same column as long as they were the
same index based on separator alone. Now, each item's index is based on vertical
alignment to the column header.
This may seem brittle, but it solves the problem of some tables operating with
empty cells that would cause remaining values to be paired with the wrong
column.
Based on kubernetes output (get pods, events), the new method has shown to have
much greater success rates for parsing.
New tests are added to test for additional cases that might be trickier to
handle with the new logic.
Old tests are updated where their expectations are no longer expected to hold true.
For instance: previously, lines would be treated separately, allowing any index
offset between columns on different rows, as long as they had the same row index
as decided by a separator. When this is no longer the case, some things need to
be adjusted.
The benefit of this is that coloring can be made atomic alongside token
stream forwarding.
I put the feature behind a flag so I can continue to iterate on it
without possibly regressing existing functionality. It's a lot of places
where the flags have to go, but I expect it to be a short-lived flag,
and the flags are fully contained in the parser.