uniq and uniq-by optimization (#7477) (#7534)

# Description

Refactored the quadratic complexity on `uniq` to use a HashMap, as key I
converted the Value to string.
I tried to use the HashableValue, but it looks it is not very developed
yet and it was getting more complex and difficult.

This improves performance on large data sets.

Fixes https://github.com/nushell/nushell/issues/7477


# Tests + Formatting
```
> let data = fetch "https://home.treasury.gov/system/files/276/yield-curve-rates-1990-2021.csv"
> $data | uniq
```

it keeps original attribute order in Records:
```
> [ {b:2, a:1} {a:1, b:2} ] | uniq 
╭───┬───┬───╮
│ # │ b │ a │
├───┼───┼───┤
│ 0 │ 2 │ 1 │
╰───┴───┴───╯
```
This commit is contained in:
raccmonteiro
2023-01-04 19:35:49 +00:00
committed by GitHub
parent f0e87da830
commit 75cb3fcc5f
4 changed files with 95 additions and 26 deletions

View File

@ -19,6 +19,7 @@ pub use command::To;
pub use html::ToHtml;
pub use json::ToJson;
pub use md::ToMd;
pub use nuon::value_to_string;
pub use nuon::ToNuon;
pub use text::ToText;
pub use tsv::ToTsv;

View File

@ -50,7 +50,7 @@ impl Command for ToNuon {
}
}
fn value_to_string(v: &Value, span: Span) -> Result<String, ShellError> {
pub fn value_to_string(v: &Value, span: Span) -> Result<String, ShellError> {
match v {
Value::Binary { val, .. } => {
let mut s = String::with_capacity(2 * val.len());