Use nucleo instead of skim for completions (#14846)

# Description

This PR replaces `SkimMatcherV2` from the
[fuzzy-matcher](https://docs.rs/fuzzy-matcher/latest/fuzzy_matcher/)
crate with the
[nucleo-matcher](https://docs.rs/nucleo-matcher/latest/nucleo_matcher/)
crate for doing fuzzy matching. This touches both our completion code in
`nu-cli` and symbol filtering in `nu-lsp`.

Nucleo should give us better performance than Skim. In the event that we
decide to use the Nucleo frontend ([crate
docs](https://docs.rs/nucleo/latest/nucleo/)) too, it also works on
Windows, unlike [Skim](https://github.com/skim-rs/skim), which appears
to only support Linux and MacOS.

Unfortunately, we still have an indirect dependency on `fuzzy-matcher`,
because the [`dialoguer`](https://github.com/console-rs/dialoguer) crate
uses it.

# User-Facing Changes

No breaking changes. Suggestions will be sorted differently, because
Nucleo uses a different algorithm from Skim for matching/scoring.
Hopefully, the new sorting will generally make more sense.

# Tests + Formatting

In `nu-cli`, modified an existing test, but didn't test performance. I
haven't tested `nu-lsp` manually, but existing tests pass.

I did manually do `ls /nix/store/<TAB>`, `ls /nix/store/d<TAB>`, etc.,
but didn't notice Nucleo being faster (my `/nix/store` folder has 34136
items at the time of writing).
This commit is contained in:
Yash Thakur
2025-01-17 07:24:00 -05:00
committed by GitHub
parent 8759936636
commit 75105033b2
7 changed files with 90 additions and 64 deletions

View File

@@ -2,7 +2,6 @@ use std::collections::{BTreeMap, HashSet};
use std::hash::{Hash, Hasher};
use crate::{path_to_uri, span_to_range, uri_to_path, Id, LanguageServer};
use fuzzy_matcher::{skim::SkimMatcherV2, FuzzyMatcher};
use lsp_textdocument::{FullTextDocument, TextDocuments};
use lsp_types::{
DocumentSymbolParams, DocumentSymbolResponse, Location, Range, SymbolInformation, SymbolKind,
@@ -14,6 +13,8 @@ use nu_protocol::{
engine::{CachedFile, EngineState, StateWorkingSet},
DeclId, Span, VarId,
};
use nucleo_matcher::pattern::{AtomKind, CaseMatching, Normalization, Pattern};
use nucleo_matcher::{Config, Matcher, Utf32Str};
use std::{cmp::Ordering, path::Path};
/// Struct stored in cache, uri not included
@@ -70,7 +71,7 @@ impl Symbol {
/// Cache symbols for each opened file to avoid repeated parsing
pub struct SymbolCache {
/// Fuzzy matcher for symbol names
matcher: SkimMatcherV2,
matcher: Matcher,
/// File Uri --> Symbols
cache: BTreeMap<Uri, Vec<Symbol>>,
/// If marked as dirty, parse on next request
@@ -80,7 +81,7 @@ pub struct SymbolCache {
impl SymbolCache {
pub fn new() -> Self {
SymbolCache {
matcher: SkimMatcherV2::default(),
matcher: Matcher::new(Config::DEFAULT),
cache: BTreeMap::new(),
dirty_flags: BTreeMap::new(),
}
@@ -240,12 +241,20 @@ impl SymbolCache {
)
}
pub fn get_fuzzy_matched_symbols(&self, query: &str) -> Vec<SymbolInformation> {
pub fn get_fuzzy_matched_symbols(&mut self, query: &str) -> Vec<SymbolInformation> {
let pat = Pattern::new(
query,
CaseMatching::Smart,
Normalization::Smart,
AtomKind::Fuzzy,
);
self.cache
.iter()
.flat_map(|(uri, symbols)| symbols.iter().map(|s| s.clone().to_symbol_information(uri)))
.filter_map(|s| {
self.matcher.fuzzy_match(&s.name, query)?;
let mut buf = Vec::new();
let name = Utf32Str::new(&s.name, &mut buf);
pat.score(name, &mut self.matcher)?;
Some(s)
})
.collect()