Improves startup time when using std-lib (#13842)

Updated summary for commit
[612e0e2](612e0e2160)
- While folks are welcome to read through the entire comments, the core
information is summarized here.

# Description

This PR drastically improves startup times of Nushell by only parsing a
single submodule of the Standard Library that provides the `banner` and
`pwd` commands. All other Standard Library commands and submodules are
parsed when imported by the user. This cuts startup times by more than
60%.

At the moment, we have stopped adding to `std-lib` because every
addition adds a small amount to the Nushell startup time.
With this change, we should once again be able to allow new
functionality to be added to the Standard Library without it impacting
`nu` startup times.

# User-Facing Changes

* Nushell now starts about 60% faster
* Breaking change: The `dirs` (Shells) aliases will return a warning
message that it will not be auto-loaded in the following release, along
with instructions on how to restore it (and disable the message)
* The `use std <submodule> *` syntax is available for convenience, but
should be avoided in scripts as it parses the entire `std` module and
all other submodules and places it in scope. The correct syntax to
*just* load a submodule is `use std/<submodule> *` (asterisk optional).
The slash is important. This will be documented.
* `use std *` can be used for convenience to load all of the library but
still incurs the full loading-time.
* `std/dirs`: Semi-breaking change. The `dirs` command replaces the
`show` command. This is more in line with the directory-stack
functionality found in other shells. Existing users will not be impacted
by this as the alias (`shells`) remains the same.

* Breaking-change: Technically a breaking change, but probably only
impacts maintainers of `std`. The virtual path for the standard library
has changed. It could previously be imported using its virtual path (and
technically, this would have been the correct way to do it):

  ```nu
  use NU_STDLIB_VIRTUAL_DIR/std
  ```

  The path is now simply `std/`:

  ```nu
  use std
  ```

  All submodules have moved accordingly.
  

# Timings

Comparisons below were made:

* In a temporary, clean config directory using `$env.XDG_CONFIG_HOME =
(mktemp -d)`.
* `nu` was run with a release build
* `nu` was run one time to generate the default `config.nu` (etc.) files
- Otherwise timings would include the user-prompt
* The shell was exited and then restarted several times to get timing
samples

(Note: Old timings based on 0.97 rather than 0.98, but in the range of
being accurate)

| Scenario | `$nu.startup-time` |
| --- | --- |
| 0.97.2
([aaaab8e](aaaab8e070))
Without this PR | 23ms - 24ms |
| This PR with deprecated commands | 9ms - <11ms |
| This PR after deprecated commands are removed in following release |
8ms - <10ms |
| Final PR (remove deprecated), using `--no-std-lib` | 6.1ms to 6.4ms |
| Final PR (remove deprecated), using `--no-config-file` | 3.1ms - 3.6ms
|
| Final PR (remove deprecated), using `--no-config-file --no-std-lib` |
1ms - 1.5ms |

*These last two timings point to the opportunity for further
optimization (see comment in thread below (will link once I write it).*

# Implementation details for future maintenance

* `use std banner` is a ridiculously deceptive call. That call parses
and imports *all* of `std` into scope. Simply replacing it with `use
std/core *` is essentially what saves ~14-15ms. This *only* imports the
submodule with the `banner` and `pwd` commands.

* From the code-comments, the reason that `NU_STDLIB_VIRTUAL_DIR` was
used as a prefix was so that there wouldn't be an issue if a user had a
`./std/mod.nu` in the current directory. This does **not** appear to be
an issue. After removing the prefix, I tested with both a relative
module as well as one in the `$env.NU_LIB_DIRS` path, and in all cases
the *internal* `std` still took precedence.

* By removing the prefix, users can now `use std` (and variants) without
requiring that it already be parsed and in scope.

* In the next release, we'll stop autoloading the `dirs` (shells)
functionality. While this only costs an additional 1-1.5ms, I think it's
better moved to the `config.nu` where the user can optionally remove it.
The main reason is its use of aliases (which have also caused issues) -
The `n`, `p`, and `g` short-commands are valuable real-estate, and users
may want to map these to something else.
  
For this release, there's an `deprecated_dirs` module that is still
autoloaded. As with the top-level commands, use of these will give a
deprecation warning with instructions on how to handle going forward.

To help with this, moved the aliases to their own submodule inside the
`dirs` module.

* Also sneaks in a small change where the top-level `dirs` command is
now the replacement for `dirs show`

* Fixed a double-import of `assert` in `dirs.nu`
* The `show_banner` step is replaced with simply `banner` rather than
re-importing it.

* A `virtual_path` may now be referenced with either a forward-slash or
a backward-slash on Windows. This allows `use std/<submodule>` to work
on all platforms.

# Performance side-notes:

* Future parsing and/or IR improvements should improve performance even
further.
* While the existing load time penalty of `std-lib` was not noticeable
on many systems, Nushell runs on a wide-variety of hardware and OS
platforms. Slower platforms will naturally see a bigger jump in
performance here. For users starting multiple Nushell sessions
frequently (e.g., `tmux`, Zellij, `screen`, et. al.) it is recommended
to keep total startup time (including user configuration) under ~250ms.

# Tests + Formatting

* All tests are green

* Updated tests:
- Removed the test that confirmed that `std` was loaded (since we
don't).
- Removed the `shells` test since it is not autoloaded. Main `dirs.nu`
functionality is tested through `stdlib-test`.
- Many tests assumed that the library was fully loaded, because it was
(even though we didn't intend for it to be). Fixed those tests.
- Tests now import only the necessary submodules (e.g., `use
std/assert`, rather than `use std assert`)
- Some tests *thought* they were loading `std/log`, but were doing so
improperly. This was masked by the now-fixed "load-everything-into-scope
bug". Local CI would pass due the `$env.NU_LOG_<...>` variables being
inherited from the calling process, but would fail in the "clean" GitHub
CI environment. These tests have also been fixed.

 * Added additional tests for the changes

# After Submitting

Will update the Standard Library doc page
This commit is contained in:
Douglas
2024-10-03 07:28:22 -04:00
committed by GitHub
parent 157494e803
commit 00709fc5bd
28 changed files with 578 additions and 375 deletions

View File

@ -1,4 +1,4 @@
use std *
use std/assert
def run [
system_level,
@ -6,9 +6,9 @@ def run [
--short
] {
if $short {
^$nu.current-exe --no-config-file --commands $'use std; NU_LOG_LEVEL=($system_level) std log ($message_level) --short "test message"'
^$nu.current-exe --no-config-file --commands $'use std; use std/log; NU_LOG_LEVEL=($system_level) log ($message_level) --short "test message"'
} else {
^$nu.current-exe --no-config-file --commands $'use std; NU_LOG_LEVEL=($system_level) std log ($message_level) "test message"'
^$nu.current-exe --no-config-file --commands $'use std; use std/log; NU_LOG_LEVEL=($system_level) log ($message_level) "test message"'
}
| complete | get --ignore-errors stderr
}

View File

@ -1,5 +1,4 @@
use std *
use std log *
use std/assert
use commons.nu *
def run-command [
@ -12,12 +11,12 @@ def run-command [
] {
if ($level_prefix | is-empty) {
if ($ansi | is-empty) {
^$nu.current-exe --no-config-file --commands $'use std; NU_LOG_LEVEL=($system_level) std log custom "($message)" "($format)" ($log_level)'
^$nu.current-exe --no-config-file --commands $'use std/log; NU_LOG_LEVEL=($system_level) log custom "($message)" "($format)" ($log_level)'
} else {
^$nu.current-exe --no-config-file --commands $'use std; NU_LOG_LEVEL=($system_level) std log custom "($message)" "($format)" ($log_level) --ansi "($ansi)"'
^$nu.current-exe --no-config-file --commands $'use std/log; NU_LOG_LEVEL=($system_level) log custom "($message)" "($format)" ($log_level) --ansi "($ansi)"'
}
} else {
^$nu.current-exe --no-config-file --commands $'use std; NU_LOG_LEVEL=($system_level) std log custom "($message)" "($format)" ($log_level) --level-prefix "($level_prefix)" --ansi "($ansi)"'
^$nu.current-exe --no-config-file --commands $'use std/log; NU_LOG_LEVEL=($system_level) log custom "($message)" "($format)" ($log_level) --level-prefix "($level_prefix)" --ansi "($ansi)"'
}
| complete | get --ignore-errors stderr
}
@ -31,6 +30,7 @@ def errors_during_deduction [] {
#[test]
def valid_calls [] {
use std/log *
assert equal (run-command "DEBUG" "msg" "%MSG%" 25 --level-prefix "abc" --ansi (ansi default) | str trim --right) "msg"
assert equal (run-command "DEBUG" "msg" "%LEVEL% %MSG%" 20 | str trim --right) $"((log-prefix).INFO) msg"
assert equal (run-command "DEBUG" "msg" "%LEVEL% %MSG%" --level-prefix "abc" 20 | str trim --right) "abc msg"
@ -39,6 +39,7 @@ def valid_calls [] {
#[test]
def log-level_handling [] {
use std/log *
assert equal (run-command "DEBUG" "msg" "%LEVEL% %MSG%" 20 | str trim --right) $"((log-prefix).INFO) msg"
assert equal (run-command "WARNING" "msg" "%LEVEL% %MSG%" 20 | str trim --right) ""
}

View File

@ -1,5 +1,6 @@
use std *
use std log *
use std/log *
use std/assert
use commons.nu *
def run-command [
@ -10,9 +11,9 @@ def run-command [
--short
] {
if $short {
^$nu.current-exe --no-config-file --commands $'use std; NU_LOG_LEVEL=($system_level) std log ($message_level) --format "($format)" --short "($message)"'
^$nu.current-exe --no-config-file --commands $'use std; use std/log; NU_LOG_LEVEL=($system_level) log ($message_level) --format "($format)" --short "($message)"'
} else {
^$nu.current-exe --no-config-file --commands $'use std; NU_LOG_LEVEL=($system_level) std log ($message_level) --format "($format)" "($message)"'
^$nu.current-exe --no-config-file --commands $'use std; use std/log; NU_LOG_LEVEL=($system_level) log ($message_level) --format "($format)" "($message)"'
}
| complete | get --ignore-errors stderr
}

View File

@ -1,5 +1,6 @@
use std *
use std log *
use std/assert
use std/log
use std/log *
#[test]
def env_log-ansi [] {

View File

@ -1,4 +1,5 @@
use std *
use std/assert
#[test]
def assert_basic [] {

View File

@ -0,0 +1,7 @@
use std/assert
#[test]
def banner [] {
use std/core
assert ((core banner | lines | length) == 15)
}

View File

@ -1,6 +1,5 @@
use std assert
use std assert
use std log
use std/assert
use std/log
# A couple of nuances to understand when testing module that exports environment:
# Each 'use' for that module in the test script will execute the def --env block.
@ -48,7 +47,7 @@ def dirs_command [] {
# must execute these uses for the UOT commands *after* the test and *not* just put them at top of test module.
# the def --env gets messed up
use std dirs
use std/dirs
# Stack: [BASE]
assert equal [$c.base_path] $env.DIRS_LIST "list is just pwd after initialization"
@ -80,7 +79,7 @@ def dirs_command [] {
assert length $env.DIRS_LIST 2 "drop removes from list"
assert equal $env.PWD $c.path_b "drop changes PWD to previous in list (before dropped element)"
assert equal (dirs show) [[active path]; [false $c.base_path] [true $c.path_b]] "show table contains expected information"
assert equal (dirs) [[active path]; [false $c.base_path] [true $c.path_b]] "show table contains expected information"
# Stack becomes: [BASE]
dirs drop
@ -96,7 +95,7 @@ def dirs_next [] {
cd $c.base_path
assert equal $env.PWD $c.base_path "test setup"
use std dirs
use std/dirs
cur_dir_check $c.base_path "use module test setup"
dirs add $c.path_a $c.path_b
@ -117,7 +116,7 @@ def dirs_cd [] {
# must set PWD *before* doing `use` that will run the def --env block in dirs module.
cd $c.base_path
use std dirs
use std/dirs
cur_dir_check $c.base_path "use module test setup"
@ -139,7 +138,7 @@ def dirs_cd [] {
def dirs_goto_bug10696 [] {
let $c = $in
cd $c.base_path
use std dirs
use std/dirs
dirs add $c.path_a
cd $c.path_b
@ -153,7 +152,7 @@ def dirs_goto_bug10696 [] {
def dirs_goto [] {
let $c = $in
cd $c.base_path
use std dirs
use std/dirs
# check that goto can move *from* any position in the ring *to* any other position (correctly)
@ -174,4 +173,7 @@ def dirs_goto [] {
assert equal $env.PWD ($exp_dir | get $other_pos) "goto changed working directory correctly"
}
}
# check that 'dirs goto' with no argument maps to `dirs` (main)
assert length (dirs goto) 3
}

View File

@ -1,5 +1,5 @@
use std assert
use std dt *
use std/assert
use std/dt *
#[test]
def equal_times [] {

View File

@ -1,4 +1,4 @@
use std assert
use std/assert
def test_data_multiline [] {
let lines = [
@ -19,7 +19,7 @@ def test_data_multiline [] {
#[test]
def from_ndjson_multiple_objects [] {
use std formats *
use std/formats *
let result = test_data_multiline | from ndjson
let expect = [{a:1},{a:2},{a:3},{a:4},{a:5},{a:6}]
assert equal $result $expect "could not convert from NDJSON"
@ -27,7 +27,7 @@ def from_ndjson_multiple_objects [] {
#[test]
def from_ndjson_single_object [] {
use std formats *
use std/formats *
let result = '{"a": 1}' | from ndjson
let expect = [{a:1}]
assert equal $result $expect "could not convert from NDJSON"
@ -35,13 +35,13 @@ def from_ndjson_single_object [] {
#[test]
def from_ndjson_invalid_object [] {
use std formats *
use std/formats *
assert error { '{"a":1' | from ndjson }
}
#[test]
def from_jsonl_multiple_objects [] {
use std formats *
use std/formats *
let result = test_data_multiline | from jsonl
let expect = [{a:1},{a:2},{a:3},{a:4},{a:5},{a:6}]
assert equal $result $expect "could not convert from JSONL"
@ -49,7 +49,7 @@ def from_jsonl_multiple_objects [] {
#[test]
def from_jsonl_single_object [] {
use std formats *
use std/formats *
let result = '{"a": 1}' | from jsonl
let expect = [{a:1}]
assert equal $result $expect "could not convert from JSONL"
@ -57,13 +57,13 @@ def from_jsonl_single_object [] {
#[test]
def from_jsonl_invalid_object [] {
use std formats *
use std/formats *
assert error { '{"a":1' | from jsonl }
}
#[test]
def to_ndjson_multiple_objects [] {
use std formats *
use std/formats *
let result = [{a:1},{a:2},{a:3},{a:4},{a:5},{a:6}] | to ndjson | str trim
let expect = test_data_multiline
assert equal $result $expect "could not convert to NDJSON"
@ -71,7 +71,7 @@ def to_ndjson_multiple_objects [] {
#[test]
def to_ndjson_single_object [] {
use std formats *
use std/formats *
let result = [{a:1}] | to ndjson | str trim
let expect = "{\"a\":1}"
assert equal $result $expect "could not convert to NDJSON"
@ -79,7 +79,7 @@ def to_ndjson_single_object [] {
#[test]
def to_jsonl_multiple_objects [] {
use std formats *
use std/formats *
let result = [{a:1},{a:2},{a:3},{a:4},{a:5},{a:6}] | to jsonl | str trim
let expect = test_data_multiline
assert equal $result $expect "could not convert to JSONL"
@ -87,7 +87,7 @@ def to_jsonl_multiple_objects [] {
#[test]
def to_jsonl_single_object [] {
use std formats *
use std/formats *
let result = [{a:1}] | to jsonl | str trim
let expect = "{\"a\":1}"
assert equal $result $expect "could not convert to JSONL"

View File

@ -1,5 +1,5 @@
use std assert
use std help
use std/assert
use std/help
#[test]
def show_help_on_commands [] {

View File

@ -1,4 +1,5 @@
use std *
use std/assert
#[test]
def iter_find [] {

View File

@ -1,8 +1,8 @@
use std
use std/lib
#[test]
def path_add [] {
use std assert
use std/assert
let path_name = if "PATH" in $env { "PATH" } else { "Path" }
@ -11,19 +11,19 @@ def path_add [] {
assert equal (get_path) []
std path add "/foo/"
lib path add "/foo/"
assert equal (get_path) (["/foo/"] | path expand)
std path add "/bar/" "/baz/"
lib path add "/bar/" "/baz/"
assert equal (get_path) (["/bar/", "/baz/", "/foo/"] | path expand)
load-env {$path_name: []}
std path add "foo"
std path add "bar" "baz" --append
lib path add "foo"
lib path add "bar" "baz" --append
assert equal (get_path) (["foo", "bar", "baz"] | path expand)
assert equal (std path add "fooooo" --ret) (["fooooo", "foo", "bar", "baz"] | path expand)
assert equal (lib path add "fooooo" --ret) (["fooooo", "foo", "bar", "baz"] | path expand)
assert equal (get_path) (["fooooo", "foo", "bar", "baz"] | path expand)
load-env {$path_name: []}
@ -35,18 +35,18 @@ def path_add [] {
android: "quux",
}
std path add $target_paths
lib path add $target_paths
assert equal (get_path) ([($target_paths | get $nu.os-info.name)] | path expand)
load-env {$path_name: [$"(["/foo", "/bar"] | path expand | str join (char esep))"]}
std path add "~/foo"
lib path add "~/foo"
assert equal (get_path) (["~/foo", "/foo", "/bar"] | path expand)
}
}
#[test]
def path_add_expand [] {
use std assert
use std/assert
# random paths to avoid collision, especially if left dangling on failure
let real_dir = $nu.temp-path | path join $"real-dir-(random chars)"
@ -63,25 +63,21 @@ def path_add_expand [] {
with-env {$path_name: []} {
def get_path [] { $env | get $path_name }
std path add $link_dir
lib path add $link_dir
assert equal (get_path) ([$link_dir])
}
rm $real_dir $link_dir
}
#[test]
def banner [] {
std assert ((std banner | lines | length) == 15)
}
#[test]
def repeat_things [] {
std assert error { "foo" | std repeat -1 }
use std/assert
assert error { "foo" | lib repeat -1 }
for x in ["foo", [1 2], {a: 1}] {
std assert equal ($x | std repeat 0) []
std assert equal ($x | std repeat 1) [$x]
std assert equal ($x | std repeat 2) [$x $x]
assert equal ($x | lib repeat 0) []
assert equal ($x | lib repeat 1) [$x]
assert equal ($x | lib repeat 2) [$x $x]
}
}

View File

@ -1,5 +1,5 @@
use std log
use std assert
use std/log
use std/assert
#[before-each]
def before-each [] {

View File

@ -0,0 +1,11 @@
use std/assert
export use std *
#[test]
def std_post_import [] {
assert length (scope commands | where name == "path add") 1
assert length (scope commands | where name == "ellie") 1
assert length (scope commands | where name == "repeat") 1
assert length (scope commands | where name == "formats from jsonl") 1
assert length (scope commands | where name == "dt datetime-diff") 1
}

View File

@ -0,0 +1,11 @@
use std/assert
#[test]
def std_pre_import [] {
# These commands shouldn't exist without an import
assert length (scope commands | where name == "path add") 0
assert length (scope commands | where name == "ellie") 0
assert length (scope commands | where name == "repeat") 0
assert length (scope commands | where name == "from jsonl") 0
assert length (scope commands | where name == "datetime-diff") 0
}

View File

@ -1,7 +1,5 @@
use std xml xaccess
use std xml xupdate
use std xml xinsert
use std assert
use std/xml *
use std/assert
#[before-each]
def before-each [] {