Files
nushell/crates/nu-plugin-engine/src/persistent.rs
Piepmatz 66bc0542e0 Refactor I/O Errors (#14927)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

As mentioned in #10698, we have too many `ShellError` variants, with
some even overlapping in meaning. This PR simplifies and improves I/O
error handling by restructuring `ShellError` related to I/O issues.
Previously, `ShellError::IOError` only contained a message string,
making it convenient but overly generic. It was widely used without
providing spans (#4323).

This PR introduces a new `ShellError::Io` variant that consolidates
multiple I/O-related errors (except for `ShellError::NetworkFailure`,
which remains distinct for now). The new `ShellError::Io` variant
replaces the following:

- `FileNotFound`
- `FileNotFoundCustom`
- `IOInterrupted`
- `IOError`
- `IOErrorSpanned`
- `NotADirectory`
- `DirectoryNotFound`
- `MoveNotPossible`
- `CreateNotPossible`
- `ChangeAccessTimeNotPossible`
- `ChangeModifiedTimeNotPossible`
- `RemoveNotPossible`
- `ReadingFile`

## The `IoError`
`IoError` includes the following fields:

1. **`kind`**: Extends `std::io::ErrorKind` to specify the type of I/O
error without needing new `ShellError` variants. This aligns with the
approach used in `std::io::Error`. This adds a second dimension to error
reporting by combining the `kind` field with `ShellError` variants,
making it easier to describe errors in more detail. As proposed by
@kubouch in [#design-discussion on
Discord](https://discord.com/channels/601130461678272522/615329862395101194/1323699197165178930),
this helps reduce the number of `ShellError` variants. In the error
report, the `kind` field is displayed as the "source" of the error,
e.g., "I/O error," followed by the specific kind of I/O error.
2. **`span`**: A non-optional field to encourage providing spans for
better error reporting (#4323).
3. **`path`**: Optional `PathBuf` to give context about the file or
directory involved in the error (#7695). If provided, it’s shown as a
help entry in error reports.
4. **`additional_context`**: Allows adding custom messages when the
span, kind, and path are insufficient. This is rendered in the error
report at the labeled span.
5. **`location`**: Sometimes, I/O errors occur in the engine itself and
are not caused directly by user input. In such cases, if we don’t have a
span and must set it to `Span::unknown()`, we need another way to
reference the error. For this, the `location` field uses the new
`Location` struct, which records the Rust file and line number where the
error occurred. This ensures that we at least know the Rust code
location that failed, helping with debugging. To make this work, a new
`location!` macro was added, which retrieves `file!`, `line!`, and
`column!` values accurately. If `Location::new` is used directly, it
issues a warning to remind developers to use the macro instead, ensuring
consistent and correct usage.

### Constructor Behavior
`IoError` provides five constructor methods:
- `new` and `new_with_additional_context`: Used for errors caused by
user input and require a valid (non-unknown) span to ensure precise
error reporting.
- `new_internal` and `new_internal_with_path`: Used for internal errors
where a span is not available. These methods require additional context
and the `Location` struct to pinpoint the source of the error in the
engine code.
- `factory`: Returns a closure that maps an `std::io::Error` to an
`IoError`. This is useful for handling multiple I/O errors that share
the same span and path, streamlining error handling in such cases.

## New Report Look
This is simulation how the I/O errors look like (the `open crates` is
simulated to show how internal errors are referenced now):
![Screenshot 2025-01-25
190426](https://github.com/user-attachments/assets/a41b6aa6-a440-497d-bbcc-3ac0121c9226)

## `Span::test_data()`
To enable better testing, `Span::test_data()` now returns a value
distinct from `Span::unknown()`. Both `Span::test_data()` and
`Span::unknown()` refer to invalid source code, but having a separate
value for test data helps identify issues during testing while keeping
spans unique.

## Cursed Sneaky Error Transfers
I removed the conversions between `std::io::Error` and `ShellError` as
they often removed important information and were used too broadly to
handle I/O errors. This also removed the problematic implementation
found here:

7ea4895513/crates/nu-protocol/src/errors/shell_error.rs (L1534-L1583)

which hid some downcasting from I/O errors and made it hard to trace
where `ShellError` was converted into `std::io::Error`. To address this,
I introduced a new struct called `ShellErrorBridge`, which explicitly
defines this transfer behavior. With `ShellErrorBridge`, we can now
easily grep the codebase to locate and manage such conversions.

## Miscellaneous
- Removed the OS error added in #14640, as it’s no longer needed.
- Improved error messages in `glob_from` (#14679).
- Trying to open a directory with `open` caused a permissions denied
error (it's just what the OS provides). I added a `is_dir` check to
provide a better error in that case.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

- Error outputs now include more detailed information and are formatted
differently, including updated error codes.
- The structure of `ShellError` has changed, requiring plugin authors
and embedders to update their implementations.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

I updated tests to account for the new I/O error structure and
formatting changes.

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

This PR closes #7695 and closes #14892 and partially addresses #4323 and
#10698.

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
2025-01-28 16:03:31 -06:00

373 lines
13 KiB
Rust

use crate::{
init::{create_command, make_plugin_interface},
PluginGc,
};
use super::{PluginInterface, PluginSource};
use nu_plugin_core::CommunicationMode;
use nu_protocol::{
engine::{EngineState, Stack},
shell_error::io::IoError,
HandlerGuard, Handlers, PluginGcConfig, PluginIdentity, PluginMetadata, RegisteredPlugin,
ShellError, Span,
};
use std::{
collections::HashMap,
sync::{Arc, Mutex},
};
/// A box that can keep a plugin that was spawned persistent for further uses. The plugin may or
/// may not be currently running. [`.get()`] gets the currently running plugin, or spawns it if it's
/// not running.
#[derive(Debug)]
pub struct PersistentPlugin {
/// Identity (filename, shell, name) of the plugin
identity: PluginIdentity,
/// Mutable state
mutable: Mutex<MutableState>,
}
/// The mutable state for the persistent plugin. This should all be behind one lock to prevent lock
/// order problems.
#[derive(Debug)]
struct MutableState {
/// Reference to the plugin if running
running: Option<RunningPlugin>,
/// Metadata for the plugin, e.g. version.
metadata: Option<PluginMetadata>,
/// Plugin's preferred communication mode (if known)
preferred_mode: Option<PreferredCommunicationMode>,
/// Garbage collector config
gc_config: PluginGcConfig,
/// RAII guard for this plugin's signal handler
signal_guard: Option<HandlerGuard>,
}
#[derive(Debug, Clone, Copy)]
enum PreferredCommunicationMode {
Stdio,
#[cfg(feature = "local-socket")]
LocalSocket,
}
#[derive(Debug)]
struct RunningPlugin {
/// Interface (which can be cloned) to the running plugin
interface: PluginInterface,
/// Garbage collector for the plugin
gc: PluginGc,
}
impl PersistentPlugin {
/// Create a new persistent plugin. The plugin will not be spawned immediately.
pub fn new(identity: PluginIdentity, gc_config: PluginGcConfig) -> PersistentPlugin {
PersistentPlugin {
identity,
mutable: Mutex::new(MutableState {
running: None,
metadata: None,
preferred_mode: None,
gc_config,
signal_guard: None,
}),
}
}
/// Get the plugin interface of the running plugin, or spawn it if it's not currently running.
///
/// Will call `envs` to get environment variables to spawn the plugin if the plugin needs to be
/// spawned.
pub fn get(
self: Arc<Self>,
envs: impl FnOnce() -> Result<HashMap<String, String>, ShellError>,
) -> Result<PluginInterface, ShellError> {
let mut mutable = self.mutable.lock().map_err(|_| ShellError::NushellFailed {
msg: format!(
"plugin `{}` mutex poisoned, probably panic during spawn",
self.identity.name()
),
})?;
if let Some(ref running) = mutable.running {
// It exists, so just clone the interface
Ok(running.interface.clone())
} else {
// Try to spawn. On success, `mutable.running` should have been set to the new running
// plugin by `spawn()` so we just then need to clone the interface from there.
//
// We hold the lock the whole time to prevent others from trying to spawn and ending
// up with duplicate plugins
//
// TODO: We should probably store the envs somewhere, in case we have to launch without
// envs (e.g. from a custom value)
let envs = envs()?;
let result = self.clone().spawn(&envs, &mut mutable);
// Check if we were using an alternate communication mode and may need to fall back to
// stdio.
if result.is_err()
&& !matches!(
mutable.preferred_mode,
Some(PreferredCommunicationMode::Stdio)
)
{
log::warn!("{}: Trying again with stdio communication because mode {:?} failed with {result:?}",
self.identity.name(),
mutable.preferred_mode);
// Reset to stdio and try again, but this time don't catch any error
mutable.preferred_mode = Some(PreferredCommunicationMode::Stdio);
self.clone().spawn(&envs, &mut mutable)?;
}
Ok(mutable
.running
.as_ref()
.ok_or_else(|| ShellError::NushellFailed {
msg: "spawn() succeeded but didn't set interface".into(),
})?
.interface
.clone())
}
}
/// Run the plugin command, then set up and set `mutable.running` to the new running plugin.
fn spawn(
self: Arc<Self>,
envs: &HashMap<String, String>,
mutable: &mut MutableState,
) -> Result<(), ShellError> {
// Make sure `running` is set to None to begin
if let Some(running) = mutable.running.take() {
// Stop the GC if there was a running plugin
running.gc.stop_tracking();
}
let source_file = self.identity.filename();
// Determine the mode to use based on the preferred mode
let mode = match mutable.preferred_mode {
// If not set, we try stdio first and then might retry if another mode is supported
Some(PreferredCommunicationMode::Stdio) | None => CommunicationMode::Stdio,
// Local socket only if enabled
#[cfg(feature = "local-socket")]
Some(PreferredCommunicationMode::LocalSocket) => {
CommunicationMode::local_socket(source_file)
}
};
let mut plugin_cmd = create_command(source_file, self.identity.shell(), &mode);
// We need the current environment variables for `python` based plugins
// Or we'll likely have a problem when a plugin is implemented in a virtual Python environment.
plugin_cmd.envs(envs);
let program_name = plugin_cmd.get_program().to_os_string().into_string();
// Before running the command, prepare communication
let comm = mode.serve()?;
// Run the plugin command
let child = plugin_cmd.spawn().map_err(|err| {
let error_msg = match err.kind() {
std::io::ErrorKind::NotFound => match program_name {
Ok(prog_name) => {
format!("Can't find {prog_name}, please make sure that {prog_name} is in PATH.")
}
_ => {
format!("Error spawning child process: {err}")
}
},
_ => {
format!("Error spawning child process: {err}")
}
};
ShellError::PluginFailedToLoad { msg: error_msg }
})?;
// Start the plugin garbage collector
let gc = PluginGc::new(mutable.gc_config.clone(), &self).map_err(|err| {
IoError::new_with_additional_context(
err.kind(),
Span::unknown(),
None,
"Could not start plugin gc",
)
})?;
let pid = child.id();
let interface = make_plugin_interface(
child,
comm,
Arc::new(PluginSource::new(self.clone())),
Some(pid),
Some(gc.clone()),
)?;
// If our current preferred mode is None, check to see if the plugin might support another
// mode. If so, retry spawn() with that mode
#[cfg(feature = "local-socket")]
if mutable.preferred_mode.is_none()
&& interface
.protocol_info()?
.supports_feature(&nu_plugin_protocol::Feature::LocalSocket)
{
log::trace!(
"{}: Attempting to upgrade to local socket mode",
self.identity.name()
);
// Stop the GC we just created from tracking so that we don't accidentally try to
// stop the new plugin
gc.stop_tracking();
// Set the mode and try again
mutable.preferred_mode = Some(PreferredCommunicationMode::LocalSocket);
return self.spawn(envs, mutable);
}
mutable.running = Some(RunningPlugin { interface, gc });
Ok(())
}
fn stop_internal(&self, reset: bool) -> Result<(), ShellError> {
let mut mutable = self.mutable.lock().map_err(|_| ShellError::NushellFailed {
msg: format!(
"plugin `{}` mutable mutex poisoned, probably panic during spawn",
self.identity.name()
),
})?;
// If the plugin is running, stop its GC, so that the GC doesn't accidentally try to stop
// a future plugin
if let Some(ref running) = mutable.running {
running.gc.stop_tracking();
}
// We don't try to kill the process or anything, we just drop the RunningPlugin. It should
// exit soon after
mutable.running = None;
// If this is a reset, we should also reset other learned attributes like preferred_mode
if reset {
mutable.preferred_mode = None;
}
Ok(())
}
}
impl RegisteredPlugin for PersistentPlugin {
fn identity(&self) -> &PluginIdentity {
&self.identity
}
fn is_running(&self) -> bool {
// If the lock is poisoned, we return false here. That may not be correct, but this is a
// failure state anyway that would be noticed at some point
self.mutable
.lock()
.map(|m| m.running.is_some())
.unwrap_or(false)
}
fn pid(&self) -> Option<u32> {
// Again, we return None for a poisoned lock.
self.mutable
.lock()
.ok()
.and_then(|r| r.running.as_ref().and_then(|r| r.interface.pid()))
}
fn stop(&self) -> Result<(), ShellError> {
self.stop_internal(false)
}
fn reset(&self) -> Result<(), ShellError> {
self.stop_internal(true)
}
fn metadata(&self) -> Option<PluginMetadata> {
self.mutable.lock().ok().and_then(|m| m.metadata.clone())
}
fn set_metadata(&self, metadata: Option<PluginMetadata>) {
if let Ok(mut mutable) = self.mutable.lock() {
mutable.metadata = metadata;
}
}
fn set_gc_config(&self, gc_config: &PluginGcConfig) {
if let Ok(mut mutable) = self.mutable.lock() {
// Save the new config for future calls
mutable.gc_config = gc_config.clone();
// If the plugin is already running, propagate the config change to the running GC
if let Some(gc) = mutable.running.as_ref().map(|running| running.gc.clone()) {
// We don't want to get caught holding the lock
drop(mutable);
gc.set_config(gc_config.clone());
gc.flush();
}
}
}
fn as_any(self: Arc<Self>) -> Arc<dyn std::any::Any + Send + Sync> {
self
}
fn configure_signal_handler(self: Arc<Self>, handlers: &Handlers) -> Result<(), ShellError> {
let guard = {
// We take a weakref to the plugin so that we don't create a cycle to the
// RAII guard that will be stored on the plugin.
let plugin = Arc::downgrade(&self);
handlers.register(Box::new(move |action| {
// write a signal packet through the PluginInterface if the plugin is alive and
// running
if let Some(plugin) = plugin.upgrade() {
if let Ok(mutable) = plugin.mutable.lock() {
if let Some(ref running) = mutable.running {
let _ = running.interface.signal(action);
}
}
}
}))?
};
if let Ok(mut mutable) = self.mutable.lock() {
mutable.signal_guard = Some(guard);
}
Ok(())
}
}
/// Anything that can produce a plugin interface.
pub trait GetPlugin: RegisteredPlugin {
/// Retrieve or spawn a [`PluginInterface`]. The `context` may be used for determining
/// environment variables to launch the plugin with.
fn get_plugin(
self: Arc<Self>,
context: Option<(&EngineState, &mut Stack)>,
) -> Result<PluginInterface, ShellError>;
}
impl GetPlugin for PersistentPlugin {
fn get_plugin(
self: Arc<Self>,
mut context: Option<(&EngineState, &mut Stack)>,
) -> Result<PluginInterface, ShellError> {
self.get(|| {
// Get envs from the context if provided.
let envs = context
.as_mut()
.map(|(engine_state, stack)| {
// We need the current environment variables for `python` based plugins. Or
// we'll likely have a problem when a plugin is implemented in a virtual Python
// environment.
let stack = &mut stack.start_collect_value();
nu_engine::env::env_to_strings(engine_state, stack)
})
.transpose()?;
Ok(envs.unwrap_or_default())
})
}
}