Files
nushell/crates/nu-protocol/src/span.rs
Piepmatz 66bc0542e0 Refactor I/O Errors (#14927)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

As mentioned in #10698, we have too many `ShellError` variants, with
some even overlapping in meaning. This PR simplifies and improves I/O
error handling by restructuring `ShellError` related to I/O issues.
Previously, `ShellError::IOError` only contained a message string,
making it convenient but overly generic. It was widely used without
providing spans (#4323).

This PR introduces a new `ShellError::Io` variant that consolidates
multiple I/O-related errors (except for `ShellError::NetworkFailure`,
which remains distinct for now). The new `ShellError::Io` variant
replaces the following:

- `FileNotFound`
- `FileNotFoundCustom`
- `IOInterrupted`
- `IOError`
- `IOErrorSpanned`
- `NotADirectory`
- `DirectoryNotFound`
- `MoveNotPossible`
- `CreateNotPossible`
- `ChangeAccessTimeNotPossible`
- `ChangeModifiedTimeNotPossible`
- `RemoveNotPossible`
- `ReadingFile`

## The `IoError`
`IoError` includes the following fields:

1. **`kind`**: Extends `std::io::ErrorKind` to specify the type of I/O
error without needing new `ShellError` variants. This aligns with the
approach used in `std::io::Error`. This adds a second dimension to error
reporting by combining the `kind` field with `ShellError` variants,
making it easier to describe errors in more detail. As proposed by
@kubouch in [#design-discussion on
Discord](https://discord.com/channels/601130461678272522/615329862395101194/1323699197165178930),
this helps reduce the number of `ShellError` variants. In the error
report, the `kind` field is displayed as the "source" of the error,
e.g., "I/O error," followed by the specific kind of I/O error.
2. **`span`**: A non-optional field to encourage providing spans for
better error reporting (#4323).
3. **`path`**: Optional `PathBuf` to give context about the file or
directory involved in the error (#7695). If provided, it’s shown as a
help entry in error reports.
4. **`additional_context`**: Allows adding custom messages when the
span, kind, and path are insufficient. This is rendered in the error
report at the labeled span.
5. **`location`**: Sometimes, I/O errors occur in the engine itself and
are not caused directly by user input. In such cases, if we don’t have a
span and must set it to `Span::unknown()`, we need another way to
reference the error. For this, the `location` field uses the new
`Location` struct, which records the Rust file and line number where the
error occurred. This ensures that we at least know the Rust code
location that failed, helping with debugging. To make this work, a new
`location!` macro was added, which retrieves `file!`, `line!`, and
`column!` values accurately. If `Location::new` is used directly, it
issues a warning to remind developers to use the macro instead, ensuring
consistent and correct usage.

### Constructor Behavior
`IoError` provides five constructor methods:
- `new` and `new_with_additional_context`: Used for errors caused by
user input and require a valid (non-unknown) span to ensure precise
error reporting.
- `new_internal` and `new_internal_with_path`: Used for internal errors
where a span is not available. These methods require additional context
and the `Location` struct to pinpoint the source of the error in the
engine code.
- `factory`: Returns a closure that maps an `std::io::Error` to an
`IoError`. This is useful for handling multiple I/O errors that share
the same span and path, streamlining error handling in such cases.

## New Report Look
This is simulation how the I/O errors look like (the `open crates` is
simulated to show how internal errors are referenced now):
![Screenshot 2025-01-25
190426](https://github.com/user-attachments/assets/a41b6aa6-a440-497d-bbcc-3ac0121c9226)

## `Span::test_data()`
To enable better testing, `Span::test_data()` now returns a value
distinct from `Span::unknown()`. Both `Span::test_data()` and
`Span::unknown()` refer to invalid source code, but having a separate
value for test data helps identify issues during testing while keeping
spans unique.

## Cursed Sneaky Error Transfers
I removed the conversions between `std::io::Error` and `ShellError` as
they often removed important information and were used too broadly to
handle I/O errors. This also removed the problematic implementation
found here:

7ea4895513/crates/nu-protocol/src/errors/shell_error.rs (L1534-L1583)

which hid some downcasting from I/O errors and made it hard to trace
where `ShellError` was converted into `std::io::Error`. To address this,
I introduced a new struct called `ShellErrorBridge`, which explicitly
defines this transfer behavior. With `ShellErrorBridge`, we can now
easily grep the codebase to locate and manage such conversions.

## Miscellaneous
- Removed the OS error added in #14640, as it’s no longer needed.
- Improved error messages in `glob_from` (#14679).
- Trying to open a directory with `open` caused a permissions denied
error (it's just what the OS provides). I added a `is_dir` check to
provide a better error in that case.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

- Error outputs now include more detailed information and are formatted
differently, including updated error codes.
- The structure of `ShellError` has changed, requiring plugin authors
and embedders to update their implementations.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

I updated tests to account for the new I/O error structure and
formatting changes.

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

This PR closes #7695 and closes #14892 and partially addresses #4323 and
#10698.

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
2025-01-28 16:03:31 -06:00

243 lines
7.3 KiB
Rust

//! [`Span`] to point to sections of source code and the [`Spanned`] wrapper type
use crate::SpanId;
use miette::SourceSpan;
use serde::{Deserialize, Serialize};
use std::ops::Deref;
pub trait GetSpan {
fn get_span(&self, span_id: SpanId) -> Span;
}
/// A spanned area of interest, generic over what kind of thing is of interest
#[derive(Clone, Copy, Debug, Serialize, Deserialize, PartialEq, Eq)]
pub struct Spanned<T> {
pub item: T,
pub span: Span,
}
impl<T> Spanned<T> {
/// Map to a spanned reference of the inner type, i.e. `Spanned<T> -> Spanned<&T>`.
pub fn as_ref(&self) -> Spanned<&T> {
Spanned {
item: &self.item,
span: self.span,
}
}
/// Map to a mutable reference of the inner type, i.e. `Spanned<T> -> Spanned<&mut T>`.
pub fn as_mut(&mut self) -> Spanned<&mut T> {
Spanned {
item: &mut self.item,
span: self.span,
}
}
/// Map to the result of [`.deref()`](std::ops::Deref::deref) on the inner type.
///
/// This can be used for example to turn `Spanned<Vec<T>>` into `Spanned<&[T]>`.
pub fn as_deref(&self) -> Spanned<&<T as Deref>::Target>
where
T: Deref,
{
Spanned {
item: self.item.deref(),
span: self.span,
}
}
/// Map the spanned item with a function.
pub fn map<U>(self, f: impl FnOnce(T) -> U) -> Spanned<U> {
Spanned {
item: f(self.item),
span: self.span,
}
}
}
impl<T, E> Spanned<Result<T, E>> {
/// Move the `Result` to the outside, resulting in a spanned `Ok` or unspanned `Err`.
pub fn transpose(self) -> Result<Spanned<T>, E> {
match self {
Spanned {
item: Ok(item),
span,
} => Ok(Spanned { item, span }),
Spanned {
item: Err(err),
span: _,
} => Err(err),
}
}
}
/// Helper trait to create [`Spanned`] more ergonomically.
pub trait IntoSpanned: Sized {
/// Wrap items together with a span into [`Spanned`].
///
/// # Example
///
/// ```
/// # use nu_protocol::{Span, IntoSpanned};
/// # let span = Span::test_data();
/// let spanned = "Hello, world!".into_spanned(span);
/// assert_eq!("Hello, world!", spanned.item);
/// assert_eq!(span, spanned.span);
/// ```
fn into_spanned(self, span: Span) -> Spanned<Self>;
}
impl<T> IntoSpanned for T {
fn into_spanned(self, span: Span) -> Spanned<Self> {
Spanned { item: self, span }
}
}
/// Spans are a global offset across all seen files, which are cached in the engine's state. The start and
/// end offset together make the inclusive start/exclusive end pair for where to underline to highlight
/// a given point of interest.
#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Serialize, Deserialize)]
pub struct Span {
pub start: usize,
pub end: usize,
}
impl Span {
pub fn new(start: usize, end: usize) -> Self {
debug_assert!(
end >= start,
"Can't create a Span whose end < start, start={start}, end={end}"
);
Self { start, end }
}
pub const fn unknown() -> Self {
Self { start: 0, end: 0 }
}
/// Span for testing purposes.
///
/// The provided span does not point into any known source but is unequal to [`Span::unknown()`].
///
/// Note: Only use this for test data, *not* live data, as it will point into unknown source
/// when used in errors
pub const fn test_data() -> Self {
Self {
start: usize::MAX / 2,
end: usize::MAX / 2,
}
}
pub fn offset(&self, offset: usize) -> Self {
Self::new(self.start - offset, self.end - offset)
}
pub fn contains(&self, pos: usize) -> bool {
self.start <= pos && pos < self.end
}
pub fn contains_span(&self, span: Self) -> bool {
self.start <= span.start && span.end <= self.end && span.end != 0
}
/// Point to the space just past this span, useful for missing values
pub fn past(&self) -> Self {
Self {
start: self.end,
end: self.end,
}
}
/// Returns the minimal [`Span`] that encompasses both of the given spans.
///
/// The two `Spans` can overlap in the middle,
/// but must otherwise be in order by satisfying:
/// - `self.start <= after.start`
/// - `self.end <= after.end`
///
/// If this is not guaranteed to be the case, use [`Span::merge`] instead.
pub fn append(self, after: Self) -> Self {
debug_assert!(
self.start <= after.start && self.end <= after.end,
"Can't merge two Spans that are not in order"
);
Self {
start: self.start,
end: after.end,
}
}
/// Returns the minimal [`Span`] that encompasses both of the given spans.
///
/// The spans need not be in order or have any relationship.
///
/// [`Span::append`] is slightly more efficient if the spans are known to be in order.
pub fn merge(self, other: Self) -> Self {
Self {
start: usize::min(self.start, other.start),
end: usize::max(self.end, other.end),
}
}
/// Returns the minimal [`Span`] that encompasses all of the spans in the given slice.
///
/// The spans are assumed to be in order, that is, all consecutive spans must satisfy:
/// - `spans[i].start <= spans[i + 1].start`
/// - `spans[i].end <= spans[i + 1].end`
///
/// (Two consecutive spans can overlap as long as the above is true.)
///
/// Use [`Span::merge_many`] if the spans are not known to be in order.
pub fn concat(spans: &[Self]) -> Self {
// TODO: enable assert below
// debug_assert!(!spans.is_empty());
debug_assert!(spans.windows(2).all(|spans| {
let &[a, b] = spans else {
return false;
};
a.start <= b.start && a.end <= b.end
}));
Self {
start: spans.first().map(|s| s.start).unwrap_or(0),
end: spans.last().map(|s| s.end).unwrap_or(0),
}
}
/// Returns the minimal [`Span`] that encompasses all of the spans in the given iterator.
///
/// The spans need not be in order or have any relationship.
///
/// [`Span::concat`] is more efficient if the spans are known to be in order.
pub fn merge_many(spans: impl IntoIterator<Item = Self>) -> Self {
spans
.into_iter()
.reduce(Self::merge)
.unwrap_or(Self::unknown())
}
}
impl From<Span> for SourceSpan {
fn from(s: Span) -> Self {
Self::new(s.start.into(), s.end - s.start)
}
}
/// An extension trait for [`Result`], which adds a span to the error type.
///
/// This trait might be removed later, since the old [`Spanned<std::io::Error>`] to [`ShellError`]
/// conversion was replaced by [`IoError`](io_error::IoError).
pub trait ErrSpan {
type Result;
/// Adds the given span to the error type, turning it into a [`Spanned<E>`].
fn err_span(self, span: Span) -> Self::Result;
}
impl<T, E> ErrSpan for Result<T, E> {
type Result = Result<T, Spanned<E>>;
fn err_span(self, span: Span) -> Self::Result {
self.map_err(|err| err.into_spanned(span))
}
}