nushell/crates/nu-command/src/system/registry_query.rs
Piepmatz 66bc0542e0
Refactor I/O Errors (#14927)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

As mentioned in #10698, we have too many `ShellError` variants, with
some even overlapping in meaning. This PR simplifies and improves I/O
error handling by restructuring `ShellError` related to I/O issues.
Previously, `ShellError::IOError` only contained a message string,
making it convenient but overly generic. It was widely used without
providing spans (#4323).

This PR introduces a new `ShellError::Io` variant that consolidates
multiple I/O-related errors (except for `ShellError::NetworkFailure`,
which remains distinct for now). The new `ShellError::Io` variant
replaces the following:

- `FileNotFound`
- `FileNotFoundCustom`
- `IOInterrupted`
- `IOError`
- `IOErrorSpanned`
- `NotADirectory`
- `DirectoryNotFound`
- `MoveNotPossible`
- `CreateNotPossible`
- `ChangeAccessTimeNotPossible`
- `ChangeModifiedTimeNotPossible`
- `RemoveNotPossible`
- `ReadingFile`

## The `IoError`
`IoError` includes the following fields:

1. **`kind`**: Extends `std::io::ErrorKind` to specify the type of I/O
error without needing new `ShellError` variants. This aligns with the
approach used in `std::io::Error`. This adds a second dimension to error
reporting by combining the `kind` field with `ShellError` variants,
making it easier to describe errors in more detail. As proposed by
@kubouch in [#design-discussion on
Discord](https://discord.com/channels/601130461678272522/615329862395101194/1323699197165178930),
this helps reduce the number of `ShellError` variants. In the error
report, the `kind` field is displayed as the "source" of the error,
e.g., "I/O error," followed by the specific kind of I/O error.
2. **`span`**: A non-optional field to encourage providing spans for
better error reporting (#4323).
3. **`path`**: Optional `PathBuf` to give context about the file or
directory involved in the error (#7695). If provided, it’s shown as a
help entry in error reports.
4. **`additional_context`**: Allows adding custom messages when the
span, kind, and path are insufficient. This is rendered in the error
report at the labeled span.
5. **`location`**: Sometimes, I/O errors occur in the engine itself and
are not caused directly by user input. In such cases, if we don’t have a
span and must set it to `Span::unknown()`, we need another way to
reference the error. For this, the `location` field uses the new
`Location` struct, which records the Rust file and line number where the
error occurred. This ensures that we at least know the Rust code
location that failed, helping with debugging. To make this work, a new
`location!` macro was added, which retrieves `file!`, `line!`, and
`column!` values accurately. If `Location::new` is used directly, it
issues a warning to remind developers to use the macro instead, ensuring
consistent and correct usage.

### Constructor Behavior
`IoError` provides five constructor methods:
- `new` and `new_with_additional_context`: Used for errors caused by
user input and require a valid (non-unknown) span to ensure precise
error reporting.
- `new_internal` and `new_internal_with_path`: Used for internal errors
where a span is not available. These methods require additional context
and the `Location` struct to pinpoint the source of the error in the
engine code.
- `factory`: Returns a closure that maps an `std::io::Error` to an
`IoError`. This is useful for handling multiple I/O errors that share
the same span and path, streamlining error handling in such cases.

## New Report Look
This is simulation how the I/O errors look like (the `open crates` is
simulated to show how internal errors are referenced now):
![Screenshot 2025-01-25
190426](https://github.com/user-attachments/assets/a41b6aa6-a440-497d-bbcc-3ac0121c9226)

## `Span::test_data()`
To enable better testing, `Span::test_data()` now returns a value
distinct from `Span::unknown()`. Both `Span::test_data()` and
`Span::unknown()` refer to invalid source code, but having a separate
value for test data helps identify issues during testing while keeping
spans unique.

## Cursed Sneaky Error Transfers
I removed the conversions between `std::io::Error` and `ShellError` as
they often removed important information and were used too broadly to
handle I/O errors. This also removed the problematic implementation
found here:

7ea4895513/crates/nu-protocol/src/errors/shell_error.rs (L1534-L1583)

which hid some downcasting from I/O errors and made it hard to trace
where `ShellError` was converted into `std::io::Error`. To address this,
I introduced a new struct called `ShellErrorBridge`, which explicitly
defines this transfer behavior. With `ShellErrorBridge`, we can now
easily grep the codebase to locate and manage such conversions.

## Miscellaneous
- Removed the OS error added in #14640, as it’s no longer needed.
- Improved error messages in `glob_from` (#14679).
- Trying to open a directory with `open` caused a permissions denied
error (it's just what the OS provides). I added a `is_dir` check to
provide a better error in that case.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

- Error outputs now include more detailed information and are formatted
differently, including updated error codes.
- The structure of `ShellError` has changed, requiring plugin authors
and embedders to update their implementations.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

I updated tests to account for the new I/O error structure and
formatting changes.

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

This PR closes #7695 and closes #14892 and partially addresses #4323 and
#10698.

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
2025-01-28 16:03:31 -06:00

335 lines
14 KiB
Rust

use nu_engine::command_prelude::*;
use nu_protocol::shell_error::io::IoError;
use windows::{core::PCWSTR, Win32::System::Environment::ExpandEnvironmentStringsW};
use winreg::{enums::*, types::FromRegValue, RegKey};
#[derive(Clone)]
pub struct RegistryQuery;
impl Command for RegistryQuery {
fn name(&self) -> &str {
"registry query"
}
fn signature(&self) -> Signature {
Signature::build("registry query")
.input_output_types(vec![(Type::Nothing, Type::Any)])
.switch("hkcr", "query the hkey_classes_root hive", None)
.switch("hkcu", "query the hkey_current_user hive", None)
.switch("hklm", "query the hkey_local_machine hive", None)
.switch("hku", "query the hkey_users hive", None)
.switch("hkpd", "query the hkey_performance_data hive", None)
.switch("hkpt", "query the hkey_performance_text hive", None)
.switch("hkpnls", "query the hkey_performance_nls_text hive", None)
.switch("hkcc", "query the hkey_current_config hive", None)
.switch("hkdd", "query the hkey_dyn_data hive", None)
.switch(
"hkculs",
"query the hkey_current_user_local_settings hive",
None,
)
.switch(
"no-expand",
"do not expand %ENV% placeholders in REG_EXPAND_SZ",
Some('u'),
)
.required("key", SyntaxShape::String, "Registry key to query.")
.optional(
"value",
SyntaxShape::String,
"Optionally supply a registry value to query.",
)
.category(Category::System)
}
fn description(&self) -> &str {
"Query the Windows registry."
}
fn extra_description(&self) -> &str {
"Currently supported only on Windows systems."
}
fn run(
&self,
engine_state: &EngineState,
stack: &mut Stack,
call: &Call,
_input: PipelineData,
) -> Result<PipelineData, ShellError> {
registry_query(engine_state, stack, call)
}
fn examples(&self) -> Vec<Example> {
vec![
Example {
description: "Query the HKEY_CURRENT_USER hive",
example: "registry query --hkcu environment",
result: None,
},
Example {
description: "Query the HKEY_LOCAL_MACHINE hive",
example: r"registry query --hklm 'SYSTEM\CurrentControlSet\Control\Session Manager\Environment'",
result: None,
},
]
}
}
fn registry_query(
engine_state: &EngineState,
stack: &mut Stack,
call: &Call,
) -> Result<PipelineData, ShellError> {
let call_span = call.head;
let skip_expand = call.has_flag(engine_state, stack, "no-expand")?;
let registry_key: Spanned<String> = call.req(engine_state, stack, 0)?;
let registry_key_span = &registry_key.clone().span;
let registry_value: Option<Spanned<String>> = call.opt(engine_state, stack, 1)?;
let reg_hive = get_reg_hive(engine_state, stack, call)?;
let reg_key = reg_hive
.open_subkey(registry_key.item)
.map_err(|err| IoError::new(err.kind(), *registry_key_span, None))?;
if registry_value.is_none() {
let mut reg_values = vec![];
for (name, val) in reg_key.enum_values().flatten() {
let reg_type = format!("{:?}", val.vtype);
let nu_value = reg_value_to_nu_value(val, call_span, skip_expand);
reg_values.push(Value::record(
record! {
"name" => Value::string(name, call_span),
"value" => nu_value,
"type" => Value::string(reg_type, call_span),
},
*registry_key_span,
))
}
Ok(reg_values.into_pipeline_data(call_span, engine_state.signals().clone()))
} else {
match registry_value {
Some(value) => {
let reg_value = reg_key.get_raw_value(value.item.as_str());
match reg_value {
Ok(val) => {
let reg_type = format!("{:?}", val.vtype);
let nu_value = reg_value_to_nu_value(val, call_span, skip_expand);
Ok(Value::record(
record! {
"name" => Value::string(value.item, call_span),
"value" => nu_value,
"type" => Value::string(reg_type, call_span),
},
value.span,
)
.into_pipeline_data())
}
Err(_) => Err(ShellError::GenericError {
error: "Unable to find registry key/value".into(),
msg: format!("Registry value: {} was not found", value.item),
span: Some(value.span),
help: None,
inner: vec![],
}),
}
}
None => Ok(Value::nothing(call_span).into_pipeline_data()),
}
}
}
fn get_reg_hive(
engine_state: &EngineState,
stack: &mut Stack,
call: &Call,
) -> Result<RegKey, ShellError> {
let flags = [
"hkcr", "hkcu", "hklm", "hku", "hkpd", "hkpt", "hkpnls", "hkcc", "hkdd", "hkculs",
]
.iter()
.copied()
.filter_map(|flag| match call.has_flag(engine_state, stack, flag) {
Ok(true) => Some(Ok(flag)),
Ok(false) => None,
Err(e) => Some(Err(e)),
})
.collect::<Result<Vec<_>, ShellError>>()?;
if flags.len() > 1 {
return Err(ShellError::GenericError {
error: "Only one registry key can be specified".into(),
msg: "Only one registry key can be specified".into(),
span: Some(call.head),
help: None,
inner: vec![],
});
}
let hive = flags.first().copied().unwrap_or("hkcu");
let hkey = match hive {
"hkcr" => HKEY_CLASSES_ROOT,
"hkcu" => HKEY_CURRENT_USER,
"hklm" => HKEY_LOCAL_MACHINE,
"hku" => HKEY_USERS,
"hkpd" => HKEY_PERFORMANCE_DATA,
"hkpt" => HKEY_PERFORMANCE_TEXT,
"hkpnls" => HKEY_PERFORMANCE_NLSTEXT,
"hkcc" => HKEY_CURRENT_CONFIG,
"hkdd" => HKEY_DYN_DATA,
"hkculs" => HKEY_CURRENT_USER_LOCAL_SETTINGS,
_ => {
return Err(ShellError::NushellFailedSpanned {
msg: "Entered unreachable code".into(),
label: "Unknown registry hive".into(),
span: call.head,
})
}
};
Ok(RegKey::predef(hkey))
}
fn reg_value_to_nu_value(
mut reg_value: winreg::RegValue,
call_span: Span,
skip_expand: bool,
) -> nu_protocol::Value {
match reg_value.vtype {
REG_NONE => Value::nothing(call_span),
REG_BINARY => Value::binary(reg_value.bytes, call_span),
REG_MULTI_SZ => reg_value_to_nu_list_string(reg_value, call_span),
REG_SZ | REG_EXPAND_SZ => reg_value_to_nu_string(reg_value, call_span, skip_expand),
REG_DWORD | REG_DWORD_BIG_ENDIAN | REG_QWORD => reg_value_to_nu_int(reg_value, call_span),
// This should be impossible, as registry symlinks should be automatically transparent
// to the registry API as it's used by winreg, since it never uses REG_OPTION_OPEN_LINK.
// If it happens, decode as if the link is a string; it should be a registry path string.
REG_LINK => {
reg_value.vtype = REG_SZ;
reg_value_to_nu_string(reg_value, call_span, skip_expand)
}
// Decode these as binary; that seems to be the least bad option available to us.
// REG_RESOURCE_LIST is a struct CM_RESOURCE_LIST.
// REG_FULL_RESOURCE_DESCRIPTOR is a struct CM_FULL_RESOURCE_DESCRIPTOR.
// REG_RESOURCE_REQUIREMENTS_LIST is a struct IO_RESOURCE_REQUIREMENTS_LIST.
REG_RESOURCE_LIST | REG_FULL_RESOURCE_DESCRIPTOR | REG_RESOURCE_REQUIREMENTS_LIST => {
reg_value.vtype = REG_BINARY;
Value::binary(reg_value.bytes, call_span)
}
}
}
fn reg_value_to_nu_string(
reg_value: winreg::RegValue,
call_span: Span,
skip_expand: bool,
) -> nu_protocol::Value {
let value = String::from_reg_value(&reg_value)
.expect("registry value type should be REG_SZ or REG_EXPAND_SZ");
// REG_EXPAND_SZ contains unexpanded references to environment variables, for example, %PATH%.
// winreg not expanding these is arguably correct, as it's just wrapping raw registry access.
// These placeholder-having strings work in *some* Windows contexts, but Rust's fs/path APIs
// don't handle them, so they won't work in Nu unless we expand them here. Eagerly expanding the
// strings here seems to be the least bad option. This is what PowerShell does, for example,
// although reg.exe does not. We could do the substitution with our env, but the officially
// correct way to expand these strings is to call Win32's ExpandEnvironmentStrings function.
// ref: <https://learn.microsoft.com/en-us/windows/win32/sysinfo/registry-value-types>
// We can skip the dance if the string doesn't actually have any unexpanded placeholders.
if skip_expand || reg_value.vtype != REG_EXPAND_SZ || !value.contains('%') {
return Value::string(value, call_span);
}
// The encoding dance is unfortunate since we read "Windows Unicode" from the registry, but
// it's the most resilient option and avoids making potentially wrong alignment assumptions.
let value_utf16 = value.encode_utf16().chain([0]).collect::<Vec<u16>>();
// Like most Win32 string functions, the return value is the number of TCHAR written,
// or the required buffer size (in TCHAR) if the buffer is too small, or 0 for error.
// Since we already checked for the case where no expansion is done, we can start with
// an empty output buffer, since we expect to require at least one resize loop anyway.
let mut out_buffer = vec![];
loop {
match unsafe {
ExpandEnvironmentStringsW(PCWSTR(value_utf16.as_ptr()), Some(&mut *out_buffer))
} {
0 => {
// 0 means error, but we don't know what the error is. We could try to get
// the error code with GetLastError, but that's a whole other can of worms.
// Instead, we'll just return the original string and hope for the best.
// Presumably, registry strings shouldn't ever cause this to error anyway.
return Value::string(value, call_span);
}
size if size as usize <= out_buffer.len() => {
// The buffer was large enough, so we're done. Remember to remove the trailing nul!
let out_value_utf16 = &out_buffer[..size as usize - 1];
let out_value = String::from_utf16_lossy(out_value_utf16);
return Value::string(out_value, call_span);
}
size => {
// The buffer was too small, so we need to resize and try again.
// Clear first to indicate we don't care about the old contents.
out_buffer.clear();
out_buffer.resize(size as usize, 0);
continue;
}
}
}
}
#[test]
fn no_expand_does_not_expand() {
let unexpanded = "%AppData%";
let reg_val = || winreg::RegValue {
bytes: unexpanded
.encode_utf16()
.chain([0])
.flat_map(u16::to_ne_bytes)
.collect(),
vtype: REG_EXPAND_SZ,
};
// normally we do expand
let nu_val_expanded = reg_value_to_nu_string(reg_val(), Span::unknown(), false);
assert!(nu_val_expanded.coerce_string().is_ok());
assert_ne!(nu_val_expanded.coerce_string().unwrap(), unexpanded);
// unless we skip expansion
let nu_val_skip_expand = reg_value_to_nu_string(reg_val(), Span::unknown(), true);
assert!(nu_val_skip_expand.coerce_string().is_ok());
assert_eq!(nu_val_skip_expand.coerce_string().unwrap(), unexpanded);
}
fn reg_value_to_nu_list_string(reg_value: winreg::RegValue, call_span: Span) -> nu_protocol::Value {
let values = <Vec<String>>::from_reg_value(&reg_value)
.expect("registry value type should be REG_MULTI_SZ")
.into_iter()
.map(|s| Value::string(s, call_span));
// There's no REG_MULTI_EXPAND_SZ, so no need to do placeholder expansion here.
Value::list(values.collect(), call_span)
}
fn reg_value_to_nu_int(reg_value: winreg::RegValue, call_span: Span) -> nu_protocol::Value {
let value =
match reg_value.vtype {
// See discussion here https://github.com/nushell/nushell/pull/10806#issuecomment-1791832088
// "The unwraps here are effectively infallible...", so I changed them to expects.
REG_DWORD => u32::from_reg_value(&reg_value)
.expect("registry value type should be REG_DWORD") as i64,
REG_DWORD_BIG_ENDIAN => {
// winreg (v0.51.0) doesn't natively decode REG_DWORD_BIG_ENDIAN
u32::from_be_bytes(unsafe { *reg_value.bytes.as_ptr().cast() }) as i64
}
REG_QWORD => u64::from_reg_value(&reg_value)
.expect("registry value type should be REG_QWORD") as i64,
_ => unreachable!(
"registry value type should be REG_DWORD, REG_DWORD_BIG_ENDIAN, or REG_QWORD"
),
};
Value::int(value, call_span)
}