nushell/crates/nu-command/src/platform/du.rs
Bob Hyman 09b3dab35d
Allow filesystem commands to access files with glob metachars in name (#10694)
(squashed version of #10557, clean commit history and review thread)

Fixes #10571, also potentially: #10364, #10211, #9558, #9310,


# Description
Changes processing of arguments to filesystem commands that are source
paths or globs.
Applies to `cp, cp-old, mv, rm, du` but not `ls` (because it uses a
different globbing interface) or `glob` (because it uses a different
globbing library).

The core of the change is to lookup the argument first as a file and
only glob if it is not. That way,
a path containing glob metacharacters can be referenced without glob
quoting, though it will have to be single quoted to avoid nushell
parsing.

Before: A file path that looks like a glob is not matched by the glob
specified as a (source) argument and takes some thinking about to
access. You might say the glob pattern shadows a file with the same
spelling.
```
> ls a*
╭───┬────────┬──────┬──────┬────────────────╮
│ # │  name  │ type │ size │    modified    │
├───┼────────┼──────┼──────┼────────────────┤
│ 0 │ a[bc]d │ file │  0 B │ 34 seconds ago │
│ 1 │ abd    │ file │  0 B │ now            │
│ 2 │ acd    │ file │  0 B │ now            │
╰───┴────────┴──────┴──────┴────────────────╯

> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd

> ## Note -- a[bc]d *not* copied, and seemingly hard to access.
> cp --verbose 'a\[bc\]d' dest
Error:   × No matches found
   ╭─[entry #33:1:1]
 1 │ cp --verbose 'a\[bc\]d' dest
   ·              ─────┬────
   ·                   ╰── no matches found
   ╰────

> #.. but is accessible with enough glob quoting.
> cp --verbose 'a[[]bc[]]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d
```
Before_2: if file has glob metachars but isn't a valid pattern, user
gets a confusing error:

```
> touch 'a[b'
> cp 'a[b' dest
Error:   × Pattern syntax error near position 30: invalid range pattern
   ╭─[entry #13:1:1]
 1 │ cp 'a[b' dest
   ·    ──┬──
   ·      ╰── invalid pattern
   ╰────
```

After: Args to cp, mv, etc. are tried first as literal files, and only
as globs if not found to be files.

```
> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d
> cp --verbose '[a][bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd
```
After_2: file with glob metachars but invalid pattern just works.
(though Windows does not allow file name to contain `*`.).

```
> cp --verbose 'a[b' dest
copied /home/bobhy/src/rust/work/r4/a[b to /home/bobhy/src/rust/work/r4/dest/a[b
```

So, with this fix, a file shadows a glob pattern with the same spelling.
If you have such a file and really want to use the glob pattern, you
will have to glob quote some of the characters in the pattern. I think
that's less confusing to the user: if ls shows a file with a weird name,
s/he'll still be able to copy, rename or delete it.

# User-Facing Changes
Could break some existing scripts. If user happened to have a file with
a globbish name but was using a glob pattern with the same spelling, the
new version will process the file and not expand the glob.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
2023-10-18 13:31:15 -05:00

193 lines
5.5 KiB
Rust

use crate::{DirBuilder, DirInfo, FileInfo};
use nu_cmd_base::arg_glob;
use nu_engine::{current_dir, CallExt};
use nu_glob::{GlobError, Pattern};
use nu_protocol::{
ast::Call,
engine::{Command, EngineState, Stack},
Category, Example, IntoInterruptiblePipelineData, PipelineData, ShellError, Signature, Span,
Spanned, SyntaxShape, Type, Value,
};
use serde::Deserialize;
#[derive(Clone)]
pub struct Du;
#[derive(Deserialize, Clone, Debug)]
pub struct DuArgs {
path: Option<Spanned<String>>,
all: bool,
deref: bool,
exclude: Option<Spanned<String>>,
#[serde(rename = "max-depth")]
max_depth: Option<Spanned<i64>>,
#[serde(rename = "min-size")]
min_size: Option<Spanned<i64>>,
}
impl Command for Du {
fn name(&self) -> &str {
"du"
}
fn usage(&self) -> &str {
"Find disk usage sizes of specified items."
}
fn signature(&self) -> Signature {
Signature::build("du")
.input_output_types(vec![(Type::Nothing, Type::Table(vec![]))])
.allow_variants_without_examples(true)
.optional("path", SyntaxShape::GlobPattern, "starting directory")
.switch(
"all",
"Output file sizes as well as directory sizes",
Some('a'),
)
.switch(
"deref",
"Dereference symlinks to their targets for size",
Some('r'),
)
.named(
"exclude",
SyntaxShape::GlobPattern,
"Exclude these file names",
Some('x'),
)
.named(
"max-depth",
SyntaxShape::Int,
"Directory recursion limit",
Some('d'),
)
.named(
"min-size",
SyntaxShape::Int,
"Exclude files below this size",
Some('m'),
)
.category(Category::Core)
}
fn run(
&self,
engine_state: &EngineState,
stack: &mut Stack,
call: &Call,
_input: PipelineData,
) -> Result<PipelineData, ShellError> {
let tag = call.head;
let min_size: Option<Spanned<i64>> = call.get_flag(engine_state, stack, "min-size")?;
let max_depth: Option<Spanned<i64>> = call.get_flag(engine_state, stack, "max-depth")?;
if let Some(ref max_depth) = max_depth {
if max_depth.item < 0 {
return Err(ShellError::NeedsPositiveValue(max_depth.span));
}
}
if let Some(ref min_size) = min_size {
if min_size.item < 0 {
return Err(ShellError::NeedsPositiveValue(min_size.span));
}
}
let current_dir = current_dir(engine_state, stack)?;
let args = DuArgs {
path: call.opt(engine_state, stack, 0)?,
all: call.has_flag("all"),
deref: call.has_flag("deref"),
exclude: call.get_flag(engine_state, stack, "exclude")?,
max_depth,
min_size,
};
let exclude = args.exclude.map_or(Ok(None), move |x| {
Pattern::new(&x.item)
.map(Some)
.map_err(|e| ShellError::InvalidGlobPattern(e.msg.to_string(), x.span))
})?;
let include_files = args.all;
let mut paths = match args.path {
Some(p) => arg_glob(&p, &current_dir)?,
// The * pattern should never fail.
None => arg_glob(
&Spanned {
item: "*".into(),
span: Span::unknown(),
},
&current_dir,
)?,
}
.filter(move |p| {
if include_files {
true
} else {
match p {
Ok(f) if f.is_dir() => true,
Err(e) if e.path().is_dir() => true,
_ => false,
}
}
})
.map(|v| v.map_err(glob_err_into));
let all = args.all;
let deref = args.deref;
let max_depth = args.max_depth.map(|f| f.item as u64);
let min_size = args.min_size.map(|f| f.item as u64);
let params = DirBuilder {
tag,
min: min_size,
deref,
exclude,
all,
};
let mut output: Vec<Value> = vec![];
for p in paths.by_ref() {
match p {
Ok(a) => {
if a.is_dir() {
output.push(
DirInfo::new(a, &params, max_depth, engine_state.ctrlc.clone()).into(),
);
} else if let Ok(v) = FileInfo::new(a, deref, tag) {
output.push(v.into());
}
}
Err(e) => {
output.push(Value::error(e, tag));
}
}
}
Ok(output.into_pipeline_data(engine_state.ctrlc.clone()))
}
fn examples(&self) -> Vec<Example> {
vec![Example {
description: "Disk usage of the current directory",
example: "du",
result: None,
}]
}
}
fn glob_err_into(e: GlobError) -> ShellError {
let e = e.into_error();
ShellError::from(e)
}
#[cfg(test)]
mod tests {
use super::Du;
#[test]
fn examples_work_as_expected() {
use crate::test_examples;
test_examples(Du {})
}
}