nushell/crates
132ikl 430b2746b8
Parse XML documents with DTDs by default, and add --disallow-dtd flag (#15272)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->
This PR allows `from xml` to parse XML documents with [document type
declarations](https://en.wikipedia.org/wiki/Document_type_declaration)
by default. This is especially notable since many HTML documents start
with `<!DOCTYPE html>`, and `roxmltree` should be able to parse some
simple HTML documents. The security concerns with DTDs are [XXE
attacks](https://en.wikipedia.org/wiki/XML_external_entity_attack), and
[exponential entity expansion
attacks](https://en.wikipedia.org/wiki/Billion_laughs_attack).
`roxmltree` [doesn't
support](d2c7801624/src/tokenizer.rs (L535-L547))
external entities (it parses them, but doesn't do anything with them),
so it is not vulnerable to XXE attacks. Additionally, `roxmltree` has
[some
safeguards](d2c7801624/src/parse.rs (L424-L452))
in place to prevent exponential entity expansion, so enabling DTDs by
default is relatively safe. The worst case is no worse than running
`loop {}`, so I think allowing DTDs by default is best, and DTDs can
still be disabled with `--disallow-dtd` if needed.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
* Allows `from xml` to parse XML documents with [document type
declarations](https://en.wikipedia.org/wiki/Document_type_declaration)
by default, and adds a `--disallow-dtd` flag to disallow parsing
documents with DTDs.

This PR also improves the errors in `from xml` by pointing at the issue
in the XML source. Example:

```
$ open --raw foo.xml | from xml 
Error:   × Failed to parse XML
   ╭─[2:7]
 1 │ <html>
 2 │     <p<>hi</p>
   ·       ▲
   ·       ╰── Unexpected character <, expected a whitespace
 3 │ </html>
   ╰────
```

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
N/A

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
N/A
2025-03-12 08:09:55 -05:00
..
nu_plugin_custom_values Rework operator type errors (#14429) 2025-02-12 20:03:40 -08:00
nu_plugin_example Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu_plugin_formats Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu_plugin_gstat Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu_plugin_inc Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu_plugin_nu_example Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu_plugin_polars polars open: exposing the ability to configure hive settings. (#15255) 2025-03-11 14:18:36 -07:00
nu_plugin_python Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu_plugin_query update query json help and examples (#15190) 2025-02-26 09:15:14 -06:00
nu_plugin_stress_internals Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu-cli fix(completion): full set of operators for type any (#15303) 2025-03-12 08:04:20 -05:00
nu-cmd-base Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu-cmd-extra Refactor/fix tests affecting the whole command set (#15073) 2025-02-11 11:36:36 +01:00
nu-cmd-lang fix $env.FILE_PWD and $env.CURRENT_FILE inside overlay use (#15126) 2025-03-05 21:13:44 +02:00
nu-cmd-plugin Refactor/fix tests affecting the whole command set (#15073) 2025-02-11 11:36:36 +01:00
nu-color-config Rework operator type errors (#14429) 2025-02-12 20:03:40 -08:00
nu-command Parse XML documents with DTDs by default, and add --disallow-dtd flag (#15272) 2025-03-12 08:09:55 -05:00
nu-derive-value Use proc-macro-error2 instead of proc-macro-error (#15093) 2025-02-11 15:13:34 -05:00
nu-engine check signals in nu-glob and ls (#15140) 2025-02-28 19:36:39 +01:00
nu-explore fix: new clippy warnings from rust 1.85.0 (#15203) 2025-02-27 14:11:47 +01:00
nu-glob check signals in nu-glob and ls (#15140) 2025-02-28 19:36:39 +01:00
nu-json fix: new clippy warnings from rust 1.85.0 (#15203) 2025-02-27 14:11:47 +01:00
nu-lsp fix(lsp): find_id for custom def in custom def (#15289) 2025-03-12 07:35:28 -05:00
nu-parser Fix unterminated loop in parse_record (#15246) 2025-03-05 21:02:03 +01:00
nu-path Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu-plugin Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu-plugin-core Replaced IoError::new_with_additional_context calls that still had Span::unknown() (#15056) 2025-02-08 09:23:28 -06:00
nu-plugin-engine Rework operator type errors (#14429) 2025-02-12 20:03:40 -08:00
nu-plugin-protocol make plugin compatible with nightly nushell version (#15084) 2025-02-11 06:40:15 -06:00
nu-plugin-test-support Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu-pretty-hex bugfix: math commands now return error with infinite range [#15135] (#15236) 2025-03-11 14:40:26 +01:00
nu-protocol Parse XML documents with DTDs by default, and add --disallow-dtd flag (#15272) 2025-03-12 08:09:55 -05:00
nu-std allow bench to handle larger numbers (#15162) 2025-02-25 15:02:42 +01:00
nu-system Jobs (#14883) 2025-02-25 12:09:52 -05:00
nu-table Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu-term-grid Bump to 0.102.1 dev version (#15012) 2025-02-05 00:19:48 -05:00
nu-test-support fix(test-support): use CARGO_BUILD_TARGET_DIR env var (#15212) 2025-02-28 20:08:44 +01:00
nu-utils Add filesize.show_unit config option (#15276) 2025-03-09 17:34:55 -05:00
nuon Custom command attributes (#14906) 2025-02-11 06:34:51 -06:00
README.md Remove old nushell/merge engine-q 2022-02-07 14:54:06 -05:00

Nushell core libraries and plugins

These sub-crates form both the foundation for Nu and a set of plugins which extend Nu with additional functionality.

Foundational libraries are split into two kinds of crates:

  • Core crates - those crates that work together to build the Nushell language engine
  • Support crates - a set of crates that support the engine with additional features like JSON support, ANSI support, and more.

Plugins are likewise also split into two types:

  • Core plugins - plugins that provide part of the default experience of Nu, including access to the system properties, processes, and web-connectivity features.
  • Extra plugins - these plugins run a wide range of different capabilities like working with different file types, charting, viewing binary data, and more.