httpie-cli/httpie/output/formatters/xml.py
Mickaël Schoentgen 4f1c9441c5
Fix encoding error with non-prettified encoded responses (#1168)
* Fix encoding error with non-prettified encoded responses

Removed `--format-option response.as` an promote `--response-as`: using
the format option would be misleading as it is now also used by non-prettified
responses.

* Encoding refactoring

* split --response-as into --response-mime and --response-charset
* add support for Content-Type charset for requests printed to terminal
* add support charset detection for requests printed to terminal without a Content-Type charset
* etc.

* `test_unicode.py` → `test_encoding.py`

* Drop sequence length check

* Clean-up tests

* [skip ci] Tweaks

* Use the compatible release clause for `charset_normalizer` requirement

Cf. https://www.python.org/dev/peps/pep-0440/#version-specifiers

* Clean-up

* Partially revert d52a4833e4

* Changelog

* Tweak tests

* [skip ci] Better test name

* Cleanup tests and add request body charset detection

* More test suite cleanups

* Cleanup

* Fix code style in test

* Improve detect_encoding() docstring

* Uniformize pytest.mark.parametrize() calls

* [skip ci] Comment out TODOs (will be tackled in a specific PR)

Co-authored-by: Jakub Roztocil <jakub@roztocil.co>
2021-10-06 17:27:07 +02:00

60 lines
1.9 KiB
Python

import sys
from typing import TYPE_CHECKING, Optional
from ...encoding import UTF8
from ...plugins import FormatterPlugin
if TYPE_CHECKING:
from xml.dom.minidom import Document
def parse_xml(data: str) -> 'Document':
"""Parse given XML `data` string into an appropriate :class:`~xml.dom.minidom.Document` object."""
from defusedxml.minidom import parseString
return parseString(data)
def pretty_xml(document: 'Document',
encoding: Optional[str] = UTF8,
indent: int = 2,
standalone: Optional[bool] = None) -> str:
"""Render the given :class:`~xml.dom.minidom.Document` `document` into a prettified string."""
kwargs = {
'encoding': encoding or UTF8,
'indent': ' ' * indent,
}
if standalone is not None and sys.version_info >= (3, 9):
kwargs['standalone'] = standalone
body = document.toprettyxml(**kwargs).decode(kwargs['encoding'])
# Remove blank lines automatically added by `toprettyxml()`.
return '\n'.join(line for line in body.splitlines() if line.strip())
class XMLFormatter(FormatterPlugin):
def __init__(self, **kwargs):
super().__init__(**kwargs)
self.enabled = self.format_options['xml']['format']
def format_body(self, body: str, mime: str):
if 'xml' not in mime:
return body
from xml.parsers.expat import ExpatError
from defusedxml.common import DefusedXmlException
try:
parsed_body = parse_xml(body)
except ExpatError:
pass # Invalid XML, ignore.
except DefusedXmlException:
pass # Unsafe XML, ignore.
else:
body = pretty_xml(parsed_body,
encoding=parsed_body.encoding,
indent=self.format_options['xml']['indent'],
standalone=parsed_body.standalone)
return body