Simplify get_content_type() (#1125)

We were using the potential `encoding` returned by `mimetypes.guess_type()`
to expand the `Content-Type` header.
According to the RFC-7231 [1] the `Content-Type` should contain a charset
and nothing more. But as stated in the `mimetypes.guess_type()` doc [2],
the `encoding` would be the name of the program used to encode (e.g. compress
or gzip) the payload. The `encoding` is suitable for use as a `Content-Encoding`
header. See [3] for potential `encoding`s, none is a IANA registered one [4], and
so a valid charset to be used by the `Content-Type` header.

[1] https://httpwg.org/specs/rfc7231.html#header.content-type
[2] https://docs.python.org/3/library/mimetypes.html#guess_type
[3] 938e84b4fa/Lib/mimetypes.py (L416-L422)
[4] https://www.iana.org/assignments/character-sets/character-sets.xhtml
This commit is contained in:
Mickaël Schoentgen 2021-08-06 12:35:38 +02:00 committed by GitHub
parent 6633b5ae9b
commit 6bd6648545
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -80,12 +80,7 @@ def get_content_type(filename):
to ``mimetypes``.
"""
mime, encoding = mimetypes.guess_type(filename, strict=False)
if mime:
content_type = mime
if encoding:
content_type = f'{mime}; charset={encoding}'
return content_type
return mimetypes.guess_type(filename, strict=False)[0]
def split_cookies(cookies):