Parser
- class headerparser.HeaderParser(normalizer: Callable[[str], Any] | None = None, body: bool | None = None, **kwargs: Any)[source]
A parser for RFC 822-style header sections. Define the fields the parser should recognize with the
add_field()
method, configure handling of unrecognized fields withadd_additional()
, and then parse input withparse()
or anotherparse_*()
method.- Parameters:
normalizer (callable) – By default, the parser will consider two field names to be equal iff their lowercased forms are equal. This can be overridden by setting
normalizer
to a custom callable that takes a field name and returns a “normalized” name for use in equality testing. The normalizer will also be used when looking up keys in theNormalizedDict
instances returned by the parser’sparse_*()
methods.body (bool) – whether the parser should allow or forbid a body after the header section;
True
means a body is required,False
means a body is prohibited, andNone
(the default) means a body is optionalkwargs – Passed to the
Scanner
constructor
- add_additional(enable: bool = True, **kwargs: Any) None [source]
Specify how the parser should handle fields in the input that were not previously registered with
add_field
. By default, unknown fields will cause theparse_*
methods to raise anUnknownFieldError
, but calling this method withenable=True
(the default) will change the parser’s behavior so that all unregistered fields are processed according to the options in**kwargs
. (If no options are specified, the additional values will just be stored in the result dictionary.)If this method is called more than once, only the settings from the last call will be used.
Note that additional field values are always stored in the result dictionary using their field name as the key, and two fields are considered the same (for the purposes of
multiple
) iff their names are the same after normalization. Customization of the dictionary key and field name can only be done throughadd_field
.Changed in version 0.2.0:
action
argument added- Parameters:
enable (bool) – whether the parser should accept input fields that were not registered with
add_field
; setting this toFalse
disables additional fields and restores the parser’s default behaviormultiple (bool) – If
True
, each additional header field will be allowed to occur more than once in the input, and each field’s values will be stored in a list. IfFalse
(the default), aDuplicateFieldError
will be raised if an additional field occurs more than once in the input.unfold (bool) – If
True
(defaultFalse
), additional field values will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applyingtype
type (callable) – a callable to apply to additional field values before storing them in the result dictionary
choices (iterable) – A sequence of values which additional fields are allowed to have. If
choices
is defined, all additional field values in the input must have one of the given values (after applyingtype
) or else anInvalidChoiceError
is raised.action (callable) – A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field’s name, and the field’s value (after processing with
type
andunfold
and checking againstchoices
). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired.
- Returns:
- Raises:
if
enable
is true and a previous call toadd_field
used a customdest
if
choices
is an empty sequence
- add_field(name: str, *altnames: str, **kwargs: Any) None [source]
Define a header field for the parser to parse. During parsing, if a field is encountered whose name (modulo normalization) equals either
name
or one of thealtnames
, the field’s value will be processed according to the options in**kwargs
. (If no options are specified, the value will just be stored in the result dictionary.)Changed in version 0.2.0:
action
argument added- Parameters:
name (string) – the primary name for the field, used in error messages and as the default value of
dest
altnames (strings) – field name synonyms
dest – The key in the result dictionary in which the field’s value(s) will be stored; defaults to
name
. When additional headers are enabled (seeadd_additional
),dest
must equal (after normalization) one of the field’s names.required (bool) – If
True
(defaultFalse
), theparse_*
methods will raise aMissingFieldError
if the field is not present in the inputdefault – The value to associate with the field if it is not present in the input. If no default value is specified, the field will be omitted from the result dictionary if it is not present in the input.
default
cannot be set when the field is required.type
,unfold
, andaction
will not be applied to the default value, and the default value need not belong tochoices
.multiple (bool) – If
True
, the header field will be allowed to occur more than once in the input, and all of the field’s values will be stored in a list. IfFalse
(the default), aDuplicateFieldError
will be raised if the field occurs more than once in the input.unfold (bool) – If
True
(defaultFalse
), the field value will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applyingtype
type (callable) – a callable to apply to the field value before storing it in the result dictionary
choices (iterable) – A sequence of values which the field is allowed to have. If
choices
is defined, all occurrences of the field in the input must have one of the given values (after applyingtype
) or else anInvalidChoiceError
is raised.action (callable) – A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field’s
name
, and the field’s value (after processing withtype
andunfold
and checking againstchoices
). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired. Whenaction
is defined for a field,dest
cannot be.
- Returns:
- Raises:
if another field with the same name or
dest
was already definedif
dest
is not one of the field’s names andadd_additional
is enabledif
default
is defined andrequired
is trueif
choices
is an empty sequenceif both
dest
andaction
are defined
TypeError – if
name
or one of thealtnames
is not a string
- parse(data: str | Iterable[str]) NormalizedDict [source]
Added in version 0.4.0.
Parse an RFC 822-style header field section (possibly followed by a message body) from the contents of the given string, filehandle, or sequence of lines and return a dictionary of the header fields (possibly with body attached). If
data
is an iterable ofstr
, newlines will be appended to lines in multiline header fields where not already present but will not be inserted where missing inside the body.Changed in version 0.5.0:
data
can now be a string.- Parameters:
iterable – a string, text-file-like object, or iterable of lines to parse
- Return type:
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_field
andadd_additional
ScannerError – if the header section is malformed
- parse_next_stanza(iterator: Iterator[str]) NormalizedDict [source]
Added in version 0.4.0.
Parse a RFC 822-style header field section from the contents of the given filehandle or iterator of lines and return a dictionary of the header fields. Input processing stops at the end of the header section, leaving the rest of the iterator unconsumed. As a message body is not consumed, calling this method when
body
is true will produce aMissingBodyError
.Deprecated since version 0.5.0: Instead combine
Scanner.scan_next_stanza()
withparse_stream()
- Parameters:
iterator – a text-file-like object or iterator of lines to parse
- Return type:
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_field
andadd_additional
ScannerError – if a header section is malformed
- parse_next_stanza_string(s: str) tuple[NormalizedDict, str] [source]
Added in version 0.4.0.
Parse a RFC 822-style header field section from the given string and return a pair of a dictionary of the header fields and the rest of the string. As a message body is not consumed, calling this method when
body
is true will produce aMissingBodyError
.Deprecated since version 0.5.0: Instead combine
Scanner.scan_next_stanza()
withparse_stream()
- Parameters:
s (string) – the text to parse
- Return type:
pair of
NormalizedDict
and a string- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_field
andadd_additional
ScannerError – if a header section is malformed
- parse_stanzas(data: str | Iterable[str]) Iterator[NormalizedDict] [source]
Added in version 0.4.0.
Parse zero or more stanzas of RFC 822-style header fields from the given string, filehandle, or sequence of lines and return a generator of dictionaries of header fields.
All of the input is treated as header sections, not message bodies; as a result, calling this method when
body
is true will produce aMissingBodyError
.Changed in version 0.5.0:
data
can now be a string.- Parameters:
data – a string, text-file-like object, or iterable of lines to parse
- Return type:
generator of
NormalizedDict
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_field
andadd_additional
ScannerError – if a header section is malformed
- parse_stanzas_stream(fields: Iterable[Iterable[tuple[str, str]]]) Iterator[NormalizedDict] [source]
Added in version 0.4.0.
Parse an iterable of iterables of
(name, value)
pairs as returned byscan_stanzas()
and return a generator of dictionaries of header fields. This is a low-level method that you will usually not need to call.- Parameters:
fields – an iterable of iterables of pairs of strings
- Return type:
generator of
NormalizedDict
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_field
andadd_additional
ScannerError – if a header section is malformed
- parse_stanzas_string(s: str) Iterator[NormalizedDict] [source]
Added in version 0.4.0.
Parse zero or more stanzas of RFC 822-style header fields from the given string and return a generator of dictionaries of header fields.
All of the input is treated as header sections, not message bodies; as a result, calling this method when
body
is true will produce aMissingBodyError
.Deprecated since version 0.5.0: Use
parse_stanzas()
instead.- Parameters:
s (string) – the text to parse
- Return type:
generator of
NormalizedDict
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_field
andadd_additional
ScannerError – if a header section is malformed
- parse_stream(fields: Iterable[tuple[str | None, str]]) NormalizedDict [source]
Process a sequence of
(name, value)
pairs as returned byscan()
and return a dictionary of header fields (possibly with body attached). This is a low-level method that you will usually not need to call.- Parameters:
fields (iterable of pairs of strings) – a sequence of
(name, value)
pairs representing the input fields- Return type:
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_field
andadd_additional
ValueError – if the input contains more than one body pair
- parse_string(s: str) NormalizedDict [source]
Parse an RFC 822-style header field section (possibly followed by a message body) from the given string and return a dictionary of the header fields (possibly with body attached)
Deprecated since version 0.5.0: Use
parse()
instead.- Parameters:
s (string) – the text to parse
- Return type:
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_field
andadd_additional
ScannerError – if the header section is malformed