Parser¶
-
class
headerparser.
HeaderParser
(normalizer=None, body=None)[source]¶ A parser for RFC 822-style header sections. Define the fields the parser should recognize with the
add_field
method, configure handling of unrecognized fields withadd_additional
, and then parse input withparse_file
orparse_string
.Parameters: - normalizer (callable) – By default, the parser will consider two field
names to be equal iff their lowercased forms are equal. This can be
overridden by setting
normalizer
to a custom callable that takes a field name and returns a “normalized” name for use in equality testing. The normalizer will also be used when looking up keys in theNormalizedDict
instances returned by the parser’sparse_*
methods. - body (bool) – whether the parser should allow or forbid a body after
the header section;
True
means a body is required,False
means a body is prohibited, andNone
(the default) means a body is optional
-
add_additional
(enable=True, **kwargs)[source]¶ Specify how the parser should handle fields in the input that were not previously registered with
add_field
. By default, unknown fields will cause theparse_*
methods to raise anUnknownFieldError
, but calling this method withenable=True
(the default) will change the parser’s behavior so that all unregistered fields are processed according to the options in**kwargs
. (If no options are specified, the additional values will just be stored in the result dictionary.)If this method is called more than once, only the settings from the last call will be used.
Note that additional field values are always stored in the result dictionary using their field name as the key, and two fields are considered the same (for the purposes of
multiple
) iff their names are the same after normalization. Customization of the dictionary key and field name can only be done throughadd_field
.New in version 0.2.0:
action
argument addedParameters: - enable (bool) – whether the parser should accept input fields that
were not registered with
add_field
; setting this toFalse
disables additional fields and restores the parser’s default behavior - multiple (bool) – If
True
, each additional header field will be allowed to occur more than once in the input, and each field’s values will be stored in a list. IfFalse
(the default), aDuplicateFieldError
will be raised if an additional field occurs more than once in the input. - unfold (bool) – If
True
(defaultFalse
), additional field values will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applyingtype
- type (callable) – a callable to apply to additional field values before storing them in the result dictionary
- choices (iterable) – A sequence of values which additional fields
are allowed to have. If
choices
is defined, all additional field values in the input must have one of the given values (after applyingtype
) or else anInvalidChoiceError
is raised. - action (callable) – A callable to invoke whenever the field is
encountered in the input. The callable will be passed the current
dictionary of header fields, the field’s name, and the field’s
value (after processing with
type
andunfold
and checking againstchoices
). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired.
Returns: Raises: - if
enable
is true and a previous call toadd_field
used a customdest
- if
choices
is an empty sequence
- enable (bool) – whether the parser should accept input fields that
were not registered with
-
add_field
(name, *altnames, **kwargs)[source]¶ Define a header field for the parser to parse. During parsing, if a field is encountered whose name (modulo normalization) equals either
name
or one of thealtnames
, the field’s value will be processed according to the options in**kwargs
. (If no options are specified, the value will just be stored in the result dictionary.)New in version 0.2.0:
action
argument addedParameters: - name (string) – the primary name for the field, used in error
messages and as the default value of
dest
- altnames (strings) – field name synonyms
- dest – The key in the result dictionary in which the field’s
value(s) will be stored; defaults to
name
. When additional headers are enabled (seeadd_additional
),dest
must equal (after normalization) one of the field’s names. - required (bool) – If
True
(defaultFalse
), theparse_*
methods will raise aMissingFieldError
if the field is not present in the input - default – The value to associate with the field if it is not
present in the input. If no default value is specified, the field
will be omitted from the result dictionary if it is not present in
the input.
default
cannot be set when the field is required.type
,unfold
, andaction
will not be applied to the default value, and the default value need not belong tochoices
. - multiple (bool) – If
True
, the header field will be allowed to occur more than once in the input, and all of the field’s values will be stored in a list. IfFalse
(the default), aDuplicateFieldError
will be raised if the field occurs more than once in the input. - unfold (bool) – If
True
(defaultFalse
), the field value will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applyingtype
- type (callable) – a callable to apply to the field value before storing it in the result dictionary
- choices (iterable) – A sequence of values which the field is
allowed to have. If
choices
is defined, all occurrences of the field in the input must have one of the given values (after applyingtype
) or else anInvalidChoiceError
is raised. - action (callable) – A callable to invoke whenever the field is
encountered in the input. The callable will be passed the current
dictionary of header fields, the field’s
name
, and the field’s value (after processing withtype
andunfold
and checking againstchoices
). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired. Whenaction
is defined for a field,dest
cannot be.
Returns: Raises: - ValueError –
- if another field with the same name or
dest
was already defined - if
dest
is not one of the field’s names andadd_additional
is enabled - if
default
is defined andrequired
is true - if
choices
is an empty sequence - if both
dest
andaction
are defined
- if another field with the same name or
- TypeError – if
name
or one of thealtnames
is not a string
- name (string) – the primary name for the field, used in error
messages and as the default value of
-
parse_file
(fp)[source]¶ Parse an RFC 822-style header field section (possibly followed by a message body) from the contents of the given filehandle and return a dictionary of the header fields (possibly with body attached)
Parameters: fp (file-like object) – the file to parse
Return type: Raises: - ParserError – if the input fields do not conform to the field
definitions declared with
add_field
andadd_additional
- ScannerError – if the header section is malformed
- ParserError – if the input fields do not conform to the field
definitions declared with
-
parse_lines
(iterable)[source]¶ Parse an RFC 822-style header field section (possibly followed by a message body) from the given sequence of lines and return a dictionary of the header fields (possibly with body attached). Newlines will be inserted where not already present in multiline header fields but will not be inserted inside the body.
Parameters: iterable (iterable of strings) – a sequence of lines comprising the text to parse
Return type: Raises: - ParserError – if the input fields do not conform to the field
definitions declared with
add_field
andadd_additional
- ScannerError – if the header section is malformed
- ParserError – if the input fields do not conform to the field
definitions declared with
-
parse_stream
(fields)[source]¶ Process a sequence of
(name, value)
pairs as returned byscan_lines()
and return a dictionary of header fields (possibly with body attached). This is a low-level method that you will usually not need to call.Parameters: fields (iterable of pairs of strings) – a sequence of
(name, value)
pairs representing the input fieldsReturn type: Raises: - ParserError – if the input fields do not conform to the field
definitions declared with
add_field
andadd_additional
- ValueError – if the input contains more than one body pair
- ParserError – if the input fields do not conform to the field
definitions declared with
-
parse_string
(s)[source]¶ Parse an RFC 822-style header field section (possibly followed by a message body) from the given string and return a dictionary of the header fields (possibly with body attached)
Parameters: s (string) – the text to parse
Return type: Raises: - ParserError – if the input fields do not conform to the field
definitions declared with
add_field
andadd_additional
- ScannerError – if the header section is malformed
- ParserError – if the input fields do not conform to the field
definitions declared with
- normalizer (callable) – By default, the parser will consider two field
names to be equal iff their lowercased forms are equal. This can be
overridden by setting