Parser

class headerparser.HeaderParser(normalizer=None, body=None)[source]

A parser for RFC 822-style header sections. Define the fields the parser should recognize with the add_field method, configure handling of unrecognized fields with add_additional, and then parse input with parse_file or parse_string.

Parameters:
  • normalizer (callable) – By default, the parser will consider two field names to be equal iff their lowercased forms are equal. This can be overridden by setting normalizer to a custom callable that takes a field name and returns a “normalized” name for use in equality testing. The normalizer will also be used when looking up keys in the NormalizedDict instances returned by the parser’s parse_* methods.
  • body (bool) – whether the parser should allow or forbid a body after the header section; True means a body is required, False means a body is prohibited, and None (the default) means a body is optional
add_additional(enable=True, **kwargs)[source]

Specify how the parser should handle fields in the input that were not previously registered with add_field. By default, unknown fields will cause the parse_* methods to raise an UnknownFieldError, but calling this method with enable=True (the default) will change the parser’s behavior so that all unregistered fields are processed according to the options in **kwargs. (If no options are specified, the additional values will just be stored in the result dictionary.)

If this method is called more than once, only the settings from the last call will be used.

Note that additional field values are always stored in the result dictionary using their field name as the key, and two fields are considered the same (for the purposes of multiple) iff their names are the same after normalization. Customization of the dictionary key and field name can only be done through add_field.

New in version 0.2.0: action argument added

Parameters:
  • enable (bool) – whether the parser should accept input fields that were not registered with add_field; setting this to False disables additional fields and restores the parser’s default behavior
  • multiple (bool) – If True, each additional header field will be allowed to occur more than once in the input, and each field’s values will be stored in a list. If False (the default), a DuplicateFieldError will be raised if an additional field occurs more than once in the input.
  • unfold (bool) – If True (default False), additional field values will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applying type
  • type (callable) – a callable to apply to additional field values before storing them in the result dictionary
  • choices (iterable) – A sequence of values which additional fields are allowed to have. If choices is defined, all additional field values in the input must have one of the given values (after applying type) or else an InvalidChoiceError is raised.
  • action (callable) – A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field’s name, and the field’s value (after processing with type and unfold and checking against choices). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired.
Returns:

None

Raises:

ValueError

  • if enable is true and a previous call to add_field used a custom dest
  • if choices is an empty sequence

add_field(name, *altnames, **kwargs)[source]

Define a header field for the parser to parse. During parsing, if a field is encountered whose name (modulo normalization) equals either name or one of the altnames, the field’s value will be processed according to the options in **kwargs. (If no options are specified, the value will just be stored in the result dictionary.)

New in version 0.2.0: action argument added

Parameters:
  • name (string) – the primary name for the field, used in error messages and as the default value of dest
  • altnames (strings) – field name synonyms
  • dest – The key in the result dictionary in which the field’s value(s) will be stored; defaults to name. When additional headers are enabled (see add_additional), dest must equal (after normalization) one of the field’s names.
  • required (bool) – If True (default False), the parse_* methods will raise a MissingFieldError if the field is not present in the input
  • default – The value to associate with the field if it is not present in the input. If no default value is specified, the field will be omitted from the result dictionary if it is not present in the input. default cannot be set when the field is required. type, unfold, and action will not be applied to the default value, and the default value need not belong to choices.
  • multiple (bool) – If True, the header field will be allowed to occur more than once in the input, and all of the field’s values will be stored in a list. If False (the default), a DuplicateFieldError will be raised if the field occurs more than once in the input.
  • unfold (bool) – If True (default False), the field value will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applying type
  • type (callable) – a callable to apply to the field value before storing it in the result dictionary
  • choices (iterable) – A sequence of values which the field is allowed to have. If choices is defined, all occurrences of the field in the input must have one of the given values (after applying type) or else an InvalidChoiceError is raised.
  • action (callable) – A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field’s name, and the field’s value (after processing with type and unfold and checking against choices). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired. When action is defined for a field, dest cannot be.
Returns:

None

Raises:
  • ValueError
    • if another field with the same name or dest was already defined
    • if dest is not one of the field’s names and add_additional is enabled
    • if default is defined and required is true
    • if choices is an empty sequence
    • if both dest and action are defined
  • TypeError – if name or one of the altnames is not a string
parse_file(fp)[source]

Parse an RFC 822-style header field section (possibly followed by a message body) from the contents of the given filehandle and return a dictionary of the header fields (possibly with body attached)

Parameters:

fp (file-like object) – the file to parse

Return type:

NormalizedDict

Raises:
parse_lines(iterable)[source]

Parse an RFC 822-style header field section (possibly followed by a message body) from the given sequence of lines and return a dictionary of the header fields (possibly with body attached). Newlines will be inserted where not already present in multiline header fields but will not be inserted inside the body.

Parameters:

iterable (iterable of strings) – a sequence of lines comprising the text to parse

Return type:

NormalizedDict

Raises:
parse_stream(fields)[source]

Process a sequence of (name, value) pairs as returned by scan_lines() and return a dictionary of header fields (possibly with body attached). This is a low-level method that you will usually not need to call.

Parameters:

fields (iterable of pairs of strings) – a sequence of (name, value) pairs representing the input fields

Return type:

NormalizedDict

Raises:
parse_string(s)[source]

Parse an RFC 822-style header field section (possibly followed by a message body) from the given string and return a dictionary of the header fields (possibly with body attached)

Parameters:

s (string) – the text to parse

Return type:

NormalizedDict

Raises: