Scanner

headerparser.scan_file(fp)[source]

Scan a file for RFC 822-style header fields and return a generator of (name, value) pairs for each header field in the input, plus a (None, body) pair representing the body (if any) after the header section.

See scan_lines() for more information on the exact behavior of the scanner.

Parameters:

fp – A file-like object than can be iterated over to produce lines to pass to scan_lines(). Opening the file in universal newlines mode is recommended.

Return type:

generator of pairs of strings

Raises:
  • MalformedHeaderError – if an invalid header line, i.e., a line without either a colon or leading whitespace, is encountered
  • UnexpectedFoldingError – if a folded (indented) line that is not preceded by a valid header line is encountered
headerparser.scan_lines(iterable)[source]

Scan an iterable of lines for RFC 822-style header fields and return a generator of (name, value) pairs for each header field in the input, plus a (None, body) pair representing the body (if any) after the header section.

Each field value is a single string, the concatenation of one or more lines, with leading whitespace on lines after the first preserved. The ending of each line is converted to '\n' (added if there is no ending), and the last line of the field value has its trailing line ending (if any) removed.

Note

“Line ending” here means a CR, LF, or CR LF sequence at the end of one of the lines in iterable. Unicode line separators, along with line endings occurring in the middle of a line, are not treated as line endings and are not trimmed or converted to \n.

All lines after the first blank line are concatenated & yielded as-is in a (None, body) pair. (Note that body lines which do not end with a line terminator will not have one appended.) If there is no empty line in iterable, then no body pair is yielded. If the empty line is the last line in iterable, the body will be the empty string. If the empty line is the first line in iterable, then all other lines will be treated as part of the body and will not be scanned for header fields.

Parameters:

iterable – an iterable of strings representing lines of input

Return type:

generator of pairs of strings

Raises:
  • MalformedHeaderError – if an invalid header line, i.e., a line without either a colon or leading whitespace, is encountered
  • UnexpectedFoldingError – if a folded (indented) line that is not preceded by a valid header line is encountered
headerparser.scan_string(s)[source]

Scan a string for RFC 822-style header fields and return a generator of (name, value) pairs for each header field in the input, plus a (None, body) pair representing the body (if any) after the header section.

See scan_lines() for more information on the exact behavior of the scanner.

Parameters:

s – a string which will be broken into lines on CR, LF, and CR LF boundaries and passed to scan_lines()

Return type:

generator of pairs of strings

Raises:
  • MalformedHeaderError – if an invalid header line, i.e., a line without either a colon or leading whitespace, is encountered
  • UnexpectedFoldingError – if a folded (indented) line that is not preceded by a valid header line is encountered