PDF Parsers

For parsing routine data from PDF files.

Exam Routine Parser

class routinepy.lib.scraper.parsers.pdf.exam_routine.BaseExamPdfParser[source]

Factory class that routes exam table extractions to program-specific extractors.

This class provides the main interface for extracting raw exam table from PDF into a workable list for further processing.

Note

Specific program codes need to have their own implementation.

See also

For an example implementation, see the source code of Program006ExamPdfParser.

static extract_raw_tables(program_code: ProgramCode, path: Path) list[source]

Extracts and returns cleaned raw table data from a PDF file based on the specified program code.

This method selects an appropriate parser for the given program code and uses it to extract table data from the provided PDF file.

Parameters:
  • program_code (ProgramCode) – The program code (e.g., ‘006’, ‘001’) to determine the parser to use.

  • path (Path) – The file path to the PDF file to be parsed.

Returns:

A list of raw table data extracted from the PDF.

Return type:

list

Note

  • Specific program codes need to have their own implementation.

Warning

  • Only routinepy.lib.api.enums.ProgramCode.CSE_DAY using Program006ExamPdfParser is currently supported

See also

For an example implementation, see the source code of Program006ExamPdfParser.