PDF Parsers¶
For parsing routine data from PDF files.
Exam Routine Parser¶
- class routinepy.lib.scraper.parsers.pdf.exam_routine.BaseExamPdfParser[source]¶
Factory class that routes exam table extractions to program-specific extractors.
This class provides the main interface for extracting raw exam table from PDF into a workable list for further processing.
Note
Specific program codes need to have their own implementation.
See also
For an example implementation, see the source code of
Program006ExamPdfParser.- static extract_raw_tables(program_code: ProgramCode, path: Path) list[source]¶
Extracts and returns cleaned raw table data from a PDF file based on the specified program code.
This method selects an appropriate parser for the given program code and uses it to extract table data from the provided PDF file.
- Parameters:
program_code (ProgramCode) – The program code (e.g., ‘006’, ‘001’) to determine the parser to use.
path (Path) – The file path to the PDF file to be parsed.
- Returns:
A list of raw table data extracted from the PDF.
- Return type:
list
Note
Specific program codes need to have their own implementation.
Warning
Only
routinepy.lib.api.enums.ProgramCode.CSE_DAYusingProgram006ExamPdfParseris currently supported
See also
For an example implementation, see the source code of
Program006ExamPdfParser.