pod5_view

class Field(expr: Expr, docs: str)[source]

Bases: tuple

Container class for storing the polars expression for a named field

property docs

Alias for field number 1

property expr

Alias for field number 0

assert_unique_acquisition_id(run_info: LazyFrame, path: Path) None[source]

Perform a check that the acquisition ids are unique raising AssertionError otherwise

format_view_table(lazyframe: LazyFrame, path: Path, selected_fields: Set[str]) LazyFrame[source]

Format the view table based on the selected fields

get_field_or_raise(key: str) Field[source]

Get the Field for this key or raise a KeyError

get_reads_tables(path: Path, selected_fields: Set[str], threshold: int = 100000) Generator[LazyFrame, None, None][source]

Generate lazy dataframes from pod5 records. If the number of records is greater than threshold then yield chunks to limit memory consumption and improve overall performance

join_reads_to_run_info(reads: LazyFrame, run_info: LazyFrame) LazyFrame[source]

Join the reads and run_info tables

join_workers(processes: List[SpawnProcess], exceptions: JoinableQueue) None[source]

Poll workers checking for exceptions which will likely cause

launch_view_workers(paths: Set[Path], output: Path, selection: Set[str], separator: str, num_workers: int)[source]
main()[source]
parse_read_table_chunks(reader: Reader, approx_size: int = 99999) Generator[LazyFrame, None, None][source]

Read record batches and yield polars lazyframes of approx_size records. Records are yielded in units of whole batches of the underlying table

parse_reads_table_all(reader: Reader) LazyFrame[source]

Parse all records in the reads table returning a polars LazyFrame

parse_reads_table_batch(reader: Reader, batch_index: int) Tuple[LazyFrame, int][source]

Parse the reads table record batch at batch_index from a pod5 file returning a polars LazyFrame and the number of records in it

parse_run_info_table(reader: Reader) LazyFrame[source]

Parse the reads table from a pod5 file returning a polars LazyFrame

print_fields()[source]

Print a list of the available columns

resolve_output(output: Optional[Path], force_overwrite: bool) Optional[Path][source]

Resolve the output path if necessary checking for no accidental overwrite and resolving to default output if given a path

select_fields(*, group_read_id: bool = False, include: Optional[str] = None, exclude: Optional[str] = None) Set[str][source]

Select fields to write

view_pod5(inputs: List[Path], output: Path, separator: str = '\t', recursive: bool = False, force_overwrite: bool = False, list_fields: bool = False, no_header: bool = False, threads: int = 2, **kwargs) None[source]

Given a list of POD5 files write a table to view their contents

worker_process(paths: JoinableQueue, exceptions: JoinableQueue, lock: Lock, output: Path, separator: bool, selection: Set[str]) None[source]

Consume pod5 paths from paths queue, parse the records and write to output after acquiring lock. Returns None when all finish sentinel None is received in paths queue.

write(ldf: LazyFrame, output: Optional[Path], separator: str = '\t') None[source]

Write the polars.LazyFrame

write_header(output: Optional[Path], selected: Set[str], separator: str = '\t') None[source]

Write the header line