pod5_convert_from_fast5

Tool for converting fast5 files to the pod5 format

class OutputHandler(output_root: Path, one_to_one: Optional[Path], force_overwrite: bool)[source]

Bases: object

Class for managing p5.Writer handles

__init__(output_root: Path, one_to_one: Optional[Path], force_overwrite: bool)[source]
close_all()[source]

Close all open writers

get_writer(input_path: Path) Writer[source]

Get a Pod5Writer to write data from the input_path

static resolve_one_to_one_path(path: Path, root: Path, relative_root: Path)[source]

Find the relative path between the input path and the relative root

static resolve_output_path(path: Path, root: Path, relative_root: Optional[Path]) Path[source]

Resolve the output path. If relative_root is a path, resolve the relative output path under root, otherwise, the output is either root or a new file within root if root is a directory

set_input_complete(input_path: Path) None[source]

Close the Pod5Writer for associated input_path

class QueueManager(context: SpawnContext, inputs: List[Path], threads: int, timeout: float)[source]

Bases: object

__init__(context: SpawnContext, inputs: List[Path], threads: int, timeout: float) None[source]

Manager for balancing work queues

await_data() Tuple[Optional[Path], Optional[Union[List[CompressedRead], int]]][source]

Await compressed reads or the total count of reads compressed (file end) for a input filepath. Enqueues the next request if necessary

await_request() None[source]

Await a request for data

enqueue_data(path: Optional[Path], reads: Optional[Union[List[CompressedRead], int]]) None[source]

Enqueues an input path and either a list of compressed reads to be written, or the total count of reads converted for that path. Otherwise, if path is None, mark the child process as being empty.

enqueue_exception(path: Path, exception: Exception, trace: str) None[source]
enqueue_input(path: Path) None[source]

Enqueue a request

enqueue_request() None[source]
get_exception() Optional[Tuple[Path, Exception, str]][source]

Promptly get an exception if any

get_input() Optional[Path][source]

Promptly get an input if any returning None if queue is empty

shutdown() Tuple[int, int, int, int][source]

Shutdown all queues returning the counts of all remaining items

class StatusMonitor(paths: List[Path])[source]

Bases: object

Class for monitoring the status of the conversion

__init__(paths: List[Path])[source]
close() None[source]

Close the progress bar

increment_reads(n: int) None[source]

Increment the reads status by n

property total_files: int
property total_reads: int
update_reads_total(path: Path, total: int) None[source]

Increment the reads status by n and update the total reads

write(msg: str, file: Any) None[source]

Write runtime message to avoid clobbering tqdm pbar

convert_datetime_as_epoch_ms(time_str: Optional[str]) datetime[source]

Convert the fast5 time string to timestamp

convert_fast5_end_reason(fast5_end_reason: int) EndReason[source]

Return an EndReason instance from the given end_reason integer from a fast5 file. This will handle the difference between fast5 and pod5 values for this enumeration and set the default “forced” value for each fast5 enumeration value.

convert_fast5_file(path: Path, queues: QueueManager, signal_chunk_size: int = 102400) int[source]

Convert the reads in a fast5 file

convert_fast5_file_chunk(queues: QueueManager, handle: File, chunk: Iterable[str], cache: Dict[str, RunInfo], signal_chunk_size: int) List[CompressedRead][source]
convert_fast5_files(queues: QueueManager, signal_chunk_size: int = 102400) None[source]

Main function for converting fast5s available in queues. Collections of converted reads are emplaced on the data_queue for writing in the main process.

convert_fast5_read(fast5_read: Group, run_info_cache: Dict[str, RunInfo], signal_chunk_size: int = 102400) CompressedRead[source]

Given a fast5 read parsed from a fast5 file, return a pod5.Read object.

convert_from_fast5(inputs: List[Path], output: Path, recursive: bool = False, threads: int = 10, one_to_one: Optional[Path] = None, force_overwrite: bool = False, signal_chunk_size: int = 102400, strict: bool = False) None[source]

Convert fast5 files found (optionally recursively) at the given input Paths into pod5 file(s). If one_to_one is a Path then the new pod5 files are created in a new relative directory structure within output relative to the the one_to_one Path.

convert_run_info(acq_id: str, adc_max: int, adc_min: int, sample_rate: int, context_tags: Dict[str, str], device_type: str, tracking_id: Dict[str, str]) RunInfo[source]

Create a Pod5RunInfo instance from parsed fast5 data

decode_str(value: Union[str, bytes]) str[source]

Decode a h5py utf-8 byte string to python string

get_read_from_fast5(group_name: str, h5_file: File) Optional[Group][source]

Read a group from a h5 file ensuring that it’s a read

handle_exception(exception: Tuple[Path, Exception, str], output_handler: OutputHandler, status: StatusMonitor, strict: bool) None[source]
is_multi_read_fast5(path: Path) bool[source]

Assert that the given path points to a a multi-read fast5 file for which direct-to-pod5 conversion is supported.

issue_not_multi_read_exception(path: Path, queues: QueueManager)[source]
logged(log_return: bool = False, log_args: bool = False, log_time: bool = False)[source]

Logging parameterised decorator

logged_all(func)
main()[source]

Main function for pod5_convert_from_fast5

process_conversion_tasks(queues: QueueManager, output_handler: OutputHandler, status: StatusMonitor, strict: bool, threads: int) None[source]

Work through the queues of data until all work is done

terminate_processes(processes: List[SpawnProcess]) None[source]

terminate all child processes