pod5_convert_from_fast5
Tool for converting fast5 files to the pod5 format
- class OutputHandler(output_root: Path, one_to_one: Optional[Path], force_overwrite: bool)[source]
Bases:
objectClass for managing p5.Writer handles
- static resolve_one_to_one_path(path: Path, root: Path, relative_root: Path)[source]
Find the relative path between the input path and the relative root
- class QueueManager(context: SpawnContext, inputs: List[Path], threads: int, timeout: float)[source]
Bases:
object- __init__(context: SpawnContext, inputs: List[Path], threads: int, timeout: float) None[source]
Manager for balancing work queues
- await_data() Tuple[Optional[Path], Optional[Union[List[CompressedRead], int]]][source]
Await compressed reads or the total count of reads compressed (file end) for a input filepath. Enqueues the next request if necessary
- enqueue_data(path: Optional[Path], reads: Optional[Union[List[CompressedRead], int]]) None[source]
Enqueues an input path and either a list of compressed reads to be written, or the total count of reads converted for that path. Otherwise, if path is None, mark the child process as being empty.
- class StatusMonitor(paths: List[Path])[source]
Bases:
objectClass for monitoring the status of the conversion
- property total_files: int
- property total_reads: int
- convert_datetime_as_epoch_ms(time_str: Optional[str]) datetime[source]
Convert the fast5 time string to timestamp
- convert_fast5_end_reason(fast5_end_reason: int) EndReason[source]
Return an EndReason instance from the given end_reason integer from a fast5 file. This will handle the difference between fast5 and pod5 values for this enumeration and set the default “forced” value for each fast5 enumeration value.
- convert_fast5_file(path: Path, queues: QueueManager, signal_chunk_size: int = 102400) int[source]
Convert the reads in a fast5 file
- convert_fast5_file_chunk(queues: QueueManager, handle: File, chunk: Iterable[str], cache: Dict[str, RunInfo], signal_chunk_size: int) List[CompressedRead][source]
- convert_fast5_files(queues: QueueManager, signal_chunk_size: int = 102400) None[source]
Main function for converting fast5s available in queues. Collections of converted reads are emplaced on the data_queue for writing in the main process.
- convert_fast5_read(fast5_read: Group, run_info_cache: Dict[str, RunInfo], signal_chunk_size: int = 102400) CompressedRead[source]
Given a fast5 read parsed from a fast5 file, return a pod5.Read object.
- convert_from_fast5(inputs: List[Path], output: Path, recursive: bool = False, threads: int = 10, one_to_one: Optional[Path] = None, force_overwrite: bool = False, signal_chunk_size: int = 102400, strict: bool = False) None[source]
Convert fast5 files found (optionally recursively) at the given input Paths into pod5 file(s). If one_to_one is a Path then the new pod5 files are created in a new relative directory structure within output relative to the the one_to_one Path.
- convert_run_info(acq_id: str, adc_max: int, adc_min: int, sample_rate: int, context_tags: Dict[str, str], device_type: str, tracking_id: Dict[str, str]) RunInfo[source]
Create a Pod5RunInfo instance from parsed fast5 data
- get_read_from_fast5(group_name: str, h5_file: File) Optional[Group][source]
Read a group from a h5 file ensuring that it’s a read
- handle_exception(exception: Tuple[Path, Exception, str], output_handler: OutputHandler, status: StatusMonitor, strict: bool) None[source]
- is_multi_read_fast5(path: Path) bool[source]
Assert that the given path points to a a multi-read fast5 file for which direct-to-pod5 conversion is supported.
- issue_not_multi_read_exception(path: Path, queues: QueueManager)[source]
- logged(log_return: bool = False, log_args: bool = False, log_time: bool = False)[source]
Logging parameterised decorator
- logged_all(func)
- process_conversion_tasks(queues: QueueManager, output_handler: OutputHandler, status: StatusMonitor, strict: bool, threads: int) None[source]
Work through the queues of data until all work is done