repack
Tools to assist repacking pod5 data into other pod5 files
- class Repacker[source]
Bases:
objectWrapper class around native pod5 tools to repack data
- add_all_reads_to_output(output_ref: Pod5RepackerOutput, reader: Reader) None[source]
Copy the every read from the given
Readerinto the Repacker output reference which was returned byadd_output()- Parameters:
output_ref (lib_pod5.pod5_format_pybind.Pod5RepackerOutput) – The repacker handle reference returned from
add_output()reader (
Reader) – The Pod5 file reader to copy reads from
- add_output(output_file: Writer) Pod5RepackerOutput[source]
Add an output file writer to the repacker, so it can have read data repacked into it.
Once a user has added an output, it can be passed as an output to
add_selected_reads_to_output()oradd_reads_to_output()- Parameters:
output_file (
writer.Writer) – The output file writer to use- Returns:
repacker_object – Use this as “output_ref” in calls to
add_selected_reads_to_output()oradd_reads_to_output()- Return type:
p5b.Pod5RepackerOutput
- add_selected_reads_to_output(output_ref: Pod5RepackerOutput, reader: Reader, selected_read_ids: Collection[str])[source]
Copy the selected read_ids from the given
Readerinto the Repacker output reference which was returned byadd_output()- Parameters:
output_ref (lib_pod5.pod5_format_pybind.Pod5RepackerOutput) – The repacker handle reference returned from
add_output()reader (
Reader) – The Pod5 file reader to copy reads fromselected_read_ids (Collection[str]) – A Collection of read_ids as strings
- Raises:
RuntimeError – If any of the selected_read_ids were not found in the source file
- property batches_completed: int
Find the number of batches completed writing to dest files
- property batches_requested: int
Find the number of batches requested to be read from source files
- property is_complete: bool
Find if the requested repack operations are complete
- property pending_batch_writes: int
Find the number of batches in flight, awaiting writing
- property reads_completed: int
Find the number of reads written to files
- property reads_requested: int
Find the number of requested reads to be written
- property reads_sample_bytes_completed: int
Find the number of bytes for sample data repacked
- wait(finish: bool = True, interval: float = 0.5, desc: str = '', total_reads: Optional[int] = None, offset: int = 0) int[source]
Wait for the repacker (blocking) until it is done checking every interval seconds. Shows a progress bar at the current process index with desc string as the description.
- Parameters:
finish (bool) – Flag to toggle an optional final call to
finish()to close the repacker and free resourcesinterval (float) – The interval (in seconds) between checks to
is_complete()desc (str) – Progressbar description string
total_reads (int) – Overwrites the total number of reads expected
offset (int) – Sets the progress bar position offset
- Returns:
num_reads_completed – The number of reads written
- Return type:
int
- waiter(interval: float = 0.5) Generator[int, None, None][source]
Wait for the repacker (blocking) until it is done checking every interval seconds. Yields number of reads completed .
- Parameters:
interval (float) – The interval (in seconds) between checks to
is_complete()- Returns:
num_reads_completed – The number of reads written
- Return type:
int