pod5_types
Container class for a pod5 Read object
- class BaseRead(read_id: ~uuid.UUID, pore: ~pod5.pod5_types.Pore, calibration: ~pod5.pod5_types.Calibration, read_number: int, start_sample: int, median_before: float, end_reason: ~pod5.pod5_types.EndReason, run_info: ~pod5.pod5_types.RunInfo, num_minknow_events: int = 0, tracked_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, predicted_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, num_reads_since_mux_change: int = 0, time_since_mux_change: float = 0.0)[source]
Bases:
objectBase class for POD5 Read Data
- Parameters:
read_id (UUID) – The read_id of this read as UUID.
pore (Pore) – Pore data.
calibration (Calibration) – Calibration data.
read_number (int) – The read number on channel. This is increasing but typically not necessarily consecutive.
start_sample (int) – The number samples recorded on this channel before the read started.
median_before (float) – The level of current in the well before this read.
end_reason (EndReason) – EndReason data.
run_info (RunInfo) – RunInfo data.
num_minknow_events (int) – Number of minknow events that the read contains
tracked_scaling (ShiftScalePair) – Shift and Scale for tracked read scaling values (based on previous reads shift)
predicted_scaling (ShiftScalePair) – Shift and Scale for predicted read scaling values (based on this read’s raw signal)
num_reads_since_mux_change (int) – Number of selected reads since the last mux change on this reads channel
time_since_mux_change (float) – Time in seconds since the last mux change on this reads channel
- __init__(read_id: ~uuid.UUID, pore: ~pod5.pod5_types.Pore, calibration: ~pod5.pod5_types.Calibration, read_number: int, start_sample: int, median_before: float, end_reason: ~pod5.pod5_types.EndReason, run_info: ~pod5.pod5_types.RunInfo, num_minknow_events: int = 0, tracked_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, predicted_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, num_reads_since_mux_change: int = 0, time_since_mux_change: float = 0.0) None
- calibration: Calibration
Calibration metadata
- median_before: float
The level of current in the well before this read.
- num_minknow_events: int = 0
Number of minknow events that the read contains
- num_reads_since_mux_change: int = 0
Number of selected reads since the last mux change on this reads channel
- predicted_scaling: ShiftScalePair
Shift and Scale for predicted read scaling values (based on this read’s raw signal)
- read_id: UUID
The read_id of this read as UUID
- read_number: int
The read number on channel. This is increasing but typically not necessarily consecutive.
- start_sample: int
The number samples recorded on this channel before the read started.
- time_since_mux_change: float = 0.0
Time in seconds since the last mux change on this reads channel
- tracked_scaling: ShiftScalePair
Shift and Scale for tracked read scaling values (based on previous reads shift)
- class Calibration(offset: float, scale: float)[source]
Bases:
objectParameters to convert the signal data to picoamps.
- Parameters:
offset (float) – Calibration offset used to convert raw ADC data into pA readings.
scale (float) – Calibration scale factor used to convert raw ADC data into pA readings.
- __init__(offset: float, scale: float) None
- classmethod from_range(offset: float, adc_range: float, digitisation: float) Calibration[source]
Create a Calibration instance from offset, adc_range and digitisation
- offset: float
Calibration offset used to convert raw ADC data into pA readings.
- scale: float
Calibration scale factor used to convert raw ADC data into pA readings.
- class CompressedRead(read_id: ~uuid.UUID, pore: ~pod5.pod5_types.Pore, calibration: ~pod5.pod5_types.Calibration, read_number: int, start_sample: int, median_before: float, end_reason: ~pod5.pod5_types.EndReason, run_info: ~pod5.pod5_types.RunInfo, num_minknow_events: int = 0, tracked_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, predicted_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, num_reads_since_mux_change: int = 0, time_since_mux_change: float = 0.0, signal_chunks: ~typing.List[~numpy.ndarray[~typing.Any, ~numpy.dtype[~numpy.uint8]]] = <factory>, signal_chunk_lengths: ~typing.List[int] = <factory>)[source]
Bases:
BaseReadPOD5 Read Data with a compressed signal.
- Parameters:
read_id (UUID) – The read_id of this read as UUID.
pore (Pore) – Pore data.
calibration (Calibration) – Calibration data.
read_number (int) – The read number on channel. This is increasing but typically not necessarily consecutive.
start_sample (int) – The number samples recorded on this channel before the read started.
median_before (float) – The level of current in the well before this read.
end_reason (EndReason) – EndReason data.
run_info (RunInfo) – RunInfo data.
signal_chunks (List[numpy.array[uint8]]) – Compressed signal data in chunks.
signal_chunk_lengths (List[int]) – Chunk lengths (number of samples) of signal data before compression.
- __init__(read_id: ~uuid.UUID, pore: ~pod5.pod5_types.Pore, calibration: ~pod5.pod5_types.Calibration, read_number: int, start_sample: int, median_before: float, end_reason: ~pod5.pod5_types.EndReason, run_info: ~pod5.pod5_types.RunInfo, num_minknow_events: int = 0, tracked_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, predicted_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, num_reads_since_mux_change: int = 0, time_since_mux_change: float = 0.0, signal_chunks: ~typing.List[~numpy.ndarray[~typing.Any, ~numpy.dtype[~numpy.uint8]]] = <factory>, signal_chunk_lengths: ~typing.List[int] = <factory>) None
- property decompressed_signal: ndarray[Any, dtype[int16]]
Decompress and return the chunked signal data as a contiguous numpy array.
- Returns:
decompressed_signal – Decompressed signal data
- Return type:
numpy.array[int16]
- property sample_count: int
Return the total number of samples in the uncompressed signal.
- signal_chunk_lengths: List[int]
Chunk lengths (number of samples) of signal data before compression.
- signal_chunks: List[ndarray[Any, dtype[uint8]]]
Compressed signal data in chunks.
- class EndReason(reason: EndReasonEnum, forced: bool)[source]
Bases:
objectData on why the Read ended.
- Parameters:
reason (EndReasonEnum) – The end reason enumeration.
forced (bool) – True if it is a ‘forced’ read break.
- __init__(reason: EndReasonEnum, forced: bool) None
- forced: bool
True if it is a ‘forced’ read break (e.g. mux_change, unblock), False otherwise.
- classmethod from_reason_with_default_forced(reason: EndReasonEnum) EndReason[source]
Return a new EndReason instance with the ‘forced’ flag set to the expected default for the given reason
- property name: str
Return the reason name as a lower string
- reason: EndReasonEnum
The end reason enumeration
- class EndReasonEnum(value)[source]
Bases:
EnumEndReason Enumeration
- DATA_SERVICE_UNBLOCK_MUX_CHANGE = 3
- MUX_CHANGE = 1
- SIGNAL_NEGATIVE = 5
- SIGNAL_POSITIVE = 4
- UNBLOCK_MUX_CHANGE = 2
- UNKNOWN = 0
- class Pore(channel: int, well: int, pore_type: str)[source]
Bases:
objectData for the pore that the Read was acquired on
- Parameters:
channel (int) – 1-indexed channel.
well (int) – 1-indexed well.
pore_type (PoreType) – The pore type present in the well.
- __init__(channel: int, well: int, pore_type: str) None
- channel: int
1-indexed channel.
- pore_type: str
Name of the pore type present in the well.
- well: int
1-indexed well.
- class Read(read_id: ~uuid.UUID, pore: ~pod5.pod5_types.Pore, calibration: ~pod5.pod5_types.Calibration, read_number: int, start_sample: int, median_before: float, end_reason: ~pod5.pod5_types.EndReason, run_info: ~pod5.pod5_types.RunInfo, num_minknow_events: int = 0, tracked_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, predicted_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, num_reads_since_mux_change: int = 0, time_since_mux_change: float = 0.0, signal: ~numpy.ndarray[~typing.Any, ~numpy.dtype[~numpy.int16]] = <factory>)[source]
Bases:
BaseReadPOD5 Read Data with an uncompressed signal
- Parameters:
read_id (UUID) – The read_id of this read as UUID.
pore (Pore) – Pore data.
calibration (Calibration) – Calibration data.
read_number (int) – The read number on channel. This is increasing but typically not necessarily consecutive.
start_sample (int) – The number samples recorded on this channel before the read started.
median_before (float) – The level of current in the well before this read.
end_reason (EndReason) – EndReason data.
run_info (RunInfo) – RunInfo data.
signal (numpy.array[int16]) – Uncompressed signal data.
- __init__(read_id: ~uuid.UUID, pore: ~pod5.pod5_types.Pore, calibration: ~pod5.pod5_types.Calibration, read_number: int, start_sample: int, median_before: float, end_reason: ~pod5.pod5_types.EndReason, run_info: ~pod5.pod5_types.RunInfo, num_minknow_events: int = 0, tracked_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, predicted_scaling: ~pod5.pod5_types.ShiftScalePair = <factory>, num_reads_since_mux_change: int = 0, time_since_mux_change: float = 0.0, signal: ~numpy.ndarray[~typing.Any, ~numpy.dtype[~numpy.int16]] = <factory>) None
- property sample_count: int
Return the total number of samples in the uncompressed signal.
- signal: ndarray[Any, dtype[int16]]
Uncompressed signal data.
- class RunInfo(acquisition_id: str, acquisition_start_time: datetime, adc_max: int, adc_min: int, context_tags: Dict[str, str], experiment_name: str, flow_cell_id: str, flow_cell_product_code: str, protocol_name: str, protocol_run_id: str, protocol_start_time: datetime, sample_id: str, sample_rate: int, sequencing_kit: str, sequencer_position: str, sequencer_position_type: str, software: str, system_name: str, system_type: str, tracking_id: Dict[str, str])[source]
Bases:
objectHigher-level information about the Reads that correspond to a part of an experiment, protocol or acquisition
- Parameters:
acquisition_id (str) – A unique identifier for the acquisition.
acquisition_start_time (datetime.datetime) – This is the clock time for sample 0
adc_max (int) – The maximum ADC value that might be encountered.
adc_min (int) – The minimum ADC value that might be encountered.
context_tags (Dict[str, str]) – The context tags for the run. (For compatibility with fast5).
experiment_name (str) – The user-supplied name for the experiment being run.
flow_cell_id (str) – Uniquely identifies the flow cell the data was captured on.
flow_cell_product_code (str) – Identifies the type of flow cell the data was captured on.
protocol_name (str) – The name of the protocol that was run.
protocol_run_id (str) – The unique identifier for the protocol run that produced this data.
protocol_start_time (datetime.datetime) – When the protocol that the acquisition was part of started.
sample_id (str) – A user-supplied name for the sample being analysed.
sample_rate (int) – The number of samples acquired each second on each channel.
sequencing_kit (str) – The type of sequencing kit used to prepare the sample.
sequencer_position (str) – The sequencer position the data was collected on.
sequencer_position_type (str) – The type of sequencing hardware the data was collected on.
software (str) – A description of the software that acquired the data.
system_name (str) – The name of the system the data was collected on.
system_type (str) – The type of system the data was collected on.
tracking_id (Dict[str, str]) – The tracking id for the run. (For compatibility with fast5).
- __init__(acquisition_id: str, acquisition_start_time: datetime, adc_max: int, adc_min: int, context_tags: Dict[str, str], experiment_name: str, flow_cell_id: str, flow_cell_product_code: str, protocol_name: str, protocol_run_id: str, protocol_start_time: datetime, sample_id: str, sample_rate: int, sequencing_kit: str, sequencer_position: str, sequencer_position_type: str, software: str, system_name: str, system_type: str, tracking_id: Dict[str, str]) None
- acquisition_id: str
A unique identifier for the acquisition - note that readers should not depend on this uniquely determining the other fields in the run_info, or being unique among the dictionary keys.
- acquisition_start_time: datetime
This is the clock time for sample 0
- adc_max: int
The maximum ADC value that might be encountered. This is a hardware constraint.
- adc_min: int
The minimum ADC value that might be encountered. This is a hardware constraint.
- context_tags: Dict[str, str]
The context tags for the run. (For compatibility with fast5).
- experiment_name: str
The user-supplied name for the experiment being run.
- flow_cell_id: str
Uniquely identifies the flow cell the data was captured on. This is written on the flow cell case.
- flow_cell_product_code: str
Identifies the type of flow cell the data was captured on.
- protocol_name: str
The name of the protocol that was run.
- protocol_run_id: str
The unique identifier for the protocol run that produced this data.
- protocol_start_time: datetime
When the protocol that the acquisition was part of started.
- sample_id: str
A user-supplied name for the sample being analysed.
- sample_rate: int
The number of samples acquired each second on each channel.
- sequencer_position: str
The sequencer position the data was collected on. For removable positions, like MinION Mk1Bs, this is unique (e.g. ‘MN12345’), while for integrated positions it is not (e.g. ‘X1’ on a GridION).
- sequencer_position_type: str
The type of sequencing hardware the data was collected on. For example: ‘MinION Mk1B’ or ‘GridION’ or ‘PromethION’.
- sequencing_kit: str
The type of sequencing kit used to prepare the sample.
- software: str
A description of the software that acquired the data. For example: ‘MinKNOW 21.05.12 (Bream 5.1.6, Configurations 16.2.1, Core 5.1.9, Guppy 4.2.3)’.
- system_name: str
The name of the system the data was collected on. This might be a sequencer serial (eg: ‘GXB1234’) or a host name (e.g. ‘Lab PC’).
- system_type: str
The type of system the data was collected on. For example, ‘GridION Mk1’ or ‘PromethION P48’. If the system is not a Nanopore sequencer with built-in compute, this will be a description of the operating system (e.g. ‘Ubuntu 20.04’).
- tracking_id: Dict[str, str]
The tracking id for the run. (For compatibility with fast5).