DVF module
- class FixedParamsDupmark(*, duplication_window_size: ~typing.Annotated[int, ~pydantic.functional_validators.AfterValidator(func=~hairpin2.process_wrappers.DVF.FixedParamsDupmark.<lambda>)] = 6)[source]
Bases:
FixedParams
- duplication_window_size: <lambda>)]
- model_config: ConfigDict = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'frozen': True, 'strict': True}
parent dataclass to be be inherited from to store specific fixed parameters for a particular subclass of FilterTester, or in other words for a particular filtering test. Using subclasses of this class for the fixed parameters provides type-safety and a consistent interface for implementing filters
- tag_dups(run_params: RunParamsShared, fixed_params: FixedParamsDupmark)[source]
- class TaggerDupmark(engine_fixed_params: FixedParams | None, require_marks: Sequence[str], exclude_marks: Sequence[str])[source]
Bases:
ReadAwareProcess
- AddsMarks: ClassVar[set[str] | None] = {Tags.STUTTER_DUP_TAG}
- EngineFactory() ProcessEngineProtocol[RunParams_T, None]
- FixedParamClass
alias of
FixedParamsDupmark
- ProcessNamespace: ClassVar[str | None] = 'mark-duplicates'
- ProcessType: ClassVar[ProcessKindEnum | None] = <class 'hairpin2.infrastructure.process_engines.ReadTaggerEngine'>
- class ResultDVF(variant_flagged: TestOutcomes, info_flag: enum.Flag | None, reads_seen: int, loss_ratio: float)[source]
Bases:
FlagResult
- reads_seen: int
- loss_ratio: float
- getinfo(alt: str) str [source]
Return basic filter info in a string formatted for use in the VCF INFO field - “<flag>|<code>”.
Each filter must return INFO as it should be formatted for the VCF INFO field, or None if not applicable. Subclasses must override this method to return more specific info.
- FlagName: ClassVar[str] = 'DVF'
- InfoFlagsAllSet: ClassVar[Flag | None] = 7
- class FixedParamsDVF(*, read_loss_threshold: ~typing.Annotated[float, ~pydantic.functional_validators.AfterValidator(func=~hairpin2.process_wrappers.DVF.FixedParamsDVF.<lambda>)], min_pass_reads: ~typing.Annotated[int, ~pydantic.functional_validators.AfterValidator(func=~hairpin2.process_wrappers.DVF.FixedParamsDVF.<lambda>)], nsamples_threshold: int)[source]
Bases:
FixedParams
read_loss_threshold - percent threshold of N lq reads compared to N input reads for a given variant and sample, above which we flag DVF min_pass_reads - the absolute minimum number of reads required for a variant not to be flagged DVF
- read_loss_threshold: <lambda>)]
- min_pass_reads: <lambda>)]
- nsamples_threshold: int
- model_config: ConfigDict = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'frozen': True, 'strict': True}
parent dataclass to be be inherited from to store specific fixed parameters for a particular subclass of FilterTester, or in other words for a particular filtering test. Using subclasses of this class for the fixed parameters provides type-safety and a consistent interface for implementing filters
- test_DVF(run_params: RunParamsShared, fixed_params: FixedParamsDVF)[source]
- class FlaggerDVF(engine_fixed_params: FixedParams | None, require_marks: Sequence[str], exclude_marks: Sequence[str])[source]
Bases:
ReadAwareProcess
duplication variant filter - a portion of the reads supporting the variant are suspected to arise from duplicated reads that have escaped dupmarking.
In regions of low complexity, short repeats and homopolymer tracts can cause PCR stuttering. Leading to, for example, an additional A on the read when amplifying a tract of As. If duplicated reads contain stutter, this can lead to variation of read length and alignment to reference between reads that are in fact duplicates. Because of this, these duplicates then evade dupmarking and give rise to spurious variants when calling.
min_boundary_deviation sets the minimum deviation start/end coordinates, above which reads are assumed not to be duplicated read_number_difference_threshold sets the the threshold for absolute difference between the number of reads supporting the variant with and without duplicates removed. If this threshold is exceeded, the flag will be set.
- AddsMarks: ClassVar[set[str] | None] = None
- EngineFactory() ProcessEngineProtocol[RunParams, FlagResult]
- FixedParamClass
alias of
FixedParamsDVF
- ProcessNamespace: ClassVar[str | None] = 'DVF'
- ProcessType: ClassVar[ProcessKindEnum | None] = <class 'hairpin2.infrastructure.process_engines.VariantFlaggerEngine'>