coalispr.count_analyze.countfile_plots¶

Module to plot count comparisons.

Attributes¶

logger

Functions¶

`compare_un_spec_lengths`(rdkind, readlen, strand, notitles)	Compare distribution of lengths of all SPECIFIC and UNSPECIFIC
`compare_exp_lengths`(rdkind, readlen, strand, group, ...)	Obtain read-length distribution for all separate library samples.
`compare_libtotals`(rdkind, strand, group, notitles, ...)	Obtain read total library count distribution for two dataframes
`compare_exp_bins`(rdkind, strand, group, notitles, ...)	Obtain bin distribution for all SPECIFIC samples.
`compare_exp_mulmaps`(rdkind, strand, group, notitles, ...)	Obtain hit distribution for multimappers in each library.
`compare_un_spec_mulmaps`(rdkind, strand, notitles)	Compare distribution of hit-numbers (repeats) for SPECIFIC and

Module Contents¶

coalispr.count_analyze.countfile_plots.logger¶

coalispr.count_analyze.countfile_plots.compare_un_spec_lengths(rdkind, readlen, strand, notitles)¶

Compare distribution of lengths of all SPECIFIC and UNSPECIFIC reads.

As set in the process_bamdata; data from unused samples are not included. Relevant file-titles: {kind}_[RLENCOUNTS | LENCOUNTS]_ALL_{strand}TSV.

Output are diagrams with seaborn bar plots for read-length distribution of all or only UNIQ reads; can be saved as png or svg.

Parameters:

rdkind (str) – One of LIBR, UNIQ, CHREXTRA, COLLR, INTR, INTR + COLLR. Choose all (LIBR), only uniquely-mapped reads (UNIQ, leaving out repetitive sequences like tRNA, rRNA or common transposons), reads for extra sequences (CHREXTRA), or get intron-like gaps (INTR}.
readlen (tuple) – Limits (int,int) of read lengths to show.
strand (str) – One of COMBI, MUNR, or CORB.
notitles (bool) – Flag to set display of figure title on graph.

coalispr.count_analyze.countfile_plots.compare_exp_lengths(rdkind, readlen, strand, group, notitles, use, showdiscards)¶

Obtain read-length distribution for all separate library samples.

Relevant file-titles: {kind}_[RLENCOUNTS | LENCOUNTS]_{strand}TSV.

Parameters:

rdkind (str) – One of LIBR, cDNAs (COLLR), CHRXTRA, COLLR, INTR, COLLR + INTR, possibly preceded by UNIQ or MULMAP to get only uniquely-mapped reads (UNIQ), or the repetitive sequences (MULMAP) like tRNA, rRNA or some transposons).
readlen (tuple) – Limits (int,int) of read lengths to show.
strand (str) – One of COMBI, MUNR, or CORB.
group (str) – Name determining grouping of samples, CATEGORY, METHOD, or FRACTION.
notitles (bool) – Flag to set display of figure title on graph.
use (str) – Tag to filter (SPECIFIC or UNSPECIFIC) particular reads.
showdiscards (bool) – Show discarded samples.

coalispr.count_analyze.countfile_plots.compare_libtotals(rdkind, strand, group, notitles, showdiscards, diff, log2)¶

Obtain read total library count distribution for two dataframes

Relevant file-titles: “{kind}{COU}_{strand}{TSV}”.

Parameters:

rdkind (str) – One of LIBR, UNIQ, CHRXTRA, COLLR, INTR, INTR+COLLR, INTR+MULMAP, SKIP. Choose all (LIBR), only uniquely-mapped reads (UNIQ, leaving out repetitive sequences, MULMAP, like tRNA, rRNA or common transposons), reads for extra sequences (CHREXTRA), or get intron-like gaps (INTR}. Or reads skipped (SKIP) because of imperfect alignment according to cigar string.
strand (str) – One of {ALL_COMBI, COMBI, MUNR, CORB}.
group (str) – Name determining grouping of samples, either CATEGORY, METHOD, or FRACTION.
notitles (bool) – Flag to set display of figure title on graph.
showdiscards (bool) – Show discarded samples.
diff (bool) – Show log2 difference or plain UNSELECTED values with log 2 scale.
log2 (bool) – Use log2 scale if True.

coalispr.count_analyze.countfile_plots.compare_exp_bins(rdkind, strand, group, notitles, showdiscards)¶

Obtain bin distribution for all SPECIFIC samples.

Relevant file-titles: {kind}_{BCO}_{strand}TSV.

Parameters:

rdkind (str) – One of LIBR, UNIQ, CHRXTRA, COLLR, INTR, or COLLR+INTR. Choose all (ALL), only uniquely-mapped reads (UNIQ, leaving out repetitive sequences like tRNA, rRNA or common transposons), cDNAs (COLLR)}.
strand (str) – One of COMBI, MUNR, or CORB.
group (str) – Name determining grouping of samples, CATEGORY, METHOD, or FRACTION.
notitles (bool) – Flag to set display of figure title on graph.
showdiscards (bool) – Show discarded samples.

coalispr.count_analyze.countfile_plots.compare_exp_mulmaps(rdkind, strand, group, notitles, use, showdiscards)¶

Obtain hit distribution for multimappers in each library.

Relevant file-titles: {kind}_{MULMAP}_{strand}TSV.

Parameters:

rdkind (str) – One of LIBR or INTR.
strand (str) – One of COMBI, MUNR, or CORB.
group (str) – Name determining grouping of samples, CATEGORY, METHOD, or FRACTION.
notitles (bool) – Flag to set display of figure title on graph.
use (str) – Tag to filter particular (SPECIFIC or UNSPECIFIC) reads.
showdiscards (bool) – Show discarded samples.

coalispr.count_analyze.countfile_plots.compare_un_spec_mulmaps(rdkind, strand, notitles)¶

Compare distribution of hit-numbers (repeats) for SPECIFIC and UNSPECIFIC multimappers.

As set in the process_bamdata; data from unused samples are not included to prevent affecting totals used for percentaging. Relevant file-titles: {kind}_[MULMAP]_ALL_{strand}TSV.

Output are diagrams with seaborn bar plots for hit-number distribution of MULMAPs; can be saved as png or svg.

Parameters:

rdkind (str) – One of LIBR or INTR.
strand (str) – One of COMBI, MUNR, or CORB.
notitles (bool) – Flag to set display of figure title on graph.