coalispr.count_analyze.countfile_plots

Module to plot count comparisons.

Attributes

Functions

compare_un_spec_lengths(rdkind, readlen, strand, notitles)

Compare distribution of lengths of all SPECIFIC and UNSPECIFIC

compare_exp_lengths(rdkind, readlen, strand, group, ...)

Obtain read-length distribution for all separate library samples.

compare_libtotals(rdkind, strand, group, notitles, ...)

Obtain read total library count distribution for two dataframes

compare_exp_bins(rdkind, strand, group, notitles, ...)

Obtain bin distribution for all SPECIFIC samples.

compare_exp_mulmaps(rdkind, strand, group, notitles, ...)

Obtain hit distribution for multimappers in each library.

compare_un_spec_mulmaps(rdkind, strand, notitles)

Compare distribution of hit-numbers (repeats) for SPECIFIC and

Module Contents

coalispr.count_analyze.countfile_plots.logger
coalispr.count_analyze.countfile_plots.compare_un_spec_lengths(rdkind, readlen, strand, notitles)

Compare distribution of lengths of all SPECIFIC and UNSPECIFIC reads.

As set in the process_bamdata; data from unused samples are not included. Relevant file-titles: {kind}_[RLENCOUNTS | LENCOUNTS]_ALL_{strand}TSV.

Output are diagrams with seaborn bar plots for read-length distribution of all or only UNIQ reads; can be saved as png or svg.

Parameters:
  • rdkind (str) – One of LIBR, UNIQ, CHREXTRA, COLLR, INTR, INTR + COLLR. Choose all (LIBR), only uniquely-mapped reads (UNIQ, leaving out repetitive sequences like tRNA, rRNA or common transposons), reads for extra sequences (CHREXTRA), or get intron-like gaps (INTR}.

  • readlen (tuple) – Limits (int,int) of read lengths to show.

  • strand (str) – One of COMBI, MUNR, or CORB.

  • notitles (bool) – Flag to set display of figure title on graph.

coalispr.count_analyze.countfile_plots.compare_exp_lengths(rdkind, readlen, strand, group, notitles, use, showdiscards)

Obtain read-length distribution for all separate library samples.

Relevant file-titles: {kind}_[RLENCOUNTS | LENCOUNTS]_{strand}TSV.

Parameters:
  • rdkind (str) – One of LIBR, cDNAs (COLLR), CHRXTRA, COLLR, INTR, COLLR + INTR, possibly preceded by UNIQ or MULMAP to get only uniquely-mapped reads (UNIQ), or the repetitive sequences (MULMAP) like tRNA, rRNA or some transposons).

  • readlen (tuple) – Limits (int,int) of read lengths to show.

  • strand (str) – One of COMBI, MUNR, or CORB.

  • group (str) – Name determining grouping of samples, CATEGORY, METHOD, or FRACTION.

  • notitles (bool) – Flag to set display of figure title on graph.

  • use (str) – Tag to filter (SPECIFIC or UNSPECIFIC) particular reads.

  • showdiscards (bool) – Show discarded samples.

coalispr.count_analyze.countfile_plots.compare_libtotals(rdkind, strand, group, notitles, showdiscards, diff, log2)

Obtain read total library count distribution for two dataframes

Relevant file-titles: “{kind}{COU}_{strand}{TSV}”.

Parameters:
  • rdkind (str) – One of LIBR, UNIQ, CHRXTRA, COLLR, INTR, INTR+COLLR, INTR+MULMAP, SKIP. Choose all (LIBR), only uniquely-mapped reads (UNIQ, leaving out repetitive sequences, MULMAP, like tRNA, rRNA or common transposons), reads for extra sequences (CHREXTRA), or get intron-like gaps (INTR}. Or reads skipped (SKIP) because of imperfect alignment according to cigar string.

  • strand (str) – One of {ALL_COMBI, COMBI, MUNR, CORB}.

  • group (str) – Name determining grouping of samples, either CATEGORY, METHOD, or FRACTION.

  • notitles (bool) – Flag to set display of figure title on graph.

  • showdiscards (bool) – Show discarded samples.

  • diff (bool) – Show log2 difference or plain UNSELECTED values with log 2 scale.

  • log2 (bool) – Use log2 scale if True.

coalispr.count_analyze.countfile_plots.compare_exp_bins(rdkind, strand, group, notitles, showdiscards)

Obtain bin distribution for all SPECIFIC samples.

Relevant file-titles: {kind}_{BCO}_{strand}TSV.

Parameters:
  • rdkind (str) – One of LIBR, UNIQ, CHRXTRA, COLLR, INTR, or COLLR+INTR. Choose all (ALL), only uniquely-mapped reads (UNIQ, leaving out repetitive sequences like tRNA, rRNA or common transposons), cDNAs (COLLR)}.

  • strand (str) – One of COMBI, MUNR, or CORB.

  • group (str) – Name determining grouping of samples, CATEGORY, METHOD, or FRACTION.

  • notitles (bool) – Flag to set display of figure title on graph.

  • showdiscards (bool) – Show discarded samples.

coalispr.count_analyze.countfile_plots.compare_exp_mulmaps(rdkind, strand, group, notitles, use, showdiscards)

Obtain hit distribution for multimappers in each library.

Relevant file-titles: {kind}_{MULMAP}_{strand}TSV.

Parameters:
  • rdkind (str) – One of LIBR or INTR.

  • strand (str) – One of COMBI, MUNR, or CORB.

  • group (str) – Name determining grouping of samples, CATEGORY, METHOD, or FRACTION.

  • notitles (bool) – Flag to set display of figure title on graph.

  • use (str) – Tag to filter particular (SPECIFIC or UNSPECIFIC) reads.

  • showdiscards (bool) – Show discarded samples.

coalispr.count_analyze.countfile_plots.compare_un_spec_mulmaps(rdkind, strand, notitles)

Compare distribution of hit-numbers (repeats) for SPECIFIC and UNSPECIFIC multimappers.

As set in the process_bamdata; data from unused samples are not included to prevent affecting totals used for percentaging. Relevant file-titles: {kind}_[MULMAP]_ALL_{strand}TSV.

Output are diagrams with seaborn bar plots for hit-number distribution of MULMAPs; can be saved as png or svg.

Parameters:
  • rdkind (str) – One of LIBR or INTR.

  • strand (str) – One of COMBI, MUNR, or CORB.

  • notitles (bool) – Flag to set display of figure title on graph.