coalispr.count_analyze.load_countfile¶

Module to obtain dataframes from count files.

Attributes¶

logger

Functions¶

`load_count_table`(name[, use, bam, segments, overmax, ...])	Retrieve count tables with given keywords in the folder/filename.
`load_lengths_table_perc`(name, use, debug)	Retrieve lengthcounts from `coalispr.bedgraph_analyze.process_bamdata`
`load_bin_table_perc`(name, use, debug)	Retrieve bin-counts from `coalispr.bedgraph_analyze.process_bamdata`
`get_freq_frame`(name, use, debug[, idx, allc])	Get frame with distribution for index `idx` and column `allc`.
`split_readlengths_index`(df)
`get_readlengths_frame`(name, use, debug)	Add columns PLOTSTRT, PLOTLEN to frame with read lengths.

Module Contents¶

coalispr.count_analyze.load_countfile.logger¶

coalispr.count_analyze.load_countfile.load_count_table(name, use=SPECIFIC, bam=TAGCOLL, segments=TAGUNCOLL, overmax=LOG2BG, maincut=UNSPECLOG10, usegaps=USEGAPS, index_col=0, debug=0)¶

Retrieve count tables with given keywords in the folder/filename.

The count files come from coalispr.bedgraph_analyze.process_bamdata.

Parameters:

name (str) – Name of particular count file to retrieve.
use (str (default: SPECIFIC)) – What type of counted reads to use, i.e. SPECIFIC or UNSPECIFIC.
bam (str (default: TAGCOLL)) – Flag to indicate sort of aligned-reads, TAGCOLL or TAGUNCOLL, used to obtain bam-alignments.
segments (str (default: TAGUNCOLL)) – Flag to indicate sort of aligned-reads, TAGCOLL or TAGUNCOLL, used to obtain segment definitions.
overmax (int (default: LOG2BG)) – Exponent to set threshold above which read signals are considered; part of folder name with stored count files.
maincut (float (default: UNSPECLOG10)) – Exponent to set difference between SPECIFIC and UNSPECIFIC reads; part of folder name with stored count files.
usegaps (int (default: USEGAPS)) – Region tolerated between peaks of mapped reads to form a contiguous segment; part of folder name with stored count files.
indexcol (int) – Number of index-column in table loaded from csv file.
debug (int) – Level to obtain relevant function calling this support function.

Return type:

pandas.DataFrame

coalispr.count_analyze.load_countfile.load_lengths_table_perc(name, use, debug)¶

Retrieve lengthcounts from coalispr.bedgraph_analyze.process_bamdata as percentages for given kind of reads.

Parameters:

name (str) – Name of particular count file to retrieve.
use (str (default: SPECIFIC)) – What type of counted reads to use, i.e. SPECIFIC or UNSPECIFIC.
debug (int) – Level to obtain relevant function calling this support function.

Returns:

Dataframe with percentaged counts for read-lengths.

Return type:

pandas.DataFrame

coalispr.count_analyze.load_countfile.load_bin_table_perc(name, use, debug)¶: Retrieve bin-counts from coalispr.bedgraph_analyze.process_bamdata as percentages for given kind of reads.

coalispr.count_analyze.load_countfile.get_freq_frame(name, use, debug, idx=PLOTINTRLEN, allc=PLOTPERC)¶

Get frame with distribution for index idx and column allc.

Parameters:

name (str) – Name of particular count file to retrieve.
use (str) – What type of counted reads to use, i.e. SPECIFIC or UNSPECIFIC.
idx (str) – Name of index column, PLOTINTRLEN for lengths (introns) or PLOTFREQ for number of hits (multimappers).
allc (str) – Name of column with values to be shown, PLOTINTR for lengths or PLOTMMAP for multimappers

Returns:

Dataframe with frequencies for lengths or multimappers.

Return type:

pandas.DataFrame

coalispr.count_analyze.load_countfile.split_readlengths_index(df)¶

coalispr.count_analyze.load_countfile.get_readlengths_frame(name, use, debug)¶

Add columns PLOTSTRT, PLOTLEN to frame with read lengths.

Parameters: name : str

Name of file to load.

usestr: Which kind of reads to use, SPECIFIC or UNSPECIFIC.
debugint: For debugging: the number of levels calling function is separated.