coalispr.count_analyze.load_countfile¶
Module to obtain dataframes from count files.
Attributes¶
Functions¶
|
Retrieve count tables with given keywords in the folder/filename. |
|
Retrieve lengthcounts from |
|
Retrieve bin-counts from |
|
Get frame with distribution for index |
|
Add columns PLOTSTRT, PLOTLEN to frame with read lengths. |
Module Contents¶
- coalispr.count_analyze.load_countfile.logger¶
- coalispr.count_analyze.load_countfile.load_count_table(name, use=SPECIFIC, bam=TAGCOLL, segments=TAGUNCOLL, overmax=LOG2BG, maincut=UNSPECLOG10, usegaps=USEGAPS, index_col=0, debug=0)¶
Retrieve count tables with given keywords in the folder/filename.
The count files come from
coalispr.bedgraph_analyze.process_bamdata
.- Parameters:
name (str) – Name of particular count file to retrieve.
use (str (default: SPECIFIC)) – What type of counted reads to use, i.e. SPECIFIC or UNSPECIFIC.
bam (str (default: TAGCOLL)) – Flag to indicate sort of aligned-reads, TAGCOLL or TAGUNCOLL, used to obtain bam-alignments.
segments (str (default: TAGUNCOLL)) – Flag to indicate sort of aligned-reads, TAGCOLL or TAGUNCOLL, used to obtain segment definitions.
overmax (int (default: LOG2BG)) – Exponent to set threshold above which read signals are considered; part of folder name with stored count files.
maincut (float (default: UNSPECLOG10)) – Exponent to set difference between SPECIFIC and UNSPECIFIC reads; part of folder name with stored count files.
usegaps (int (default: USEGAPS)) – Region tolerated between peaks of mapped reads to form a contiguous segment; part of folder name with stored count files.
indexcol (int) – Number of index-column in table loaded from csv file.
debug (int) – Level to obtain relevant function calling this support function.
- Return type:
pandas.DataFrame
- coalispr.count_analyze.load_countfile.load_lengths_table_perc(name, use, debug)¶
Retrieve lengthcounts from
coalispr.bedgraph_analyze.process_bamdata
as percentages for given kind of reads.- Parameters:
name (str) – Name of particular count file to retrieve.
use (str (default: SPECIFIC)) – What type of counted reads to use, i.e. SPECIFIC or UNSPECIFIC.
debug (int) – Level to obtain relevant function calling this support function.
- Returns:
Dataframe with percentaged counts for read-lengths.
- Return type:
pandas.DataFrame
- coalispr.count_analyze.load_countfile.load_bin_table_perc(name, use, debug)¶
Retrieve bin-counts from
coalispr.bedgraph_analyze.process_bamdata
as percentages for given kind of reads.
- coalispr.count_analyze.load_countfile.get_freq_frame(name, use, debug, idx=PLOTINTRLEN, allc=PLOTPERC)¶
Get frame with distribution for index
idx
and columnallc
.- Parameters:
name (str) – Name of particular count file to retrieve.
use (str) – What type of counted reads to use, i.e. SPECIFIC or UNSPECIFIC.
idx (str) – Name of index column, PLOTINTRLEN for lengths (introns) or PLOTFREQ for number of hits (multimappers).
allc (str) – Name of column with values to be shown, PLOTINTR for lengths or PLOTMMAP for multimappers
- Returns:
Dataframe with frequencies for lengths or multimappers.
- Return type:
pandas.DataFrame
- coalispr.count_analyze.load_countfile.split_readlengths_index(df)¶
- coalispr.count_analyze.load_countfile.get_readlengths_frame(name, use, debug)¶
Add columns PLOTSTRT, PLOTLEN to frame with read lengths.
Parameters: name : str
Name of file to load.
- usestr
Which kind of reads to use, SPECIFIC or UNSPECIFIC.
- debugint
For debugging: the number of levels calling function is separated.