coalispr.resources.utilities¶
Attributes¶
Functions¶
|
Sum bin-values to get bin totals. |
|
Create label for chromosome region. |
|
Check for presence extra DNA and annotations. |
|
Remove empty items from a dictionary. |
|
Return date of function called (for saving files). |
Return location of folder with BNY files |
|
|
Return folder name for stored count files |
|
Provide a value of fragment size skipped during counting, |
|
Return a string for logging info, bypassing standard out. |
Return location of folder with TSV files |
|
|
Return formatted label for y-axis of count plots. |
Include test command for profiling functions |
|
|
Is this a dataframe with only 0 values? |
|
Create a backup of file or directory or link, return success. |
|
Return string of words from list or dict of labels. |
|
Quote list-items when joined to string. Add start and end ' to calling |
|
Merge bedgraphs for each chr on intervals with hits. |
|
Generalized multi_processor function for counting reads. |
|
Turn dataframe values into percentages of column totals. |
|
Reader of tabbed binary files, returns dataframe. |
|
Reader of tabbed csv files, returns dataframe. |
|
Prevent spaces or odd symbols in filename string. |
|
Remove dots from file name. |
|
Replace sections in a string, pathname, print line etc. |
|
Replace items in a list. |
|
Return name of current or calling function for logging. |
|
Print the time needed to run the decorated function. |
|
Writer of binary files from dataframe. |
|
Select and write file based on suffix |
|
Writer of tabbed csv files from dataframe. |
Module Contents¶
- coalispr.resources.utilities.logger¶
- coalispr.resources.utilities.TEST = False¶
- coalispr.resources.utilities.binary_read¶
- coalispr.resources.utilities.binary_write¶
- coalispr.resources.utilities.bins_sum(df, level=BINN)¶
Sum bin-values to get bin totals.
- Parameters:
df (pandas.DataFrame) – Dataframe with bins that constitute a
levelto be converted.level (str) – Column header with indices to be grouped (default=**BINN**).
- Return type:
pandas.DataFrame
- coalispr.resources.utilities.check_newpath(p)¶
- coalispr.resources.utilities.chrom_region(chrnam, region)¶
Create label for chromosome region.
- Parameters:
chrnam (str) – Chromosome name.
region (tuple) – Tuple of coordinates.
- coalispr.resources.utilities.chrxtra()¶
Check for presence extra DNA and annotations.
- coalispr.resources.utilities.clean_dict(adict)¶
Remove empty items from a dictionary.
- Parameters:
adict (dict) – Dictionary
- coalispr.resources.utilities.doneon()¶
Return date of function called (for saving files).
- coalispr.resources.utilities.get_bnypath()¶
Return location of folder with BNY files
- coalispr.resources.utilities.get_count_folder(kind)¶
Return folder name for stored count files
- Parameters:
kind (str) – Kind of reads: UNSPECIFIC or SPECIFIC
Notes
- TAGBAM: str (bam)
Flag to indicate sort of aligned-reads, TAGCOLL or TAGUNCOLL, used to obtain bam-alignments.
- TAGSEG: str (segments)
Flag to indicate sort of aligned-reads, TAGCOLL or TAGUNCOLL, used to obtain segment definitions.
- LOG2BG: int (over)
Exponent to set threshold above which read signals are considered; part of folder name with stored count files.
- UNSPECLOG10: float (unspec)
Exponent to set difference between SPECIFIC and UNSPECIFIC reads; part of folder name with stored count files.
- gaps: int
Region tolerated between peaks of mapped reads to form a contiguous segment, USEGAPS or UNSPCGAPS.
- coalispr.resources.utilities.get_skip()¶
Provide a value of fragment size skipped during counting, which depends on BINSTEP and MIRNAPKBUF.
- Returns:
A value representing an extra margin to expand read segment with a single peak beyond 0.
- Return type:
int
- coalispr.resources.utilities.get_string_info(df)¶
Return a string for logging info, bypassing standard out.
- Parameters:
df (pandas.DataFrame) – Dataframe to get info for.
- coalispr.resources.utilities.get_tsvpath()¶
Return location of folder with TSV files
- coalispr.resources.utilities.get_ylabel(label, strand=COMBI, spaces=0)¶
Return formatted label for y-axis of count plots.
- Parameters:
label (str) – Read kind name to retrieve a label for configured in CNTLABELS.
strand (str) – One of COMBI, MUNR or CORB to indicate strand counted reads map to.
spaces (int) – Number of spaces to start second line with.
- coalispr.resources.utilities.include_test()¶
Include test command for profiling functions
- coalispr.resources.utilities.is_all_zero(df)¶
Is this a dataframe with only 0 values?
- Parameters:
df (pandas.DataFrame) – Dataframe to get info for.
- Returns:
Flag to indicate whether all values are 0.
- Return type:
bool
- coalispr.resources.utilities.is_backedup(pathtofile, moveit=True)¶
Create a backup of file or directory or link, return success.
Parameters:¶
- pathtofile: Path
Path to object to be overwritten/replaced with a new version with the same name.
- moveit: bool
Flag to indicate to rename and move file (default) or copy it.
- returns:
Flag to indicate backup process went through.
- rtype:
bool
- coalispr.resources.utilities.joinall(labels, conn="', '")¶
Return string of words from list or dict of labels.
- Parameters:
labels (list or dict) – List/Dictionary of lists of words to be joined.
conn (str) – Connector linking the words from labels.
- coalispr.resources.utilities.joiner(symb=None)¶
Quote list-items when joined to string. Add start and end ‘ to calling format function {no control of ‘/” when using ‘repr’ by including !r}.
- coalispr.resources.utilities.merg(df1, df2)¶
Merge bedgraphs for each chr on intervals with hits.
All rows/columns need to be combined; this creates duplicate columns with adapted names when non-unique columns are merged.
- Parameters:
df1 (pandas.DataFrame) – Dataframes to merge
df2 (pandas.DataFrame) – Dataframes to merge
- Returns:
Merged dataframe.
- Return type:
pandas.DataFrame
- coalispr.resources.utilities.multi_process(func, keys)¶
Generalized multi_processor function for counting reads.
- Parameters:
func (function) – Name of function to run in separate process.
keys (list) – List of keys - SHORT names, each leading to one alignment file to be counted in a separate process.
- Returns:
collected_objects – Collection of objects generated by fusing outcomes of each process.
- Return type:
list
- coalispr.resources.utilities.percentaged(df)¶
Turn dataframe values into percentages of column totals.
- Parameters:
df (pandas.DataFrame) – Dataframe with raw counts
- Return type:
pandas.DataFrame
- coalispr.resources.utilities.read_bny(filenam, kwargs)¶
Reader of tabbed binary files, returns dataframe.
- Parameters:
filenam (Path or str) – Input binary file
kwargs (dict) – Additional parameters
- coalispr.resources.utilities.read_tsv(filenam, kwargs=None)¶
Reader of tabbed csv files, returns dataframe.
- Parameters:
filenam (Path or str) – Input text file
kwargs (dict) – Additional parameters fitting pd.to_csv; omit comment and sep
- coalispr.resources.utilities.remove_odds(termodds)¶
Prevent spaces or odd symbols in filename string.
- Parameters:
termodds (str) – String with possibly symbols or spaces in filename.
- Returns:
Lower case name without odds; not to be confused with extension.
- Return type:
str
- coalispr.resources.utilities.replace_dot(termdot)¶
Remove dots from file name.
- Parameters:
termdot (str) – String with possibly dots (‘.’) in filename (excluding extension).
- Returns:
Name without dot(s); not to be confused with extension.
- Return type:
str
- coalispr.resources.utilities.replacelist(linestring, names_old_new)¶
Replace sections in a string, pathname, print line etc.
- Parameters:
linestring (str) – String with particular sections to be replaced.
names_old_new (list of tuples) – Contents of listed tuples: (search-string, replacement).
- Returns:
String after replacement.
- Return type:
str
- coalispr.resources.utilities.replacelist_list(listofitems, names_old_new)¶
Replace items in a list.
- Parameters:
listofitems (list) – List with particular items to be changed.
names_old_new (list of tuples) – Contents of listed tuples: (search-string, replacement).
- Returns:
List with items including those that have been replaced.
- Return type:
list
- coalispr.resources.utilities.thisfunc(n=0)¶
Return name of current or calling function for logging.
- Parameters:
n (int) – For current func name, specify 0 or no argument. For name of caller of current func, specify 1. For name of caller of caller of current func, specify 2. etc.
- Returns:
Name of function containing call.
- Return type:
str
- coalispr.resources.utilities.timer(func)¶
Print the time needed to run the decorated function. from: https://realpython.com/primer-on-python-decorators/
- coalispr.resources.utilities.write_bny(df, filenam, kwargs=None)¶
Writer of binary files from dataframe.
- Parameters:
df (pd.DataFrame) – Dataframe to save.
filnam (Path or str) – Output binary file.
kwargs (dict) – Additional parameters fitting pd.to_<format>;
- coalispr.resources.utilities.write_suffix(suffix, df, filenam, kwargs=None)¶
Select and write file based on suffix
- coalispr.resources.utilities.write_tsv(df, filenam, kwargs=None)¶
Writer of tabbed csv files from dataframe.
- Parameters:
df (pd.DataFrame) – Dataframe to save
filnam (Path or str) – Output path for tabbed text file
kwargs (dict) – Additional parameters, apart from ‘comment’, ‘sep’, ‘quoting’, ‘quotingchar’, or ‘escapechar’, fitting pd.to_csv.