coalispr.bedgraph_analyze.collect_bedgraphs¶
Module for collecting data and configured experiment information.
Attributes¶
Functions¶
Dataframe linking files by a filename-derived key to a short name. |
|
Returns dict for categories with SHORT names as keys. |
|
|
Returns path to main folder with folders containing data files. |
|
Returns list of short names based on properties of experiments. |
|
Find file paths to bedgraphs and return as list. |
Find file paths to bedgraphs for reference data and return as list. |
Module Contents¶
- coalispr.bedgraph_analyze.collect_bedgraphs.logger¶
- coalispr.bedgraph_analyze.collect_bedgraphs.label_frame()¶
Dataframe linking files by a filename-derived key to a short name.
This is the start-dataframe. It defines abbreviated names for display (SHORT), their CATEGORY, etc.. and is built from a tabbed text file EXPFILE containing all details. Try to correct for errors in EXPFILE that easily occur, like spaces by themselves as a value in a group column.
- Returns:
Dataframe of EXPFILE with SHORT as index.
- Return type:
pandas.DataFrame
- coalispr.bedgraph_analyze.collect_bedgraphs.get_categories_dict()¶
Returns dict for categories with SHORT names as keys.
- coalispr.bedgraph_analyze.collect_bedgraphs.checkSRCDIR(src_dir, tag)¶
Returns path to main folder with folders containing data files.
- src_dir: str
Folder name (SRCFLDR) to create Path for.
- tag: str
Flag to indicate kind of data (TAGUNCOLL or TAGCOLL), can be unset (None) for references; required to find data files.
- coalispr.bedgraph_analyze.collect_bedgraphs.get_experiments(category=None, method=None, fraction=None, plusdiscards=True)¶
Returns list of short names based on properties of experiments.
- Parameters:
category (str or list) – Name or list of category item(s) as present in EXPFILE
method (str or list) – Name or list of method(s) as given in EXPFILE
fraction (str or list) – Name or list of fraction(s) as given in EXPFILE
plusdiscards (bool) – Flag to indicate whether to include samples marked as a discard.
- Returns:
A list of SHORT names representing the samples/experiments requested.
- Return type:
list
- coalispr.bedgraph_analyze.collect_bedgraphs.collect_bedgraphs(tag, src_dir=SRCFLDR, ndirlevels=SRCNDIRLEVEL, category=None)¶
Find file paths to bedgraphs and return as list.
- Parameters:
tag (str (default: TAGUNCOLL)) – Flag to indicate kind of aligned-reads, TAGUNCOLL or TAGCOLL .
src_dir (str (default: SRCFLDR)) – File path + TAGCOLL, or + TAGUNCOLL, as string to main folder with folders containing data files.
ndirlevels (int (default: SRCNDIRLEVEL)) – Number of folders in between SRCFLDR and data files.
category (str or list) – Name or list of category item(s) as present in EXPFILE.
- Returns:
A tuple of dictionaries with FILEKEY items and paths to separate bedgraph files for PLUS - and MINUS strand data.
- Return type:
dict, dict
- coalispr.bedgraph_analyze.collect_bedgraphs.collect_references()¶
Find file paths to bedgraphs for reference data and return as list.
- Returns:
A tuple of dictionaries with FILEKEY items and paths to separate bedgraph files for the reference with PLUS - and MINUS strand data.
- Return type:
dict, dict