Coalispr introduction¶
Coalispr (COunt ALIgned SPecified Reads) is a Python tool to clean up (small) RNA sequencing results. It can visualize over 100 bedgraphs in one panel [1] and helps to retrieve read counts from associated bam files without reliance on reference features (GTF annotations).
Features¶
- Fast and voluminous
- Input files are bedgraph files
- Interactive visualization
- All bedgraphs for a chromosome are shown (with Matplotlib).Toggle data display.Load GTF files for overlap with annotated features.Signal scale: normal or log2.Save snapshots as svg, jpg, pdf or png.
- Counting
- Map contiguous regions of unspecific or specific reads.These segment definitions are stored in tsv files to:Split segments into a number of bins to profile coverage.Collect counts for various read properties and save to tsv files.Obtain counts for particular chromosomal regions.Thus, counting relies on genome coordinates, not GTF references.
- Analysis
- Count-outputs can be diagrammed (with Matplotlib and Seaborn).Compare numbers for reads, cDNAs, introns, multimappers.Check length-distributions of reads, also for a particular genomic region.Annotate count files with gene-information from GTF references.
For a rationale and application of Coalispr see the essay: ‘Bio‑informatics: Integrate negative controls to get the good data’.
Requirements¶
Enough RAM for loading genome data.
The numerical expression evaluator for NumPy, Numexpr, can help to get the most of your machine computing capabilities [7].
Installation¶
Coalispr is on Codeberg.org and Pypi.org from where it can be downloaded.
Configuration files with properties will have to be edited by the user to analyze their own data (see Tutorials). Therefore, this package is best installed locally in user space, not system wide. Alternatively, the program can be installed in a virtual environment [8].
After extraction of the source archive, go to the coalispr
project folder with the setup.py
and pyproject.toml
files and run in a terminal (as user):
python3 -m pip install --editable .
This also makes it easy to adapt source code and directly test the changes.
A script, callable from the command line with coalispr
, will be installed locally [9] (alternatively, you can run python3 -m coalispr
instead of coalispr
).
With installs of pandas-2.x
please link coalispr/resources/numeric.py
to python3/site_packages/pandas/core/indexes/
(see here).
Installation can be done in a virtual environment as described in INSTALL.txt
Run Coalispr¶
In a terminal run the following command-line, which shows the various options for Coalispr:
coalispr -h
See the How-to guides and the Tutorials for more information.
Contribute¶
All resources for Coalispr are accessible at
Source Code: https://codeberg.org/coalispr/coalispr
Issue Tracker: https://codeberg.org/coalispr/coalispr/issues
Documentation¶
This documentation is online at https://coalispr.codeberg.page/
Sources for the documentation can be found at https://codeberg.org/coalispr/coalispr/docs
Datasets supporting the tutorials are published at Zenodo.org under DOI 10.5281/zenodo.12822543
Licences¶
The program source code is published under the European Union Public Licence (EUPL)
EUP L The documentation is under the Creative Commons Attribution License (CC-BY-4.0)
"Licence CC BY 4.0, Attribution 4.0 International: free to reuse with crediting the maker"