BiC2PAM
BiC2PAM (BiClustering with Constraints using PAttern Mining) is a biclustering algorithm for unsupervised data analysis with domain knowledge.BiC2PAM integrates recent breakthroughs on pattern-based biclustering (including BicPAM, BicNET and BicSPAM algorithms) and extends them to effectively incorporate constraints and annotations. In this context, the underlying pattern mining searches are: adapted to learn from data with annotations derived from knowledge repositories, and enhanced to be able to explore efficiency gains from constraints with succinct, (anti-)monotone and convertible properties.
Authors: Rui Henriques and Sara Madeira
Please cite: contributions currently under review, contact Rui Henriques, rmch@tecnico.ulisboa.pt, to obtain the updated reference.
Synthetic datasets (non-exhaustive set):
- Datasets with constant biclusters (noise up to ±5%) and respective hidden biclusters can be download here.
500x50, 1000x100, 2000x200, 4000x400 settings with both Uniform and Gaussian distribution of background values. - Datasets with order-preserving biclusters (noise up to ±5%) and respective hidden biclusters can be download here.
500x50, 1000x75, 2000x100 settings with both Uniform and Gaussian distribution of background values. - Datasets with varying levels of planted noise and missings can be download here (constant) and here (order-preserving).
- Annotated data with artifically generated labels for the recovery of constant biclusters with label-consistency (2000x200 setting):
data#1 (10±4 labels/row, 200±10 rows/label), data#2 (10±4,100±10), data#3 (4±2,200±10), data#4 (4±2,100±10).
- dlblc.arff (660 genes, 180 conditions): diffuse large-B-cell lymphoma.
- hughes.arff (6300 genes, 300 conditions): oligonucleotide array for Saccharomyces cerevisiae.
- gasch.txt (6152 genes, 176 conditions): Yeast responses to different stress conditions.
- human.sig (6314 nodes, 423335 interactions): human GIs from STRING database.
- ecoli.sig (8428 nodes, 3293416 interactions): Escherichia coli GIs from STRING database.
- yeast.sig (19247 nodes, 8548002 interactions): yeast GIs from STRING database.
Results: statistical sheets and biological analyses
Software