SISUA

Semi-supervised Single-cell modeling:

Free software: MIT license
Documentation: https://github.com/trungnt13/sisua/tree/master/docs.

Reference:

Trung Ngo Trong, Roger Kramer, Juha Mehtonen, Gerardo González, Ville Hautamäki, Merja Heinäniemi. "SISUA: SemI-SUpervised Generative Autoencoder for Single Cell Data", ICML Workshop on Computational Biology, 2019. [pdf]

Installation

You only need Python 3.6, the stable version of SISUA installed via pip:

pip install sisua

Install the nightly version on github:

pip install git+https://github.com/trungnt13/sisua@master

For developers, we create a conda environment for SISUA contribution sisua_env

conda env create -f=sisua_env.yml

Getting started

The basics:
- Datasets description
- Models specification
- Basic API and work-flow
Single-cell analysis:
- Latent space
- Imputation of genes expression
- Prediction of protein markers
Advanced technical topics:
- Probabilistic embedding
- Hierarchical modeling (coming soon)
- Causal analysis (coming soon)
- Cross datasets analysis (coming soon)
Benchmarks:
- Scalability test
- Fine-tuning networks
- Data normalization

Roadmap

[x] Multi-OMICs single-cell dataset (link)
[x] Disentanglement VAE for multi-OMICs data (link)
[x] New model: FactorVAE, BetaVAE, MIxture Semi-supervised Autoencoder (MISA) (link)
[ ] Better imputation via hierarchical latents model.
[ ] Release SISUA 2

Toolkits

We provide binary toolkits for fast and efficient analyzing single-cell datasets:

sisua-train: train single-cell modeling algorithms, support training multiple systems in parallel.
sisua-analyze: evaluate, compare, and interpret trained model.
sisua-embed: probabilistic embedding for semi-supervised training.
sisua-data: coming soon

Some important arguments:

-model

name of function declared in models

scvi: single-cell Variational Inference model
dca: Deep Count Autoencoder
vae: single-cell Variational Autoencoder
movae: SISUA

-ds

name of dataset declared in data.

Description of all predefined datasets is in docs.

Some good datasets for practicing:

pbmc8k_ly
cortex
pbmcecc_ly
pbmcscvi
pbmcscvae

Configuration

By default, the data will be saved at your home folder at ~/bio_data, and the experiments' outputs will be stored at ~/bio_log

You can customize these two paths using the environment variables:

For storing downloaded and preprocessed data: SISUA_DATA
For the experiments: SISUA_EXP

For example:

import os
os.environ['SISUA_DATA'] = '/tmp/bio_data'
os.environ['SISUA_EXP'] = '/tmp/bio_log'

from sisua.data import EXP_DIR, DATA_DIR

print(DATA_DIR) # /tmp/bio_data
print(EXP_DIR)  # /tmp/bio_log

or you could set the variables in advance:

export SISUA_DATA=/tmp/bio_data
export SISUA_EXP=/tmp/bio_log
python sisua/train.py
# or using the provided toolkit: sisua-train

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
bin		bin
configs		configs
description		description
sisua		sisua
tests		tests
tutorials		tutorials
.gitignore		.gitignore
LICENSE		LICENSE
README.rst		README.rst
setup.cfg		setup.cfg
setup.py		setup.py
sisua.yml		sisua.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SISUA

Installation

Getting started

Roadmap

Toolkits

Configuration

About

Releases 1

Packages

Languages

License

trungnt13/sisua

Folders and files

Latest commit

History

Repository files navigation

SISUA

Installation

Getting started

Roadmap

Toolkits

Configuration

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages