# CNA_origin
**Repository Path**: wanqikang/CNA_origin
## Basic Information
- **Project Name**: CNA_origin
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-08-03
- **Last Updated**: 2021-08-03
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# CNA_origin
We proposed a two-step computational framework called CNA_origin to predict the tissue-of-origin of a tumor from its gene CNA levels. CNA origin set up an intellectual deep-learning network mainly composed of autoencoder and convolution neural network (CNN).
If you want to use CNA_origin, you must have gene-level CNA file and label file.
The use of CNA_origin:
CNA_origin.py -T PATH_GENE_CNV: File of the gene CNV
-G PATH_LABEL: File of the sample label
[-d DIM_NUMBER]:The Number of Features after Dimension Reduction, default:100
[-k K_CROSS_VALIDATION]: k fold cross validation, default:10
[-s TRAINING_PART_SCALE]: Split scale for train/test,default:0.1
[-o OUTPUT_FILE]: The result output path
The merge-group file contains sample label information. The merge-sample file contains the gene-level CNA information of 50 samples. The complete datasets were from primary solid tumor samples released by MSKCC in 2013, which could be downloaded from http://cbio.mskcc.org/cancergenomics/pancan_tcga/ or http://gdac.broadinstitute.org/. We recommend using dataset with sample size greater than 400.
for example: python CNA_origin.py -T merge-sample -G merge-group
CNA origin was implemented in python 3.7.3 using keras (2.24) with the backend of tensorflow (1.14.0)
The program now has a bug that can only be run using CPU (not GPU). We are trying to fix it.
If you have any question,please send email to aliang1229@126.com. We will continue to improve the code of CNA_origin.