# partial-labels **Repository Path**: fulowl/partial-labels ## Basic Information - **Project Name**: partial-labels - **Description**: multi label classification with missing labels - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2019-12-09 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # partial-labels ## Presentation This is the working repository for a research project about partial labels. Authors are [Laura Calem](https://lcalem.github.io/) and [Olivier Petit](https://www.olivier-petit.fr/) ## Setup ### 1. Download repo 1. Go in some folder 2. `git clone git@github.com:lcalem/partial-labels.git .` 3. Put `/partial-labels` in your `PYTHONPATH` by putting this line in your `/.bashrc` or whatever file you're using: ``` export PYTHONPATH="${PYTHONPATH}://partial-labels" ``` ### 2. Datasets #### 2.1. PascalVOC 2.1.1. Download
Make a dataset folder and `cd` in it (it will be called )
Download Pascal-VOC 2007 dataset - trainval `wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar` - test `wget wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar` 2.1.2. Untar everything in place and all annotations / images should come in place nicely - `tar xvf VOCtrainval_06-Nov-2007.tar` - `tar xvf VOCtest_06-Nov-2007.tar` 2.1.3. Preprocess using `data/pascalvoc/preprocessing/pp_multilabel.py` to create a single annotation csv (do it once for trainval and a second time for test, separately) - `python3 pp_multilabel.py train` - `python3 pp_multilabel.py val` 2.1.4. Use `data/pascalvoc/preprocessing/partial_datasets.py` to create partial datasets: `python3 partial_datasets.py /Annotations` #### 2.2. MS COCO 2.2.1. Download
Make a dataset folder and `cd` in it (it will be called )
Download MSCOCO 2014 dataset - train images `wget http://images.cocodataset.org/zips/train2014.zip` - val images `wget http://images.cocodataset.org/zips/val2014.zip` - train+val annotations `wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip` 2.2.2. Unzip everything in place 2.2.3. Preprocess using `data/coco/preprocessing/pp_multilabel.py` to create a single annotation csv (do it once for train and once for val separately): - `python3 pp_multilabel.py train2014` - `python3 pp_multilabel.py val2014` This operation will create csv complete datasets `/annotations/multilabel_train2014.csv` and `/annotations/multilabel_val2014.csv`. 2.2.4. Use `data/coco/preprocessing/partial_datasets.py` to create partial datasets (one with 10% known labels, one with 20% known labels, and so on til 100% known labels which should be identical to `multilabel_train2014.csv`): `python3 partial_datasets.py /annotations/multilabel_train2014.csv`