Manage datasets
In this section we will see how we can upload custom datasets in Deep Learning Studio

How to add datasets in Deep Learning Studio (DLS)

There are 2 ways to upload the dataset
1) From Datasets tab
2) From File browser
    Web-based
    Native browser

1) How to add dataset using "Datasets tab"

    Click on "Datasets" tab from the left navigation bar
    Click on upload icon from top-right to upload a dataset
    When you click on upload icon, it will shows pop-up console where you need to drag/drop the zipped folder of the dataset
Important!
    Make sure you have zipped the dataset folder before uploading it.
    Upload zipped dataset is < 1GB
    Drag/Drop the zipped folder to the datasets canvas
    Select dataset format by clicking on the drop-down button
    Dataset format has three formats:
      DLS Native
      Image Folder Dataset
      MS COCO Dataset
    Click on Start Upload
      It may take a few seconds to get complete. (Depending on the size of the dataset)

2) How to upload dataset from "File Browser"

1) How to upload datasets using "Native file browser"
    Click on "File Browser" from left navigation and "Native", this will open file explorer (<DLS Folder>/user_data/1 /)
    Open the "dataset" folder
    Create/Copy/Move your custom dataset
    Download dataset_config.yaml and place it in your dataset folder
1
dataset_format : <dataset format>
2
source : Upload
Copied!
You may need to modify this config file, update the dataset_format in data_config.yaml

Here we will see which dataset format need to be select for custom dataset

Upload Instructions

    Use DLS Dataset tab for <1GB zipped dataset (recommended)‌
    Use DLS "DLS Native" File Browser for datasets >1GB.

Dataset Formats

1. DLS Native

    1.
    Folder format dataset for image classification
    1
    root/class_x/xxx.ext
    2
    root/class_x/xxy.ext
    3
    root/class_x/xxz.ext
    4
    5
    root/class_y/123.ext
    6
    root/class_y/nsdf3.ext
    7
    root/class_y/asd932_.ext
    Copied!
    2.
    CSV file file name should be train.csv having two or more columns
    e.g. - create imdb like dataset
    text : Encode the text as a string of semicolon-separated numbers. Pad as needed to maintain a fixed length of the sequence.
    label: rating 1
You can also refer to How to Videos to upload the DLS Native Custom dataset.
In Deep Learning Studio, the DLS Native format dataset can only be used for Custom Neural Network project types.

2. Image folder format

    This folder dataset only contains images.
A dataset for loading image files stored in a folder structure.
1
root
2
├── test
3
│ ├── brick
4
│ │ ├── brick_001968.jpg
5
│ │ └── brick_001981.jpg
6
│ ├── water
7
│ │ ├── water_002256.jpg
8
│ │ └── water_002296.jpg
9
│ └── wood
10
│ ├── wood_000770.jpg
11
│ └── wood_000793.jpg
12
├── train
13
│ ├── brick
14
│ │ ├── brick_000593.jpg
15
│ │ └── brick_002089.jpg
16
│ ├── carpet
17
│ │ ├── carpet_002084.jpg
18
│ │ └── carpet_002375.jpg
19
│ └── wood
20
│ ├── wood_002278.jpg
21
│ └── wood_002391.jpg
22
└── val
23
├── brick
24
│ ├── brick_000168.jpg
25
│ └── brick_002137.jpg
26
├── water
27
│ ├── water_000792.jpg
28
│ └── water_000797.jpg
29
└── wood
30
├── wood_001844.jpg
31
└── wood_002146.jpg
Copied!
In Deep Learning Studio, the Image Folder format dataset can only be used for the AI APP Module Classification project type.

3. MS COCO Dataset

    MS COCO Dataset contains 2 files:
    1.
    Image folder (which contains images)
    2.
    Annotations folder (which contain 2 JSON annotation file of images)
1
├── annotations
2
│ ├── instances_train2017.json
3
│ └── instances_val2017.json
4
└── images
5
├── 000000000074.jpg
6
├── 000000000109.jpg
7
├── 000000008458.jpg
8
├── 000000008781.jpg
9
├── 000000008787.jpg
10
├── 000000008821.jpg
11
├── 000000016775.jpg
12
├── 000000016957.jpg
13
├── 000000024664.jpg
14
├── 000000024861.jpg
15
├── 000000024935.jpg
16
├── 000000025148.jpg
17
├── 000000025234.jpg
18
├── 000000033325.jpg
19
├── 000000033377.jpg
20
├── 000000033405.jpg
21
├── 000000033444.jpg
22
├── 000000041311.jpg
23
├── 000000041552.jpg
24
├── 000000041568.jpg
25
├── 000000049814.jpg
26
├── 000000052891.jpg
27
└── 000000581654.jpg
Copied!
In Deep Learning Studio, the MS COCO format dataset can only be used for the AI APP Module Segmentation project type.

CVAT Dataset

    You can refer to the CVAT Dataset link for detailed information.
Last modified 1yr ago