HyperAIHyperAI

Uploading Data with bayes

Here we introduce the complete process of uploading datasets using bayes:

  1. Create a new dataset
  2. Create an empty dataset version
  3. Upload data to the specified version
  4. View and manage datasets

Below we will use the tiny imagenet dataset as an example to illustrate the entire process step by step.

Create a New Dataset

$ bayes data create tiny-imagenet -m "A brief description of this tiny-imagenet dataset" -o

Dataset tiny-imagenet (tiQXU5Z5DIy) created successfully
Open the webpage https://app.hyper.ai/console/username/datasets/tiQXU5Z5DIy to view detailed information about dataset tiny-imagenet (tiQXU5Z5DIy)
Redirecting to browser...
Browser opened successfully

Here are some available parameters:

  • -m or --message Dataset description information, can be left empty
  • -o or --open Will open the corresponding web interface in the browser after the dataset is created successfully

You can also see the dataset's URL and ID in the terminal output.

Create an Empty Dataset Version

Before uploading data, you need to create an empty dataset version first. Use the following command to create an empty dataset version:

$ bayes data new-version tiQXU5Z5DIy

Currently operating on personal account admin...
Dataset tiQXU5Z5DIy/1 created successfully

After successful creation, the system will return the newly created version number, which will be used when uploading data later. Please remember this version number, as you need to specify it through the --version or -v parameter when uploading data. If you forget the version number, you can view all versions through the bayes data versions tiQXU5Z5DIy command.

Upload Folder Directly via Command Line

After creating an empty dataset version, you can upload data to that version. In the upload command, you need to specify the --version or -v parameter as the version number created in the previous step:

$ bayes data upload tiQXU5Z5DIy -v 1 -p '/Users/username/test-upload' -o
Currently operating on personal account admin...
Current working directory /Users/username/test-upload
Preparing to upload dataset tiQXU5Z5DIy...
Obtaining upload authorization...
Starting file upload, please be patient...
Found 3 files in total
Analyzing file list...
Excluding files and folders ignored in .openbayesignore...
2 files need to be uploaded in total, 2 files ignored

Ignored file list:
  - .DS_Store
  - .openbayesignore
78949397fb964f6293f9c71b0488e2d9.jpeg: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 117k/117k [00:00<00:00, 122kB/s]
↑ Uploaded: /Users/username/test-upload/78949397fb964f6293f9c71b0488e2d9.jpeg
Starting new multipart upload: 测试视频.mov
Creating new upload ID: f436061f-305d-43c3-b32d-5d9557ada5cb
测试视频.mov: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37.3M/37.3M [00:11<00:00, 3.18MB/s]
↑ Uploaded: /Users/username/test-upload/测试视频.mov

✅ Upload successful! Uploaded 2 files, skipped 0 existing files
Version v1 of dataset tiQXU5Z5DIy has been updated
Opening dataset https://app.hyper.ai/console/admin/datasets/tiQXU5Z5DIy
Redirecting to browser...
Browser opened successfully.

Command-line upload supports resumable uploads. If the upload is interrupted unexpectedly during the process, entering the upload command again will automatically resume the unfinished upload task:

$ bayes data upload tiQXU5Z5DIy -v 1 -p '/Users/username/test-upload' -o
Currently operating on personal account admin...
Current working directory /Users/username/test-upload
Preparing to upload dataset tiQXU5Z5DIy...
Obtaining upload authorization...
Starting file upload, please be patient...
Found 4 files in total
Analyzing file list...
Filtering out files and folders ignored in .openbayesignore...
2 files need to be uploaded, 2 files ignored

Ignored file list:
  - .DS_Store
  - .openbayesignore
↷ Skipped: /Users/username/test-upload/78949397fb964f6293f9c71b0488e2d9.jpeg (already exists)
Found incomplete upload task, resuming: 测试视频.mov
Resuming upload ID: 8a005a27-5639-45ff-af1b-ec9444d62c63
Uploaded 2 parts out of 8 parts
测试视频.mov: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37.3M/37.3M [00:05<00:00, 4.60MB/s]
↑ Uploaded: /Users/username/test-upload/测试视频.mov

✅ Upload successful! Uploaded 1 file, skipped 1 existing file
Version v1 of dataset tiQXU5Z5DIy has been updated
Opening dataset https://app.hyper.ai/console/admin/datasets/tiQXU5Z5DIy
Redirecting to browser...
Browser opened successfully.

Here are some available parameters:

  • -v or --version The version number of the dataset to upload, required
  • -p or --path The local path of the dataset file or folder, defaults to current directory if not specified
  • -d or --directory The specified path for uploading dataset files, defaults to root directory if not specified
  • -o or --open Will open the corresponding web interface in browser after dataset upload is complete

Wait for server-side data synchronization and you can see the dataset has been uploaded:

:::note If you already have an existing compressed package that needs to be uploaded to HyperAI, you can upload it directly with the command bayes data upload tiQXU5Z5DIy -v 1 -p '/Users/username/test/测试.zip'. :::

:::note If you only have a single file that needs to be uploaded to HyperAI, you can also upload it to HyperAI with the command bayes data upload tiQXU5Z5DIy -v 1 -p '/Users/username/test/test.txt'. :::

Open Dataset Web Interface via Command Line

With the following command we can directly open the web interface from the command line:

$ bayes data open tiQXU5Z5DIy

Currently operating on personal account admin...
Opening dataset https://app.hyper.ai/console/username/datasets/tiQXU5Z5DIy
Redirecting to browser...
Browser opened successfully.

Alternatively, by adding the -o parameter at the end of the upload command, the command line tool will immediately open the corresponding web interface after upload is complete:

$ bayes data upload tiQXU5Z5DIy -v 1 -p '/Users/username/test-upload' -o

Currently operating on personal account admin...
Current working directory /Users/username/test-upload
Preparing to upload dataset tiQXU5Z5DIy...
Obtaining upload authorization...
Starting file upload, please wait patiently...
Found 3 files in total
Analyzing file list...
Excluding files and folders ignored in .openbayesignore...
2 files need to be uploaded, 2 files ignored

Ignored files list:
  - .DS_Store
  - .openbayesignore
78949397fb964f6293f9c71b0488e2d9.jpeg: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 117k/117k [00:00<00:00, 122kB/s]
↑ Uploaded: /Users/username/test-upload/78949397fb964f6293f9c71b0488e2d9.jpeg
Starting new multipart upload: 测试视频.mov
Creating new upload ID: f436061f-305d-43c3-b32d-5d9557ada5cb
测试视频.mov: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 37.3M/37.3M [00:11<00:00, 3.18MB/s]
↑ Uploaded: /Users/username/test-upload/测试视频.mov

✅ Upload successful! Uploaded 2 files, skipped 0 existing files
Dataset tiQXU5Z5DIy version v1 has been updated
Opening dataset https://app.hyper.ai/console/admin/datasets/tiQXU5Z5DIy
Redirecting to browser...
Browser opened successfully.