HyperAIHyperAI

Computing Power Container Storage Persistence Guide

Detailed explanation of which storage contents in HyperAI containers will be lost after restart

HyperAI provides multiple computing environment solutions. This article mainly introduces the data persistence mechanism in traditional container mode. In this mode, after a container restarts, some storage contents will be retained while others will be lost. This article details which data will be retained, which will be lost, and how to effectively manage data in containers.

:::info Platform Technology Evolution HyperAI is constantly evolving. In addition to the traditional container mode introduced in this article, the platform is actively developing new technical solutions such as "stateful containers" and "virtual machines," which will provide more flexible data persistence options. The content of this article mainly applies to the currently widely used traditional container mode. :::

Basic Principles

Why Does Data Get Lost?

Traditional containers are designed to be "stateless" and "reproducible" environments, which means:

  1. Isolation: Each time a container starts, it is a fresh, clean environment, unaffected by previous operations
  2. Consistency: When different people start containers with the same configuration, they get exactly the same initial environment
  3. Security: Each restart clears potential errors, conflicts, or security issues

This design makes traditional containers particularly suitable for scientific computing and machine learning work, as it ensures experimental reproducibility and environmental consistency.

Why This Design?

This design has several important advantages:

  1. Reliability: Avoids the "it only works on my machine" problem
  2. Scalability: Containers can run on any machine that supports the same container technology
  3. Resource Efficiency: Multiple containers can share underlying resources while remaining isolated from each other
  4. Rapid Deployment: New environments can be quickly created without affecting other work

HyperAI's Technical Solution Evolution

As user needs diversify, HyperAI is expanding multiple computing environment solutions:

  1. Traditional Container Mode (focus of this article): Provides isolated, consistent environments with specific directory persistence
  2. Stateful Containers (coming soon): Provides more complete state preservation, allowing installed software and configurations to persist
  3. Virtual Machine Solution (in development): Provides fully isolated environments and full disk persistence, similar to physical machine experience

Different solutions each have their advantages, and users can choose the most suitable solution based on their needs:

  • For scenarios requiring high experimental reproducibility, traditional containers are the ideal choice
  • For scenarios needing to preserve complex environment configurations, stateful containers will be more convenient
  • For scenarios requiring complete control and system customization, virtual machine solutions are more appropriate

Rich Pre-built Image Library

To reduce users' need to manually install complex dependencies, HyperAI provides a large number of carefully prepared pre-built images:

  • Multiple Deep Learning Framework Versions: Including multiple versions of mainstream frameworks such as PyTorch and PaddlePaddle
  • Domain-Specific Images: Environments optimized for specific fields such as computer vision, natural language processing, and reinforcement learning
  • Integrated Common Tools: Pre-installed with commonly used data science libraries and tools, such as NumPy, Pandas, Scikit-learn, etc.
  • GPU Acceleration Support: Multiple CUDA versions and corresponding deep learning frameworks

These pre-built images have been carefully configured and tested by professional teams to ensure compatibility and stability between components, greatly reducing the complexity of user environment configuration. Users can directly select the image closest to their needs without having to install and configure complex dependencies from scratch.

:::info Choosing the Right Image When creating a container, it is recommended to carefully review the available image list and detailed descriptions, and select the pre-built environment that best meets your project requirements. This can minimize the need for additional dependency installation and improve work efficiency. :::

Persistent Storage Solution

In traditional container mode, HyperAI provides a dedicated persistent directory (/openbayes/home) and data repository functionality. This is like providing a special safe in a disposable workspace—what you put in the safe won't disappear when the workspace is reset.

This design balances the advantages of environment isolation with the need for data persistence, allowing you to enjoy the convenience of containers without worrying about losing important data.

:::info Data Management Resources

  • To learn how to upload data to the HyperAI platform, please refer to the Data Upload Guide
  • To learn how to bind data to containers, please refer to the Data Binding Guide :::

Storage Content That Will Be Lost

  1. System Disk (content under root directory /)

    • All data on the system disk will be cleared after the container is shut down
    • This includes the root directory and all its subdirectories (except for specific persistent directories)
    • System dependencies installed via apt will be lost after container restart
    • By default, Python dependencies installed via pip will also be lost (unless using the --user parameter)
  2. Environment Settings and Configurations

    • Each container's "execution" environment is isolated
    • All manually installed software and dependency packages cannot be recovered after container shutdown
    • Even when running a container through "Continue Execution," it's a completely fresh environment
  3. Temporary Files and Cache

    • Temporary files and log files generated within the container (outside the working directory)
    • Data in memory (such as running variables, model states, etc.)

Storage Content That Will Not Be Lost

  1. Working Directory (/openbayes/home)

    • All data saved in this directory will be automatically saved after the container is shut down
    • This data will be synchronized to the user's global storage space and occupy the corresponding storage quota
    • When starting the container next time, you can continue using this data through data binding
  2. Specific pip Dependencies

    • If you install dependencies using the pip install --user xxx command, these dependencies will be saved in the /openbayes/home/.pylibs directory
    • These dependencies will still be available after container restart
  3. Custom Conda Environments

    • If you create custom Conda environments in the /openbayes/home directory, these environments will still be available after container restart
  4. Data in Data Warehouse

    • Data bound to containers through the data warehouse (datasets, models) will not be lost
    • Data bound to the /openbayes/input/input0-4 directories can be shared across different containers
    • For how to bind data from the data warehouse, please refer to the Data Binding Guide

Best Practice Recommendations

  1. All data that needs to be persistently saved must be stored in the /openbayes/home directory

  2. For large datasets that don't need modification but are frequently used, it's recommended to create them as data warehouse items and bind them to /openbayes/input/input0-4 directories (Learn how to bind data)

  3. For frequently used dependency packages, you can:

    • Install to user directory using pip install --user xxx
    • Create custom Conda environments under /openbayes/home (Detailed Tutorial)
    • Include requirements.txt/conda-packages.txt files in your code repository
  4. Save system dependency installation commands in a dependencies.sh script for automatic installation each time the container starts

  5. Regularly clean up unnecessary data to avoid running out of storage space

  6. For large datasets that need long-term preservation or sharing across multiple containers, it's recommended to upload them to the data warehouse and use them through data binding

:::info Choose the Right Technical Solution If you find yourself frequently needing to install the same system dependencies or requiring complex environment configurations, consider using HyperAI's upcoming "Stateful Container" or "Virtual Machine" solutions, which provide more persistent environment preservation capabilities. :::

By properly utilizing these persistence mechanisms, you can effectively manage data storage after container restarts, reduce repetitive work, and improve development efficiency.