BeeGFS is a parallel file system suitable for High Performance Computing (HPC) with a proven track record in scalable storage solution space. In this blog hosted by Verne Global, we explore how different components of BeeGFS are pieced together and how we have incorporated them into an Ansible role for a seamless storage cluster deployment experience.
With access to a high-performance InfiniBand fabric, Verne Global's hpcDIRECT users can take advantage of BeeGFS’s native RDMA support to create high performance parallel filesystems as part of their HPC deployments, either on dedicated storage resources or hyperconverged with their compute nodes. This is a great way of making scratch space available to hpcDIRECT workloads.
Users looking to optimise the time to science can do this for their deployments through the hpcDIRECT portal. For those looking to get more hands-on, this guide will take you behind the scenes on how a BeeGFS storage cluster can be configured as part of the deployment of cloud-native HPC.
In this post we'll focus on some practical details for how to dynamically provision BeeGFS filesystems and/or clients running in cloud environments. There are actually no dependencies on OpenStack APIs here - although we do like to draw our Ansible inventory from Cluster-as-a-Service infrastructure and hpcDIRECT makes this possible.
As described here, BeeGFS has components which may be familiar concepts to those working in parallel file system solution space:
- Management service: for registering and watching all other services
- Storage service: for storing the distributed file contents
- Metadata service: for storing access permissions and striping info
- Client service: for mounting the file system to access stored data
- Admon service (optional): for presenting administration and monitoring options through a graphical user interface.
Introducing our Ansible role for BeeGFS
We have an Ansible role published on Ansible Galaxy which handles the end-to-end deployment of BeeGFS. It takes care of details all the way from deployment of management, storage and metadata servers to setting up client nodes and mounting the storage point. To install, simply run:
There is a README that describes the role parameters and example usage.
An Ansible inventory is organised into groups, each representing a different role within the filesystem (or its clients). An example inventory-beegfs file with two hosts bgfs1 and bgfs2 may look like this:
Through controlling the membership of each inventory group, it is possible to create a variety of use cases and configurations. For example, client-only deployments, server-only deployments, or hyperconverged use cases in which the filesystem servers are also the clients (as above).
A minimal Ansible playbook which we shall refer to as beegfs.yml to configure the cluster may look something like this:
To create a BeeGFS cluster spanning the two nodes as defined in the inventory, run a single Ansible playbook to handle the setup and the teardown of BeeGFS storage cluster components by setting beegfs_state flag to present or absent:
The playbook is designed to fail if the path specified for BeeGFS storage service under beegfs_oss is already being used for another service. To override this behaviour, pass an extra option as -e beegfs_force_format=yes. Be warned that this will cause data loss as it formats the disk if a block device is specified and it also erases management and metadata server data if there is an existing BeeGFS deployment.
Highlights of the Ansible role for BeeGFS
- The idempotent role will leave state unchanged if the configuration has not changed compared to the previous deployment.
- The tuning parameters for optimal performance of the storage servers recommended by the BeeGFS maintainers themselves are automatically set.
- The role can be used to deploy both storage-as-a-service and hyperconverged architecture by the nature of how roles are ascribed to hosts in the Ansible inventory. For example, the hyperconverged case would have storage and client services running on the same nodes while in the disaggregated case, the clients are not aware of storage servers.
One point to be aware of: BeeGFS is sensitive to hostname. It prefers hostnames to be consistent and permanent. If the hostname changes, services refuse to start. As a result, this is worth being mindful of during the initial cluster setup.
The simplicity of BeeGFS deployment and configuration makes it a great fit for automated cloud-native deployments such as hpcDIRECT. We have seen a lot of potential in the performance of BeeGFS, and we hope to be publishing more details from our tests in a future post. Watch this space!