Background
As a developer of a GPU SaaS platform relying on Kubernetes with Ceph as the storage backend. Recently, I encountered performance issues with CephFS, notably the extended execution time of commands like import torch
, taking around 180 seconds. Investigating further, it became evident that the CephFS performance was constrained by a single MDS (Metadata Server) running by default.
However, I found that there is no clear documentation about how to configure multiple active MDS daemons. So I write this article to record the process of configuring multiple active MDS daemons.
What is MDS
Ceph MDS serves as the metadata server for the Ceph File System (CephFS), managing metadata operations within the storage system. There are primarily three types of MDS daemons:
- Single Active: The default MDS daemon configuration.
- Hot Standby: An idle backup MDS daemon.
- Multiple Active: Configuring multiple MDS daemons to distribute the load.
When to Use Multiple Active MDS Daemons
As per the Ceph documentation, multiple active MDS daemons are recommended in the following scenarios:
- Large Number of Clients: When handling a substantial client base.
- Metadata Performance Bottleneck: When the default single MDS becomes a performance bottleneck.
Configuring Multiple Active MDS Daemons
- List MDS daemon.
Identify existing MDS daemons in your environment.
> ceph orch ps --daemon_type=mds
- Create new MDS instances.
Add new MDS instances across desired host (example: node01, node02, etc.).
> ceph orch apply mds <FS_NAME> --placement="node01,node02,node03,node04,node05,node04" --dry-run
- Enable standby replay.
Enable standby MDS instances for system availability in case of active daemon failure.
According to the documentation
Even with multiple active MDS daemons, a highly available system still requires standby daemons to take over if any of the servers running an active daemon fail.
> ceph fs set <FS_NAME> allow_standby_replay 1
- Increase the number of active MDS daemons.
Augment the number of active MDS daemons.
> ceph fs set <FS_NAME> max_mds 3
Note: Total MDS daemons = Active MDS daemons + (Standby MDS daemons * Active MDS daemons).
- Check the status of MDS daemons.
Ensure proper functionality and status of the configured MDS daemons.
> ceph status
...
services:
mon: 5 daemons, quorum node01,node02,node03,node04,node05,node04 (age 9w)
mgr: node01.iitngk(active, since 6M), standbys: node02.wjppdy
mds: 3/3 daemons up, 3 hot standby
osd: 6 osds: 6 up (since 2w), 6 in (since 7M)
...
> ceph orch ps --daemon_type=mds
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
mds.<FS_NAME>.node01.zcsmff node01 running (7M) 74s ago 7M 12.9G - 17.2.6 c9a1062f7289 a45f77ca1ecb
mds.<FS_NAME>.node02.xtterx node02 running (4d) 9m ago 4d 4320M - 17.2.6 c9a1062f7289 c4553f2d538b
mds.<FS_NAME>.node03.ktibkb node03 running (4d) 9m ago 4d 11.2G - 17.2.6 c9a1062f7289 36aeb10dc1a6
mds.<FS_NAME>.node04.uhmpnt node04 running (4d) 5m ago 4d 2150M - 17.2.6 c9a1062f7289 7aa9bac68b0f
mds.<FS_NAME>.node05.jqwdxe node05 running (4d) 4m ago 4d 403M - 17.2.6 c9a1062f7289 730118ae7af0
mds.<FS_NAME>.node06.skhdpr node06 running (4d) 4m ago 4d 487M - 17.2.6 c9a1062f7289 5054ffa1bfe0
> ceph fs dump
...
max_mds 3
in 0,1,2
up {0=14232,1=6103278,2=6110459}
...
standby_count_wanted 1
...