Any views expressed within media held on this service are those of the contributors, should not be taken as approved or endorsed by the University, and do not necessarily reflect the views of the University in respect of any particular issue.

Magnus Hagdorn

Magnus Hagdorn

Research Software Engineer

Deleting too many files

tape drive
CephFS is now an integral part of the School of GeoScience’s IT infrastructure. It provides both personal and group storage for windows, mac and linux users. Since our first disaster due to running out of resources it has been running very reliably and it provided all the benefits we had hoped for. Until disaster stuck again yesterday: The first signs of trouble were users reporting that deleting files resulted in a No space left on device error.

It turned out that a user was deleting a particularly deep directory structure with around 2 million entries. We snapshot our file system to provide easy access to accidentally deleted files. When files get deleted on a snapshotted directory they end up in a stray files location. It is a single location with a default limit of one million entries. We exceeded this limit which resulted in the No space left on device errors.

We tried a lot of things to get the file system to behave again. In the end increasing the number of stray files solved this particular issue. The good news is that the current version of ceph – pacific – has solved this issue because it allows the stray directory to be split so that the limit does no longer exist.

The good news is that we didn’t lose any data but we ended up with a central resource that was offline for about 24 hours. Upgrading our ceph system to pacific has just become more urgent.

The Geeky Stuff

Finding the number of stray files:
ceph daemon mds.`hostname -s` perf dump | grep stray

Finding the maximum number of stray files (this number will get multiplied by 10)
ceph daemon mds.`hostname -s` config get mds_bal_fragment_size_max

Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.

css.php

Report this page

To report inappropriate content on this page, please use the form below. Upon receiving your report, we will be in touch as per the Take Down Policy of the service.

Please note that personal data collected through this form is used and stored for the purposes of processing this report and communication with you.

If you are unable to report a concern about content via this form please contact the Service Owner.

Please enter an email address you wish to be contacted on. Please describe the unacceptable content in sufficient detail to allow us to locate it, and why you consider it to be unacceptable.
By submitting this report, you accept that it is accurate and that fraudulent or nuisance complaints may result in action by the University.

  Cancel