[cf-dev] rep fd keep increasing until 'too many open files' and cell in bad status

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[cf-dev] rep fd keep increasing until 'too many open files' and cell in bad status

Qiu Jie QJ Li
Hi, CF developers
We met a problem that rep fd keep increasing until 'too many open files'.

Our cloudfoundry env was built on kubenetes cluster, it had 3 VMs under it.  1 for diego-cell (4core * 16G) and 2 for others.   When we did stress test, we used 10+ threads to push/start/stop/../delete apps continuously with 10s thinktime between each step. It began with 0 errors, but always ended with cell in bad status hours later.    App stage failed with 'can't communicate with compatible cells' and 'too many open files' in rep.stdout.log . We began to monitor the # of files under /proc/<rep-pid>/fd due to the 'too many open files' hint and noticed that the # of files was steady at first, then from a point, it kept increasing, even after the push app test was completely stopped, the increasing file number seems like the cause of 'too many open files' and most likely would cause the node(VM) unreachable in the end.

Why would this fd keep increasing? Was there some leak or something couldn't be released?  

I had opened an issue in rep repository https://github.com/cloudfoundry/rep/issues/21with more details. Please let us know what extra detailed info you need to know.

Thanks a lot.

Qiu Jie (Sophy) Li


You receive all messages sent to this group.

View/Reply Online (#8028) | [hidden email] | [hidden email] | Mute This Topic | New Topic

Change Your Subscription
Group Home
[hidden email]
Terms Of Service
Unsubscribe From This Group