About Me

My photo
Rohit is an investor, startup advisor and an Application Modernization Scale Specialist working at Google.

Sunday, October 27, 2019

How do you get Threaddumps and Heapdumps for Java applications running in Cloud Foundry ??

You Cannot.!!  You have hit a classical pain point due to the Java Buildpack using a JRE and not a full JDK .

So the issue is that you cannot cf ssh into the container in PCF and use the jcmd command to trigger a java threaddump. The classical way of resolving high CPU is to take three such threaddumps 30 seconds apart and check to see the threads that are stuck, ones that are not moving or contending on locks or deadlocks etc. You pair this with CPU Profiling information in the VM

NOT able to take a threaddump in PCF is frustrating. WAS/Weblogic had excellent support for getting these artifacts via must-gathers.

So what can you do ? 
You cannot invoke the /threaddump actuator endpoint because that does not provide nearly as much info as a classical threaddump will provide. 

Again this is a problem that anyone who wants to use the JDK tools in an app in PCF faces. Like for instance we want to run the javac command inside the app in PCF. We simply can't due to the above mentioned issue. 

OK So what can be done ... 
A one time custom java buildpack is created rebased on an Open full JDK and not a JRE. This is not sustainable in the long term.  You will need to restage the app with this custom Full JDK Java buildpack. 
- The JDK tooling (jcmd, jmap and other command line tools) have to be trojan horsed into the app via a side-car container or something like a pcfshell https://github.com/tfynes-pivotal/pcfshell or the app has to carry the executable with it. 
- Another option is that app itself carries a /threaddump endpoint via a spring boot actuator although if the app is dying due to OOM or high CPU this seldom works
- If the app is crashing due to an OOM it writes out a histogram and a cause of failure. In such a case enable verbose GC logging to stdout so that you can collect and visualize the GC logs and 2. you can configure a persistent volume bind for the Java buildpack to write the core file to a persistent volume oom-killer  jre-docs
Existence of a single bound Volume Service will result in Terminal heap dumps being written.
- Use flame graphs in PCF to debug high CPU. This requires some investigation. 

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.