Recently I encountered this issue. One of the application connects with Apache Cassandra NoSQL Database. The application uses DataStax java driver to connect with Cassandra. DataStax has a dependency on the netty library. To be specific following are the Jars that application uses:

  • cassandra-driver-core-2.0.1.jar
  • netty-3.9.0.Final.jar

This application all of sudden ran into ‘java.lang.OutOfMemoryError: unable to create new native thread‘. When thread dump was taken on the application, around 1650 threads were in ‘runnable’ state with following stack trace:

"New I/O worker #211" prio=10 tid=0x00007fa06424d000 nid=0x1a58 runnable [0x00007f9f832f6000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
	at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
	at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
	at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
	- locked  (a sun.nio.ch.Util$2)
	- locked  (a java.util.Collections$UnmodifiableSet)
	- locked  (a sun.nio.ch.EPollSelectorImpl)
	at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
	at org.jboss.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:68)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.select(AbstractNioSelector.java:415)
	at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:212)
	at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
	at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:722)

Oh boy! This is way too many threads. All of them are netty library threads. Apparently, the issue turned out that Apache Cassandra NoSQL DB ran out of space. This issue was cascading in the application as OutOfMemoryError. When more space was allocated with Cassandra DB, the problem went away.

However the point here is: Even though Cassandra ran out of space, client applications should be resilient to it. It can’t result in OutOfMemoryError. It’s unacceptable behavior.

Advertisements