hybrid jvm design (was Re: Paper on "Java, Threads, and Scheduling in Linux")

Dom Ene 23 21:33:31 CST 2000

On Fri, 14 Jan 2000 raybry en us.ibm.com wrote:

> http://www-4.ibm.com/software/developer/library/java2/index.html

i'd like to take a moment to bring this long thread back to the central
problem:  there's no non-blocking i/o in java.  maybe a future version of
java will have non-blocking i/o, but it's not here yet.  however i do
think there's a portable solution for handling highly threaded java
programs.  (portable means no kernel changes required.)

many folks have pointed out that there are differences between
compute-bound and i/o-bound threads.  i want to make the bold assertion
that for compute-bound threads it doesn't matter much if you have a
one-one or a many-many threading model -- in both cases you've got N
stacks which are all live/busy, your cache will thrash and it's not going
to matter how clever your scheduler is.

it's the i/o-bound threads which are of most interest -- they offer the
most opportunities for optimisation.  however, in the case of a jvm, the
"thread" which is blocked on i/o is actually a *java* thread.  the jvm
implementation can use non-blocking i/o underneath the covers without the
java thread being aware of it.  that's exactly what the first jdks gave us
with their greenthreads implementation.  obviously we wish to take
advantage of parallelism at the hardware level, and so we need multiple
native threads.

someone needs to write code, and the debate is right here:  do the jvm
folks write hybrid (many-many) code specific to java, or do the
linuxthreads folks write hybrid code implementing pthreads?

i personally think hybrid *pthreads* are completely uninteresting.
pthreads are very complicated threads -- they include async notification
(such as signals) which complicates the implementation enormously. it is
easy to program without async notification, in fact most folks don't even
use it.  for example, i was able to architect apache 2.0 such that it
doesn't require signals.  thankfully, a jvm doesn't need async
notification.

so i believe that the hybrid code should be written as part of a jvm.

there are two remaining issues:  native classes; and native-thread pool
size.  let's consider native-thread pool size first.  (note, by native
thread i'm referring to a kernel supplied thread, on solaris this would be
a lwp rather than a pthread, on linux it's just a clone.)

the easiest is to use a static sized pool of native-threads, but this does
not adapt to highly compute bound java apps very well (and requires the
user to configure the number of threads).  solaris offers a signal
SIGWAITING to help a hybrid threads implementation decide when to grow its
native-thread pool, but i claimed i had a portable implementation in mind.

allocate one native-thread, call it the timer thread, which sets up a
recurring signal every 1/10th of a second (number pulled out of my butt,
could use some tuning).  when the timer thread is signalled, it checks the
java runnable queue to see if any java-threads are available to be run. if
so, it adds another thread to the native-thread pool.

threads in the native-thread pool use poll() to find the next i/o event,
and then schedule a java thread and run it until the java thread has to
block on i/o.  before using poll() again, the native-thread could do a
rudimentary check to see if there are too many native-threads and die if
so (if you sit down and design the semaphores for this you'll see where
this fits in nicely, i've done the work for a possible apache 2.1
architecture).[1]

that gives us a dynamically sized thread pool.  there's a few tricks you
can pull, such as exponential spawning (divide the timer signal in half
each time you have to spawn so that you can ramp up fast if you need to,
and then degrade it back out to the slower 1/10th of a second when there's
no need to spawn).

for a large number of i/o-bound threads the pool will gravitate towards
the parallelism available in the hardware.  for a large number of
compute-bound threads the pool will gravitate towards a one-one
relationship, which as i stated before is the least of your worries in
this type of a program.

the remaining issue is native classes.  when the jvm needs to call out to
a native class, it is already running in a native thread -- and the jvm is
using no special semantics of the native threads (such as signals), so the
native class is free to use the thread in any way it wants. if the native
class blocks on i/o, the timer thread will eventually allocate a new
native thread to handle the jvm.

native classes which are "aware" of the hybrid jvm could do non-blocking
i/o using methods provided by the jvm -- this is how all the socket stuff
would work.  but for any real jvm it needs to handle fairly arbitrary 3rd
party libraries which do all sorts of blocking nastiness.  in this design,
the worst case scenario using nasty native classes degrades to a one-one
model.  there are only a few core classes which need to be made aware of
the hybrid jvm in order to reap the benefits.

there is an alternative design in which the native threads in the thread
pool take timer interrupts in order to reschedule compute-bound java
threads.  but this requires the native threads to be signal ready, which
vastly complicates the interface with native classes.  we've gone through
this problem in apache 1.3 -- very few 3rd party libraries are happy
receiving a SIGALRM in the middle of their operation, they've no way to
cleanup.

just say no to async notification.[2]

programming with non-blocking i/o, aka state machine programming, is
*hard*.  that's why we built up a lot of concepts so that we express more
in a simple notation.  i don't necessarily think java is wrong for its
lack of non-blocking i/o -- i think it's a fact of life.  it means that we
have to think of clever designs to implement the jvms.  but when we do, we
reap a bunch of rewards.

Dean

[1] i don't really need to elaborate how painful unix poll/select
semantics make this i/o event loop hard to implement.  zach's work on
phhttpd has given me much new hope in this area!  i can't wait to see
linux clobbering other unixes on benchmarks due to this streamlined event
model.

[2] obviously we have to take async notification somewhere -- because we
have to take interrupts from devices at a minimum.  we just choose to
partition the problem such that the amount of code requiring async
notification is comparatively small -- the kernel.  i design such that the
kernel is the only place where asynchronous events occur... because i know
that the folks who do kernel hacking are keenly aware of the needs of
async notification.  the vast masses of programmers out there writing
userland applications and libraries aren't aware of what happens when they
take a signal in the middle of an operation.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo en vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/