SPEC OF ORCA FILES

http://blog.huxx.me/2014/11/03/Spec1-1/:

Some contention occurs so fast that I cannot capture it!.

Basically, the backend has two responsibilities: extracting static and runtime information from JVM, and then dumping to files.

We are interested in concurrent information, that includes thread life cycle and interleaving caused by locks. Our basic tool is JVMTI, Java virtual machine tool interface, which is a powerful toolbox providing almost all possible information of JVM and the running program. It provides a set of events, and a set of methods.

Class and method metadata

An issue with ClassLoad and ClassPrepare is that not all classes have both two events. For Ljava/lang/Object, as an example, no ClassPrepare event is raised during its loading, thus, featured method metadata has to be extracted in the ClassLoad event.

Metadata information we are interested in includes:

  • method ID, a 32bit integer
  • method source map, how jlocations are mapped to source code lines

Not all class and method metadata need to be dumped. We are interested in two types of methods:

  • The methods we are tracking, for examples, Thread.start. We know these methods in advance. Therefore, method IDs are extracted in ClassLoad and ClassPrepare callbacks.
  • Those methods we are not interested about, but they appear in the stack trace. We do not know these methods before execution. Thus, their method ID is recorded at runtime (in a hashtable), and their detailed information, including method name, signature, line tables, are dumped when they are unloaded from JVM.

Meta log file

First part of this file is class descriptions. Each line shows which source file the class is corresponding to.

<classSignature>,<sourceFileName>

<classSignature> is the signature of a JAVA class, corresponding to JAVA specification (e.g. Ljava/lang/Object;). <sourceFileName> is a string which could be empty (no source is found).

And the second part of this file, ach line of describles a method,

<methodID>,<methodName>,<methodSignature>,<declaringClass>,[lineTable]

<methodID> is a 32-bit hexadecimal integer, <methodName> is a string for the name of that method, <methodSignature> is a string (see JVM specification for method signatures), <declaringClass> is the signature of the class the method is declared in. The last parameter is the source line table of this method.

<entriesCount>,(<startLocation>,<lineNumber>)*

Where <entriesCount> is a decimal integer. Sometimes the lineTables does not exist. In that case, <entriesCount> should be -1. Then the lineTable is followed by some pairs seperated by commas. <startLocation> is a 32-bit hexadecimal integer for each jLocation, and <lineNumber> is the corresponding line number in the source file. Name of source file could be retrieved from the first part of meta log file.

Runtime events

Log file format

  • Each event is recorded in one single line. And different parameters are seperated by commas.
  • All hashcodes are generated by the JVMTI function getObjectHashcode. Their type is jint, in our case a 32bit integer. So in the logfile, all hashcode are hexadecimal numbers of length 8 (with leading 0s, uppercased), e.g. FFFF38D6
  • All timestamps are retrieved by QueryPerformanceCounter function on Windows. In logfile, they are represented by a floating point number with unit of second, e.g. 34232.84192374283

Method call events

Ljava/lang/Object;

[timestamp],ObjectWait,<currentThread>,<object>,[stackTrace]
[timestamp],ObjectWaited,<currentThread>,<object>,[stackTrace]

These two events are generated when Object.wait is called, at the method entry point and exit point. <currentThread> is the hash code of current thread, and <object> is the object it wants to wait.
Note that Object.wait has several overloaded forms, we don not care about which one is using, but if one is called, the nested wait calls are supressed.

[timestamp],ObjectNotify,<currentThread>,<object>,[stackTrace]
[timestamp],ObjectNotifyAll,<currentThread>,<object>,[stackTrace]

When Object.notify or ObjectNotifyAll is called, the corresponding event is raised. <currentThread> is the current thread hashcode, and <object> is the object notified.

Ljava/lang/Thread;

[timestamp],ThreadStart,<currentThread>,<startThread>,[stackTrace]

This event is generated when Thread.start is called (at the method entry). currentThread is the current thread hashcode, <startThread> is the new thread hashcode.

[timestamp],ThreadJoin,<currentThread>,<waitingThread>,[stackTrace]
[timestamp],ThreadJoined,<currentThread>,<waitingThread>,[stackTrace]

Generated when Thread.join is called, both at the method entry and method exit. currentThread is the hashcode of current thread. waitingThread is the hashcode of thread it is waiting for.
Note that in JAVA implementation, Thread.Join calls Thread.wait. In logfile, the inner Thread.wait is supressed.

[timestamp],ThreadSleep,<currentThread>,[stackTrace]
[timestamp],ThreadSlept,<currentThread>,[stackTrace]

Generated when Thread.sleep is called, both at the method entry and method exit. currentThread is the hashcode of current thread.

Monitor Contention Events

Monitor contention happens when two or more threads are executing the following code snippet simutanously,

synchronized(obj) {
    foo();
}

There are two events related to monitor contention.

[timestamp],MonitorContendedEnter,<currentThread>,<monitor>,<ownerThread>,[stackTrace]
[timestamp],MonitorContendedEntered,<currentThread>,<monitor>,[stackTrace]

Generated at the beginning of contention, and the end (current thread owns the monitor object). <currentThread> is the hashcode of current thread, and <monitor> is the hashcode of monitor object. <ownerThread> is the hashcode of thread who is currently owning that monitor. When the contention is over, current thread should be the owner of the monitor.
Note these two events are only triggered when contention happens. If there is no contention, i.e. only one thread is trying to enter the monitor, both events are muted.

Thread events

Thread events are hooked when a thread is created or destroyed.

[timestamp],ThreadStarted,<thread>,<threadName>
[timestamp],ThreadEnded,<thread>

Where <thread> is the hashcode of current thread (of course the new thread). <threadName> is a string representing the name of new thread, not all threads have a name, some of them are empty.
Note most threads are invoked by Thread.start, but not necessarily all of them. So not each ThreadStartedevent has a corresponding ThreadStart event.

Stack trace

A stack trace is dumped for almost all events.

<stackDepth>,(methodID,Location)*

Stack trace always begins by a non-negative integer <stackDepth>, and followed by pairs of <methodID> and <location><methodID> is a 64-bit hexadecimal address (in uppercase, with leading 0s?), and <location>is also a hexadecimal address.

Last edited Dec 11, 2014 at 6:15 AM by LuoLiang, version 1