類文件解析的入口是ClassFileParser類中定義的parseClassFile()方法。上一小節得到了文件字節流stream后,接着會在ClassLoader::load_classfile()函數中調用parseClassFile()函數,調用的源代碼實現如下:
源代碼位置:src/share/vm/classfile/classLoader.cpp
instanceKlassHandle h;
if (stream != NULL) {
// class file found, parse it
ClassFileParser parser(stream);
ClassLoaderData* loader_data = ClassLoaderData::the_null_class_loader_data();
Handle protection_domain;
TempNewSymbol parsed_name = NULL;
instanceKlassHandle result =
parser.parseClassFile(h_name,loader_data,protection_domain,parsed_name,false,CHECK_(h));
// add to package table
if (add_package(name, classpath_index, THREAD)) {
h = result;
}
}
另外還有一些函數也會在必要的時候調用parseClassFile()函數,如裝載Java主類時調用的SystemDictionary::resolve_from_stream()函數等。
調用的parseClassFile()函數的實現如下:
instanceKlassHandle parseClassFile(Symbol* name,
ClassLoaderData* loader_data,
Handle protection_domain,
TempNewSymbol& parsed_name,
bool verify,
TRAPS) {
KlassHandle no_host_klass;
return parseClassFile(name, loader_data, protection_domain, no_host_klass, NULL, parsed_name, verify, THREAD);
}
調用的另外一個方法的原型如下:
instanceKlassHandle ClassFileParser::parseClassFile(Symbol* name,
ClassLoaderData* loader_data,
Handle protection_domain,
KlassHandle host_klass,
GrowableArray<Handle>* cp_patches,
TempNewSymbol& parsed_name,
bool verify,
TRAPS)
這個方法的實現太復雜,這里簡單分幾個步驟詳細介紹。
1. 解析魔數、主版本號與次版本號
ClassFileStream* cfs = stream(); ... u4 magic = cfs->get_u4_fast(); guarantee_property(magic == JAVA_CLASSFILE_MAGIC,"Incompatible magic value %u in class file %s",magic, CHECK_(nullHandle)); // Version numbers u2 minor_version = cfs->get_u2_fast(); u2 major_version = cfs->get_u2_fast(); … _major_version = major_version; _minor_version = minor_version;
讀取魔數主要是為了驗證值是否為0xCAFEBABE。讀取到Class文件的主、次版本號並保存到ClassFileParser實例的_major_version和_minor_version中。
2. 解析訪問標識
// Access flags
AccessFlags access_flags;
jint flags = cfs->get_u2_fast() & JVM_RECOGNIZED_CLASS_MODIFIERS;
if ((flags & JVM_ACC_INTERFACE) && _major_version < JAVA_6_VERSION) {
// Set abstract bit for old class files for backward compatibility
flags |= JVM_ACC_ABSTRACT;
}
access_flags.set_flags(flags);
讀取並驗證訪問標識,這個訪問標識在進行字段及方法解析過程中會使用,主要用來判斷這些字段或方法是定義在接口中還是類中。JVM_RECOGNIZED_CLASS_MODIFIERS是一個宏,定義如下:
#define JVM_RECOGNIZED_CLASS_MODIFIERS (JVM_ACC_PUBLIC | \
JVM_ACC_FINAL | \
JVM_ACC_SUPER | \ // 輔助invokespecial指令
JVM_ACC_INTERFACE | \
JVM_ACC_ABSTRACT | \
JVM_ACC_ANNOTATION | \
JVM_ACC_ENUM | \
JVM_ACC_SYNTHETIC)
最后一個標識符是由前端編譯器(如Javac等)添加上去的,表示是合成的類型。
3. 解析當前類索引
類索引(this_class)是一個u2類型的數據,類索引用於確定這個類的全限定名。類索引指向常量池中類型為CONSTANT_Class_info的類描述符,再通過類描述符中的索引值找到常量池中類型為CONSTANT_Utf8_info的字符串。
// This class and superclass u2 this_class_index = cfs->get_u2_fast(); Symbol* class_name = cp->unresolved_klass_at(this_class_index); assert(class_name != NULL, "class_name can't be null"); // Update _class_name which could be null previously to be class_name _class_name = class_name;
將讀取到的當前類的名稱保存到ClassFileParser實例的_class_name屬性中。
調用的cp->unresolved_klass_at()方法的實現如下:
源代碼位置:/hotspot/src/share/vm/oops/constantPool.hpp
// 未連接的返回Symbol*
// This method should only be used with a cpool lock or during parsing or gc
Symbol* unresolved_klass_at(int which) { // Temporary until actual use
intptr_t* oaar = obj_at_addr_raw(which);
Symbol* tmp = (Symbol*)OrderAccess::load_ptr_acquire(oaar);
Symbol* s = CPSlot(tmp).get_symbol();
// check that the klass is still unresolved.
assert(tag_at(which).is_unresolved_klass(), "Corrupted constant pool");
return s;
}
舉個例子如下:
#3 = Class #17 // TestClass ... #17 = Utf8 TestClass
類索引為0x0003,去常量池里找索引為3的類描述符,類描述符中的索引為17,再去找索引為17的字符串,就是“TestClass”。調用obj_at_addr_raw()方法找到的是一個指針,這個指針指向表示“TestClass”這個字符串的Symbol對象,也就是在解析常量池項時會將本來存儲索引值17替換為存儲指向Symbol對象的指針。
調用的obj_at_addr_raw()方法的實現如下:
intptr_t* obj_at_addr_raw(int which) const {
assert(is_within_bounds(which), "index out of bounds");
return (intptr_t*) &base()[which];
}
intptr_t* base() const {
return (intptr_t*) (
( (char*) this ) + sizeof(ConstantPool)
);
}
base()是ConstantPool中定義的方法,所以this指針指向當前ConstantPool對象在內存中的首地址,加上ConstantPool類本身需要占用的內存大小后,指針指向了常量池相關信息,這部分信息通常就是length個指針寬度的數組,其中length為常量池數量。通過(intptr_t*)&base()[which]獲取到常量池索引which對應的值,對於上面的例子來說就是一個指向Symbol對象的指針。
4. 解析父類索引
父類索引(super_class)是一個u2類型的數據,父類索引用於確定這個類的父類全限定名。由於java語言不允許多重繼承,所以父類索引只有一個。父類索指向常量池中類型為CONSTANT_Class_info的類描述符,再通過類描述符中的索引值找到常量池中類型為CONSTANT_Utf8_info的字符串。
u2 super_class_index = cfs->get_u2_fast(); instanceKlassHandle super_klass = parse_super_class(super_class_index,CHECK_NULL);
調用的parse_super()方法的實現如下:
instanceKlassHandle ClassFileParser::parse_super_class(int super_class_index,TRAPS) {
instanceKlassHandle super_klass;
if (super_class_index == 0) { // 當為java.lang.Object類時,沒有父類
check_property(_class_name == vmSymbols::java_lang_Object(),
"Invalid superclass index %u in class file %s",super_class_index,CHECK_NULL);
} else {
check_property(valid_klass_reference_at(super_class_index),
"Invalid superclass index %u in class file %s",super_class_index,CHECK_NULL);
// The class name should be legal because it is checked when parsing constant pool.
// However, make sure it is not an array type.
bool is_array = false;
constantTag mytemp = _cp->tag_at(super_class_index);
if (mytemp.is_klass()) {
super_klass = instanceKlassHandle(THREAD, _cp->resolved_klass_at(super_class_index));
}
}
return super_klass;
}
如果類已經連接,那么可通過super_class_index直接找到表示父類的InstanceKlass實例,否則返回的值就是NULL。
resolved_klass_at()方法的實現如下:
源代碼位置:/hotspot/src/share/vm/oops/constantPool.hpp
// 已連接的返回Klass*
Klass* resolved_klass_at(int which) const { // Used by Compiler
// Must do an acquire here in case another thread resolved the klass
// behind our back, lest we later load stale values thru the oop.
Klass* tmp = (Klass*)OrderAccess::load_ptr_acquire(obj_at_addr_raw(which));
return CPSlot(tmp).get_klass();
}
其中的CPSlot類的實現如下:
class CPSlot VALUE_OBJ_CLASS_SPEC {
intptr_t _ptr;
public:
CPSlot(intptr_t ptr): _ptr(ptr) {}
CPSlot(Klass* ptr): _ptr((intptr_t)ptr) {}
CPSlot(Symbol* ptr): _ptr((intptr_t)ptr | 1) {} // 或上1表示已經解析過了,Symbol*本來不需要解析
intptr_t value() { return _ptr; }
bool is_resolved() { return (_ptr & 1) == 0; }
bool is_unresolved() { return (_ptr & 1) == 1; }
Symbol* get_symbol() {
assert(is_unresolved(), "bad call");
return (Symbol*)(_ptr & ~1);
}
Klass* get_klass() {
assert(is_resolved(), "bad call");
return (Klass*)_ptr;
}
};
5. 解析實現接口
接口表,interfaces[]數組中的每個成員的值必須是一個對constant_pool表中項目的一個有效索引值, 它的長度為 interfaces_count。每個成員interfaces[i] 必須為CONSTANT_Class_info類型常量,其中 0 ≤ i <interfaces_count。在interfaces[]數組中,成員所表示的接口順序和對應的源代碼中給定的接口順序(從左至右)一樣,即interfaces[0]對應的是源代碼中最左邊的接口。
u2 itfs_len = cfs->get_u2_fast(); Array<Klass*>* local_interfaces = parse_interfaces(itfs_len, protection_domain, _class_name,&has_default_methods, CHECK_(nullHandle));
parse_interfaces()方法的實現如下:
Array<Klass*>* ClassFileParser::parse_interfaces(int length,
Handle protection_domain,
Symbol* class_name,
bool* has_default_methods,
TRAPS
){
if (length == 0) {
_local_interfaces = Universe::the_empty_klass_array();
} else {
ClassFileStream* cfs = stream();
_local_interfaces = MetadataFactory::new_array<Klass*>(_loader_data, length, NULL, CHECK_NULL);
int index;
for (index = 0; index < length; index++) {
u2 interface_index = cfs->get_u2(CHECK_NULL);
KlassHandle interf;
if (_cp->tag_at(interface_index).is_klass()) {
interf = KlassHandle(THREAD, _cp->resolved_klass_at(interface_index));
} else {
Symbol* unresolved_klass = _cp->klass_name_at(interface_index);
Handle class_loader(THREAD, _loader_data->class_loader());
// Call resolve_super so classcircularity is checked
Klass* k = SystemDictionary::resolve_super_or_fail(class_name,
unresolved_klass,
class_loader,
protection_domain,
false, CHECK_NULL);
// 將表示接口的InstanceKlass實例封裝為KlassHandle實例
interf = KlassHandle(THREAD, k);
}
if (InstanceKlass::cast(interf())->has_default_methods()) {
*has_default_methods = true;
}
_local_interfaces->at_put(index, interf());
}
if (!_need_verify || length <= 1) {
return _local_interfaces;
}
}
return _local_interfaces;
}
循環對類實現的每個接口進行處理,通過interface_index找到接口在C++類中的表示InstanceKlass實例,然后封裝為KlassHandle后,存儲到_local_interfaces數組中。需要注意的是,如何通過interface_index找到對應的InstanceKlass實例,如果接口索引在常量池中已經是對應的InstanceKlass實例,說明已經連接過了,直接通過_cp_resolved_klass_at()方法獲取即可;如果只是一個字符串表示,需要調用SystemDictionary::resolve_super_or_fail()方法進行連接,這個方法在連接時會詳細介紹,這里不做過多介紹。
klass_name_at()方法的實現如下:
Symbol* ConstantPool::klass_name_at(int which) {
assert(tag_at(which).is_unresolved_klass() || tag_at(which).is_klass(),
"Corrupted constant pool");
// A resolved constantPool entry will contain a Klass*, otherwise a Symbol*.
// It is not safe to rely on the tag bit's here, since we don't have a lock, and the entry and
// tag is not updated atomicly.
CPSlot entry = slot_at(which);
if (entry.is_resolved()) { // 已經連接時,獲取到的是指向InstanceKlass實例的指針
// Already resolved - return entry's name.
assert(entry.get_klass()->is_klass(), "must be");
return entry.get_klass()->name();
} else { // 未連接時,獲取到的是指向Symbol實例的指針
assert(entry.is_unresolved(), "must be either symbol or klass");
return entry.get_symbol();
}
}
其中的slot_at()方法的實現如下:
CPSlot slot_at(int which) {
assert(is_within_bounds(which), "index out of bounds");
// Uses volatile because the klass slot changes without a lock.
volatile intptr_t adr = (intptr_t)OrderAccess::load_ptr_acquire(obj_at_addr_raw(which));
assert(adr != 0 || which == 0, "cp entry for klass should not be zero");
return CPSlot(adr);
}
同樣調用obj_at_addr_raw()方法,獲取ConstantPool中對應索引處存儲的值,然后封裝為CPSlot對象返回即可。
6. 解析類屬性
ClassAnnotationCollector parsed_annotations; parse_classfile_attributes(&parsed_annotations, CHECK_(nullHandle));
調用parse_classfile_attributes()方法解析類屬性,方法的實現比較繁瑣,只需要按照各屬性的格式來解析即可,有興趣的讀者可自行研究。
關於常量池、字段及方法的解析在后面將詳細介紹,這里暫時不介紹。
相關文章的鏈接如下:
1、在Ubuntu 16.04上編譯OpenJDK8的源代碼
13、類加載器
14、類的雙親委派機制
15、核心類的預裝載
16、Java主類的裝載
17、觸發類的裝載
18、類文件介紹
19、文件流
作者持續維護的個人博客classloading.com。
關注公眾號,有HotSpot源碼剖析系列文章!
