Kryo
Spark序列化时可以用Kryo框架。
序列化的过程中主要有3个指标:
- 对象序列化后的大小 : 一个对象会被序列化工具序列化为一串byte数组,这其中包含了对象的field值以及元数据信息,使其可以被反序列化回一个对象
- 序列化与反序列化的速度 : 一个对象被序列化成byte数组的时间取决于它生成/解析byte数组的方法
- 序列化工具本身的速度 : 序列化工具本身创建会有一定的消耗
- Chunked Encoding : 分块编码
- Forward compatibility : reading bytes serialized by newer classes
- Backward compatibility : reading bytes serialized by older classes
Kryo 本地开发
Change version of bcel from
<dependency> <groupId>org.apache.bcel</groupId> <artifactId>bcel</artifactId> <version>6.0-SNAPSHOT</version> </dependency> TO: <dependency> <groupId>org.apache.bcel</groupId> <artifactId>bcel</artifactId> <version>6.0</version> </dependency>
Run command : mvn clean compile -P java8
- >> is arithmetic shift right, >>> is logical shift right.
In an arithmetic shift, the sign bit is extended to preserve the signedness of the number.
For example: -2 represented in 8 bits would be 11111110 (because the most significant bit has negative weight). Shifting it right one bit using arithmetic shift would give you 11111111, or -1. Logical right shift, however, does not care that the value could possibly represent a number; it simply moves everything to the right and fills in from the left with 0s. Shifting our -2 right one bit using logical shift would give 01111111.
-1 >>> 32 is equivalent to -1 >>> 0 and -1 >>> 33 is equivalent to -1 >>> 1 and, especially confusing, -1 >>> -1 is equivalent to -1 >>> 31
System.out.println(Integer.toBinaryString(7));
b & 0x7F 这个意思是位与运算,把b转成二进制数据与 0111 1111 进行二进制上的与运算,与相当于乘,最后得出的结果除了符号位以外,其他就是b本身。
b & 0x80 这个处理符号位
- writeVarInt()在optimizePositive=false的时候,采用Zigzag Encoding,类似于Protocol Buffer.
- 在kryo中,每个类有一个关联的registration Id 和类名的引用ID,第一次写一个类时候,写的次序为:registration Id,类名引用ID和类名.