Thrift框架详解(四)

协议和编解码(1)-TBinaryProtocol

Posted by Jason Lee on 2020-03-12

概诉

本节来讨论thrift 的编解码器。

核心组件

整体架构

Thrift的核心组件, 主要包含以下几个方面

  • IDL服务描述组件,负责完成跨平台和跨语言(针对不同语言完成了Server层和Client代码的生成)
  • TServer和Client,服务端和客户端组件的实现
  • TProtocal 协议和解编码组件
  • TTransport 传输组件
  • TProcessor 服务调用组件,完成对服务实现的调用

协议和编解码是一个网络应用程序的核心问题之一,客户端和服务器通过约定的协议来传输消息(数据),通过特定的格式来编解码字节流,并转化成业务消息,提供给上层框架调用。

本节主要主要针对TProtocal 协议组件来讨论。

编解码

协议

thrift 做到很好的让用户在服务器端与客户端选择对应的传输协议,总体上一般为2种传输协议:
二进制或者文本.

如果想要节省带宽可以采用二进制的协议,如果希望方便抓包、调试则可以选择文本协议,用户可用根据自己的项目需求选择对应的协议。

类图

  • TCompactProtocol : 紧凑的、高效的二进制传输协议;
  • TBinaryProtocol : 基于二进制传输的协议,使用方法与TCompactProtocol 相同
  • TJSONProtocol : 使用json格式编码传输协议
  • TSimpleJSONProtocol 使用简单json格式编码传输协议

我们可以看到所有的

TProtocol 方法

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
public abstract class TProtocol {
// ...
protected TTransport trans_;
//...
// 写方法
public abstract void writeMessageBegin(TMessage message) throws TException;
public abstract void writeMessageEnd() throws TException;
public abstract void writeStructBegin(TStruct struct) throws TException;
public abstract void writeStructEnd() throws TException;
public abstract void writeFieldBegin(TField field) throws TException;
public abstract void writeFieldEnd() throws TException;
public abstract void writeFieldStop() throws TException;
public abstract void writeMapBegin(TMap map) throws TException;
public abstract void writeMapEnd() throws TException;
public abstract void writeListBegin(TList list) throws TException;
public abstract void writeListEnd() throws TException;
public abstract void writeSetBegin(TSet set) throws TException;
public abstract void writeSetEnd() throws TException;
public abstract void writeBool(boolean b) throws TException;
public abstract void writeByte(byte b) throws TException;
public abstract void writeI16(short i16) throws TException;
public abstract void writeI32(int i32) throws TException;
public abstract void writeI64(long i64) throws TException;
public abstract void writeDouble(double dub) throws TException;
public abstract void writeString(String str) throws TException;
public abstract void writeBinary(ByteBuffer buf) throws TException;
// 读取方法
public abstract TMessage readMessageBegin() throws TException;
public abstract void readMessageEnd() throws TException;
public abstract TStruct readStructBegin() throws TException;
public abstract void readStructEnd() throws TException;
public abstract TField readFieldBegin() throws TException;
public abstract void readFieldEnd() throws TException;
public abstract TMap readMapBegin() throws TException;
public abstract void readMapEnd() throws TException;
public abstract TList readListBegin() throws TException;
public abstract void readListEnd() throws TException;
public abstract TSet readSetBegin() throws TException;
public abstract void readSetEnd() throws TException;
public abstract boolean readBool() throws TException;
public abstract byte readByte() throws TException;
public abstract short readI16() throws TException;
public abstract int readI32() throws TException;
public abstract long readI64() throws TException;
public abstract double readDouble() throws TException;
public abstract String readString() throws TException;
public abstract ByteBuffer readBinary() throws TException;
}

可以看到,这里面定义了一些抽象方法,用于Thrift 中各个消息的系列化入口,子类根基自身的提醒来实现这些方法,用于达到不同的类型,用不同的序列化协议。

我们以最基本的二级制协议格式来具体看一下,那么是究竟如何序列化的。

TBinaryProtocol (二进制)

二级制序列化是每个序列化框架都应该具备基本序列化方法,虽然在网络上最终是以二进制的方法,但是这种而序列化方法能让我们决定具体的二级序列,避免了浪费带宽。
先看源码

源码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
public class TBinaryProtocol extends TProtocol {
private static final TStruct ANONYMOUS_STRUCT = new TStruct();

protected static final int VERSION_MASK = 0xffff0000; // -65536
protected static final int VERSION_1 = 0x80010000; //2147418112

protected boolean strictRead_ = false;
protected boolean strictWrite_ = true;

protected int readLength_;
protected boolean checkReadLength_ = false;

/**
* 协议工厂
*/
public static class Factory implements TProtocolFactory {
protected boolean strictRead_ = false;
protected boolean strictWrite_ = true;
protected int readLength_;

public Factory() {
this(false, true);
}

public Factory(boolean strictRead, boolean strictWrite) {
this(strictRead, strictWrite, 0);
}

public Factory(boolean strictRead, boolean strictWrite, int readLength) {
strictRead_ = strictRead;
strictWrite_ = strictWrite;
readLength_ = readLength;
}

public TProtocol getProtocol(TTransport trans) {
TBinaryProtocol proto = new TBinaryProtocol(trans, strictRead_, strictWrite_);
if (readLength_ != 0) {
proto.setReadLength(readLength_);
}
return proto;
}
}
// factory ... ... ... ...

public TBinaryProtocol(TTransport trans) {
this(trans, false, true);
}

public TBinaryProtocol(TTransport trans, boolean strictRead, boolean strictWrite) {
super(trans);
strictRead_ = strictRead;
strictWrite_ = strictWrite;
}
// 写入 message 相关信息
public void writeMessageBegin(TMessage message) throws TException {
if (strictWrite_) {
int version = VERSION_1 | message.type;
writeI32(version);
writeString(message.name);
writeI32(message.seqid);
} else {
writeString(message.name);
writeByte(message.type);
writeI32(message.seqid);
}
}

public void writeMessageEnd() {}

public void writeStructBegin(TStruct struct) {}

public void writeStructEnd() {}

// 写入 field
public void writeFieldBegin(TField field) throws TException {
writeByte(field.type);
writeI16(field.id);
}

public void writeFieldEnd() {}

// field stop 表示
public void writeFieldStop() throws TException {
writeByte(TType.STOP);
}
// Map 写入开始
public void writeMapBegin(TMap map) throws TException {
writeByte(map.keyType);
writeByte(map.valueType);
writeI32(map.size);
}

public void writeMapEnd() {}

public void writeListBegin(TList list) throws TException {
writeByte(list.elemType);
writeI32(list.size);
}

public void writeListEnd() {}

public void writeSetBegin(TSet set) throws TException {
writeByte(set.elemType);
writeI32(set.size);
}

public void writeSetEnd() {}

public void writeBool(boolean b) throws TException {
writeByte(b ? (byte)1 : (byte)0);
}

private byte [] bout = new byte[1];
// 写入一个 byte 数据
public void writeByte(byte b) throws TException {
bout[0] = b;
trans_.write(bout, 0, 1);
}

private byte[] i16out = new byte[2];
public void writeI16(short i16) throws TException {
i16out[0] = (byte)(0xff & (i16 >> 8));
i16out[1] = (byte)(0xff & (i16));
trans_.write(i16out, 0, 2);
}

private byte[] i32out = new byte[4];
public void writeI32(int i32) throws TException {
i32out[0] = (byte)(0xff & (i32 >> 24));
i32out[1] = (byte)(0xff & (i32 >> 16));
i32out[2] = (byte)(0xff & (i32 >> 8));
i32out[3] = (byte)(0xff & (i32));
trans_.write(i32out, 0, 4);
}

private byte[] i64out = new byte[8];
public void writeI64(long i64) throws TException {
i64out[0] = (byte)(0xff & (i64 >> 56));
i64out[1] = (byte)(0xff & (i64 >> 48));
i64out[2] = (byte)(0xff & (i64 >> 40));
i64out[3] = (byte)(0xff & (i64 >> 32));
i64out[4] = (byte)(0xff & (i64 >> 24));
i64out[5] = (byte)(0xff & (i64 >> 16));
i64out[6] = (byte)(0xff & (i64 >> 8));
i64out[7] = (byte)(0xff & (i64));
trans_.write(i64out, 0, 8);
}

public void writeDouble(double dub) throws TException {
writeI64(Double.doubleToLongBits(dub));
}

public void writeString(String str) throws TException {
try {
byte[] dat = str.getBytes("UTF-8");
writeI32(dat.length);
trans_.write(dat, 0, dat.length);
} catch (UnsupportedEncodingException uex) {
throw new TException("JVM DOES NOT SUPPORT UTF-8");
}
}

public void writeBinary(ByteBuffer bin) throws TException {
int length = bin.limit() - bin.position();
writeI32(length);
trans_.write(bin.array(), bin.position() + bin.arrayOffset(), length);
}

/**
* Reading methods.
*/

public TMessage readMessageBegin() throws TException {
int size = readI32();
if (size < 0) {
int version = size & VERSION_MASK;
if (version != VERSION_1) {
throw new TProtocolException(TProtocolException.BAD_VERSION, "Bad version in readMessageBegin");
}
return new TMessage(readString(), (byte)(size & 0x000000ff), readI32());
} else {
if (strictRead_) {
throw new TProtocolException(TProtocolException.BAD_VERSION, "Missing version in readMessageBegin, old client?");
}
return new TMessage(readStringBody(size), readByte(), readI32());
}
}

public void readMessageEnd() {}

public TStruct readStructBegin() {
return ANONYMOUS_STRUCT;
}

public void readStructEnd() {}

public TField readFieldBegin() throws TException {
byte type = readByte();
short id = type == TType.STOP ? 0 : readI16();
return new TField("", type, id);
}

public void readFieldEnd() {}

public TMap readMapBegin() throws TException {
return new TMap(readByte(), readByte(), readI32());
}

public void readMapEnd() {}

public TList readListBegin() throws TException {
return new TList(readByte(), readI32());
}

public void readListEnd() {}

public TSet readSetBegin() throws TException {
return new TSet(readByte(), readI32());
}

public void readSetEnd() {}

public boolean readBool() throws TException {
return (readByte() == 1);
}

private byte[] bin = new byte[1];
public byte readByte() throws TException {
if (trans_.getBytesRemainingInBuffer() >= 1) {
byte b = trans_.getBuffer()[trans_.getBufferPosition()];
trans_.consumeBuffer(1);
return b;
}
readAll(bin, 0, 1);
return bin[0];
}

private byte[] i16rd = new byte[2];
public short readI16() throws TException {
byte[] buf = i16rd;
int off = 0;

if (trans_.getBytesRemainingInBuffer() >= 2) {
buf = trans_.getBuffer();
off = trans_.getBufferPosition();
trans_.consumeBuffer(2);
} else {
readAll(i16rd, 0, 2);
}

return
(short)
(((buf[off] & 0xff) << 8) |
((buf[off+1] & 0xff)));
}

private byte[] i32rd = new byte[4];
public int readI32() throws TException {
byte[] buf = i32rd;
int off = 0;

if (trans_.getBytesRemainingInBuffer() >= 4) {
buf = trans_.getBuffer();
off = trans_.getBufferPosition();
trans_.consumeBuffer(4);
} else {
readAll(i32rd, 0, 4);
}
return
((buf[off] & 0xff) << 24) |
((buf[off+1] & 0xff) << 16) |
((buf[off+2] & 0xff) << 8) |
((buf[off+3] & 0xff));
}

private byte[] i64rd = new byte[8];
public long readI64() throws TException {
byte[] buf = i64rd;
int off = 0;

if (trans_.getBytesRemainingInBuffer() >= 8) {
buf = trans_.getBuffer();
off = trans_.getBufferPosition();
trans_.consumeBuffer(8);
} else {
readAll(i64rd, 0, 8);
}

return
((long)(buf[off] & 0xff) << 56) |
((long)(buf[off+1] & 0xff) << 48) |
((long)(buf[off+2] & 0xff) << 40) |
((long)(buf[off+3] & 0xff) << 32) |
((long)(buf[off+4] & 0xff) << 24) |
((long)(buf[off+5] & 0xff) << 16) |
((long)(buf[off+6] & 0xff) << 8) |
((long)(buf[off+7] & 0xff));
}

public double readDouble() throws TException {
return Double.longBitsToDouble(readI64());
}

public String readString() throws TException {
int size = readI32();

if (trans_.getBytesRemainingInBuffer() >= size) {
try {
String s = new String(trans_.getBuffer(), trans_.getBufferPosition(), size, "UTF-8");
trans_.consumeBuffer(size);
return s;
} catch (UnsupportedEncodingException e) {
throw new TException("JVM DOES NOT SUPPORT UTF-8");
}
}

return readStringBody(size);
}

public String readStringBody(int size) throws TException {
try {
checkReadLength(size);
byte[] buf = new byte[size];
trans_.readAll(buf, 0, size);
return new String(buf, "UTF-8");
} catch (UnsupportedEncodingException uex) {
throw new TException("JVM DOES NOT SUPPORT UTF-8");
}
}

public ByteBuffer readBinary() throws TException {
int size = readI32();
checkReadLength(size);

if (trans_.getBytesRemainingInBuffer() >= size) {
ByteBuffer bb = ByteBuffer.wrap(trans_.getBuffer(), trans_.getBufferPosition(), size);
trans_.consumeBuffer(size);
return bb;
}

byte[] buf = new byte[size];
trans_.readAll(buf, 0, size);
return ByteBuffer.wrap(buf);
}

private int readAll(byte[] buf, int off, int len) throws TException {
checkReadLength(len);
return trans_.readAll(buf, off, len);
}

public void setReadLength(int readLength) {
readLength_ = readLength;
checkReadLength_ = true;
}

protected void checkReadLength(int length) throws TException {
if (length < 0) {
throw new TException("Negative length: " + length);
}
if (checkReadLength_) {
readLength_ -= length;
if (readLength_ < 0) {
throw new TException("Message length exceeded: " + length);
}
}
}
}

write数据

二级制协议主要的核心代码 主要集中在序列化上, Thrift 对 封装好的

  • TStruct
  • TMessage类型
  • TField 类型
  • TCollection 方法 TMap, TStruct, TSet
  • 基础类型 i16, i32, i64 … binary, string 等

类型都有特定的 写入begin和写入end 两种方法。Begin 都会将自己的id 和 type 写入到 byte 当中。所有的byte写入会调用,下面几种基础写入方法

  • writeBool
  • writeByte
  • writeI16
  • writeI32
  • writeI64
  • writeDouble
  • writeString

我们先来看一个方法 核心的序列化方法。

1
2
3
4
5
6
7
8
9
10
11
12
13
 // 64位 需要8个字节
private byte[] i64out = new byte[8];
public void writeI64(long i64) throws TException {
i64out[0] = (byte)(0xff & (i64 >> 56));
i64out[1] = (byte)(0xff & (i64 >> 48));
i64out[2] = (byte)(0xff & (i64 >> 40));
i64out[3] = (byte)(0xff & (i64 >> 32));
i64out[4] = (byte)(0xff & (i64 >> 24));
i64out[5] = (byte)(0xff & (i64 >> 16));
i64out[6] = (byte)(0xff & (i64 >> 8));
i64out[7] = (byte)(0xff & (i64));
trans_.write(i64out, 0, 8);
}
  • 0xff 值为:
    00000000 00000000 00000000 00000000 00000000 00000000 00000000 11111111
  • i64 是我们要序列化的数字 每个8bit是一个属于
    10000000 01000000 00100000 00010000 00001000 00000100 00000010 00000001
  • “>>56” 相当于我们 让上面的数字想右边移动 56 位 ,取到 10000000

然后和 0xff & 一下 得到
00000000 00000000 00000000 00000000 00000000 00000000 00000000 11111111
00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000
&的计算方法 不在赘述最后得到了 10000000 这段 byte 数据。

其实以上的意思是 将 i64 最高的 一个byte 取出来 放到 i64out 的第一位
最后将 i64 的数字 变成
10000000 01000000 00100000 00010000 00001000 00000100 00000010 00000001
即:

  • i64out[0] = 10000000
  • i64out[1] = 01000000
  • i64out[2] = 00100000
  • i64out[3] = 00010000
  • i64out[4] = 00001000
  • i64out[5] = 00000100
  • i64out[6] = 00000010
  • i64out[7] = 00000001

read 数据

读数据其实本质上是要按照写的方法去读取数据,也就是说是写数据反向操作。 具体的方法可以参考以上源码,这里不再过多赘述。
这里要介绍一个概念。Transport
RPC作为一种特殊的网络编程,会封装一层传输层来支持底层的网络通信。Thrift使用了Transport来封装传输层,但Transport不仅仅是底层网络传输,它还是上层流的封装。
我们看到,无论TProtocal 如何处理数据,最终都会交给 Transport 去传输
Transport 如何传输,本章节不在讨论。请看下一个章节。

Factory

这是一个协议封装好的工厂,这个工厂是要向外提供出去,供Server 和 Client 使用的,所以这里没有引用。
具体看一下Thrift框架详解(一)

参考



支付宝打赏 微信打赏

赞赏一下