Thrift框架详解（四）

概诉

本节来讨论thrift 的编解码器。

核心组件

整体架构

Thrift的核心组件, 主要包含以下几个方面

IDL服务描述组件,负责完成跨平台和跨语言(针对不同语言完成了Server层和Client代码的生成)
TServer和Client，服务端和客户端组件的实现
TProtocal 协议和解编码组件
TTransport 传输组件
TProcessor 服务调用组件，完成对服务实现的调用

协议和编解码是一个网络应用程序的核心问题之一，客户端和服务器通过约定的协议来传输消息(数据)，通过特定的格式来编解码字节流，并转化成业务消息，提供给上层框架调用。

本节主要主要针对TProtocal 协议组件来讨论。

编解码

协议

thrift 做到很好的让用户在服务器端与客户端选择对应的传输协议，总体上一般为2种传输协议：
二进制或者文本.

如果想要节省带宽可以采用二进制的协议，如果希望方便抓包、调试则可以选择文本协议，用户可用根据自己的项目需求选择对应的协议。

类图

TCompactProtocol ：紧凑的、高效的二进制传输协议；
TBinaryProtocol ：基于二进制传输的协议，使用方法与TCompactProtocol 相同
TJSONProtocol ：使用json格式编码传输协议
TSimpleJSONProtocol 使用简单json格式编码传输协议

我们可以看到所有的

TProtocol 方法

public abstract class TProtocol {
    // ...
  protected TTransport trans_;
    //...
    // 写方法
  public abstract void writeMessageBegin(TMessage message) throws TException;
  public abstract void writeMessageEnd() throws TException;
  public abstract void writeStructBegin(TStruct struct) throws TException;
  public abstract void writeStructEnd() throws TException;
  public abstract void writeFieldBegin(TField field) throws TException;
  public abstract void writeFieldEnd() throws TException;
  public abstract void writeFieldStop() throws TException;
  public abstract void writeMapBegin(TMap map) throws TException;
  public abstract void writeMapEnd() throws TException;
  public abstract void writeListBegin(TList list) throws TException;
  public abstract void writeListEnd() throws TException;
  public abstract void writeSetBegin(TSet set) throws TException;
  public abstract void writeSetEnd() throws TException;
  public abstract void writeBool(boolean b) throws TException;
  public abstract void writeByte(byte b) throws TException;
  public abstract void writeI16(short i16) throws TException;
  public abstract void writeI32(int i32) throws TException;
  public abstract void writeI64(long i64) throws TException;
  public abstract void writeDouble(double dub) throws TException;
  public abstract void writeString(String str) throws TException;
  public abstract void writeBinary(ByteBuffer buf) throws TException;
  // 读取方法
  public abstract TMessage readMessageBegin() throws TException;
  public abstract void readMessageEnd() throws TException;
  public abstract TStruct readStructBegin() throws TException;
  public abstract void readStructEnd() throws TException;
  public abstract TField readFieldBegin() throws TException;
  public abstract void readFieldEnd() throws TException;
  public abstract TMap readMapBegin() throws TException;
  public abstract void readMapEnd() throws TException;
  public abstract TList readListBegin() throws TException;
  public abstract void readListEnd() throws TException;
  public abstract TSet readSetBegin() throws TException;
  public abstract void readSetEnd() throws TException;
  public abstract boolean readBool() throws TException;
  public abstract byte readByte() throws TException;
  public abstract short readI16() throws TException;
  public abstract int readI32() throws TException;
  public abstract long readI64() throws TException;
  public abstract double readDouble() throws TException;
  public abstract String readString() throws TException;
  public abstract ByteBuffer readBinary() throws TException;
}

可以看到，这里面定义了一些抽象方法，用于Thrift 中各个消息的系列化入口，子类根基自身的提醒来实现这些方法，用于达到不同的类型，用不同的序列化协议。

我们以最基本的二级制协议格式来具体看一下，那么是究竟如何序列化的。

TBinaryProtocol （二进制）

二级制序列化是每个序列化框架都应该具备基本序列化方法，虽然在网络上最终是以二进制的方法，但是这种而序列化方法能让我们决定具体的二级序列，避免了浪费带宽。
先看源码

源码

public class TBinaryProtocol extends TProtocol {
  private static final TStruct ANONYMOUS_STRUCT = new TStruct();

  protected static final int VERSION_MASK = 0xffff0000;     // -65536
  protected static final int VERSION_1 = 0x80010000;       //2147418112

  protected boolean strictRead_ = false;
  protected boolean strictWrite_ = true;

  protected int readLength_;
  protected boolean checkReadLength_ = false;

  /**
   * 协议工厂
   */
  public static class Factory implements TProtocolFactory {
    protected boolean strictRead_ = false;
    protected boolean strictWrite_ = true;
    protected int readLength_;

    public Factory() {
      this(false, true);
    }

    public Factory(boolean strictRead, boolean strictWrite) {
      this(strictRead, strictWrite, 0);
    }

    public Factory(boolean strictRead, boolean strictWrite, int readLength) {
      strictRead_ = strictRead;
      strictWrite_ = strictWrite;
      readLength_ = readLength;
    }

    public TProtocol getProtocol(TTransport trans) {
      TBinaryProtocol proto = new TBinaryProtocol(trans, strictRead_, strictWrite_);
      if (readLength_ != 0) {
        proto.setReadLength(readLength_);
      }
      return proto;
    }
  }
// factory  ... ... ... ...

  public TBinaryProtocol(TTransport trans) {
    this(trans, false, true);
  }

  public TBinaryProtocol(TTransport trans, boolean strictRead, boolean strictWrite) {
    super(trans);
    strictRead_ = strictRead;
    strictWrite_ = strictWrite;
  }
  // 写入 message 相关信息
  public void writeMessageBegin(TMessage message) throws TException {
    if (strictWrite_) {
      int version = VERSION_1 | message.type;
      writeI32(version);
      writeString(message.name);
      writeI32(message.seqid);
    } else {
      writeString(message.name);
      writeByte(message.type);
      writeI32(message.seqid);
    }
  }

  public void writeMessageEnd() {}
  
  public void writeStructBegin(TStruct struct) {}

  public void writeStructEnd() {}
  
   // 写入 field 
  public void writeFieldBegin(TField field) throws TException {
    writeByte(field.type);
    writeI16(field.id);
  }

  public void writeFieldEnd() {}
  
  // field stop 表示
  public void writeFieldStop() throws TException {
    writeByte(TType.STOP);
  }
  // Map 写入开始
  public void writeMapBegin(TMap map) throws TException {
    writeByte(map.keyType);
    writeByte(map.valueType);
    writeI32(map.size);
  }

  public void writeMapEnd() {}

  public void writeListBegin(TList list) throws TException {
    writeByte(list.elemType);
    writeI32(list.size);
  }

  public void writeListEnd() {}

  public void writeSetBegin(TSet set) throws TException {
    writeByte(set.elemType);
    writeI32(set.size);
  }

  public void writeSetEnd() {}

  public void writeBool(boolean b) throws TException {
    writeByte(b ? (byte)1 : (byte)0);
  }

  private byte [] bout = new byte[1];
  // 写入一个 byte 数据
  public void writeByte(byte b) throws TException {
    bout[0] = b;
    trans_.write(bout, 0, 1);
  }

  private byte[] i16out = new byte[2];
  public void writeI16(short i16) throws TException {
    i16out[0] = (byte)(0xff & (i16 >> 8));
    i16out[1] = (byte)(0xff & (i16));
    trans_.write(i16out, 0, 2);
  }

  private byte[] i32out = new byte[4];
  public void writeI32(int i32) throws TException {
    i32out[0] = (byte)(0xff & (i32 >> 24));
    i32out[1] = (byte)(0xff & (i32 >> 16));
    i32out[2] = (byte)(0xff & (i32 >> 8));
    i32out[3] = (byte)(0xff & (i32));
    trans_.write(i32out, 0, 4);
  }

  private byte[] i64out = new byte[8];
  public void writeI64(long i64) throws TException {
    i64out[0] = (byte)(0xff & (i64 >> 56));
    i64out[1] = (byte)(0xff & (i64 >> 48));
    i64out[2] = (byte)(0xff & (i64 >> 40));
    i64out[3] = (byte)(0xff & (i64 >> 32));
    i64out[4] = (byte)(0xff & (i64 >> 24));
    i64out[5] = (byte)(0xff & (i64 >> 16));
    i64out[6] = (byte)(0xff & (i64 >> 8));
    i64out[7] = (byte)(0xff & (i64));
    trans_.write(i64out, 0, 8);
  }

  public void writeDouble(double dub) throws TException {
    writeI64(Double.doubleToLongBits(dub));
  }

  public void writeString(String str) throws TException {
    try {
      byte[] dat = str.getBytes("UTF-8");
      writeI32(dat.length);
      trans_.write(dat, 0, dat.length);
    } catch (UnsupportedEncodingException uex) {
      throw new TException("JVM DOES NOT SUPPORT UTF-8");
    }
  }

  public void writeBinary(ByteBuffer bin) throws TException {
    int length = bin.limit() - bin.position();
    writeI32(length);
    trans_.write(bin.array(), bin.position() + bin.arrayOffset(), length);
  }

  /**
   * Reading methods.
   */

  public TMessage readMessageBegin() throws TException {
    int size = readI32();
    if (size < 0) {
      int version = size & VERSION_MASK;
      if (version != VERSION_1) {
        throw new TProtocolException(TProtocolException.BAD_VERSION, "Bad version in readMessageBegin");
      }
      return new TMessage(readString(), (byte)(size & 0x000000ff), readI32());
    } else {
      if (strictRead_) {
        throw new TProtocolException(TProtocolException.BAD_VERSION, "Missing version in readMessageBegin, old client?");
      }
      return new TMessage(readStringBody(size), readByte(), readI32());
    }
  }

  public void readMessageEnd() {}

  public TStruct readStructBegin() {
    return ANONYMOUS_STRUCT;
  }

  public void readStructEnd() {}

  public TField readFieldBegin() throws TException {
    byte type = readByte();
    short id = type == TType.STOP ? 0 : readI16();
    return new TField("", type, id);
  }

  public void readFieldEnd() {}

  public TMap readMapBegin() throws TException {
    return new TMap(readByte(), readByte(), readI32());
  }

  public void readMapEnd() {}

  public TList readListBegin() throws TException {
    return new TList(readByte(), readI32());
  }

  public void readListEnd() {}

  public TSet readSetBegin() throws TException {
    return new TSet(readByte(), readI32());
  }

  public void readSetEnd() {}

  public boolean readBool() throws TException {
    return (readByte() == 1);
  }

  private byte[] bin = new byte[1];
  public byte readByte() throws TException {
    if (trans_.getBytesRemainingInBuffer() >= 1) {
      byte b = trans_.getBuffer()[trans_.getBufferPosition()];
      trans_.consumeBuffer(1);
      return b;
    }
    readAll(bin, 0, 1);
    return bin[0];
  }

  private byte[] i16rd = new byte[2];
  public short readI16() throws TException {
    byte[] buf = i16rd;
    int off = 0;

    if (trans_.getBytesRemainingInBuffer() >= 2) {
      buf = trans_.getBuffer();
      off = trans_.getBufferPosition();
      trans_.consumeBuffer(2);
    } else {
      readAll(i16rd, 0, 2);
    }

    return
      (short)
      (((buf[off] & 0xff) << 8) |
       ((buf[off+1] & 0xff)));
  }

  private byte[] i32rd = new byte[4];
  public int readI32() throws TException {
    byte[] buf = i32rd;
    int off = 0;

    if (trans_.getBytesRemainingInBuffer() >= 4) {
      buf = trans_.getBuffer();
      off = trans_.getBufferPosition();
      trans_.consumeBuffer(4);
    } else {
      readAll(i32rd, 0, 4);
    }
    return
      ((buf[off] & 0xff) << 24) |
      ((buf[off+1] & 0xff) << 16) |
      ((buf[off+2] & 0xff) <<  8) |
      ((buf[off+3] & 0xff));
  }

  private byte[] i64rd = new byte[8];
  public long readI64() throws TException {
    byte[] buf = i64rd;
    int off = 0;

    if (trans_.getBytesRemainingInBuffer() >= 8) {
      buf = trans_.getBuffer();
      off = trans_.getBufferPosition();
      trans_.consumeBuffer(8);
    } else {
      readAll(i64rd, 0, 8);
    }

    return
      ((long)(buf[off]   & 0xff) << 56) |
      ((long)(buf[off+1] & 0xff) << 48) |
      ((long)(buf[off+2] & 0xff) << 40) |
      ((long)(buf[off+3] & 0xff) << 32) |
      ((long)(buf[off+4] & 0xff) << 24) |
      ((long)(buf[off+5] & 0xff) << 16) |
      ((long)(buf[off+6] & 0xff) <<  8) |
      ((long)(buf[off+7] & 0xff));
  }

  public double readDouble() throws TException {
    return Double.longBitsToDouble(readI64());
  }

  public String readString() throws TException {
    int size = readI32();

    if (trans_.getBytesRemainingInBuffer() >= size) {
      try {
        String s = new String(trans_.getBuffer(), trans_.getBufferPosition(), size, "UTF-8");
        trans_.consumeBuffer(size);
        return s;
      } catch (UnsupportedEncodingException e) {
        throw new TException("JVM DOES NOT SUPPORT UTF-8");
      }
    }

    return readStringBody(size);
  }

  public String readStringBody(int size) throws TException {
    try {
      checkReadLength(size);
      byte[] buf = new byte[size];
      trans_.readAll(buf, 0, size);
      return new String(buf, "UTF-8");
    } catch (UnsupportedEncodingException uex) {
      throw new TException("JVM DOES NOT SUPPORT UTF-8");
    }
  }

  public ByteBuffer readBinary() throws TException {
    int size = readI32();
    checkReadLength(size);

    if (trans_.getBytesRemainingInBuffer() >= size) {
      ByteBuffer bb = ByteBuffer.wrap(trans_.getBuffer(), trans_.getBufferPosition(), size);
      trans_.consumeBuffer(size);
      return bb;
    }

    byte[] buf = new byte[size];
    trans_.readAll(buf, 0, size);
    return ByteBuffer.wrap(buf);
  }

  private int readAll(byte[] buf, int off, int len) throws TException {
    checkReadLength(len);
    return trans_.readAll(buf, off, len);
  }

  public void setReadLength(int readLength) {
    readLength_ = readLength;
    checkReadLength_ = true;
  }

  protected void checkReadLength(int length) throws TException {
    if (length < 0) {
      throw new TException("Negative length: " + length);
    }
    if (checkReadLength_) {
      readLength_ -= length;
      if (readLength_ < 0) {
        throw new TException("Message length exceeded: " + length);
      }
    }
  }
}

write数据

二级制协议主要的核心代码主要集中在序列化上， Thrift 对封装好的

TStruct
TMessage类型
TField 类型
TCollection 方法 TMap, TStruct, TSet
基础类型 i16， i32, i64 … binary, string 等

类型都有特定的写入begin和写入end 两种方法。Begin 都会将自己的id 和 type 写入到 byte 当中。所有的byte写入会调用，下面几种基础写入方法

writeBool
writeByte
writeI16
writeI32
writeI64
writeDouble
writeString

我们先来看一个方法核心的序列化方法。

 // 64位 需要8个字节
private byte[] i64out = new byte[8];
public void writeI64(long i64) throws TException {
   i64out[0] = (byte)(0xff & (i64 >> 56));
   i64out[1] = (byte)(0xff & (i64 >> 48));
   i64out[2] = (byte)(0xff & (i64 >> 40));
   i64out[3] = (byte)(0xff & (i64 >> 32));
   i64out[4] = (byte)(0xff & (i64 >> 24));
   i64out[5] = (byte)(0xff & (i64 >> 16));
   i64out[6] = (byte)(0xff & (i64 >> 8));
   i64out[7] = (byte)(0xff & (i64));
   trans_.write(i64out, 0, 8);
 }

0xff 值为：
00000000 00000000 00000000 00000000 00000000 00000000 00000000 11111111
i64 是我们要序列化的数字每个8bit是一个属于
10000000 01000000 00100000 00010000 00001000 00000100 00000010 00000001
“>>56” 相当于我们让上面的数字想右边移动 56 位，取到 10000000

然后和 0xff & 一下得到
00000000 00000000 00000000 00000000 00000000 00000000 00000000 11111111
00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000
&的计算方法不在赘述最后得到了 10000000 这段 byte 数据。

其实以上的意思是将 i64 最高的一个byte 取出来放到 i64out 的第一位
最后将 i64 的数字变成
10000000 01000000 00100000 00010000 00001000 00000100 00000010 00000001
即：

i64out[0] = 10000000
i64out[1] = 01000000
i64out[2] = 00100000
i64out[3] = 00010000
i64out[4] = 00001000
i64out[5] = 00000100
i64out[6] = 00000010
i64out[7] = 00000001

read 数据

读数据其实本质上是要按照写的方法去读取数据，也就是说是写数据反向操作。具体的方法可以参考以上源码，这里不再过多赘述。
这里要介绍一个概念。Transport
RPC作为一种特殊的网络编程，会封装一层传输层来支持底层的网络通信。Thrift使用了Transport来封装传输层，但Transport不仅仅是底层网络传输，它还是上层流的封装。
我们看到，无论TProtocal 如何处理数据，最终都会交给 Transport 去传输
Transport 如何传输，本章节不在讨论。请看下一个章节。

Factory

这是一个协议封装好的工厂，这个工厂是要向外提供出去，供Server 和 Client 使用的，所以这里没有引用。
具体看一下Thrift框架详解（一)

参考

Thrift框架调研

赏

支付宝打赏

微信打赏

赞赏一下

协议和编解码(1)-TBinaryProtocol

概诉