现象
最近做项目的时候遇到一个奇怪的问题,服务器将一个id传给app,app解析后拿到的id和服务器传过去的总是不同,或者大一些或者小一些。比如服务器传递的是190115096930816001,而app拿到的却是190115096930816000。
排查
起初是以为中间的某一个链路传错了数据,但是通过日志可以看出服务端各个链路的数据都是正确的。如果服务器这边的数据都是正确的,那问题应该是出在客户端,是否是客户端在使用的时候对id做了什么操作,客户端同学看了代码后也将这种猜测排除了,此时大家陷入了迷茫。
但是车到山前必有路 ~ 一个同学在将服务器传输的json贴到json工具的时候发现,对应的id被解析成了另一个数据。真相逐渐浮出了水面,问题似乎是出在了json的解析上,为了验证这个结论我们将之前的多个出问题的id分别进行json解析,发现得到的数据基本上都与之前的不同,连我们日常使用的jq都会得到错误的数据。这次的改动是将id的位数提高了3位,也就是从原来的15位id变成了18位id,会不会是json对于大整数的支持有问题呢?看样子我们得从json本身入手了。
原理
想了解json的原理,最权威的莫过于rfc协议了。json对应的rfc协议为https://tools.ietf.org/html/rfc7159#page-6。由于篇幅的原因,我们这里只分析协议中的Numbers,之所以只分析这一部分是因为上面提到的id的类型是long,在json协议中对应的是Numbers。
The representation of numbers is similar to that used in most
programming languages. A number is represented in base 10 using
decimal digits. It contains an integer component that may be
prefixed with an optional minus sign, which may be followed by a
fraction part and/or an exponent part. Leading zeros are not
allowed.
A fraction part is a decimal point followed by one or more digits.
An exponent part begins with the letter E in upper or lower case,
which may be followed by a plus or minus sign. The E and optional
sign are followed by one or more digits.
Numeric values that cannot be represented in the grammar below (such
as Infinity and NaN) are not permitted.
number = [ minus ] int [ frac ] [ exp ]
decimal-point = %x2E ; .
digit1-9 = %x31-39 ; 1-9
e = %x65 / %x45 ; e E
exp = e [ minus / plus ] 1*DIGIT
frac = decimal-point 1*DIGIT
int = zero / ( digit1-9 *DIGIT )
minus = %x2D ; -
plus = %x2B ; +
zero = %x30 ; 0
This specification allows implementations to set limits on the range
and precision of numbers accepted. Since software that implements
IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is
generally available and widely used, good interoperability can be
achieved by implementations that expect no more precision or range
than these provide, in the sense that implementations will
approximate JSON numbers within the expected precision. A JSON
number such as 1E400 or 3.141592653589793238462643383279 may indicate
potential interoperability problems, since it suggests that the
software that created it expects receiving software to have greater
capabilities for numeric magnitude and precision than is widely
available.
Note that when such software is used, numbers that are integers and
are in the range [-(2**53)+1, (2**53)-1] are interoperable in the
sense that implementations will agree exactly on their numeric
values.
根据上面的协议可以看到,json中的number的定义可以使用科学记数法来表示。同时协议指出,由于IEEE 754-2008(双精度浮点数)是比较通用的,所以建议json具体的实现参照该协议。双精度浮点数的定义如下:
将双精度浮点数转换为普通数值的公式如下:
其中0x3ff=1023。
参照https://blog.penjee.com/binary-numbers-floating-point-conversion/ 这篇文章,我们尝试将文章最开始提到的出问题的id=190115096930816001,转化为双精度浮点数,
-
十进制到二进制
190115096930816001 ->
1101001100010001111101011101111011110000000001010000000000001 -
二进制转换
1101001100010001111101011101111011110000000001010000000000001 -> 1.101001100010001111101011101111011110000000001010000000000001 * 2^60 -
求exponent
exponent = 1023 + 60 = 1083 = 10000111011 -
求Mantissa
由于Mantissa在双精度浮点数中只有52位,所以得出Mantissa = 1010011000100011111010111011110111100000000010100000
所得到的双精度浮点数为, 0 + 10000111011 + 1010011000100011111010111011110111100000000010100000 = 0100001110111010011000100011111010111011110111100000000010100000
将上面的双精度浮点数根据公式转变为普通数据的时候可以得到190115096930816000,数据的丢失发生在第四步。
到这里,json大整数丢失精度的问题基本上已经定位了,下面一节我们看一下java常用的一些json库是否有类似的问题。
java常用json库测评
String jsonS = "{\"id\":1901150969308160001}";
// fastjson
JSONObject object = JSONObject.parseObject(jsonS);
System.out.printf("%-10s%s\n", "fastjson:", object.get("id"));
// jackson
ObjectMapper mapper = new ObjectMapper();
JsonNode node = mapper.readTree(jsonS);
System.out.printf("%-10s%s\n", "jackson:", node.get("id"));
// gson
Gson gson = new Gson();
Map json = gson.fromJson(jsonS, Map.class);
System.out.printf("%-10s%s\n", "gson:", json.get("id"));
运行上面的代码可以看到如下的输出,
fastjson: 1901150969308160001
jackson: 1901150969308160001
gson: 1.90115096930816E18
可以看到fastjson和jackson可以正常的解析大整数,而gson则会出现丢精度的情况。
可见并不是所有的json框架都能够正常处理大整数,所以在选用json框架的时候要谨慎。