ACM中常常遇到卡精度的问题，卡精度可能因为取整、比较相等、高精度等多种原因。
这里通过一个例子试图探讨两个编译器的浮点数运算实现机制以及类型转换的机制，以及使用不同编译器和使用不同指令集在取整上的卡精度的问题。

问题描述

江科大的nomasp同学在刷Leetcode(171:Excel Colmun Number)的时候发现了一个问题。他的代码简化后如下：

#include <iostream>
#include <cmath>
using namespace std;

int solve(string s) {
    int n = 0, len = s.length();
    for(int i = 0; i < len; ++ i) {
        int c = s[i] - 'A' + 1;
        n += c * pow(26, len - 1 - i);
        cout<<"c = " << c << endl;
        cout<<"pow = " << pow(26, len - 1 - i) << endl;
        cout<<"n = " << n << endl;
        cout<<endl;
    }
    return n;
}

int main() {
	cout<<solve("AAB");
	return 0;
}

使用Mingw32 GCC/G++(4.7.2)编译(g++.exe cpp_path -o exepath -std=c++11 -g3 -static-libstdc++ -static-libgcc -g3)后，输出如下：

使用MSVC140(VS2015)编译后，输出如下：

稍有常识的人都会看出，GCC的结果是错误的。

问题分析

有同学认为pow函数是个浮点函数，因此应当转成int之后再相加减，但实际试验之后发现这个问题仍然存在，结果是：

经过分析，如果手动实现pow函数则GCC的结果是对的。因此定位到调用pow函数这边出了问题。因此取一下测试代码分析。

#include <cstdio> 
#include <cmath>
using namespace std;
int main() {
	float f = pow(26, 2);
	int i = f;
	int i2 = pow(26, 2);
	printf("%f %d %d\n", f, i, i2); //似乎这个版本的GCC不支持%lf
	return 0;
}

在GCC下编译运行得：

在VS2015下编译运行得：

可以发现对GCC，当使用float变量直接初始化int时发生窄化，不过不影响结果。但是将这个float变量换成pow函数后就会影响结果。谦谦同学指出在GCC520版本下源代码也能正常运行。他进而提出pow函数返回的是double，在转成int的时候需要先转成float再转成int。根据这个意见，我测试了下面的代码，如果说会先转换成float，那么yy的值应当为55，因为从double到float的过程中精度丢失了。

int main(){
	float x = 54.999999999999993;
	int y = (int) x;
	double xx = 54.999999999999993;
	int yy = (int) xx;
	long double xxx = 54.999999999999993;
	int yyy = (int) xxx;
	printf("%d %d %d\n", y, yy, yyy); 
	return 0;
}

不过结果是55 54 54，看来从double到int并不会经过float。不过在他的提示下，我把pow函数改成powf函数，发现返回值正常了。既然如此，我又测试了下面的代码

#include <cstdio> 
#include <cmath>
using namespace std;
int main() {
	double f = pow(26, 2);
	int i = f;
	printf("%f %d\n", f, i);
	return 0;
}

按照上面的理论，这里应该输出675，但结果却是正确的676。因此现在的差别实际上就来自pow函数，我倾向于是一个RVO问题。
为了进一步研究原因，需要比较对pow函数的实现以及类型转换机制。

MSVC对pow的实现

MSVC中我们实际上使用的是
double pow<int, int>(int _Left, int _Right); 这个重载版本。
首先定位到头文件xtgmath.h：

//xtgmath.h
_C_STD_BEGIN
// #define _STD ::std::
// #define _CSTD ::
template<class _Ty1, class _Ty2> inline
	typename _STD enable_if< _STD is_arithmetic<_Ty1>::value && _STD is_arithmetic<_Ty2>::value		
	,typename _STD _Common_float_type<_Ty1, _Ty2>::type>::type
	pow(const _Ty1 _Left, const _Ty2 _Right)
	{	// bring mixed types to a common type
	typedef typename _STD _Common_float_type<_Ty1, _Ty2>::type type;
	return (_CSTD pow(type(_Left), type(_Right)));
	}

可以看到最后调用一个参数是_Common_float_type的pow函数，那这个函数在哪里呢，我们首先来看一下_Common_float_type这个类型。

//xtr1common.h	
typedef integral_constant<bool, false> false_type;
// integral_constant is convenient template for integral constant types

//xtgmath.h	
template<class _Ty>
	struct _Promote_to_float
	{	// promote integral to double
	typedef typename conditional<is_integral<_Ty>::value,
		double, _Ty>::type type;
	};
	
template<class _Ty1, class _Ty2>
	struct _Common_float_type
	{	// find type for two-argument math function
	typedef typename _Promote_to_float<_Ty1>::type _Ty1f;
	typedef typename _Promote_to_float<_Ty2>::type _Ty2f;
	typedef typename conditional<is_same<_Ty1f, long double>::value || is_same<_Ty2f, long double>::value, long double,
		typename conditional<is_same<_Ty1f, double>::value || is_same<_Ty2f, double>::value, double,
			float>::type>::type type;
	};

这边说明一下这段代码：
std::is_integral用来判断一个类型是否是整数
std::conditional有三个参数，其作用相当于_Test? _Ty1: _Ty2，其中_Ty1和_Ty2都是类型，而std::enable_if相对std::conditional省略了假的情况。std::is_same用来判断_Ty1和_Ty2是否相同类型。这个是模板编程里面常用到的元函数。
_Promote_to_float用来将integral（整数）类型扩成double类型，如果是浮点数则不变。说到扩展，也挺有意思的，例如char在被爆之后是直接变成int，还有符号扩展和0扩展啥的，可以查看CSAPP P49页。
_Common_float_type意思就是

if (_Ty1 is long double || _Ty2 is long double)
	typedef type long double
else if(_Ty1 is double || _Ty2 is double)
	typedef type double
else
	typedef type float

用来给浮点数之间运算结果选择适合的精度

下面我们开启调试，跟踪pow函数执行。在调试中，在断点_CSTD pow(type(_Left), type(_Right))处发现type(_Left)和type(_Right)都已经通过_Common_float_type变成了double，继续跟踪发现直接调用了math.h中的pow。
特别地，对于c mode，pow实际直接调用了pow(double, double)。
此外还注意到<cmath>中有一个的_Check_return_ inline double pow(_In_ double _Xx, _In_ int _Yx)函数，而这个函数实际上调用了<cmath>中的_Pow_int函数，该函数如下：

template<class _Ty>
	_Check_return_ inline _Ty _Pow_int(_Ty _Xx, int _Yx) _NOEXCEPT
	{
	unsigned int _Nx;
	if (_Yx >= 0)
		_Nx = static_cast<unsigned int>(_Yx);
	else
		_Nx = static_cast<unsigned int>(-_Yx);

	for (_Ty _Zx = static_cast<_Ty>(1); ; _Xx *= _Xx)
		{
		if ((_Nx & 1) != 0)
			_Zx *= _Xx;
		if ((_Nx >>= 1) == 0)
			return (_Yx < 0 ? static_cast<_Ty>(1) / _Zx : _Zx);
		}
	}

这段代码主要就是处理指数为int的情况，其使用的是类似于快速幂的方法得到的结果。不过经过测试，这段代码始终没有被调用的情况，为什么标准库不使用这个重载版本，我想附录里面的一段答案应该能够给我们启发。
对于以上代码，我想最重要的就是_Common_float_type，~~它避免了GCC472版本的类型二次转换的错误，~~直接调用对应类型的重载版本。

MSVC对类型转换的实现

为了分析_pow函数，需要先了解一下MSVC对于类型转换的实现方案

double -> int

;	int i = pow(26, 2);
00E7277E  push        2  
00E72780  push        1Ah  
00E72782  call        pow<int,int> (0E71064h)  
00E72787  add         esp,8  
00E7278A  call        __ftol2_sse (0E71195h)  
00E7278F  mov         dword ptr [i],eax  
;	int i = pow(26.0, 2.0);
00FE277E  sub         esp,8  
00FE2781  movsd       xmm0,mmword ptr ds:[0FEAB98h]  
00FE2789  movsd       mmword ptr [esp],xmm0  
00FE278E  sub         esp,8  
00FE2791  movsd       xmm0,mmword ptr ds:[0FEABD0h]  
00FE2799  movsd       mmword ptr [esp],xmm0  
00FE279E  call        _pow (0FE11FEh)  
00FE27A3  add         esp,10h  
00FE27A6  call        __ftol2_sse (0FE1195h)  
00FE27AB  mov         dword ptr [i],eax  
;	double s = 26.0;
00D8277E  movsd       xmm0,mmword ptr ds:[0D8AB98h]  
00D82786  movsd       mmword ptr [s],xmm0  
;	int i = s;
00D8278B  cvttsd2si   eax,mmword ptr [s]  
00D82790  mov         dword ptr [i],eax

这里我们发现如果pow参数都为int则是调用pow<int, int>，而这个pow<int, int>是通过xtgmath.h中的pow函数实现的：

_C_STD_BEGIN
;template<class _Ty1,
;	class _Ty2> inline
;	typename _STD enable_if< _STD is_arithmetic<_Ty1>::value
;		&& _STD is_arithmetic<_Ty2>::value,
;		typename _STD _Common_float_type<_Ty1, _Ty2>::type>::type
;	pow(const _Ty1 _Left, const _Ty2 _Right)
;	{	// bring mixed types to a common type
00FE2B12  mov         ecx,30h  
00FE2B17  mov         eax,0CCCCCCCCh  
00FE2B1C  rep stos    dword ptr es:[edi]  
;	typedef typename _STD _Common_float_type<_Ty1, _Ty2>::type type;
;	return (_CSTD pow(type(_Left), type(_Right)));
00FE2B1E  cvtsi2sd    xmm0,dword ptr [_Right]  
00FE2B23  sub         esp,8  
00FE2B26  movsd       mmword ptr [esp],xmm0  
00FE2B2B  cvtsi2sd    xmm0,dword ptr [_Left]  
00FE2B30  sub         esp,8  
00FE2B33  movsd       mmword ptr [esp],xmm0  
00FE2B38  call        _pow (0FE11FEh)  
00FE2B3D  add         esp,10h  
;	}

cvtsi2sd来自SSE2，负责取出最低位的64位整型，并将其转换为一个浮点值，存放到xmm0浮点寄存器中。
mmword负责将xmm0内的浮点移到[esp]
所以pow<int, int>实现也是先转换成浮点再调用_pow
对于__ftol2_sse函数底层是调用cvtsi2sd函数的，相比之下由于double是有符号而且表示范围要大于int，因此需要额外加一些处理。所以实际上通过调用cvttsd2si函数进行了类型转换。

float -> int

;	int i = powf(26, 2);
00C5277E  push        ecx  
00C5277F  movss       xmm0,dword ptr ds:[0C5AB48h]  
00C52787  movss       dword ptr [esp],xmm0  
00C5278C  push        ecx  
00C5278D  movss       xmm0,dword ptr ds:[0C5AB4Ch]  
00C52795  movss       dword ptr [esp],xmm0  
00C5279A  call        _powf (0C51523h)  
00C5279F  add         esp,8  
00C527A2  call        __ftol2_sse (0C51195h)  
00C527A7  mov         dword ptr [i],eax  
;	float s = 26.0;
00D5277E  movss       xmm0,dword ptr ds:[0D5AB48h]  
00D52786  movss       dword ptr [s],xmm0  
;	int i = s;
00D5278B  cvttss2si   eax,dword ptr [s]  
00D52790  mov         dword ptr [i],eax

MSVC对_pow函数的实现

下面分析_pow函数
首先win7并没有装MSVC 140的库，编译成静态IDA导入不了PDB，都是天书。于是再在win10下面使用ollydbg来调试，然而并找不到_main，只看到_cinit函数包含的_initterm，只好靠调用关系和参数传递硬找。

过程	图片	解释
进入main函数
main函数
pow函数
pow函数		跳转使用SSE指令集计算pow
pow函数返回
pow函数返回后准备调用类型转换函数
使用cvttsd2si的类型转换函数		第一行的cmp指令由于不等于0，所以使用cvttsd2si而不是fistp，注意和后面gcc使用fistp进行比较
类型转换的结果		od_win10_vs_after_main_exit
退出main函数		可以看到exit和_cexit函数

GCC对pow的实现

GCC版本之间差别比较大，我们这里还以GCC 472(Mingw32)分析。并简化了函数

#include <cstdio> 
#include <cmath>
using namespace std;
int main() {
	int i2 = pow(26, 2);
	printf("%d\n", i2);
	return 0;
}

跟踪pow函数，发现实际调用了/MinGW32/lib/gcc/mingw32/5.7.2/include/c++/cmath中的函数

...
  template<typename _Tp, typename _Up>
    inline _GLIBCXX_CONSTEXPR
    typename __gnu_cxx::__promote_2<_Tp, _Up>::__type
    pow(_Tp __x, _Up __y)
    {
      typedef typename __gnu_cxx::__promote_2<_Tp, _Up>::__type __type;
      return pow(__type(__x), __type(__y));
    }
...

的重载std::pow<int, int>(__x = 26, __y = 2)
汇编代码如下

...
   0x00401d20 <+0>:	push   %ebp
   0x00401d21 <+1>:	mov    %esp,%ebp
   0x00401d23 <+3>:	sub    $0x18,%esp
   0x00401d26 <+6>:	fildl  0xc(%ebp)
   0x00401d29 <+9>:	fildl  0x8(%ebp)
   0x00401d2c <+12>:	fxch   %st(1)
   0x00401d2e <+14>:	fstpl  0x8(%esp)
   0x00401d32 <+18>:	fstpl  (%esp)
=> 0x00401d35 <+21>:	call   0x401c88 <pow>
   0x00401d3a <+26>:	leave  
   0x00401d3b <+27>:	ret   
...

其中fildl表示往st（浮点数操作堆栈）栈顶放入一个长整数，fstpl是取出一个长整型数
然后调用了crt中的pow函数的汇编代码

   0x74cd34b0 <+0>:	cmpl   $0x0,0x74cf6d84
=> 0x74cd34b7 <+7>:	je     0x74cd3544 <msvcrt!_CrtDbgReportWV+84>
   0x74cd34bd <+13>:	sub    $0x8,%esp
   0x74cd34c0 <+16>:	stmxcsr 0x4(%esp)
   0x74cd34c5 <+21>:	mov    0x4(%esp),%eax
   0x74cd34c9 <+25>:	and    $0x1f80,%eax
   0x74cd34ce <+30>:	cmp    $0x1f80,%eax
   0x74cd34d3 <+35>:	jne    0x74cd34e4 <pow+52>
   0x74cd34d5 <+37>:	fnstcw (%esp)
   0x74cd34d8 <+40>:	mov    (%esp),%ax
   0x74cd34dc <+44>:	and    $0x7f,%ax
   0x74cd34e0 <+48>:	cmp    $0x7f,%ax
   0x74cd34e4 <+52>:	lea    0x8(%esp),%esp
   0x74cd34e8 <+56>:	jne    0x74cd3544 <msvcrt!_CrtDbgReportWV+84>
   0x74cd34ea <+58>:	jmp    0x74ce0249 <msvcrt!modf+9193>
   0x74cd34ef <+63>:	nop
...

其中stmxcsr将MXCSR存储到32位寄存器，modf将数分解为整数部分和小数部分。
在0x74cd34b7 <+7>处程序进入msvcrt!_CrtDbgReportWV+84。然后在其中某一个modf方法中卡死。于是很奇怪为什么要用到msvcrt!_CrtDbgReportWV这个函数
后来我试图用IDA来调试，不过win10 把我的安装程序杀掉了，所以我试图在win7里面用IDA调试，首先装完DevCPP之后进行调试，发现了不一样的光景。
crt中pow函数的代码变成了这样！

=> 0x7613608f <+0>:	cmpl   $0x0,0x761b50c0
   0x76136096 <+7>:	je     0x7613609d <pow+14>
   0x76136098 <+9>:	jmp    0x76140eb8 <msvcrt!_CIpow+271>
   0x7613609d <+14>:	lea    0xc(%esp),%edx
   0x761360a1 <+18>:	call   0x76120c6f <msvcrt!_clearfp+230>
   0x761360a6 <+23>:	jmp    0x76120c85 <msvcrt!_clearfp+252>
   0x761360ab <+28>:	movl   $0x8,-0x218(%ebp)
   0x761360b5 <+38>:	jmp    0x7611cf67 <wtoi+1860>
   0x761360ba <+43>:	cmp    %edi,%esi
   0x761360bc <+45>:	jne    0x7611c269 <wcsncpy_s+30>
   0x761360c2 <+51>:	jmp    0x7615dbb3 <msvcrt!_ftol2_sse_excpt+114168>
   0x761360c7 <+56>:	xor    %eax,%eax
   0x761360c9 <+58>:	mov    %ax,(%esi)
   0x761360cc <+61>:	jmp    0x7611c2b8 <wcsncpy_s+109>

反汇编居然不一样了！那到底哪个结果是对的呢？
用IDA调试发现GCC编译结果连main函数都不知道在哪里，到处都是sub_xxxx的无名函数，于是使用OD进行调试，至少能看清楚先后调用关系

下面是调试过程

过程	图片	解释
进入main函数		前面要先经过两个jmp
进入pow<int, int>函数
进入CRT中的pow函数的跳转		这里是一个jmp跳转
进入CRT中真正的pow函数		可以和WIN10和WIN7下的DEVCPP的反汇编进行对比
CRT中pow函数返回		注意右边寄存器表中的ST0的值是正确的`26**2==676.00`
`printf`前的结果		注意到eax实际上存放着printf的参数`int i2`，但是变成了整数`0x02A3==675`而不是676

所以应该是在浮点数变成整数这块出了问题
先解释几个汇编指令

汇编	解释
fstcw	存储FPU控制字到一个内存区域
fldcw	逆运算
fistp	存储ST(0)到整数并弹出寄存器堆栈

下面来看main函数中是如何将FPU中的运算结果拿到eax中并提供给printf输出的

过程	图片	解释
从pow<int, int>函数跳出		发现此时运算结果还在ST(0)中，并且fstcw指令从[esp+1e]处存取浮点控制字，fisttp指令将ST(0)存放到[esp+1c]==0x0028ff1c处
运行完fistp检查内存块0x0028ff1c		发现此时结果变为2a3==675，于是应该是fistp指令出现了问题

总结

MSVC 140和GCC472(MinGW)使用了不同的汇编指令进行类型转换，导致出现问题。GCC使用了x87 FPU指令而MSVC 140使用了SSE指令集。
如果改成这样的方式

#include <cstdio> 
#include <cmath>
using namespace std;
const double eps = 1e-6; 
int main() {
	int i2 = pow(26, 2) + eps;
	printf("%d\n", i2); //似乎这个版本的GCC不支持%lf
	return 0;
}

输出结果在GCC472版本上也是正确的了

Postscript

本文参考了在 cplusplus.com 上guestgulkan给出的这样的解答：

This is actually quite interesting and works differently on Microsoft Visual Studio 2008 and Dev C++(using mingw);

Microsoft Visual Studio 2008 cmath is basically a wrapper that calls math.h.
In math.h if running in C mode you only get one power function pow(double, double).
In C++ mode (which we are using) you get the c++ overloaded functions:
long double pow(long double,int), float pow(float,int), double pow(double,int) and a few others.
So calling pow(int, int) for example pow(3,2) will always fail due to ambiguity whether you include cmath or math.h
DEV C++ with MINGW
With this set up, math.h just contains the the usual C function
pow(double, double) - so all the functions work because with pow(int, int) both ints get promoted to double by compiler and all is OK
cmath in more than a wrapper for math.h. First it includes math.h and then undefines a whole lot of stuff that math.h defined, and substitutes the c++ versions.
This includes the pow function declaration.
As the c++ overloaded functions (same as any other c++ compiler), you will get the ambiguity problem - when using pow(int, int).
P.S The ambiguity occurs with pow(int, int) because integers can be promoted to floats or doubles, which means that pow(int, int) can fit any of the 6 or so overloaded c++ pow function - so the compiler gets confused.

对于标准库对pow函数的处理，stackoverflow上的enigmaticPhysicist给出了这样的回答：

A specialisation of pow(x, n) to where n is a natural number is often useful for time performance. But the standard library’s generic pow() still works pretty (surprisingly!) well for this purpose and it is absolutely critical to include as little as possible in the standard C library so it can be made as portable and as easy to implement as possible. On the other hand, that doesn’t stop it at all from being in the C++ standard library or the STL, which I’m pretty sure nobody is planning on using in some kind of embedded platform.