Integrate over Xi in (0, pi)
Due to translation symmetry, integration range (0, pi) instead of (0, 2pi) is enough.
The speedup is 1.5-2 times.
Ideally, the ranges (0, pi/3) and (0, pi/2) can be used for hexagonal and square lattices, further reducing computation time.
But for some unclear reason, the result of such integration is slightly different.
Edited by Mikhail Svechnikov