Overview
Computing a square root appears simple but hides details: integer vs real results, rounding (floor/ceil/nearest), overflow safety, and the domain (nonnegative inputs). This note presents practical, production-ready approaches:
-
Binary search (integer sqrt): robust, easy to implement,
Θ(log U)steps whereUbounds the result. -
Newton’s method / Heron’s algorithm (real or integer): fast (quadratic convergence) with careful termination and rounding rules.
-
Bit-by-bit (restoring) method: integer-only alternative with predictable control flow.
Use these patterns to implement floor_sqrt(n), ceil_sqrt(n), a floating sqrt(x), and checks like “is n a perfect square?” while avoiding overflow and off-by-one traps.
Note
Domain is nonnegative inputs. For negative
x, realsqrt(x)is undefined (NaN); integer sqrt is undefined. In complex arithmetic,√xis multi-valued—outside this note’s scope.
Core Idea
Exploit monotonicity of f(m) = m²:
-
For integers,
m² ≤ nis a monotone predicate inm. The largestmsatisfying it is⌊√n⌋. Binary search or bit tricks find this without floating-point. -
Newton’s method solves
g(y) = y² − x = 0by iterative refinementy_{k+1} = (y_k + x / y_k) / 2. With a good initial guess, errors square each step, rapidly doubling correct digits.
Both methods require care around overflow and termination to guarantee correct rounding.
Algorithm Steps / Pseudocode
A. Integer floor sqrt by binary search (overflow-safe)
function ISQRT_FLOOR(n): // n ≥ 0 (unsigned or nonnegative integer)
if n < 2: return n // 0→0, 1→1
lo = 1
hi = n // 2 + 1 // √n ≤ n/2 + 1 for n ≥ 2
while lo < hi:
mid = lo + (hi - lo + 1) // 2 // upper mid to avoid infinite loop
// Compare mid*mid ≤ n without overflow:
if mid <= n // mid:
lo = mid
else:
hi = mid - 1
return lo // lo = ⌊√n⌋-
Ceiling:
ISQRT_CEIL(n) = ISQRT_FLOOR(n)iflo*lo == nelselo + 1. -
Perfect square test: compute
r = ISQRT_FLOOR(n), checkr*r == n(guard overflow viar ≤ n // ras above if types are tight).
B. Newton/Heron for real sqrt (double) with safe termination
function SQRT_NEWTON(x): // x ≥ 0, floating-point
if x == 0: return 0.0
// Initial guess: use exponent to get ~1 ULP start (optional).
y = initial_guess(x) // e.g., y = max(1.0, 2^(ceil(log2(x))/2))
// Iterate until relative change small and y*y close to x
repeat:
y_next = 0.5 * (y + x / y)
if abs(y_next - y) <= 1e-15 * y_next: break
y = y_next
return y_next-
Better stopping rules compare
|y² − x| ≤ ε · max(x,1)to avoid stagnation near tinyx. -
For languages without fused divide behavior, guard
x / ywhenyunderflows.
C. Newton for integer floor sqrt (round-down at end)
function ISQRT_NEWTON(n): // n ≥ 0 integer; uses integer division
if n < 2: return n
// Start with a power-of-two estimate: 1 << ceil(bit_length(n)/2)
y = 1 << ((bit_length(n) + 1) // 2)
while true:
y_next = (y + n // y) // 2
if y_next >= y: break // converged (monotone non-increasing)
y = y_next
// y is ≥ ⌊√n⌋; adjust down if necessary
while (y + 1) <= n // (y + 1): y = y + 1
while y > n // y: y = y - 1
return y- This uses only integer arithmetic; the final correction ensures exact floor even with early stopping.
D. Bit-by-bit (restoring) integer sqrt
Deterministic loop building the result from MSB to LSB; reminiscent of long division:
function ISQRT_RESTORING(n):
res = 0
rem = 0
// Process bits in pairs from high to low
shift = (bit_length(n) - 1) // 2
while shift >= 0:
rem = (rem << 2) | ((n >> (shift*2)) & 3) // bring next 2 bits
cand = (res << 1) | 1 // trial divisor
if rem >= cand:
rem = rem - cand
res = (res << 1) | 1
else:
res = (res << 1)
shift = shift - 1
return res // = ⌊√n⌋- Predictable branches and no division; good on microcontrollers without fast div.
Example or Trace
Binary search (n = 37):
lo=1, hi=19 → mid=10, 10 ≤ 3? no → hi=9 → mid=5, 5 ≤ 7? yes → lo=5 → mid=7, 7 ≤ 5? no → hi=6 → mid=6, 6 ≤ 6? yes → lo=6, stop with 6, since 6²=36 ≤ 37 < 49.
Newton (x = 10):
Start y=4. Iterates: 3.25, 3.1625, 3.16227766…. Two to three iterations get full double precision.
Tip
Newton’s method converges quadratically near the root: once close, each step roughly doubles the number of correct digits.
Complexity Analysis
-
Binary search (integer):
O(log U)iterations, whereUis an upper bound on⌊√n⌋(e.g.,U = n/2+1or1<<((bitlen+1)/2)). Each step isO(1)arithmetic →Θ(log n)time,Θ(1)extra space. -
Newton (float): Each iteration is
O(1)arithmetic; with a good initial guess,O(log d)iterations to achieveddigits → effectively constant for fixed-precision floats. SpaceΘ(1). -
Newton (integer): Iterations count grows like
O(log log n)from a strong initial guess; overallΘ(1)space. -
Bit-by-bit: Processes
⌈bitlen(n)/2⌉loop steps →Θ(log n)time,Θ(1)space.
Optimizations or Variants
-
Initial guess for Newton: derive from exponent bits (for IEEE-754, split mantissa/exponent) to start within a few ULPs. Alternatively, table-based seeds for ranges.
-
Guarded division: in integer Newton, using
n // yavoids overflow that would occur withy*y. -
Mixed strategy: use 2-3 Newton steps from a rough
y0(e.g., bit-based) then finalize with a binary-search polish between⌊y⌋−1and⌈y⌉+1. -
Vectorized sqrt: on platforms with SIMD, use hardware
sqrtps/sqrtsdthen refine with one Newton step for accuracy across lanes. -
Fast inverse sqrt: compute
1/√x(graphics classic); multiply byxto get√xif needed. Requires careful constant tuning and one Newton step for accuracy.
Applications
-
Geometry & graphics: distance, normalization (
len = √(x²+y²+z²)), perspective calculations. -
Statistics: standard deviation, RMS, Mahalanobis distances.
-
Cryptography / number theory: perfect square checks, Cornacchia’s algorithm, modular square roots (different domain).
-
Systems: integer sqrt for bucket sizing, level-of-detail, tile radii without floats.
Common Pitfalls or Edge Cases
Warning
Overflow in
mid*mid. Always compare viamid ≤ n // mid(integers) or use wider types/bignums.
Warning
Termination on Newton. Stop by checking both relative change and squared residual; tiny
xor very largexcan trick naive deltas.
Warning
Rounding ambiguity. Specify whether you return floor, ceil, or nearest; document ties (e.g., nearest-even).
Warning
Negative inputs. Decide: return NaN (float), throw, or sentinel. For integer APIs, reject at the boundary.
Warning
Precision loss for large
x. In floating-point,y*ycan overflow even wheny ≈ √xfits. Prefer comparingytox / yduring checks, or scalexby powers of two to a safe range, iterate, then rescale.
Implementation Notes or Trade-offs
-
API surface: Provide
isqrt_floor,isqrt_ceil,sqrt_safe(double). Expose prediction-free integer paths for deterministic timing. -
Language quirks: Some languages offer
isqrtin the standard library (e.g., C++20std::sqrtfor floats,std::bit_widthhelps seeding; Pythonmath.isqrtimplements a bit-by-bit/NR hybrid). -
Big integers: Use Newton with big-int division; convergence remains fast because each step roughly doubles digits. Normalize by base-
Bdigits to keep divisions balanced. -
Testing: Include boundaries
n ∈ {0,1,2,3,4}, powers just below/above squares (e.g.,k² − 1,k²,k² + 1), and max values for the type (e.g.,UINT64_MAX).
Summary
Square roots can be computed reliably with:
-
Binary search for integer floors/ceils (simple, safe,
Θ(log n)). -
Newton/Heron for fast real roots or integer roots with final correction (quadratic convergence).
-
Bit-by-bit methods when divisions are slow or unavailable.
Correctness hinges on overflow-safe comparisons, clear rounding semantics, and explicit termination. Choose the method that fits the numeric model, performance constraints, and platform capabilities.