Logo
Solving non-linear equations via different numerical methods.
Overview
Solving non-linear equations via different numerical methods.

Solving non-linear equations via different numerical methods.

December 18, 2025
7 min read

Before we tackle the idea of “solving non-linear equations”, we shall start by learning various methods of finding the root of a function,\textit{root of a function,} so we can understand when to use one method over another, starting with Newton’s Method. From a pedagogical standpoint, Newton’s Method isn’t exactly the best starting point (according to my arbitrary intuition); however, I also don’t think it will be too difficult to reintegrate the relationships and understandings of these different ideas/methods (seeing as they all solve the same problem (for certain functions), in different, but similar ways). So let us begin.

Newton’s Method

Newton’s method (also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson) is a root-finding algorithm.\textit{root-finding algorithm.} We will learn several different root-finding algorithms to help us approximate the root of a function, but first, what is the root of a function?

Definition (Root of a function)

The root of a function\text{The root of a function} f(x)f(x) is the value of\text{is the value of} x,x, such that\text{such that} f(x)=0.f(x)=0.

Newton’s Method can help us approximate the value of x.x.

Definition (Newton's Method)

Let ff be a real-valued differentiable function and let x0x_0 be the initial estimate (or initial guess) for a root xx, such that f(x)=0f(x)=0. Then the sequence of approximations (xn)n=0(x_n)_{n=0}^{\infty} is defined recursively by

xn+1=xnf(xn)f(xn),\begin{equation} x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}, \end{equation}

for all n0n\geq 0, if f(xn)0.f'(x_n)\neq 0.

Example (Approximating roots)

Find the root of 12lnx=2x-\frac12\ln x=2x for 0<x<120<x<\frac12 up to 33 decimal places.

Solution

First we will rewrite the function as f(x)=2x+12lnx.f(x)=2x+\frac12 \ln x. Evaluating the derivative we have f(x)=12x+2.f'(x)=\frac{1}{2x}+2. We will choose x0=14x_0=\frac14, and using eqref1 from Definition 1.2 we will approximate x1.x_1.

x1=x0f(x0)f(x0)x1=14f(14)f(14)x1=142(14)+12ln(14)12(14)+2x10.29828679514.\begin{align*} x_1=x_0-\frac{f(x_0)}{f'(x_0)} \\ x_1=\frac14-\frac{f(\frac14)}{f'(\frac14)} \\ x_1=\frac14-\frac{2(\frac14)+\frac12 \ln(\frac14)}{\frac{1}{2(\frac14)}+2}\\ x_1\approx 0.29828679514. \\ \end{align*}

By Definition 1.2 again, we will approximate x2x_2.

x2=x1f(x1)f(x1)x20.29828679514f(0.29828679514)f(0.29828679514)x20.298286795142(0.29828679514)+12ln(0.29828679514)12(0.29828679514)+2x20.300538100661.\begin{align*} x_2=x_1-\frac{f(x_1)}{f'(x_1)} \\ x_2\approx 0.29828679514-\frac{f(0.29828679514)}{f'(0.29828679514)}\\ x_2\approx 0.29828679514-\frac{2(0.29828679514)+\frac{1}{2}\ln(0.29828679514)}{\frac{1}{2(0.29828679514)}+2}\\ x_2 \approx 0.300538100661. \\ \end{align*}

By Definition 1.2 again, we will approximate x3x_3.

x3=x2f(x1)f(x1)x30.300538100661f(0.300538100661)f(0.300538100661)x30.3005381006612(0.300538100661)+12ln(0.300538100661)12(0.300538100661)+2x30.300541968288.\begin{align*} x_3 = x_2-\frac{f(x_1)}{f'(x_1)} \\ x_3\approx 0.300538100661-\frac{f(0.300538100661)}{f'(0.300538100661)} \\ x_3\approx 0.300538100661-\frac{2(0.300538100661)+\frac{1}{2}\ln(0.300538100661)}{\frac{1}{2(0.300538100661)}+2}\\ x_3\approx 0.300541968288. \end{align*}
Intuition

The question asks us to approximate the root to 33 decimal places. Notice that the first four decimals of x2x_2 and x3x_3 are the same. For this case, we round that up. Thus the root is x0.301x\approx 0.301 as desired.

Exercise

Find the root of 27e2.7x=127x227e^{-2.7x}=\frac{1 }{27}{x^2} for x0=1.9x_0=1.9 up to 33 decimal places.

Answer
Exercise for the reader.\begin{aligned} \text{Exercise for the reader.} \tag*{$\blacksquare$} \end{aligned}

The convergence of Newton’s Method

Explanation

Consider x61.x^6-1. It’s clear its roots are x=1x=-1 and x=1x=1. So Definition 1.2 isn’t necessary, but say we did it anyway. Picking arbitrary values from 1x1-1 \leq x\leq 1 (for simplicity sake), we will check how many iterations are necessary for an accuracy of three decimal places and a calculation tolerance of 1.0×107.1.0\times10^{-7}.

Start Value (xx00)Total Iterations (NN)
-0.994
-0.757
-0.5015
-0.2534
-0.01122
0.00Undefined
0.01122
0.2534
0.5015
0.757
0.994

Seeing this, it’s clear we got lucky with our choice of x0x_0 in Solution 1.4 and our given x0x_0 in Exercise 1.6. Both required only a few iterations of eqref(1) from Definition 1.2 to meet an acceptable accuracy, but for x61,x^6-1, choosing poorly within the small inequality 1x1-1 \leq x\leq 1, is the difference between iterating eqref(1) 44 times or 122122 times. In fact, 122 isn’t the limit. NN can be arbitrarily large. Similarly, NN can converge in a single computation, provided you pick an x0x_0 that converges while being close enough to the real root and meeting the accuracy you’re looking for (i.e., x0x_0 and x1x_1 must meet the same requirements that Intuition 1.5 mentioned for x2x_2 and x3x_3). What about x0=0x_0=0? Why is it undefined for this function? For x61x^6-1 we know f(0)=0,f'(0) = 0, and looking at eqref(1) it’s clear that dividing by zero is undefined; looking at it geometrically, this makes sense because the derivative is zero, therefore there exists a horizontal tangent line at x0x_0 that is at y=1,y=-1, and this line grows without bound in both directions, i.e. (,)(-\infty,\infty) in R2\Reals ^2 (which is well defined since the line itself is defined for (x,1),(x,-1), for all xRx \in \Reals (xx here is not referring to the root of course, it represents the horizontal coordinate in R2,\Reals ^2, we are overloading variables for simplicity sake). However it is not computable in standard models of arithmetic and so it also fails for Newton’s Method).

Now, this leaves a lot of questions, and explaining why is a bit tedious (and not really instructive without motivation). So, instead, you will learn the answers to those questions by answering them yourself; i.e., you will fill in the gaps of understanding that come with this unfinished exposition by doing mathematics. Below are some problems that will hopefully help you answer and fill in all the gaps I mentioned.

Some problems

Problem

From Explanation 1.8, it seems that the positive x0x_0 values converge at the same rate as their additive inverse. Is this always the case? In other words, is N(x0)=N(x0)N (x_0)= N(-x_0) for each x0R, x_0 \in \Reals, where N(x0)N(x_0) denotes the number of steps it takes for some function to converge starting at x0x_0 (the initial estimate). Supposing exceptions exist, what type of function(s) are necessary for N(x0)=N(x0)N (x_0)= N(-x_0) to hold? (and why?)

Hint

Look at the function x2,x^2, then look at the function x3x^3. They both have symmetry, but one type of symmetry that one of those functions has is a property that allows N(x0)=N(x0)N (x_0)= N(-x_0) to hold.

Problem (identify stationary points)

From Explanation 1.8 again, we saw that choosing x0=0x_0=0 is undefined for x61x^6-1 for eqref(1). It’s built into Definition 1.2 that f(x)0f'(x)\neq 0 is necessary to use Newton’s Method, and so being able to identify which points of a function fail to converge could come in handy (in more ways than one). Learn how to find the stationary point of any arbitrary function (assuming the function has stationary points).

Hint

Refer to Definition 1.13 and visually look at some functions. This one is relatively easy (all the problems in this section are). The key is to deepen your understanding of these simple ideas you’re (probably already) familiar with, to a high enough level, so that you can see (and thus learn) how to use these simple ideas in (not so simple) novel ways.

Definition (Stationary point)

Let f:RnRf: \Reals ^n \to \Reals be differentiable. A point c=(c1,c2,,cn)Rnc=(c_1,c_2,\dots,c_n)\in \Reals ^ n is called a stationary point if

f(c)=0.\begin{equation} \nabla f(c)=0. \end{equation}

Here f\nabla f denotes the gradient of f.f. (i.e.,f=(fx1,fx2,,fxn)). \nabla f = (\frac {\partial f}{\partial x_1},\frac{\partial f}{\partial x_2},\dots,\frac{\partial f}{\partial x_n})).

Problem

Geometrically derive Definition 1.2, and use that geometric intuition to understand why the speed of convergence is higher or lower for different initial guesses x0x_0 across differerent functions.

Hint

First geometrically sketch some examples using simple functions (like y=4xy=4x, or y=exy=e^x etc.). After deriving Definition 1.2 play with some initial points of x0x_0 for arbitrary functions, and look at the tangent line and gradient very closely.

Problem

Consider the function x32x+2.x^3-2x+2. Given an initial guess x0=0,x_0=0, use eqref(1) from Definition 1.2 to compute a few iterations of the sequence. Observe the sequence and use your geometric intuition to make sense of what’s going on. Then generalise this understanding to arbitrary functions that have the same properties.

Hint

Using stationary point(s) is one way to understand this. Refer to your understanding of Problem 1.11 and Definition 1.13. Think about how the gradient, concavity, and the tangent line work together.

Problem

Use your newly equipped geometric understanding of Definition 1.2 to answer any other question(s) you have.

Hint

If you can’t think of anything else after having done the previous problems, then this is a good opportunity to make your own definitions for ideas you think may be useful. For example, some generalisation about an idea that you think may come up again and again, but wasn’t explicitly written, or some idea that is true but you’re unsure of whether it can be extended to nn-dimensions. In other words, start valuing some ideas more than others to build some structure on what’s really relevant (the big picture ideas), build some tools you think will be useful going forward, and try to fill as many gaps as you can.