Thursday, October 10, 2013

Exploring B-Splines in Python

This notebook is the first result of online exploration of material explaining B-Splines. As I get more familiar with them, I'll do other notebooks. I will document my sources and eventually add some notes about things that bothered me. This is only a reformulation with my own terms of the vast b-spline documentation around the web. I try to explain the pitfalls and complete the ellipsis that other websites do. However, I'm not a mathematician so I might get some things wrong. But I really want this to be as precise yet understandable as possible using both precise terms and common language. If you spot errors, misconceptions or cryptic parts, please do leave a comment.

Used Tools:

  • Python 3
  • Numpy
  • Matplotlib
  • IPython

I won't detail their installation but have a look at IPython's installation guide. I prefer running it inside a VirtualEnv. Once installed, run IPython with:

ipython3 notebook --pylab=inline

We will draw these types of curves:

Introduction to B-Splines curves by a dummy

B-splines are mathematical objects which are very useful in computer aided design applications. They are a generalization of Bezier curves which were developped for the car industry in the mid 1900's. The user typically manipulates the vertices (control points) of a polyline and the b-spline will gracefully and smoothly interpolate those points. B-splines have nice curvature properties that make them very useful for design and industry and can be easily used in 2D or 3D (and maybe more...). They were further generalized into Non-Uniform Rational B-splines, yep: NURBs.

Parametric curves

A b-spline curve is a parametric curve whose points' coordinates are determined by b-spline functions of a parameter usually called $t$ (for time - this comes from physics and motion description). More precisely: A $D$-Dimensionnal b-spline curve is a parametric curve whose $D$ component functions are b-spline functions.

Let's pretend $C(t)$ is a 2D B-Spline curve (each of its points has two coordinates) :

  • eq. 1 : $C(t) = ( S_x(t), S_y(t) )$

$S_x$ and $S_y$ are the component functions of $C$ for the $x$ and $y$ coordinates respectively. In other words, $S_x(t)$ and $S_y(t)$ will yield the x and y coordinates of $C(t)$'s points. They are the actual b-spline functions.

B-Spline Function

But $t$ can't define the shape of $C$ all alone. A b-spline curve is also defined by:

  • $P$ : an $m$-sized list of control points. This defines which points the curve will interpolate. For example $P=[(2,3),(4,5),(4,7),(1,5)]$ be a list of $m=4$ vertices with two de Boor points each ($D=2$). Graphically, $P$ forms the control polygone or control cage of the curve.
  • $V$ : a $k$-sized list of non-decreasing real values called knot vector. It's the key to b-splines but is quite abstract. Knot vectors are very intrically tied to...
  • $n$ : the degree of the function. A higher degree means more derivability.

$S_x$ and $S_y$ are the same function $S$ which simply processes different de Boor points (different coordinates of $P$'s points). So for the 2D curve $C$:

  • eq. 2 : $C_{P,V,n} (t) = ( S_{Px,V,n} (t), S_{Py,V,n} (t) )$

Note that $S$ is "configured" (subscripted) like $C$ except that for the $x$ component function it takes the $x$ coordinate of points in $P$, thus the $Px$ subscript, the same goes for the $y$ component/coordinate. This is exactly what I meant above by "processes different de Boor points". These two sentences are equivalent. I also didn't put $P$, $V$ and $n$ as parameters because when you evaluate $C$, you evaluate $S$ and you evaluate it for a varying $t$, while all other inputs remain constant. This expression makes sense when a curve is to be evaluated more often than its control points are modified. Now we just need to know the definition of a B-spline function.

The mathematical definition of a B-Spline of degree $n$ as found in literature [Wolfram,Wikipedia] is

  • eq. 3 : $S_{n}(t) = \sum \limits_i^m p_i b_{i,n}(t)$

$b$ is the b-spline basis function, we're getting close! $p_i$ is the vertex of index i in $P$. This means eq. 3 is for one dimension! Let's express it for curve of $D$-dimensions.

  • eq. 4 : $\forall d \in \{1,...,D\}, C_{P,V,n} = (S_{n,d_1}(t), ...,S_{n,d_D}(t) ) = (\sum \limits_i^m p_{i,d_1} b_{i,n}(t), ..., \sum \limits_i^m p_{i,d_D} b_{i,n}(t))$

Roughly, $S$ is a sum of monomials, making it a polynomial! :) There is one monomial for each index of a vertex and each monomial is the product of the de Boor point at that index and the basis function at $t$ for that index. This is repeated for $D$ dimensions. So computing b-splines this way is quite slow with many loops, etc... but the implementation is easy and there are ways to speed it up (like memoization [Wikipedia, StackOverflow]).

Okay, time to translate eq. 4 to Python even though we don't know what $b$ is exactly. I'll try to follow the same naming but keep in mind that Python uses 0-indexing while maths use 1-indexing, so we shall pay attention to the iterations and array sizes.

In [3]:
def C_factory(P, V=None, n=2):
    """ Returns a b-spline curve C(t) configured with P, V and n.
    
    Parameters
    ==========
    - P (list of D-tuples of reals) : List of de Boor points of dimension D.
    - n (int) : degree of the curve
    - V (list of reals) : list of knots in increasing order (by definition).
    
    Returns
    =======
    A D-dimensionnal B-Spline curve.
    """
    
    # TODO: check that p_len is ok with the degree and > 0
    m = len(P)    # the number of points in P    
    D = len(P[0]) # the dimension of a point (2D, 3D)
    
    # TODO: check the validity of the input knot vector.
    # TODO: create an initial Vector Point.
    
    #############################################################################
    # The following line will be detailed later.                                #
    # We create the highest degree basis spline function, aka. our entry point. #
    # Using the recursive formulation of b-splines, this b_n will call          #
    # lower degree basis_functions. b_n is a function.                          #
    #############################################################################
    b_n = basis_factory(n)
    
    def S(t, d):
        """ The b-spline funtion, as defined in eq. 3. """
        out = 0.
        for i in range(m): #: Iterate over 0-indexed point indices
            out += P[i][d]*b_n(t, i, V)
        return out
    
    def C(t):
        """ The b-spline curve, as defined in eq. 4. """
        out = [0.]*D           #: For each t we return a list of D coordinates
        for d in range(D):     #: Iterate over 0-indexed dimension indices
            out[d] = S(t,d)
        return out
    
    ####################################################################
    # "Enrich" the function with information about its "configuration" #
    ####################################################################    
    C.V = V                   #: The knot vector used by the function
    C.spline = S              #: The spline function.
    C.basis = b_n             #: The highest degree basis function. Useful to do some plotting.
    C.min = V[0]              #: The domain of definition of the function, lower bound for t
    C.max = V[-1]             #: The domain of definition of the function, upper bound for t
    C.endpoint = C.max!=V[-1] #: Is the upper bound included in the domain.
    return C

Some python background

  • C_factory is a high-order function: it is a function which returns a function. The returned C function is a closure, it inherits the namespace of its definition context and holds reference to those variables.
  • In Python 3, the range(start, stop, step) function returns an iterator which iterates over integers like: $[0,X)$. This spares us much fiddling with "-1" adjustments all over the place. In Python 2, there it is called xrange(start, stop, step).

What is the Knot Vector?

Right, good question. $V$ is a list whose length is $k$ (see eq. 6), made of reals $t_q$ (called knots) with $q \in [1, k]$ where $t_q \leq t_{q+1}$. In other words values must never decrease. The current $t$ must lay somewhere in $[t_1, t_k)$. During the calculations $t$ will be compared to all the knot spans and if it falls in those spans, then the basis function will have a certain influence on the final curve. It determines which vertices (de Boor points) will influence the curve at a given $t$.

The distribution of the knots in $V$ will determine if the curve meets the endpoints of the control polyline (it is then clamped), or not (unclamped - the domain of definition is shorter). It can be periodic or not if some knot spans repeat themselves, and uniform or not if the knot spans have a constant size, but more on this later when I wrap my head around it a bit more! Just remember that the distribution of knots will depend on the type of the knot vector. We will focus on clamped knot vectors which are a special case of b-splines behaving like Bezier curves.

Clamped Knot Vector

A clamped knot vector looks like this:

  • eq. 5 : $ V = \{t_1 = ... = t_n, ..., t_{k-n-1}, t_{k-n} = ... = t_{k} \} $

Where, as a reminder, $k$ is the length of $V$, where $n$ is the degree of the curve. and the length of a knot vector is totally conditionned by the type of $V$, the degree of the curve and the number of control points.

The $t_1, ... ,t_n$ and $t_{k-n} = ... = t_k$ are called external knots, the others are internal knots.

  • eq. 6 : $k = m + n + 2$

We can now implement a basic make_knot_vector function which only handles clamped knot vectors.

In [4]:
def make_knot_vector(n, m, style="clamped"):
    """
    Create knot vectors for the requested vector type.
    
    Parameters
    ==========
    - n (int) : degree of the bspline curve that will use this knot vector
    - m (int) : number of vertices in the control polygone
    - style (str) : type of knot vector to output
    
    Returns
    =======
    - A knot vector (tuple)
    """
    if style != "clamped":
        raise NotImplementedError
        
    total_knots = m+n+2                           # length of the knot vector, this is exactly eq.6
    outer_knots = n+1                             # number of outer knots at each of the vector.
    inner_knots = total_knots - 2*(outer_knots)   # number of inner knots
    # Now we translate eq. 5:
    knots  = [0]*(outer_knots)
    knots += [i for i in range(1, inner_knots)]
    knots += [inner_knots]*(outer_knots)
    
    return tuple(knots) # We convert to a tuple. Tuples are hashable, required later for memoization

B-spline basis function

Basis functions are the functions that will be multiplied and summed in the B-Spline function.

Definition

The b-spline basis function is defined as follows:

  • eq. 7 : $b_{i,1}(x) = \left \{ \begin{matrix} 1 & \mathrm{if} \quad t_i \leq x < t_{i+1} \\0 & \mathrm{otherwise} \end{matrix} \right.$

  • eq. 8 : $b_{i,k}(x) = \frac{x - t_i}{t_{i+k-1} - t_i} b_{i,k-1}(x) + \frac{t_{i+k} - x}{t_{i+k} - t_{i+1}} b_{i+1,k-1}(x).$

There is really not much to say about them right now. Sometime later we might try to understand them, but we're fine with this definition right now. However, I did get confused as some sites write eq. 7 this way:

  • $b_{i,1}(x) = \left \{ \begin{matrix} 1 & \mathrm{if} \quad t_i \leq x \leq t_{i+1} \\0 & \mathrm{otherwise} \end{matrix} \right.$

This is wrong (the double $\leq$), at least as far as I could see : graphically, spikes appear on the curve at knot values. It makes sense since at knots the sum (eq. 3) will cumulate values from different intervals because $b_{i,1}$ will evaluate to 1 instead of 0 at internal knots.

What is basis_factory(n) ?

It is a high-order function! It will return b-spline basis functions b_n of degree $n$ which will be used by S. We are ready to implement it. We need to translate eq. 7 and eq. 8 to Python. Again, we must be extra careful with indexing.

In [5]:
def basis_factory(degree):
    """ Returns a basis_function for the given degree """
    if degree == 0:
        
        def basis_function(t, i, knots):
            """The basis function for degree = 0 as per eq. 7"""
            t_this = knots[i]
            t_next = knots[i+1]
            out = 1. if (t>=t_this and t< t_next) else 0.         
            return out
    else:
        
        def basis_function(t, i, knots):
            """The basis function for degree > 0 as per eq. 8"""
            out = 0.
            t_this = knots[i]
            t_next = knots[i+1]
            t_precog  = knots[i+degree]
            t_horizon = knots[i+degree+1]            

            top = (t-t_this)
            bottom = (t_precog-t_this)
     
            if bottom != 0:
                out  = top/bottom * basis_factory(degree-1)(t, i, knots)
                
            top = (t_horizon-t)
            bottom = (t_horizon-t_next)
            if bottom != 0:
                out += top/bottom * basis_factory(degree-1)(t, i+1, knots)
         
            return out
        
    ####################################################################
    # "Enrich" the function with information about its "configuration" #
    ####################################################################         
    basis_function.lower = None if degree==0 else basis_factory(degree-1)
    basis_function.degree = degree
    return basis_function

The puzzle is complete

We now have all that is needed to start drawing clamped b-splines!

Constructing the spline curve

In [6]:
# We decide to draw a quadratic curve
n = 2
# Next we define the control points of our curve
P = [(3,-1), (2.5,3), (0, 1), (-2.5,3), (-3,-1)]
# Create the knot vector
V = make_knot_vector(n, len(P), "clamped")
# Create the Curve function
C = C_factory(P, V, n)

Sampling the curve

$C(t)$ is ready to be sampled (evaluated)! That means that we are ready to feed it with values for $t$ and getting point coordinates out. In the case of clamped curves the domain of definition of $C$ - that is: the values for which $t$ is valid - is $[min(V), max(V))$. These bounds are available as C.min and C.max member attributes of C. Do NOT sample at the endpoint, the curve is not defined there in the case of clamped knot vectors. There are situations where it is possible to sample at C.max but this value will be different to $max(V)$.

There are many ways to sample a function. In general most effort is put in strategies the put more samples (decreasing sample steps) in curvy areas and less in straight lines. This can be done by analyzing the polyline angles for example. The best sampling visually is when each display point was tested for being a point of the curve.

These are complicated strategies that we might explore later. For now we will use a linear sampling of the definition interval.

In [7]:
# Regularly spaced samples
sampling = [t for t in np.linspace(C.min, C.max, 100, endpoint=C.endpoint)]
# Sample the curve!!!!
curvepts = [ C(s) for s in sampling ]

# Create a matplotlib figure
fig = plt.figure()
fig.set_figwidth(16)
ax  = fig.add_subplot(111)
# Draw the curve points
ax.scatter( *zip(*curvepts), marker="o", c=sampling, cmap="jet", alpha=0.5 )
# Draw the control cage.
ax.plot(*zip(*P), alpha=0.3)
Out[7]:
[<matplotlib.lines.Line2D at 0x7f46a6cf67d0>]

Mission complete! We can now draw clamped 2D Curves!

Short discussion

Degree

Compared to Bezier curves, a B-Spline of degree $n$ can interpolate a polyline of an almost arbitrary number of points without having to raise the degree. With Bezier curves, to interpolate a polyline of $m$ vertices you need a curve of $n=m-1$ degree. Or you do a "poly-bezier" curve: you connect a series of lower-degree bezier curves and put effort in keeping control points correctly placed to garantee the continuity of the curve at the "internal endpoints".

Ugly endpoint

In our example, the endpoint is on the left (dark red). It is quite clear that the curve's end doesn't match that of the polyline. This is because in our basic sampling example, we didn't include the endpoint (when $t=max(V)$). And for a good reason : the curve is not defined there. However, we could include a sample $t_f$ very close to $max(V)$:

  • $t_f=max(V)-\epsilon $

Computation

In our example we sampled one short b-spline of low-degree and with a decent amount of samples and it went quickly. However, if you add more curves and sample them in a similar way, you will notice the slowness of the calculations. Memoization is interesting in our case for several reasons:

  • the recursive definition of $b_n$ (eq. 8) and its implementation (see basis_function(n)) imply many requests to create lower-level basis functions.
  • the fact that the same $S(t)$ is used for the different components (coordinates) of the curve points implies many calls to the basis_functions with the same exact input. So in a 2D case, for each point we save one full computation of $b_n$.
  • if we consider that the curve will be more often sampled than modified (eg: adaptive sampling in a resolution-aware drawing system to get exact representation of the curve) then there are chances that the curve will be sampled many times for the same $t$.

This gives serious performance improvements at the expense of memory.

Complete implementation with memoization

See the final listing for an implementation with memoization. The creation of the knot vector was moved inside the C_factory function. A helper function was created for drawing purpose. It includes a dirty fixed hack to approach the endpoint too.

References

Full implementation

In [8]:
def memoize(f):
    """ Memoization decorator for functions taking one or more arguments. """
    class memodict(dict):
        def __init__(self, f):
            self.f = f
        def __call__(self, *args):
            return self[args]
        def __missing__(self, key):
            ret = self[key] = self.f(*key)
            return ret        
    return memodict(f)

def C_factory(P, n=2, V_type="clamped"):
    """ Returns a b-spline curve C(t) configured with P, V_type and n.
    The knot vector will be created according to V_type
    
    Parameters
    ==========
    - P (list of D-tuples of reals) : List of de Boor points of dimension D.
    - n (int) : degree of the curve
    - V_type (str): name of the knit vector type to create.
    
    Returns
    =======
    A D-dimensionnal B-Spline Curve.
    """
    
    # TODO: check that p_len is ok with the degree and > 0
    m = len(P)    # the number of points in P    
    D = len(P[0]) # the dimension of a point (2D, 3D)
        
    # Create the knot vector
    V = make_knot_vector(n, m, V_type)
    # TODO: check the validity of the input knot vector.
    # TODO: create an initial Vector Point.
    
    #############################################################################
    # The following line will be detailed later.                                #
    # We create the highest degree basis spline function, aka. our entry point. #
    # Using the recursive formulation of b-splines, this b_n will call          #
    # lower degree basis_functions. b_n is a function.                          #
    #############################################################################
    b_n = basis_factory(n)
    
    @memoize
    def S(t, d):
        """ The b-spline funtion, as defined in eq. 3. """
        out = 0.
        for i in range(m): #: Iterate over 0-indexed point indices
            out += P[i][d]*b_n(t, i, V)
        return out
    
    def C(t):
        """ The b-spline curve, as defined in eq. 4. """
        out = [0.]*D           #: For each t we return a list of D coordinates
        for d in range(D):     #: Iterate over 0-indexed dimension indices
            out[d] = S(t,d)
        return out
    
    ####################################################################
    # "Enrich" the function with information about its "configuration" #
    ####################################################################  
    C.P = P                   #: The control polygone
    C.V = V                   #: The knot vector used by the function
    C.spline = S              #: The spline function.
    C.basis = b_n             #: The highest degree basis function. Useful to do some plotting.
    C.min = V[0]              #: The domain of definition of the function, lower bound for t
    C.max = V[-1]             #: The domain of definition of the function, upper bound for t
    C.endpoint = C.max!=V[-1] #: Is the upper bound included in the domain.
    return C

def make_knot_vector(n, m, style="clamped"):
    """
    Create knot vectors for the requested vector type.
    
    Parameters
    ==========
    - n (int) : degree of the bspline curve that will use this knot vector
    - m (int) : number of vertices in the control polygone
    - style (str) : type of knot vector to output
    
    Returns
    =======
    - A knot vector (tuple)
    """
    if style != "clamped":
        raise NotImplementedError
        
    total_knots = m+n+2                           # length of the knot vector, this is exactly eq.6
    outer_knots = n+1                             # number of outer knots at each of the vector.
    inner_knots = total_knots - 2*(outer_knots)   # number of inner knots
    # Now we translate eq. 5:
    knots  = [0]*(outer_knots)
    knots += [i for i in range(1, inner_knots)]
    knots += [inner_knots]*(outer_knots)
    
    return tuple(knots) # We convert to a tuple. Tuples are hashable, required later for memoization

@memoize
def basis_factory(degree):
    """ Returns a basis_function for the given degree """
    if degree == 0:
        @memoize        
        def basis_function(t, i, knots):
            """The basis function for degree = 0 as per eq. 7"""
            t_this = knots[i]
            t_next = knots[i+1]
            out = 1. if (t>=t_this and t<t_next) else 0.         
            return out
        
    else:
        @memoize
        def basis_function(t, i, knots):
            """The basis function for degree > 0 as per eq. 8"""
            out = 0.
            t_this = knots[i]
            t_next = knots[i+1]
            t_precog  = knots[i+degree]
            t_horizon = knots[i+degree+1]            

            top = (t-t_this)
            bottom = (t_precog-t_this)
     
            if bottom != 0:
                out  = top/bottom * basis_factory(degree-1)(t, i, knots)
                
            top = (t_horizon-t)
            bottom = (t_horizon-t_next)
            if bottom != 0:
                out += top/bottom * basis_factory(degree-1)(t, i+1, knots)
         
            return out
        
    ####################################################################
    # "Enrich" the function with information about its "configuration" #
    ####################################################################         
    basis_function.lower = None if degree==0 else basis_factory(degree-1)
    basis_function.degree = degree
    return basis_function

import matplotlib
def draw_bspline(C=None, P=None, n=None, V_type=None, endpoint_epsilon=0.00001):
    """Helper function to draw curves."""
    if P and n and V_type:
        C = C_factory(P, n, V_type)
    if C:
        # Use 2D or 3D
        is3d = True if len(C.P[0])==3 else False
        from mpl_toolkits.mplot3d import Axes3D

        # Regularly spaced samples
        sampling = [t for t in np.linspace(C.min, C.max, 100, endpoint=C.endpoint)]
        # Hack to sample close to the endpoint
        sampling.append(C.max - endpoint_epsilon)
        # Sample the curve!!!!
        curvepts = [ C(s) for s in sampling ]
        
        # Create a matplotlib figure
        fig = plt.figure()
        fig.set_figwidth(12)
        if is3d:
            fig.set_figheight(10)
            ax = fig.add_subplot(111, projection='3d')
        else:
            ax  = fig.add_subplot(111)
            
        # Draw the curve points
        ax.scatter( *zip(*curvepts), marker="o", c=sampling, cmap="jet", alpha=0.5 )
        # Draw the control cage.
        ax.plot(*zip(*C.P), alpha=0.3)
        # Draw the knots
        knotspos = [C(s) for s in C.V if s!= C.max]
        knotspos.append( C(C.max - endpoint_epsilon) )
        ax.scatter( *zip(*knotspos), marker="*", c=sampling, alpha=1, s=100 )
        
        # Here we annotate the knots with their values
        prev = None
        occurences = 1
        for i, curr in enumerate(C.V):
            if curr == C.max:
                kpos = C(curr-endpoint_epsilon)
            else:
                kpos = C(curr)
            if curr == prev:
                occurences += 1
            else:
                occurences = 1
            kpos[0] -= 0.3*occurences
            ax.text( *kpos, s="t="+str(curr), fontsize=12 )
            prev = curr
In [9]:
# Next we define the control points of our 2D curve
P = [(3 , 1), (2.5, 4), (0, 1), (-2.5, 4), (-3, 0), (-2.5, -4), (0, -1), (2.5, -4), (3, -1)]
# Create the a quadratic curve function
draw_bspline(P=P, n=2, V_type="clamped")

# Next we define the control points of our 3D curve
P3D = [(3, 1, 8), (2.5, 3, 7), (0, 1, 6), (-2.5, 3, 5), (-3, 0, 4), (-2.5, -3, 3), (0, -1, 2), (2.5, -3, 1), (3, -1, 0)]
# Create the a quadratic curve function
draw_bspline(P=P3D, n=2, V_type="clamped")

Revisions

I decided to add this section a bit late. At first I was planning to use git commits to document changes but it is not possible. The revision number is the gist revision number and I start at revision 11.

  • r.14
    • Fix details of range and xrange. This code is written for Python 3 and the info in that section was for Python 2.
  • r.13
    • More 1-indexing fixes in text, not in code.
    • Corrections of some docstrings.
    • Watch out for < and > in python code. For example "if d < myvar" won't interfer with the html layout when converting using nbconvert, but "if d <myvar" will open a new html tag and wreck the layout.
  • r.12
    • Ugh, I obviously got the gist numbering wrong. Fix to sync with gist.
  • r.11
    • Cosmetic fixes to markdown to get better rendering of mathjax elements and other elements. NBConvert renders to html with a different pipeline (pandoc) than IPython Notebook and uses a stricter version of markdown.
    • Fixes to "What is The Knot Vector" section: there were 0-indexing vs 1-indexing inconsistencies. I think I got it right this time.