© Dmitri Lebedev (http://ryba4.com) [http://ryba4.com], April 2010, version 1.1.
This is a Python implementation of Ramer-Douglas-Peucker
algorithm that simplifies polylines.
We know X, Y and Z (begin, end and curr in the code). We can calculate automatically a and b vectors and need vector c's length (module). To find the longest vector, we may use the squares of the lengths. Now see the way of thought:
Note that |a|2 actually takes less calculations than |a|, so this formula gets rid of few x**.5 operators.
The code:
def ramerdouglas(line, dist): """Does Ramer-Douglas-Peucker simplification of a line with `dist` threshold. `line` must be a list of Vec objects, all of the same type (either 2d or 3d).""" if len(line) < 3: return line begin, end = line[0], line[-1] distSq = [begin.distSq(curr) - ((end - begin) * (curr - begin)) ** 2 / begin.distSq(end) for curr in line[1:-1]] maxdist = max(distSq) if maxdist < dist ** 2: return [begin, end] pos = distSq.index(maxdist) return (ramerdouglas(line[:pos + 2], dist) + ramerdouglas(line[pos + 1:], dist)[1:])
Basically, that's all the algorithm's code. The rest is service code with line and vector classes.
class Line: """Polyline. Contains a list of points and outputs a simplified version of itself.""" def __init__(self, points): pointclass = points[0].__class__ for i in points[1:]: if i.__class__ != pointclass: raise TypeError("""All points in a Line must have the same type""") self.points = points def simplify(self, dist): if self.points[0] != self.points[-1]: points = ramerdouglas(self.points, dist) else: points = ramerdouglas( self.points[:-1], dist) + self.points[-1:] return self.__class__(points) def __repr__(self): return '{0}{1}'.format(self.__class__.__name__, tuple(self.points)) class Vec: """Generic vector class for n-dimensional vectors for any natural n.""" def __eq__(self, obj): """Equality check.""" if self.__class__ == obj.__class__: return self.coords == obj.coords return False def __repr__(self): """String representation. The string is executable as Python code and makes the same vector.""" return '{0}{1}'.format(self.__class__.__name__, self.coords) def __add__(self, obj): """Add a vector.""" if not isinstance(obj, self.__class__): raise TypeError return self.__class__(*map(sum, zip(self.coords, obj.coords))) def __neg__(self): """Reverse the vector.""" return self.__class__(*[-i for i in self.coords]) def __sub__(self, obj): """Substract object from self.""" if not isinstance(obj, self.__class__): raise TypeError return self + (- obj) def __mul__(self, obj): """If obj is scalar, scales the vector. If obj is vector returns the scalar product.""" if isinstance(obj, self.__class__): return sum([a * b for (a, b) in zip(self.coords, obj.coords)]) return self.__class__(*[i * obj for i in self.coords]) def dist(self, obj = None): """Distance to another object. Leave obj empty to get the length of vector from point 0.""" return self.distSq(obj) ** 0.5 def distSq(self, obj = None): """ Square of distance. Use this method to save calculations if you don't need to calculte an extra square root.""" if obj is None: obj = self.__class__(*[0]*len(self.coords)) elif not isinstance(obj, self.__class__): raise TypeError('Parameter must be of the same class') # simple memoization to save extra calculations if obj.coords not in self.distSqMem: self.distSqMem[obj.coords] = sum([(s - o) ** 2 for (s, o) in zip(self.coords, obj.coords)]) return self.distSqMem[obj.coords] class Vec3D(Vec): """3D vector""" def __init__(self, x, y, z): self.coords = x, y, z self.distSqMem = {} class Vec2D(Vec): """2D vector""" def __init__(self, x, y): self.coords = x, y self.distSqMem = {}
Highlighted with Pygments [http://pygments.org].
This code at work:
Figure 1. Original line in 10*10 km square. 492 points.
Figure 2. Simplified line, threshold = 50 m. ~150 points.
Figure 3. Simplified line, threshold = 100 m. ~90 points.
Figure 3. Simplified line, threshold = 250 m. ~45 points.
You can download the code with this line and it's simplifications.
The data is a coastline from this place [http://osm.org/go/2ttZPN0-] in OpenStreetMap [http://osm.org]. Coordinates are in kilometres.
I know this code isn't the fastest, and procedural approach (which I actually like more) can work faster on larger amount of data, but I finally chose OO approach, as it allows readable and maintainable code. Here is a procedural-style implementation [http://mappinghacks.com/2008/05/05/douglas-peucker-line-simplification-in-python/].