Well, is still the latest. Still working on assembly optimizing it Had to rewrite all assembly code - it was messy, and got too complex, so I started over on it. I'm about halfway through (it processes every even pixel now). I do get to work on it now and then. I'm trying to also do a version with unrolled loops for diameter=3 and 5, which also should give a good speedup. I also made a quality bugfix and some optimizations on 2.1 for Vdub, so there should be a new version soon. (Even though I must admit I accindently mmx-optimized ConvertToRGB24(), instead of working on the smoother).