Fast contour detection in 4K video: color and complex shapes

In the previous part, “Training sets from video - quickly and efficiently”, we talked about the complexity of using neural networks for any task associated with rare, unusual, or simply complex objects. Be sure to look at the examples, they are worth it.





Classical algorithms for computer vision, as it turned out, can greatly help with obtaining high-quality training sets. Naturally, this approach is not applicable in all cases, with which it is necessary to understand.



What is the difficulty?



As shown in the previous part , detailed manual marking of sets is a very time-consuming process and, frankly, is generally not an option for any sane person. Automatic marking, especially when it comes to contours, looks much more interesting, but how to get the contour of interest quickly and accurately?



Membership function



Perhaps it’s worth starting with the membership function. Suppose the object of interest to us is characterized by a bright color, which, moreover, is unique to the object in the context of a particular scene:





Given the specifics of the approach (namely, the need for such scenes that are easy to “parse”), it’s quite easy to formulate a selection rule for examples to obtain a training set: scenes for which the rule of color uniqueness of the desired object will be fulfilled will be very useful (remember, with all the difficult cases you will have to deal with a neural network that has been successfully trained using the generated set).



Actually, the condition of uniqueness is a necessary minimum, since color can and should also be worked on:





Color distance



Working with color, in this case, is a very important part of the whole approach. In fact, the membership function can be implemented as a function of proximity to a given color with a set threshold value:







The existing solution uses several Delta E implementations as the most accurate standard. For example, CIE94 in the LCH color space (L * C * h):







Too large a threshold, for color distance, is likely to “break” the path, capturing pixels that are not related to the desired object. Too small - selects only part of the desired object. In this connection, complex scenes require attention, for example:







The whale in the photo is still visible to the eye (with difficulty, of course), but the outline is already built incorrectly. Entire example:





Restore the circuit



Suppose everything is fine with color, how to get the desired outline? The task is not simple, since the result is likely to be quite complex, with cavities, minor elements, etc. Which of the options for the restored contour for a single object is correct?





Lighting is complex, shadows, reflexes are an integral part of the three-dimensional world, etc. We use a more complex example:





The algorithm for obtaining such a result is as follows:







  1. source image
  2. scan step selection (performance critical)
  3. horizontal scan
  4. vertical scanning and intersection analysis to search for isolated "objects"
  5. building an array of meta-pixels (to identify both the shape and internal features of the object) and post-processing (filtering, smoothing, etc.)
  6. "Vectorization" of the restored shape of the object


Analysis of intersections makes it easy to localize separate, unrelated areas. By turning on the scan line display mode, you can easily see both the approach itself and the effect of the scanning step on the final result. Pay attention to a very simple trick with a border that significantly improves the impression you make:







The accuracy of the reconstructed circuit is easily evaluated using the following example:







Final test



More objects, more contours, better accuracy, hair and 4K - if you check your implementation, so with songs and dances.





Until next time, and other, no less interesting, details.



Other results









Follow the development of the project



YouTube: RobotsCanSee

Telegram: RobotsCanSeeUs



All Articles