Training kits from the video - quickly and efficiently

Any, not the most trivial (or just rare) object can easily create a lot of problems with almost every attempt to use neural networks to solve real problems. Obviously, the lack of a sane training set significantly complicates the overwhelming number of scenarios for using the neurostem approach.



What to do, for example, with a rare species of grasshopper, the recognition of representatives of which, for one reason or another, has become a very important task.





All results / examples were obtained independently (and quickly).



Custom objects



The real world, like real tasks, is overwhelmingly unique, unusual, and often just very specific when it comes to color, shape, behavior, etc.





To successfully solve the corresponding tasks, data is needed (training sets, in our case). And since not everyone is trying to build the “most correct” autopilot or to look for smiles in photographs, the creation of the necessary sets becomes the main problem.



Agree, the probability of finding a ready-made and high-quality set for some very specific coloring style tends to zero:





By the way, youtube algorithms seem a bit fake when it comes to the painted body. At least the returned content looks somewhat controversial.



The usual way of marking



Well, suppose manual markup does not look very scary - you are not afraid of monotonous work or crowd sourcing is suitable for both the quality of the result and the cost. But this is true as long as it comes down to a bounding box (a hackneyed example is used, for illustrative purposes only):





What to do if the specifics of the task require finding the exact contour? Mask RCNN is quite a solution, but it requires a high-quality and accurate training set. And to draw a contour, as you know, this is not a rectangle to mark and such work will require several other efforts.



Automated approach



The eternal question: "What to do?". The answer is no less trivial - to automate. Classical algorithms of computer vision allow achieving acceptable results provided that certain basic conditions are met.





Actually, it is the imposition of additional conditions that does not allow using this approach as the main solution. Nevertheless, the correct standard algorithms allow you to quickly get a high-quality, diverse and easily extensible set.



So high-quality that even the usual color change in the selected area looks like an almost ready-made solution:





More about the approach next time.



Training Set Example



The approach to generating a training set from a video is convenient in that the final result contains exclusively “live” and completely real examples that reflect the variability and complexity of the real world. For example, lips:







Other results









Follow the development of the project



YouTube: RobotsCanSee

Telegram: RobotsCanSeeUs



All Articles