âIt's hard to look for a cat in a black room, especially if he is not thereâ
While analyzing blogs and releases of familiar IT companies to understand the vector of development of the modern (including Russian) market, IT stumbled (yes, it was stumbled - this is not a typo) about an article in the Synesis blog (Synesis, Mink, Belarus). Link to the publication (in English).
Having contacted a company representative (one of the managing partners - Nikolay Ptitsyn), whom I personally know from meetings at various international events and parties (MWC, iGB Affiliate, etc), I found out a few details of the case.
Long story short: a certain young man by the name of Sancharov Kirill contacted the founders of the Synesis company (Minsk, Belarus) and proposed a partnership in the development of a mathematical data compression algorithm (archiver) using a completely scientifically-similar method of âsearching for a given n-bit sequence in infinite an irrational number series, which is the value of Pi after the decimal point. â The science-like sound, a combination of circumstances, as well as the stubbornness of the young âdeveloper-entrepreneurâ made fantasy a reality. A respected, experienced, multidisciplinary IT company picked up the project and gave the green light by allocating material, human and other resources for the development of the initiative and even provided its name for the presentation of the Aleph project at various specialized conferences and events (EMERGE 2019). 6 months of hard work of the companyâs team and a desire to make a bold start-up business case broke up on obstacles of very strange properties, whether it were the periodic disappearance of the âauthorâ of the idea under various pretexts (for example, âyou must pass the session at the Moscow Institute of Physics and Technologyâ; at 26? Session? Didnât I finish the 8th year otherwise?), Attempts to âhackâ system servers by evil hackers, or the curvature of âdevelopersâ working on a remote site that periodically âbroke the systemâ and needed an urgent personal presence in another country to fix the âdamageâ niy. " The demonstrated source code was suspiciously similar to the various generations of obfuscated PAQ solutions. Further - more, the data was magically compressed 2/4/8 /.../ n ^ 32 times. The apotheosis of history was the moment when the âalgorithmâ began to compress the newly compressed data the same n-times (though it is worth noting that for this it was necessary to encrypt the compressed data, otherwise the program recognized them and refused to âcompressâ again) and miraculously 1Gb imagine dense MPEG-4 data in a record 7Kb!
Synesis closed the project. The company estimates the total damage from the actions of a fraudster at hundreds of thousands of US dollars.
Sounds weird, huh? 2019, Python is taught in schools, IT has long become mainstream, and the number of people who understand the terms âdata normalizationâ, âmedianâ, âcode encapsulationâ amounts to tens, if not hundreds of thousands. Machine learning has not yet taught children (or is it already teaching?). In this environment, a situation similar to the one described above occurs.
Why? How? Is it true that the dissemination of knowledge leads not only to the growth of professional expertise in the industry, but also to the penetration of scammers whose lot was the theft of bank card data and the deception of vendors on electronic trading platforms?
I am inclined to admit that today 9 out of 10 professionals in the IT market (people who regard IT as their profession) have not even heard of Knutâs three-volume âThe Art of Programmingâ. But logic is a stubborn science, and the theory of information has been studied by many involved in the industry! Following the logic and knowledge in the field of algebra, we can assume that with a certain amount of luck (the mathematical meaning of the term is meant by the author), as well as with the correct search / calculation / inversion technique, you can find a certain balance in the encoding speed / compression ratio and get some sane result. But the key word is balance. I immediately recall the triad of contradictions: âQuickly, efficiently, cheaplyâ - if you take any two assumptions as a premise, then the syllogism contradicts the third (âquickly and efficiently - it canât be cheapâ, âfast and cheaply - it canât be qualitativeâ, â high-quality and cheap - cannot be fast â). Relying solely on methods of mathematical coding and not using the applied estimation of encoded materials, entropy cannot be defeated. The same âJPEG requires the application of a theory from digital signal processing, mathematical analysis, linear algebra, information theory, in particular, Fourier transform, lossless coding, etc.â (habr.com - citation, Philip Volodin @ Fil).
Afterword.
Any IT specialist is not new to hearing the term âdue diligence,â but do we well understand the importance of such a familiar process in the real world? The story described above and the reason for this article leads me to rethink the importance of testing a hypothesis not only for its validity in terms of ideas and feasibility, but also in terms of checking the integrity of the âinventorâ. âGrandmaâs Algorithmâ, which was a funny joke a few years ago, happened again - and this is an alarming sign!
The IT world is a world of chasing the âunicornâ in all its glory! And now, having caught up with something horny and seemingly of the right pink hue, the question arises: âIs it exactly him?â, âIs there exactly one horn (and the second is not cut down by sly people to idealize the result)?â, âAnd itâs exactly pink (in a deceptive binary light) the shades of the moon can change in the most bizarre way)? âAnd other questions that we could come up with - they all add confidence to us on the one hand, and on the other hand, can be an unforgivable lag in making decisions.
I wish everyone simple and understandable dyudily, more ideas and real unicorns!