VKScript Language Analysis: JavaScript, is it you?

TL; DR






VKScript is not JavaScript. The semantics of this language are fundamentally different from the semantics of JavaScript. See the conclusion .







What is VKScript?






VKScript is a JavaScript-like scripting programming language used in the VKontakte execute



API method, which enables customers to download exactly the information they need. In essence, VKScript is an analogue of GraphQL used by Facebook for the same purpose.







Comparison of GraphQL and VKScript:







GraphQL VKScript
Implementations Many open-source implementations in different programming languages The only implementation within the VK API
Based on Brand new language Javascript
Opportunities Data request, limited filtering; query arguments cannot use the results of previous queries Any post-processing of data at the discretion of the client; API requests are presented as methods and can use any data from previous requests


Description of VKScript from the method page in the VK API documentation (the only official language documentation):







code algorithm code in VKScript - a format similar to JavaScript or ActionScript (compatibility with ECMAScript is assumed) . The algorithm should end with the command return% expression% . Operators must be separated by semicolons.

line


The following are supported:







  • arithmetic operations
  • logical operations
  • creation of arrays and lists ([X, Y])
  • parseInt and parseDouble
  • concatenation (+)
  • if construct
  • array filter by parameter (@.)
  • API method calls, length parameter
  • loops using the while statement
  • Javascript methods: slice , push , pop , shift , unshift , splice , substr , split
  • delete operator
  • assignment to array elements, for example: row.user.action = "test";
  • the search in an array or string is indexOf , for example: โ€œ123โ€ .indexOf (2) = 1, [1, 2, 3] .indexOf (3) = 2. Returns -1 if the element is not found.


Function creation is not currently supported.









The cited documentation states that "ECMAScript compatibility is planned." But is it? Let's try to figure out how this language works from the inside.









Content






  1. VKScript virtual machine
  2. Semantics of VKScript objects
  3. Conclusion


VKScript virtual machine






How can a program be analyzed in the absence of a local copy? That's right - send requests to the public endpoint and analyze the answers. Letโ€™s try, for example, to execute the following code:









 while(1);
      
      





We get a Runtime error occurred during code invocation: Too many operations



. This suggests that in the implementation of the language there is a limit on the number of actions performed. Let's try to set the exact limit value:







 var i = 0; while(i < 1000) i = i + 1;
      
      







 var i = 0; while(i < 999) i = i + 1;
      
      







Thus, the limit on the number of operations is about 1000 โ€œidleโ€ cycles. But, at the same time, it is clear that such a cycle is most likely not a โ€œunitaryโ€ operation. Let's try to find an operation that is not divided by the compiler into several smaller ones.







The most obvious candidate for the role of such an operation is the so-called empty statement ( ;



). However, after adding to the code with i < 999



50 characters ;



, the limit is not exceeded. This means that either the empty statement is thrown by the compiler and does not waste operations, or one iteration of the loop takes more than 50 operations (which, most likely, is not so).







The next thing that comes to mind after ;



- calculation of some simple expression (for example, like this: 1;



). Let's try to add some of these expressions to our code:







 var i = 0; while(i < 999) i = i + 1; 1; //    1; //       "Too many operations"
      
      





Thus, 2 operations 1;



spend more operations than 50 operations ;



. This confirms the hypothesis that empty statement does not waste instructions.







Let's try to reduce the number of iterations of the cycle and add an additional 1;



. It is easy to see that for each iteration there are 5 additional 1;



therefore, one iteration of the cycle spends 5 times more operations than one operation 1;



.







But is there an even simpler operation? For example, adding a unary operator ~



does not require the calculation of additional expressions, and the operation itself is performed on the processor. It is logical to assume that adding this operation to the expression increases the total number of operations by 1.







Add this operator to our code:







 var i = 0; while(i < 999) i = i + 1; ~1;
      
      





And yes, we can add one such operator, and one more expression 1;



- no longer. Therefore, 1;



really is not a unitary operator.







Similar to operator 1;



, we will reduce the number of iterations of the loop and add the ~



operators. One iteration turned out to be equivalent to 10 unitary operations ~



, therefore, expression 1;



spends 2 operations.







Note that the limit is approximately 1000 iterations, i.e. approximately 10,000 unit operations. We assume that the limit is exactly 10,000 operations.









Measuring the number of operations in code






Note that now we can measure the number of operations in any code. To do this, add this code after the loop and add / remove iterations, ~



operators, or the entire last line, until the Too many operations



error disappears.







Some measurement results:







Code Number of operations
1;



2
~1;



3
1+1;



four
1+1+1;



6
(true?1:1);



five
(false?1:1);



four
if(0)1;



2
if(1)1;



four
if(0)1;else 1;



four
if(1)1;else 1;



five
while(0);



2
i=1;



3
i=i+1;



five
var j = 1;



one
var j = 0;while(j < 1)j=j+1;



15




Determining the type of virtual machine






First you need to understand how the VKScript interpreter works. There are two more or less plausible options:









It is easy to understand that VKScript uses the second option. Consider the expression (true?1:1);



(5 operations) and (false?1:1);



(4 operations). In the case of sequential execution of instructions, an additional operation is explained by a transition that โ€œbypassesโ€ the wrong option, and in the case of a recursive AST bypass, both options are equivalent for the interpreter. A similar effect is observed in if / else with a different condition.







It is also worth paying attention to a pair i = 1;



(3 operations) and var j = 1;



(1 operation). Creating a new variable costs only 1 operation, and assigning to an existing one costs 3? The fact that creating a variable costs 1 operation (and that, most likely, it is a constant loading operation), says two things:









Using the stack also explains that the expression var j = 1;



runs faster than expression 1;



: the last expression spends additional instructions on removing the calculated value from the stack.









Determining the exact limit value



Note that the cycle var j=0;while(j < 1)j=j+1;



(15 operations) is a small copy of the cycle that was used for measurements:







Code Number of operations
 var i = 0; while(i < 1) i = i + 1;
      
      



15
 var i = 0; while(i < 999) i = i + 1;
      
      



15 + 998 * 10 = 9995
 var i = 0; while(i < 999) i = i + 1; ~1;
      
      





(limit)
9998


Stop what? Is there a limit of 9998 instructions? We are clearly missing something ...







Note that the return 1;



code is return 1;



performed, according to measurements, for 0 instructions. This is easily explained: the compiler adds an implicit return null;



at the end of the code return null;



, and when adding its return it fails. Assuming that the limit is 10000, we conclude that the operation return null;



takes 2 instructions (probably something like push null; return;



).









Nested Code Blocks






Let's take some more measurements:







Code Number of operations
{};



0
{var j = 1;};



2
{var j = 1, k = 2;};



3
{var j = 1; var k = 2;};



3
var j = 1; var j = 1;



four
{var j = 1;}; var j = 1;



3


Let's pay attention to the following facts:









It is easy to understand that an extra operation is spent on removing local variables declared in the block from the stack. Accordingly, when there are no local variables, nothing needs to be deleted.









Objects, Methods, API Calls






Code Number of operations
"";



2
"abcdef";



2
{};



2
[];



2
[1, 2, 3];



five
{a: 1, b: 2, c: 3};



five
API.users.isAppUser(1);



3
"".substr(0, 0);



6
var j={};jx=1;



6
var j={x:1};delete jx;



6


Let us analyze the results. You may notice that creating a string and an empty array / object takes 2 operations, just like loading a number. When creating a non-empty array or object, operations spent on loading elements of the array / object are added. This suggests that directly creating an object occurs in one operation. At the same time, no time is wasted downloading property names; therefore, downloading them is part of the operation of creating an object.







With the API method call, everything is also quite commonplace - loading a unit, actually calling the method, pop



result (you can notice that the method name is processed as a whole, and not as taking properties). But the last three examples look interesting.













Semantics of VKScript objects



The numbers






Back to the original question: is VKScript a subset of JavaScript or another language? Let's do a simple test:







 return 1000000000 + 2000000000;
      
      





 {"response": -1294967296};
      
      





As we can see, integer addition leads to overflow, despite the fact that JavaScript does not have integers as such. It is also easy to verify that dividing by 0 leads to an error, and does not return Infinity



.









The objects






 return {};
      
      





 {"response": []}
      
      





Stop what? We return an object and get an array ? Yes, that is right. In VKScript, arrays and objects are represented by the same type, in particular, an empty object and an empty array are one and the same. In this case, the length



property of the object works and returns the number of properties.







It is interesting to see how list methods behave if you call them on an object?







 return {a:1, b:2, c:3}.pop();
      
      





 3
      
      





The pop



method returns the last declared property, which, however, is logical. Change the order of properties:







 return {b:1, c:2, a:3}.pop();
      
      





 3
      
      





Apparently, objects in VKScript remember the order in which properties are assigned. Let's try to use numeric properties:







 return {'2':1,'1':2,'0':3}.pop();
      
      





 3
      
      





Now let's see how push works:







 var a = {'2':'a','1':'b','x':'c'}; a.push('d'); return a;
      
      





 {"1": "b", "2": "a", "3": "d", "x": "c"};
      
      





As you can see, the push method sorts the numerical keys and adds a new value after the last numerical key. โ€œHolesโ€ are not filled in this case.







Now try to combine these two methods:







 var a = {'2':'a','1':'b','x':'c'}; a.push(a.pop()); return a;
      
      





 {"1": "b", "2": "a", "3": "c", "x": "c"};
      
      





As we can see, the element was not deleted from the array. However, if we put push



and pop



in different lines, the bug will disappear. We need to go deeper!









Object Storage






 var x = {}; var y = x; xy = 'z'; return y;
      
      





 {"response": []}
      
      





As it turned out, objects in VKScript are stored by value, unlike JavaScript. Now we see the strange behavior of the string a.push(a.pop());



- apparently, the old value of the array was saved on the stack, from where it was later taken.







However, how then is the data stored in the object if the method modifies it? Apparently, the "extra" instruction when calling the method is intended specifically for recording changes back to the object.









Array Methods






Method Act
push



  • sort numeric keys by value
  • take the maximum numeric key, add one
  • write argument to array
  • add non-numeric keys to the end of the array
pop



Remove the last element from the array (not necessarily with a numeric key) and return.
rest
  • sort numeric keys by value, remove โ€œholesโ€ in the array
  • perform appropriate javascript operation
  • add non-numeric keys to the end of the array


When using the slice method, changes are not saved









Conclusion






VKScript is not JavaScript. Unlike JavaScript, objects in it are stored by value, not by reference, and have completely different semantics. However, when using VKScript for the purpose for which it is intended, the difference is not noticeable.









PS Semantics of operators






The comments mentioned combining objects through +



. In this regard, I decided to add information about the work of operators.







Operator Actions
+
  • If both arguments are objects, create a copy of the first object and add the keys from the second (with replacement) to it.
  • If both arguments are numbers, add as numbers.
  • Otherwise, both operands are cast to a string and added as strings.
Other arithmetic operators Both operands are cast to a number, and the corresponding operation is performed. For bit operations, operands are additionally cast to int



.
Comparison operators If two strings or two numbers are compared, they are compared directly. If a string and a number are compared, and the string is a correct notation for the number, the string is cast to a number. Otherwise, a Comparing values of different or unsupported types



error is returned.
Cast to string Numbers and strings are given as in JavaScript. Objects are listed as a comma-separated list of values, in the order of the keys. false



and null



are cast as ""



, true



cast as "1"



.
Cast to If the argument is a string that is a valid number notation, the number is returned. Otherwise, a Numeric arguments expected



error is returned.


In operations with numbers (except for bit), if the operands are int



and double



, int



is double



to double



. If both operands are int



, an operation is performed on signed 32-bit integers (with overflow).








All Articles