VKScript Language Analysis: JavaScript, is it you?

TL; DR

VKScript is not JavaScript. The semantics of this language are fundamentally different from the semantics of JavaScript. See the conclusion .

What is VKScript?

VKScript is a JavaScript-like scripting programming language used in the VKontakte execute

API method, which enables customers to download exactly the information they need. In essence, VKScript is an analogue of GraphQL used by Facebook for the same purpose.

Comparison of GraphQL and VKScript:

	GraphQL	VKScript
Implementations	Many open-source implementations in different programming languages	The only implementation within the VK API
Based on	Brand new language	Javascript
Opportunities	Data request, limited filtering; query arguments cannot use the results of previous queries	Any post-processing of data at the discretion of the client; API requests are presented as methods and can use any data from previous requests

Description of VKScript from the method page in the VK API documentation (the only official language documentation):

code algorithm code in VKScript - a format similar to JavaScript or ActionScript (compatibility with ECMAScript is assumed) . The algorithm should end with the command return% expression% . Operators must be separated by semicolons.

line

The following are supported:

arithmetic operations
logical operations
creation of arrays and lists ([X, Y])
parseInt and parseDouble
concatenation (+)
if construct
array filter by parameter (@.)
API method calls, length parameter
loops using the while statement
Javascript methods: slice , push , pop , shift , unshift , splice , substr , split
delete operator
assignment to array elements, for example: row.user.action = "test";
the search in an array or string is indexOf , for example: “123” .indexOf (2) = 1, [1, 2, 3] .indexOf (3) = 2. Returns -1 if the element is not found.

Function creation is not currently supported.

code	algorithm code in VKScript - a format similar to JavaScript or ActionScript (compatibility with ECMAScript is* assumed)* . The algorithm should end with the command return% expression% . Operators must be separated by semicolons. line

The cited documentation states that "ECMAScript compatibility is planned." But is it? Let's try to figure out how this language works from the inside.

Content

VKScript virtual machine

How can a program be analyzed in the absence of a local copy? That's right - send requests to the public endpoint and analyze the answers. Let’s try, for example, to execute the following code:

 while(1);

We get a Runtime error occurred during code invocation: Too many operations

. This suggests that in the implementation of the language there is a limit on the number of actions performed. Let's try to set the exact limit value:

 var i = 0; while(i < 1000) i = i + 1;

Runtime error occurred during code invocation: Too many operations

.

 var i = 0; while(i < 999) i = i + 1;

{"response": null}

- code successfully executed.

Thus, the limit on the number of operations is about 1000 “idle” cycles. But, at the same time, it is clear that such a cycle is most likely not a “unitary” operation. Let's try to find an operation that is not divided by the compiler into several smaller ones.

The most obvious candidate for the role of such an operation is the so-called empty statement ( ;

). However, after adding to the code with i < 999

50 characters ;

, the limit is not exceeded. This means that either the empty statement is thrown by the compiler and does not waste operations, or one iteration of the loop takes more than 50 operations (which, most likely, is not so).

The next thing that comes to mind after ;

- calculation of some simple expression (for example, like this: 1;

). Let's try to add some of these expressions to our code:

 var i = 0; while(i < 999) i = i + 1; 1; //    1; //       "Too many operations"

Thus, 2 operations 1;

spend more operations than 50 operations ;

. This confirms the hypothesis that empty statement does not waste instructions.

Let's try to reduce the number of iterations of the cycle and add an additional 1;

. It is easy to see that for each iteration there are 5 additional 1;

therefore, one iteration of the cycle spends 5 times more operations than one operation 1;

.

But is there an even simpler operation? For example, adding a unary operator ~

does not require the calculation of additional expressions, and the operation itself is performed on the processor. It is logical to assume that adding this operation to the expression increases the total number of operations by 1.

Add this operator to our code:

 var i = 0; while(i < 999) i = i + 1; ~1;

And yes, we can add one such operator, and one more expression 1;

- no longer. Therefore, 1;

really is not a unitary operator.

Similar to operator 1;

, we will reduce the number of iterations of the loop and add the ~

operators. One iteration turned out to be equivalent to 10 unitary operations ~

, therefore, expression 1;

spends 2 operations.

Note that the limit is approximately 1000 iterations, i.e. approximately 10,000 unit operations. We assume that the limit is exactly 10,000 operations.

Measuring the number of operations in code

Note that now we can measure the number of operations in any code. To do this, add this code after the loop and add / remove iterations, ~

operators, or the entire last line, until the Too many operations

error disappears.

Some measurement results:

Code	Number of operations
`1;`	2
`~1;`	3
`1+1;`	four
`1+1+1;`	6
`(true?1:1);`	five
`(false?1:1);`	four
`if(0)1;`	2
`if(1)1;`	four
`if(0)1;else 1;`	four
`if(1)1;else 1;`	five
`while(0);`	2
`i=1;`	3
`i=i+1;`	five
`var j = 1;`	one
`var j = 0;while(j < 1)j=j+1;`	15

Determining the type of virtual machine

First you need to understand how the VKScript interpreter works. There are two more or less plausible options:

The interpreter recursively traverses the syntax tree and performs an operation on each node.
The compiler translates the syntax tree into a sequence of instructions that the interpreter executes.

It is easy to understand that VKScript uses the second option. Consider the expression (true?1:1);

(5 operations) and (false?1:1);

(4 operations). In the case of sequential execution of instructions, an additional operation is explained by a transition that “bypasses” the wrong option, and in the case of a recursive AST bypass, both options are equivalent for the interpreter. A similar effect is observed in if / else with a different condition.

It is also worth paying attention to a pair i = 1;

(3 operations) and var j = 1;

(1 operation). Creating a new variable costs only 1 operation, and assigning to an existing one costs 3? The fact that creating a variable costs 1 operation (and that, most likely, it is a constant loading operation), says two things:

When creating a new variable, there is no explicit memory allocation for the variable.
When creating a new variable, the value is not loaded into the memory cell. This means that the space for the new variable is allocated where the value of the expression was calculated, and after that this memory is considered allocated. This suggests the use of a stack machine.

Using the stack also explains that the expression var j = 1;

runs faster than expression 1;

: the last expression spends additional instructions on removing the calculated value from the stack.

Determining the exact limit value

Note that the cycle var j=0;while(j < 1)j=j+1;

(15 operations) is a small copy of the cycle that was used for measurements:

Code	Number of operations
`var i = 0; while(i < 1) i = i + 1;`	15
`var i = 0; while(i < 999) i = i + 1;`	15 + 998 * 10 = 9995
`var i = 0; while(i < 999) i = i + 1; ~1;` (limit)	9998

Stop what? Is there a limit of 9998 instructions? We are clearly missing something ...

Note that the return 1;

code is return 1;

performed, according to measurements, for 0 instructions. This is easily explained: the compiler adds an implicit return null;

at the end of the code return null;

, and when adding its return it fails. Assuming that the limit is 10000, we conclude that the operation return null;

takes 2 instructions (probably something like push null; return;

).

Nested Code Blocks

Let's take some more measurements:

Code	Number of operations
`{};`	0
`{var j = 1;};`	2
`{var j = 1, k = 2;};`	3
`{var j = 1; var k = 2;};`	3
`var j = 1; var j = 1;`	four
`{var j = 1;}; var j = 1;`	3

Let's pay attention to the following facts:

Adding a variable to a block takes one extra operation.
When "declaring a variable again" the second declaration fulfills as a normal assignment.
But at the same time, the variable inside the block is not visible from the outside (see the last example).

It is easy to understand that an extra operation is spent on removing local variables declared in the block from the stack. Accordingly, when there are no local variables, nothing needs to be deleted.

Objects, Methods, API Calls

Code	Number of operations
`"";`	2
`"abcdef";`	2
`{};`	2
`[];`	2
`[1, 2, 3];`	five
`{a: 1, b: 2, c: 3};`	five
`API.users.isAppUser(1);`	3
`"".substr(0, 0);`	6
`var j={};jx=1;`	6
`var j={x:1};delete jx;`	6

Let us analyze the results. You may notice that creating a string and an empty array / object takes 2 operations, just like loading a number. When creating a non-empty array or object, operations spent on loading elements of the array / object are added. This suggests that directly creating an object occurs in one operation. At the same time, no time is wasted downloading property names; therefore, downloading them is part of the operation of creating an object.

With the API method call, everything is also quite commonplace - loading a unit, actually calling the method, pop

result (you can notice that the method name is processed as a whole, and not as taking properties). But the last three examples look interesting.

"".substr(0, 0);

- loading a string, loading zero, loading zero, pop

result. For a reason, there are 2 instructions for calling a method (for some reason, see below).
var j={};jx=1;

- creating an object, loading an object, loading a unit, pop

unit after assignment. Again, there are 2 instructions for assignment.
var j={x:1};delete jx;

- loading a unit, creating an object, loading an object, deleting. There are 3 instructions per delete operation.

Semantics of VKScript objects

The numbers

Back to the original question: is VKScript a subset of JavaScript or another language? Let's do a simple test:

 return 1000000000 + 2000000000;

 {"response": -1294967296};

As we can see, integer addition leads to overflow, despite the fact that JavaScript does not have integers as such. It is also easy to verify that dividing by 0 leads to an error, and does not return Infinity

.

The objects

 return {};

 {"response": []}

Stop what? We return an object and get an array ? Yes, that is right. In VKScript, arrays and objects are represented by the same type, in particular, an empty object and an empty array are one and the same. In this case, the length

property of the object works and returns the number of properties.

It is interesting to see how list methods behave if you call them on an object?

 return {a:1, b:2, c:3}.pop();

The pop

method returns the last declared property, which, however, is logical. Change the order of properties:

 return {b:1, c:2, a:3}.pop();

Apparently, objects in VKScript remember the order in which properties are assigned. Let's try to use numeric properties:

 return {'2':1,'1':2,'0':3}.pop();

Now let's see how push works:

 var a = {'2':'a','1':'b','x':'c'}; a.push('d'); return a;

 {"1": "b", "2": "a", "3": "d", "x": "c"};

As you can see, the push method sorts the numerical keys and adds a new value after the last numerical key. “Holes” are not filled in this case.

Now try to combine these two methods:

 var a = {'2':'a','1':'b','x':'c'}; a.push(a.pop()); return a;

 {"1": "b", "2": "a", "3": "c", "x": "c"};

As we can see, the element was not deleted from the array. However, if we put push

and pop

in different lines, the bug will disappear. We need to go deeper!

Object Storage

 var x = {}; var y = x; xy = 'z'; return y;

 {"response": []}

As it turned out, objects in VKScript are stored by value, unlike JavaScript. Now we see the strange behavior of the string a.push(a.pop());

- apparently, the old value of the array was saved on the stack, from where it was later taken.

However, how then is the data stored in the object if the method modifies it? Apparently, the "extra" instruction when calling the method is intended specifically for recording changes back to the object.

Array Methods

Method Act

Method	Act
`push`	sort numeric keys by value take the maximum numeric key, add one write argument to array add non-numeric keys to the end of the array
`pop`	Remove the last element from the array (not necessarily with a numeric key) and return.
rest	sort numeric keys by value, remove “holes” in the array perform appropriate javascript operation add non-numeric keys to the end of the array When using the slice method, changes are not saved

push

sort numeric keys by value
take the maximum numeric key, add one
write argument to array
add non-numeric keys to the end of the array

pop Remove the last element from the array (not necessarily with a numeric key) and return.

rest

sort numeric keys by value, remove “holes” in the array
perform appropriate javascript operation
add non-numeric keys to the end of the array

When using the slice method, changes are not saved

Conclusion

VKScript is not JavaScript. Unlike JavaScript, objects in it are stored by value, not by reference, and have completely different semantics. However, when using VKScript for the purpose for which it is intended, the difference is not noticeable.

PS Semantics of operators

The comments mentioned combining objects through +

. In this regard, I decided to add information about the work of operators.

Operator	Actions
+	If both arguments are objects, create a copy of the first object and add the keys from the second (with replacement) to it. If both arguments are numbers, add as numbers. Otherwise, both operands are cast to a string and added as strings.
Other arithmetic operators	Both operands are cast to a number, and the corresponding operation is performed. For bit operations, operands are additionally cast to `int` .
Comparison operators	If two strings or two numbers are compared, they are compared directly. If a string and a number are compared, and the string is a correct notation for the number, the string is cast to a number. Otherwise, a `Comparing values of different or unsupported types` error is returned.
Cast to string	Numbers and strings are given as in JavaScript. Objects are listed as a comma-separated list of values, in the order of the keys. `false` and `null` are cast as `""` , `true` cast as `"1"` .
Cast to	If the argument is a string that is a valid number notation, the number is returned. Otherwise, a `Numeric arguments expected` error is returned.

In operations with numbers (except for bit), if the operands are int

and double

, int

is double

to double

. If both operands are int

, an operation is performed on signed 32-bit integers (with overflow).

All Articles