Checking Ark Compiler code recently opened by Huawei

Picture 1






During presentations in the summer of 2019, Huawei announced Ark Compiler technology. According to representatives of the company, this open source project can significantly improve the smoothness and responsiveness of Android and third-party applications. By tradition, a new interesting open source project must pass code quality control using PVS-Studio.



Introduction



For the first time, the Huawei Ark compiler was introduced along with the launch of the Huawei P30 and P30 Pro smartphones. According to Huawei, the Ark compiler improves the smoothness of Android by 24%, and the response speed - by 44%. At the same time, third-party Android applications, after recompilation using Ark, can run 60% faster. An open project is called OpenArkCompiler . Its source code is available on the Chinese counterpart of the GitHub site - Gitee .



To test the project, we used a static code analyzer - PVS-Studio . This is a tool for detecting errors and potential vulnerabilities in the source code of programs written in C, C ++, C # and Java.



The analyzer quickly coped with the project on 50K lines of code. For a small project, the results of the analysis are modest: the article included 11 warnings out of 39 (High and Medium levels).



Code defects overview



Warning 1



V502 Perhaps the '?:' Operator works in a different way than it was expected. The '?:' Operator has a lower priority than the '==' operator. mir_parser.cpp 884



enum Opcode : uint8 { kOpUndef, .... OP_intrinsiccall, OP_intrinsiccallassigned, .... kOpLast, }; bool MIRParser::ParseStmtIntrinsiccall(StmtNodePtr &stmt, bool isAssigned) { Opcode o = !isAssigned ? (....) : (....); auto *intrnCallNode = mod.CurFuncCodeMemPool()->New<IntrinsiccallNode>(....); lexer.NextToken(); if (o == !isAssigned ? OP_intrinsiccall : OP_intrinsiccallassigned) { intrnCallNode->SetIntrinsic(GetIntrinsicID(lexer.GetTokenKind())); } else { intrnCallNode->SetIntrinsic(static_cast<MIRIntrinsicID>(....)); } .... }
      
      





We are interested in the following part of this code:



 if (o == !isAssigned ? OP_intrinsiccall : OP_intrinsiccallassigned) { .... }
      
      





The operator '==' has a higher priority than the ternary operator (? :). Consequently, the conditional expression is not computed correctly. The written code is equivalent to the following:



 if ((o == !isAssigned) ? OP_intrinsiccall : OP_intrinsiccallassigned) { .... }
      
      





And taking into account the fact that the constants OP_intrinsiccall and OP_intrinsiccallassigned have nonzero values, this condition always returns the true value. The body of the else branch is unreachable code.



Warning 2



V570 The 'theDoubleVal' variable is assigned to itself. lexer.cpp 283



 int64 theIntVal = 0; float theFloatVal = 0.0; double theDoubleVal = 0.0; TokenKind MIRLexer ::GetFloatConst(uint32 valStart, uint32 startIdx, bool negative) { .... theIntVal = static_cast<int>(theFloatVal); theDoubleVal = static_cast<double>(theDoubleVal); // <= if (theFloatVal == -0) { theDoubleVal = -theDoubleVal; } .... }
      
      





TheDoubleVal variable is assigned to itself, while not changing at all. Most likely, they wanted to write the result to theFloatVal variable. It is this variable that is then used in the condition. In this case, the type conversion should be to float , and not to double . I venture to suggest that the code should be like this:



 theFloatVal = static_cast<float>(theDoubleVal); if (theFloatVal == -0) { theDoubleVal = -theDoubleVal;
      
      





or even if you simply mixed up the variable in the condition:



 if (theDoubleVal == -0) { theDoubleVal = -theDoubleVal;
      
      





Although, maybe I'm wrong, and everything should be different. The code looks very incomprehensible to a third-party programmer like me.



Warnings 3-5



V524 It is odd that the body of '-' function is fully equivalent to the body of '+' function. mpl_number.h 158



 template <typename T, typename Tag> inline Number<T, Tag> operator+(const Number<T, Tag> &lhs, const Number<T, Tag> &rhs) { return Number<T, Tag>(lhs.get() + rhs.get()); } template <typename T, typename Tag> inline Number<T, Tag> operator-(const Number<T, Tag> &lhs, const Number<T, Tag> &rhs) { return Number<T, Tag>(lhs.get() + rhs.get()); }
      
      





In the header file mpl_number.h we duplicated a lot of code with minor changes. And, of course, they made mistakes. In this example, the addition and subtraction operators are implemented identically. In the body of the subtraction operator, they forgot to change the sign of the operation.



Here are a few more examples:





Warning 6



V560 A part of conditional expression is always false:! FirstImport. parser.cpp 2633



 bool MIRParser::ParseMIRForImport() { .... if (paramIsIPA && firstImport) { BinaryMplt *binMplt = new BinaryMplt(mod); mod.SetBinMplt(binMplt); if (!(*binMplt).Import(...., paramIsIPA && !firstImport, paramIsComb)) { .... } .... } .... }
      
      





In the body of the first conditional expression, the firstImport variable is always true . In this case, the expression



 paramIsIPA && !firstImport
      
      





will always be false . This piece of code either contains a logical error, or it can be simplified by passing the constant false to the Import function.



Warning 7



V547 Expression 'idx> = 0' is always true. Unsigned type value is always> = 0. lexer.h 129



 char GetCharAtWithLowerCheck(uint32 idx) const { return idx >= 0 ? line[idx] : 0; }
      
      





Checking the idx index variable in this way (> = 0) makes no sense, since it is an unsigned type. Perhaps here it is worth adding a check for a different access border to the line array, or just delete this meaningless check.



Warning 8



V728 An excessive check can be simplified. The '||' operator is surrounded by opposite expressions 'c! =' \ "'' and 'c ==' \" ''. lexer.cpp 400



 TokenKind MIRLexer::GetTokenWithPrefixDoubleQuotation() { .... char c = GetCurrentCharWithUpperCheck(); while ((c != 0) && (c != '\"' || (c == '\"' && GetCharAtWithLowerCheck(....) == '\\'))) { .... } .... }
      
      





The analyzer "caught" a code pattern that can be simplified. The pattern looks something like this:



 A || (!A && smth)
      
      





The expression ! A will always be true . Then the initial example can be simplified to this:



 while ((c != 0) && (c != '\"' || (GetCharAtWithLowerCheck(....) == '\\'))) { .... }
      
      





Warnings 9-10



V728 An excessive check can be simplified. The '(A &&! B) || (! A && B) 'expression is equivalent to the' bool (A)! = Bool (B) 'expression. mir_nodes.cpp 1552



 bool BinaryNode::Verify() const { .... if ((IsAddress(GetBOpnd(0)->GetPrimType()) && !IsAddress(GetBOpnd(1)->GetPrimType())) || (!IsAddress(GetBOpnd(0)->GetPrimType()) && IsAddress(GetBOpnd(1)->GetPrimType()))) { .... } .... }
      
      





Another example of code that can be refactored. The example itself is divided into several lines for convenience. In the original, the conditional expression is written in two long lines, which greatly impaired readability. This code can be written easier and more clearly:



 if (IsAddress(GetBOpnd(0)->GetPrimType()) != IsAddress(GetBOpnd(1)->GetPrimType())) .... }
      
      





Another place where you can do a similar refactoring:





Warning 11



V1048 The 'floatSpec-> floatStr' variable was assigned the same value. input.inl 1356



 static void SecInitFloatSpec(SecFloatSpec *floatSpec) { floatSpec->floatStr = floatSpec->buffer; floatSpec->allocatedFloatStr = NULL; floatSpec->floatStrSize = sizeof(floatSpec->buffer) / sizeof(floatSpec->buffer[0]); floatSpec->floatStr = floatSpec->buffer; floatSpec->floatStrUsedLen = 0; }
      
      





The analyzer detected 2 identical initialization lines of the variable floatSpec-> floatStr . Most likely, an extra line can be deleted.



Conclusion



More recently, we reviewed the Huawei Cloud DIS SDK code. Huawei has begun to actively open the code to the public, which is good news for the developer community. Projects such as Ark Compiler or Harmony OS have just appeared and have not yet become widespread. Investing in the quality control of the project code at this stage will be very profitable, since potential vulnerabilities and criticism of users can be avoided.



Sitelinks



  1. LLVM Validation in 2011
  2. Validation of LLVM in 2012
  3. GCC Check in 2016
  4. LLVM Validation in 2016
  5. Checking PascalABC.NET in 2017
  6. Checking Roslyn (.NET Compiler Platform) in 2019
  7. Verification of LLVM in 2019










If you want to share this article with an English-speaking audience, then please use the link to the translation: Svyatoslav Razmyslov. Checking the Ark Compiler Recently Made Open-Source by Huawei .



All Articles