ROOT code analysis - research data analysis framework

Picture 3




While the 118th Nobel Week was taking place in Stockholm, a review of the ROOT project code used in scientific research for processing big data was being prepared at the PVS-Studio static code analyzer development office. Of course, you won’t give a bonus for such code, but the developers will receive a detailed review of interesting code defects and a license for a full verification of the project.



Introduction







Picture 1






ROOT - a set of utilities for working with scientific research data. It provides all the functionality needed for big data processing, statistical analysis, visualization and storage. It is mainly written in C ++. Development began at CERN (European Organization for Nuclear Research) for research in high-energy physics. Every day, thousands of physicists use ROOT applications to analyze their data or to simulate.



PVS-Studio is a tool for detecting errors and potential vulnerabilities in the source code of programs written in C, C ++, C # and Java. It works on 64-bit systems on Windows, Linux and macOS and can analyze code designed for 32-bit, 64-bit and embedded ARM platforms.



New Diagnostics Debut



V1046 Unsafe usage of the bool 'and' int 'types together in the operation' & = '. GSLMultiRootFinder.h 175



int AddFunction(const ROOT::Math::IMultiGenFunction & func) { ROOT::Math::IMultiGenFunction * f = func.Clone(); if (!f) return 0; fFunctions.push_back(f); return fFunctions.size(); } template<class FuncIterator> bool SetFunctionList( FuncIterator begin, FuncIterator end) { bool ret = true; for (FuncIterator itr = begin; itr != end; ++itr) { const ROOT::Math::IMultiGenFunction * f = *itr; ret &= AddFunction(*f); } return ret; }
      
      





The beta version of the analyzer, which was used during the verification, found such a terrific error.



Expectation. The SetFunctionList function traverses the list of iterators. If at least one of them is invalid, then the return value will be false , otherwise true .



Reality. The SetFunctionList function can return false even for valid iterators. Let’s take a look at the situation: AddFunction returns the number of valid iterators in the fFunctions list. Those. when adding non-zero iterators, the size of this list will sequentially increase: 1, 2, 3, 4, etc. This is where the error in the code begins to manifest itself:



 ret &= AddFunction(*f);
      
      





Because Since the function returns an result of type int , not bool , the operation '& =' with even numbers will give the value false . After all, the least significant bit of even numbers will always be zero. Therefore, such an unobvious error will spoil the result of the SetFunctionsList function even for valid arguments.







Picture 2






Errors in Conditional Expressions



V501 There are identical sub-expressions to the left and to the right of the '&&' operator: module && module rootcling_impl.cxx 3650



 virtual void HandleDiagnostic(....) override { .... bool isROOTSystemModuleDiag = module && ....; bool isSystemModuleDiag = module && module && module->IsSystem; if (!isROOTSystemModuleDiag && !isSystemModuleDiag) fChild->HandleDiagnostic(DiagLevel, Info); .... }
      
      





Let's start with the most harmless example. The module pointer is checked twice. Most likely, one check is unnecessary. But it’s better to fix the code so that there are no unnecessary questions.



V501 There are identical sub-expressions 'strchr (fHostAuth-> GetHost (),' * ')' to the left and to the right of the '||' operator. TAuthenticate.cxx 300



 TAuthenticate::TAuthenticate(TSocket *sock, const char *remote, const char *proto, const char *user) { .... // If generic THostAuth (ie with wild card or user == any) // make a personalized memory copy of this THostAuth if (strchr(fHostAuth->GetHost(),'*') || strchr(fHostAuth->GetHost(),'*') || fHostAuth->GetServer() == -1 ) { fHostAuth = new THostAuth(*fHostAuth); fHostAuth->SetHost(fqdn); fHostAuth->SetUser(checkUser); fHostAuth->SetServer(servtype); } .... }
      
      





Here in the line fHostAuth-> GetHost () the same character is searched for - '*'. Perhaps one of them should be the symbol '?'. Usually they are used to set different masks.



V517 The use of 'if (A) {...} else if (A) {...}' pattern was detected. There is a probability of logical error presence. Check lines: 163, 165. TProofMonSenderML.cxx 163



 Int_t TProofMonSenderML::SendSummary(TList *recs, const char *id) { .... if (fSummaryVrs == 0) { if ((dsn = recs->FindObject("dataset"))) recs->Remove(dsn); } else if (fSummaryVrs == 0) { // Only the first records xrecs = new TList; xrecs->SetOwner(kFALSE); TIter nxr(recs); TObject *o = 0; while ((o = nxr())) { if (!strcmp(o->GetName(), "vmemmxw")) break; xrecs->Add(o); } } .... }
      
      





The variable fSummaryVrs is compared twice to zero. This causes the code in the else-if branch to never execute. A lot of code is written there ...



V523 The 'then' statement is equivalent to the 'else' statement. TKDTree.cxx 805



 template <typename Index, typename Value> void TKDTree<Index, Value>::UpdateRange(....) { .... if (point[fAxis[inode]]<=fValue[inode]){ //first examine the node that contains the point UpdateRange(GetLeft(inode),point, range, res); UpdateRange(GetRight(inode),point, range, res); } else { UpdateRange(GetLeft(inode),point, range, res); UpdateRange(GetRight(inode),point, range, res); } .... }
      
      





The same copy-paste code is executed under any condition. Perhaps there is a typo in the words left and right .



There are many similar suspicious code in the project:





V547 Expression '! File_name_value.empty ()' is always false. SelectionRules.cxx 1423



 bool SelectionRules::AreAllSelectionRulesUsed() const { for(auto&& rule : fClassSelectionRules){ .... std::string file_name_value; if (!rule.GetAttributeValue("file_name", file_name_value)) file_name_value.clear(); if (!file_name_value.empty()) { // <= // don't complain about defined_in rules continue; } const char* attrName = nullptr; const char* attrVal = nullptr; if (!file_name_value.empty()) { // <= attrName = "file name"; attrVal = file_name_value.c_str(); } else { attrName = "class"; if (!name.empty()) attrVal = name.c_str(); } ROOT::TMetaUtils::Warning(0,"Unused %s rule: %s\n", attrName, attrVal); } .... }
      
      





Most likely, there is no mistake. The analyzer has detected code that can be shortened. Because Since the value of file_name_value.empty () is checked at the beginning of the loop, then lower in the code this check can be removed, significantly reducing the amount of unnecessary code.



V590 Consider inspecting the '! File1 || c <= 0 || c == '*' || c! = '(' 'expression. The expression is excessive or contains a misprint. TTabCom.cxx 840



 TString TTabCom::DetermineClass(const char varName[]) { .... c = file1.get(); if (!file1 || c <= 0 || c == '*' || c != '(') { Error("TTabCom::DetermineClass", "variable \"%s\" not defined?", varName); goto cleanup; } .... }
      
      





Consider the abbreviated part of the conditional expression pointed out by the analyzer:



 if (.... || c == '*' || c != '(') { .... }
      
      





The condition does not depend on whether the symbol "asterisk" is equal to or not. This part of the conditional expression will always be true for any character other than '('. This is easy to see if you build a truth table.



Two more places with strange logic in conditional expressions:





V593 Consider reviewing the expression of the 'A = B <C' kind. The expression is calculated as following: 'A = (B <C)'. TProofServ.cxx 1903



 Int_t TProofServ::HandleSocketInput(TMessage *mess, Bool_t all) { .... if (Int_t ret = fProof->AddWorkers(workerList) < 0) { Error("HandleSocketInput:kPROOF_GETSLAVEINFO", "adding a list of worker nodes returned: %d", ret); } .... }
      
      





The error detected by the analyzer only manifests itself when the program does not work correctly. The ret variable must store the AddWorkers function return code and, in case of emergency, display this value in the log. But the code does not work like that. The condition lacks additional brackets that specify the priority of operations. Not the return code, but the result of the logical comparison is stored in the ret variable, i.e. only 0 or 1.



There is another place with a similar defect:





V768 The enumeration constant 'kCostComplexityPruning' is used as a variable of a Boolean-type. MethodDT.cxx 283



 enum EPruneMethod {kExpectedErrorPruning=0, kCostComplexityPruning, kNoPruning}; void TMVA::MethodDT::ProcessOptions() { .... if (fPruneStrength < 0) fAutomatic = kTRUE; else fAutomatic = kFALSE; if (fAutomatic && fPruneMethod==!DecisionTree::kCostComplexityPruning){ Log() << kFATAL << "Sorry automatic pruning strength determination is ...." << Endl; } .... }
      
      





Hmm ... Why do the negation of the constant value of kCostComplexityPruning ? Most likely, the negation symbol was accidentally added and now leads to incorrect logic of code execution.



Invalid code with pointers



V522 Dereferencing of the null pointer 'pre' might take place. TSynapse.cxx 61



 void TSynapse::SetPre(TNeuron * pre) { if (pre) { Error("SetPre","this synapse is already assigned to a pre-neuron."); return; } fpre = pre; pre->AddPost(this); }
      
      





I tried to figure out this weird code. It seems like the idea is not to set a new value for the fpre field. Then they made a mistake by confusing the pointer, which should be checked in the condition. In the current implementation, if you pass the nullptr value to the SetPre function, the null pointer is dereferenced.



Most likely, the following is correct:



 void TSynapse::SetPre(TNeuron * pre) { if (fpre) { Error("SetPre","this synapse is already assigned to a pre-neuron."); return; } fpre = pre; pre->AddPost(this); }
      
      





True, this still will not save the function from passing a null pointer. But at least this code looks more logical than the original version.



Here is another place that is copied from here with a few changes:





V595 The 'N' pointer was utilized before it was verified against nullptr. Check lines: 484, 488. Scanner.cxx 484



 bool RScanner::shouldVisitDecl(clang::NamedDecl *D) { if (auto M = D->getOwningModule()) { // <= 2 return fInterpreter.getSema().isModuleVisible(M); } return true; } bool RScanner::VisitNamespaceDecl(clang::NamespaceDecl* N) { if (fScanType == EScanType::kOnePCM) return true; if (!shouldVisitDecl(N)) // <= 1 return true; if((N && N->isImplicit()) || !N){ // <= 3 return true; } .... }
      
      





The analyzer detected a very dangerous code! The pointer N in the first case is dereferenced without checking for a zero value. Moreover, you can’t even see the call to the unverified pointer. This happens inside the shouldVisitDecl function.



Traditionally, this diagnostic rule generates many useful warnings. Here are some of them:





The following example is not an error, but once again demonstrates that macros encourage writing incorrect or redundant code.



V571 Recurring check. The 'if (fCanvasImp)' condition was already verified in line 799. TCanvas.cxx 800



 #define SafeDelete(p) { if (p) { delete p; p = 0; } } void TCanvas::Close(Option_t *option) { .... if (fCanvasImp) SafeDelete(fCanvasImp); .... }
      
      





The fCanvasImp pointer is checked twice. One of the checks has already been implemented in the SafeDelete macro. One of the problems with macros is that they are often difficult to navigate from code, so many do not study their contents before use.



Errors when working with arrays



V519 The 'Line [Cursor]' variable is assigned values ​​twice successively. Perhaps this is a mistake. Check lines: 352, 353. Editor.cpp 353



 size_t find_last_non_alnum(const std::string &str, std::string::size_type index = std::string::npos) { .... char tmp = Line.GetText()[Cursor]; Line[Cursor] = Line[Cursor - 1]; Line[Cursor] = tmp; .... }
      
      





The new value of the Line [Cursor] element is overwritten immediately. Something is wrong here…



V557 Array overrun is possible. The 'ivar' index is pointing beyond array bound. BasicMinimizer.cxx 130



 bool BasicMinimizer::SetVariableValue(unsigned int ivar, double val) { if (ivar > fValues.size() ) return false; fValues[ivar] = val; return true; }
      
      





So making mistakes in checking the index of an array is just a massive problem lately. Almost every third project is found. If everything is simple when indexing an array in a loop - the '<' operator is almost always used to compare the index with the size of the array, then with a check like this, you need to use the '> =' operator, not the '>'. Otherwise, it is possible to go beyond the boundary of the array by one element.



This error was propagated over the file several times:





V621 Consider inspecting the 'for' operator. It's possible that the loop will be executed incorrectly or won't be executed at all. TDataMember.cxx 554



 Int_t TDataMember::GetArrayDim() const { if (fArrayDim<0 && fInfo) { R__LOCKGUARD(gInterpreterMutex); TDataMember *dm = const_cast<TDataMember*>(this); dm->fArrayDim = gCling->DataMemberInfo_ArrayDim(fInfo); // fArrayMaxIndex should be zero if (dm->fArrayDim) { dm->fArrayMaxIndex = new Int_t[fArrayDim]; for(Int_t dim = 0; dim < fArrayDim; ++dim) { dm->fArrayMaxIndex[dim] = gCling->DataMemberInfo_MaxIndex(fInfo,dim); } } } return fArrayDim; }
      
      





Most likely, in the for loop they wanted to compare the variable dim with dm-> fArrayDim , and not fArrayDim . The value of the variable used is negative, due to the condition at the beginning of the function. Such a cycle is never executed.



V767 Suspicious access to element of 'current' array by a constant index inside a loop. TClingUtils.cxx 3082



 llvm::StringRef ROOT::TMetaUtils::DataMemberInfo__ValidArrayIndex(....) { .... while (current!=0) { // Check the token if (isdigit(current[0])) { for(i=0;i<strlen(current);i++) { if (!isdigit(current[0])) { if (errstr) *errstr = current; if (errnum) *errnum = NOT_INT; return llvm::StringRef(); } } } else { // current token is not a digit .... } .... } .... }
      
      





This code fragment parses a certain line and checks its correctness. After the null character of the string current is defined as a number, all other characters are traversed to make sure that all characters are numbers to the end of the string. Well, how is it done ... loop counter i is not used in the loop. It was necessary to write current [i] , and not current [0] in the condition.







Picture 4






Memory leak



V773 The function was exited without releasing the 'optionlist' pointer. A memory leak is possible. TDataMember.cxx 355



 void TDataMember::Init(bool afterReading) { .... TList *optionlist = new TList(); //storage for options strings for (i=0;i<token_cnt;i++) { if (strstr(tokens[i],"Items")) { ptr1 = R__STRTOK_R(tokens[i], "()", &rest); if (ptr1 == 0) { Fatal("TDataMember","Internal error, found \"Items....",GetTitle()); return; } ptr1 = R__STRTOK_R(nullptr, "()", &rest); if (ptr1 == 0) { Fatal("TDataMember","Internal error, found \"Items....",GetTitle()); return; } .... } .... } .... // dispose of temporary option list... delete optionlist; .... }
      
      





When exiting a function, memory is not provided for using the optionList pointer. Whether this is necessary in this particular case is difficult for me to say. But usually such errors are corrected in projects based on PVS-Studio reports. It all depends on whether the program should try to continue to work in an emergency or not. There are many such warnings in the project, it is better for developers to double-check the project on their own and see the full analyzer report.



Again about the memset function



V597 The compiler could delete the 'memset' function call, which is used to flush 'x' buffer. The memset_s () function should be used to erase the private data. TMD5.cxx 366



 void TMD5::Transform(UInt_t buf[4], const UChar_t in[64]) { UInt_t a, b, c, d, x[16]; .... // Zero out sensitive information memset(x, 0, sizeof(x)); }
      
      





Many will think that when the code compiles, this comment will not get into the binary file, and they will be right: D. But the memset function will also be deleted by the compiler, not everyone guesses. And it will happen. If the flushed buffer will no longer be used in the code, then the compiler will optimize the code and delete the function call. Technically, he is right, but if there were secret data in the buffer, then they will remain there. This is a classic CWE-14 security flaw.



Miscellaneous warnings



V591 Non-void function should return a value. LogLikelihoodFCN.h 108



 LogLikelihoodFCN & operator = (const LogLikelihoodFCN & rhs) { SetData(rhs.DataPtr() ); SetModelFunction(rhs.ModelFunctionPtr() ); fNEffPoints = rhs.fNEffPoints; fGrad = rhs.fGrad; fIsExtended = rhs.fIsExtended; fWeight = rhs.fWeight; fExecutionPolicy = rhs.fExecutionPolicy; }
      
      





The overloaded statement does not have a return value. Also a very common problem lately.



V596 The object was created but it is not being used. The 'throw' keyword could be missing: throw runtime_error (FOO); RTensor.hxx 363



 template <typename Value_t, typename Container_t> inline RTensor<Value_t, Container_t> RTensor<Value_t, Container_t>::Transpose() { if (fLayout == MemoryLayout::RowMajor) { fLayout = MemoryLayout::ColumnMajor; } else if (fLayout == MemoryLayout::ColumnMajor) { fLayout = MemoryLayout::RowMajor; } else { std::runtime_error("Memory layout is not known."); } .... }
      
      





The error is that the throw keyword is accidentally forgotten. As a result, this code does not throw an exception in case of an error.



In total, there were two such places. Second:





V609 Divide by zero. Denominator range [0..100]. TGHtmlImage.cxx 340



 const char *TGHtml::GetPctWidth(TGHtmlElement *p, char *opt, char *ret) { int n, m, val; .... if (n < 0 || n > 100) return z; if (opt[0] == 'h') { val = fCanvas->GetHeight() * 100; } else { val = fCanvas->GetWidth() * 100; } if (!fInTd) { snprintf(ret, 15, "%d", val / n); // <= } else { .... } .... }
      
      





A case similar to those described earlier about arrays. Here, the variable n is limited to a range of values ​​from 0 to 100. In this case, there is a branch of code in which division by the variable n with the value 0 will occur. Most likely, the restriction of the value of n should be rewritten as follows:



 if (n <= 0 || n > 100) return z;
      
      





V646 Consider inspecting the application's logic. It's possible that 'else' keyword is missing. TProofServ.cxx 729



 TProofServ::TProofServ(Int_t *argc, char **argv, FILE *flog) : TApplication("proofserv", argc, argv, 0, -1) { .... if (!logmx.IsDigit()) { if (logmx.EndsWith("K")) { xf = 1024; logmx.Remove(TString::kTrailing, 'K'); } else if (logmx.EndsWith("M")) { xf = 1024*1024; logmx.Remove(TString::kTrailing, 'M'); } if (logmx.EndsWith("G")) { xf = 1024*1024*1024; logmx.Remove(TString::kTrailing, 'G'); } } .... }
      
      





The analyzer detected strange formatting. In one place, the else keyword is missing. From the code, it can be assumed that the code is really worth fixing.



And a couple of places to fix at the same time:





V663 Infinite loop is possible. The 'cin.eof ()' condition is insufficient to break from the loop. Consider adding the 'cin.fail ()' function call to the conditional expression. MethodKNN.cxx 602



 void TMVA::MethodKNN::ReadWeightsFromStream(std::istream& is) { .... while (!is.eof()) { std::string line; std::getline(is, line); if (line.empty() || line.find("#") != std::string::npos) { continue; } .... } .... }
      
      





When working with the std :: istream class, calling the eof () function is not enough to complete the loop. In the event of a failure while reading data, calling the eof () function will always return false , and there are no other exit points from the loop in this code. To complete the loop in this case, an additional check of the value returned by the fail () function is required:



 while (!is.eof() && !is.fail()) { .... }
      
      





Or just write:



 while (is) { .... }
      
      





V678 An object is used as an argument to its own method. Consider checking the first actual argument of the 'Copy' function. TFormLeafInfo.cxx 2414



 TFormLeafInfoMultiVarDim::TFormLeafInfoMultiVarDim( const TFormLeafInfoMultiVarDim& orig) : TFormLeafInfo(orig) { fNsize = orig.fNsize; fSizes.Copy(fSizes); // <= fCounter2 = orig.fCounter2?orig.fCounter2->DeepCopy():0; fSumOfSizes = orig.fSumOfSizes; fDim = orig.fDim; fVirtDim = orig.fVirtDim; fPrimaryIndex = orig.fPrimaryIndex; fSecondaryIndex = orig.fSecondaryIndex; }
      
      





Finally, there is such an error. Instead of fSizes, you had to pass orig.fSizes to the Copy function.



Conclusion



About a year ago, an overview of the NCBI Genome Workbench project code was provided . This project is also used in scientific research, but the genome. Where I lead, software in this area is very important, but its quality is not given due attention.



By the way, the other day macOS 10.15 Catalina came out, in which they refused to support 32-bit applications. And in PVS-Studio there is a large set of rules for identifying problems when porting programs to 64-bit systems. More on this in the analyzer's blog post .











If you want to share this article with an English-speaking audience, then please use the link to the translation: Svyatoslav Razmyslov. Analyzing the Code of ROOT, Scientific Data Analysis Framework .



All Articles