CMake is a cross-platform automation system for building projects. This system is much older than the PVS-Studio static code analyzer, while no one has yet tried to apply it to the code and review errors. It turns out that there are many mistakes. CMake's audience is huge. On it, new projects begin and old ones are transferred. It’s scary to imagine how many programmers might have this or that error.
Introduction
CMake (from the English cross-platform make) is a cross-platform automation system for building software from source code. CMake does not directly build, but only generates assembly control files from CMakeLists.txt files. The first release of the program took place in 2000. For comparison, the
PVS-Studio static analyzer appeared only in 2008. Then it was focused on finding errors when porting programs from 32-bit systems to 64-bit ones, and in 2010 the first set of general-purpose diagnostics (
V501 -
V545 ) appeared. By the way, there are several warnings from this first set on the CMake code.
Unforgivable Mistakes
V1040 Possible typo in the spelling of a pre-defined macro name. The '__MINGW32_' macro is similar to '__MINGW32__'. winapi.h 4112
#if !defined(__UNICODE_STRING_DEFINED) && defined(__MINGW32_) #define __UNICODE_STRING_DEFINED #endif
V1040 diagnostics
have only recently been implemented. At the time of publication of the article, most likely, there will not be a release with it, but with the help of this diagnostics we have already managed to find a tough mistake.
Here they made a typo in the name
__MINGW32_ . In the end, one underscore is missing. If you search by code with this name, you can make sure that the project actually uses the version with two underscores on both sides:
V531 It is odd that a sizeof () operator is multiplied by sizeof (). cmGlobalVisualStudioGenerator.cxx 558
bool IsVisualStudioMacrosFileRegistered(const std::string& macrosFile, const std::string& regKeyBase, std::string& nextAvailableSubKeyName) { .... if (ERROR_SUCCESS == result) { wchar_t subkeyname[256];
When the array is statically declared, the
sizeof operator will calculate its size in bytes, taking into account both the number of elements and the size of the elements. When calculating the value of the variable
cch_subkeyname, the programmer did not take this into account and received a value 4 times larger than planned. Let us explain where it is "4 times."
The array and its incorrect size are passed to the
RegEnumKeyExW function:
LSTATUS RegEnumKeyExW( HKEY hKey, DWORD dwIndex, LPWSTR lpName,
The pointer
lpcchName must point to a variable containing the size of the buffer specified in characters: "A pointer to a variable that specifies the size of the buffer specified by the
lpClass parameter, in characters". The size of the
subkeyname array is 512 bytes and it is capable of storing 256 characters of the
wchar_t type (in Windows, wchar_t is 2 bytes). This value is 256 and should be passed to the function. Instead, 512 is multiplied by 2 again to get 1024.
I think how to fix the error is now clear. Instead of multiplication, use division:
DWORD cch_subkeyname = sizeof(subkeyname) / sizeof(subkeyname[0]);
By the way, exactly the same error occurs when calculating the value of the
cch_keyclass variable.
The described error can potentially lead to a buffer overflow. It is necessary to fix all such places:
- V531 It is odd that a sizeof () operator is multiplied by sizeof (). cmGlobalVisualStudioGenerator.cxx 556
- V531 It is odd that a sizeof () operator is multiplied by sizeof (). cmGlobalVisualStudioGenerator.cxx 572
- V531 It is odd that a sizeof () operator is multiplied by sizeof (). cmGlobalVisualStudioGenerator.cxx 621
- V531 It is odd that a sizeof () operator is multiplied by sizeof (). cmGlobalVisualStudioGenerator.cxx 622
- V531 It is odd that a sizeof () operator is multiplied by sizeof (). cmGlobalVisualStudioGenerator.cxx 649
V595 The 'this-> BuildFileStream' pointer was utilized before it was verified against nullptr. Check lines: 133, 134. cmMakefileTargetGenerator.cxx 133
void cmMakefileTargetGenerator::CreateRuleFile() { .... this->BuildFileStream->SetCopyIfDifferent(true); if (!this->BuildFileStream) { return; } .... }
The
this-> BuildFileStream pointer is dereferenced just before the validation check. Was this really causing any problems? Below is another example of such a place. It is made directly under the carbon paper. But in fact,
there are a lot of
V595 warnings and most of them are not so obvious. From experience, I can say that correcting warnings of this diagnosis is the longest.
- V595 The 'this-> FlagFileStream' pointer was utilized before it was verified against nullptr. Check lines: 303, 304. cmMakefileTargetGenerator.cxx 303
V614 Uninitialized pointer 'str' used. cmVSSetupHelper.h 80
class SmartBSTR { public: SmartBSTR() { str = NULL; } SmartBSTR(const SmartBSTR& src) { if (src.str != NULL) { str = ::SysAllocStringByteLen((char*)str, ::SysStringByteLen(str)); } else { str = ::SysAllocStringByteLen(NULL, 0); } } .... private: BSTR str; };
The analyzer detected the use of the uninitialized pointer
str . And this arose because of the usual typo. When calling the
SysAllocStringByteLen function
, you had to use the
src.str pointer.
V557 Array overrun is possible. The value of 'lensymbol' index could reach 28. archive_read_support_format_rar.c 2749
static int64_t expand(struct archive_read *a, int64_t end) { .... if ((lensymbol = read_next_symbol(a, &rar->lengthcode)) < 0) goto bad_data; if (lensymbol > (int)(sizeof(lengthbases)/sizeof(lengthbases[0]))) goto bad_data; if (lensymbol > (int)(sizeof(lengthbits)/sizeof(lengthbits[0]))) goto bad_data; len = lengthbases[lensymbol] + 2; if (lengthbits[lensymbol] > 0) { if (!rar_br_read_ahead(a, br, lengthbits[lensymbol])) goto truncated_data; len += rar_br_bits(br, lengthbits[lensymbol]); rar_br_consume(br, lengthbits[lensymbol]); } .... }
Several problems were found in this piece of code. When accessing the arrays of
lengthbases and
lengthbits, it is possible to go beyond the boundary of the array, because above the code, the developers wrote the operator '>' instead of '> ='. Such a check began to skip one invalid value. We are faced with a classic error pattern called
Off-by-one Error .
The whole list of places to access arrays by invalid index:
- V557 Array overrun is possible. The value of 'lensymbol' index could reach 28. archive_read_support_format_rar.c 2750
- V557 Array overrun is possible. The value of 'lensymbol' index could reach 28. archive_read_support_format_rar.c 2751
- V557 Array overrun is possible. The value of 'lensymbol' index could reach 28. archive_read_support_format_rar.c 2753
- V557 Array overrun is possible. The value of 'lensymbol' index could reach 28. archive_read_support_format_rar.c 2754
- V557 Array overrun is possible. The value of 'offssymbol' index could reach 60. archive_read_support_format_rar.c 2797
Memory leak
V773 The function was exited without releasing the 'testRun' pointer. A memory leak is possible. cmCTestMultiProcessHandler.cxx 193
void cmCTestMultiProcessHandler::FinishTestProcess(cmCTestRunTest* runner, bool started) { .... delete runner; if (started) { this->StartNextTests(); } } bool cmCTestMultiProcessHandler::StartTestProcess(int test) { .... cmCTestRunTest* testRun = new cmCTestRunTest(*this);
The analyzer detected a memory leak. Memory by pointer
testRun is not freed if function
testRun-> StartTest returns
true . When another branch of code is executed, the memory using the
testRun pointer
is freed in the
this-> FinishTestProcess function .
Resource leak
V773 The function was exited without closing the file referenced by the 'fd' handle. A resource leak is possible. rhash.c 450
RHASH_API int rhash_file(....) { FILE* fd; rhash ctx; int res; hash_id &= RHASH_ALL_HASHES; if (hash_id == 0) { errno = EINVAL; return -1; } if ((fd = fopen(filepath, "rb")) == NULL) return -1; if ((ctx = rhash_init(hash_id)) == NULL) return -1;
Strange logic in conditions
V590 Consider inspecting the '* s! =' \ 0 '&& * s ==' '' expression. The expression is excessive or contains a misprint. archive_cmdline.c 76
static ssize_t get_argument(struct archive_string *as, const char *p) { const char *s = p; archive_string_empty(as); while (*s != '\0' && *s == ' ') s++; .... }
Comparing the
* s character with a terminal zero is superfluous. The condition of the
while loop depends only on whether the character is equal to a space or not. This is not a mistake, but an extra complication of the code.
V592 The expression was enclosed by parentheses twice: ((expression)). One pair of parentheses is unnecessary or misprint is present. cmCTestTestHandler.cxx 899
void cmCTestTestHandler::ComputeTestListForRerunFailed() { this->ExpandTestsToRunInformationForRerunFailed(); ListOfTests finalList; int cnt = 0; for (cmCTestTestProperties& tp : this->TestList) { cnt++;
The analyzer warns that perhaps negation should be put out of brackets. It seems that there is no such mistake here - just extra double brackets. But, most likely, there is a logical error in this condition.
The
continue statement is executed if the list of tests
this-> TestsToRun is not empty and
cnt is absent in it. It is logical to assume that if the test list is empty, then the same action must be performed. Most likely, the condition should be like this:
if (this->TestsToRun.empty() || std::find(this->TestsToRun.begin(), this->TestsToRun.end(), cnt) == this->TestsToRun.end()) { continue; }
V592 The expression was enclosed by parentheses twice: ((expression)). One pair of parentheses is unnecessary or misprint is present. cmMessageCommand.cxx 73
bool cmMessageCommand::InitialPass(std::vector<std::string> const& args, cmExecutionStatus&) { .... } else if (*i == "DEPRECATION") { if (this->Makefile->IsOn("CMAKE_ERROR_DEPRECATED")) { fatal = true; type = MessageType::DEPRECATION_ERROR; level = cmake::LogLevel::LOG_ERROR; } else if ((!this->Makefile->IsSet("CMAKE_WARN_DEPRECATED") || this->Makefile->IsOn("CMAKE_WARN_DEPRECATED"))) { type = MessageType::DEPRECATION_WARNING; level = cmake::LogLevel::LOG_WARNING; } else { return true; } ++i; } .... }
A similar example, but here I am more confident in the presence of an error. The
IsSet function
("CMAKE_WARN_DEPRECATED") verifies that the
CMAKE_WARN_DEPRECATED value is set globally, and the
IsOn function
("CMAKE_WARN_DEPRECATED") verifies that the value is specified in the project configuration. Most likely, the negation operator is superfluous, because in both cases, it is correct to set the same
type and
level values.
V728 An excessive check can be simplified. The '(A &&! B) || (! A && B) 'expression is equivalent to the' bool (A)! = Bool (B) 'expression. cmCTestRunTest.cxx 151
bool cmCTestRunTest::EndTest(size_t completed, size_t total, bool started) { .... } else if ((success && !this->TestProperties->WillFail) || (!success && this->TestProperties->WillFail)) { this->TestResult.Status = cmCTestTestHandler::COMPLETED; outputStream << " Passed "; } .... }
Such code can be greatly simplified by rewriting the conditional expression this way:
} else if (success != this->TestProperties->WillFail) { this->TestResult.Status = cmCTestTestHandler::COMPLETED; outputStream << " Passed "; }
Some more places you can simplify:
- V728 An excessive check can be simplified. The '(A && B) || (! A &&! B) 'expression is equivalent to the' bool (A) == bool (B) 'expression. cmCTestTestHandler.cxx 702
- V728 An excessive check can be simplified. The '(A &&! B) || (! A && B) 'expression is equivalent to the' bool (A)! = Bool (B) 'expression. digest_sspi.c 443
- V728 An excessive check can be simplified. The '(A &&! B) || (! A && B) 'expression is equivalent to the' bool (A)! = Bool (B) 'expression. tcp.c 1295
- V728 An excessive check can be simplified. The '(A &&! B) || (! A && B) 'expression is equivalent to the' bool (A)! = Bool (B) 'expression. testDynamicLoader.cxx 58
- V728 An excessive check can be simplified. The '(A &&! B) || (! A && B) 'expression is equivalent to the' bool (A)! = Bool (B) 'expression. testDynamicLoader.cxx 65
- V728 An excessive check can be simplified. The '(A &&! B) || (! A && B) 'expression is equivalent to the' bool (A)! = Bool (B) 'expression. testDynamicLoader.cxx 72
Miscellaneous warnings
V523 The 'then' statement is equivalent to the subsequent code fragment. archive_read_support_format_ar.c 415
static int _ar_read_header(struct archive_read *a, struct archive_entry *entry, struct ar *ar, const char *h, size_t *unconsumed) { .... if (strcmp(filename, "__.SYMDEF") == 0) { archive_entry_copy_pathname(entry, filename); return (ar_parse_common_header(ar, entry, h)); } archive_entry_copy_pathname(entry, filename); return (ar_parse_common_header(ar, entry, h)); }
The expression in the last condition is identical to the last two lines of the function. This code can be simplified by removing the condition, or there is an error in the code and should be fixed.
V535 The variable 'i' is being used for this loop and for the outer loop. Check lines: 2220, 2241. multi.c 2241
static CURLMcode singlesocket(struct Curl_multi *multi, struct Curl_easy *data) { .... for(i = 0; (i< MAX_SOCKSPEREASYHANDLE) &&
The variable
i is used as a loop counter in the outer and nested loops. In this case, the counter value again starts counting from zero in the enclosed one. This may not be a mistake here, but the code is suspicious.
V519 The 'tagString' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 84, 86. cmCPackLog.cxx 86
oid cmCPackLog::Log(int tag, const char* file, int line, const char* msg, size_t length) { .... if (tag & LOG_OUTPUT) { output = true; display = true; if (needTagString) { if (!tagString.empty()) { tagString += ","; } tagString = "VERBOSE"; } } if (tag & LOG_WARNING) { warning = true; display = true; if (needTagString) { if (!tagString.empty()) { tagString += ","; } tagString = "WARNING"; } } .... }
The
tagString variable is
frayed by the new value in all places. It’s hard to say what the mistake was, or why they did it. Perhaps the operators '=' and '+ =' were confused.
The whole list of such places:
- V519 The 'tagString' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 94, 96. cmCPackLog.cxx 96
- V519 The 'tagString' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 104, 106. cmCPackLog.cxx 106
- V519 The 'tagString' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 114, 116. cmCPackLog.cxx 116
- V519 The 'tagString' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 125, 127. cmCPackLog.cxx 127
V519 The 'aes-> aes_set' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 4052, 4054. archive_string.c 4054
int archive_mstring_copy_utf8(struct archive_mstring *aes, const char *utf8) { if (utf8 == NULL) { aes->aes_set = 0;
Forcing AES_SET_UTF8 looks suspicious. I think such a code will mislead any developer who is faced with the refinement of this place.
This code was copied to one more place:
- V519 The 'aes-> aes_set' variable is assigned values twice successively. Perhaps this is a mistake. Check lines: 4066, 4068. archive_string.c 4068
How to find errors in a project on CMake
In this section I will tell you a little how to easily and easily check projects on CMake using PVS-Studio.
Windows / Visual Studio
For Visual Studio, you can generate the project file using the CMake GUI or the following command:
cmake -G "Visual Studio 15 2017 Win64" ..
Next, you can open the .sln file and test the project using the
plug-in for Visual Studio.
Linux / macOS
On these systems, the compile_commands.json file is used to verify the project. By the way, it can be generated in different assembly systems. In CMake, this is done like this:
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=On ..
It remains to run the analyzer in the directory with the .json file:
pvs-studio-analyzer analyze -l /path/to/PVS-Studio.lic -o /path/to/project.log -e /path/to/exclude-path -j<N>
We also developed a module for CMake projects. Some people like to use it. The CMake module and examples of its use can be found in our repository on GitHub:
pvs-studio-cmake-examples .
Conclusion
The huge audience of CMake users is a good tester of the project, but many problems could not have been prevented before the release, using such static code analysis tools as
PVS-Studio .
If you liked the results of the analyzer, but your project is not written in C and C ++, I want to remind you that the analyzer also supports analysis of projects in C # and Java. You can try the analyzer on your project by going to
this page.
If you want to share this article with an English-speaking audience, then please use the link to the translation: Svyatoslav Razmyslov.
CMake: the Case when the Project's Quality is Unforgivable .