Configuration files. Libconfig library and definition of unused settings

Introduction



Large applications use configs to transfer settings. And it often happens that editing and deleting features leads to desynchronization between the application code and what is stored in these same settings. Simply, in the last data which you will never use again settle. Such settings, ideally, I would like to track and mark as deprecated, or completely delete.







Problem



It so happened historically that most of the configs inside our project were written in a mixture of json and yaml and parsed using the libconfig library. There was no desire to rewrite the corresponding code and the contents of the configs, for example, on yaml, especially when there are a bunch of other interesting and more complex tasks. And the part of the library that is written in C is good in itself: it is stable and rich in functionality (in the wrapper in C ++ everything is not so simple).







One fine day, we took care to find out how much garbage we had accumulated in the configuration files. Unfortunately, libconfig did not have such an option. First, we tried to fork the project on github and make changes to the part written in C ++ (all kinds of lookup and operator [] methods), automating the process of setting the visited flag for the node. But this would lead to a very large patch, the adoption of which would probably be delayed. And then the choice fell on the side of writing your own wrapper in C ++, without affecting the core of libconfig.







From the point of view of using output, we have the following:







#include <variti/util/config.hpp> #include <iostream> #include <cassert> int main(int argc, char* argv[]) { using namespace variti; using namespace variti::util; assert(argc = 2); config conf( [](const config_setting& st) { if (!st.visited()) std::cerr << "config not visited: " << st.path() << "\n"; }); conf.load(argv[1]); auto root = conf.root(); root["module"]["name"].to_string(); return 0; }
      
      





 laptop :: work/configpp/example ‹master*› % cat config_example1.conf version = "1.0"; module: { name = "module1"; submodules = ( { name = "submodule1"; }, { name = "submodule2"; } ); }; laptop :: work/configpp/example ‹master*› % ./config-example1 config_example1.conf config not visited: root.module.submodules.0.name config not visited: root.module.submodules.1.name
      
      





In the sample code, we turned to setting module.name. The settings module.submodules.0.name and module.submodules.1.name were not accessed. This is what is reported to us in the log.







Wrap



How to implement this if the visited flag or something like that is not inside libconfig? The developers of the library thought in advance and added the ability to hook a hook to the config_setting_t node, which is set using the config_setting_set_hook function and read using config_setting_get_hook.







Define this hook as:







 struct config_setting_hook { bool visited{false}; };
      
      





There are two main structures inside libconfig: config_t and config_setting_t. The first provides access to the entire config as a whole and returns a pointer to the root node config_setting_t, the second - access to the parent and child nodes, as well as the value inside the current node.







We wrap both structures in the corresponding classes - handles.







Handle around config_t:







 using config_notify = std::function<void(const config_setting&)>; struct config : boost::noncopyable { config(config_notify n = nullptr); ~config(); void load(const std::string& filename); config_setting root() const; config_notify n; config_t* h; };
      
      





Note that a function is passed to the config constructor, which will be called in the destructor at the time of traversal of all extreme nodes. How it can be used can be seen in the example above.







Handle around config_setting_t:







 struct config_setting : boost::noncopyable { config_setting(config_setting_t* h, bool visit = false); ~config_setting(); bool to_bool() const; std::int32_t to_int32() const; std::int64_t to_int64() const; double to_double() const; std::string to_string() const; bool is_bool() const; bool is_int32() const; bool is_int64() const; bool is_double() const; bool is_string() const; bool is_group() const; bool is_array() const; bool is_list() const; bool is_scalar() const; bool is_root() const; std::string path() const; std::size_t size() const; bool exists(const std::string& name) const; config_setting parent() const; config_setting lookup(const std::string& name, bool visit = false) const; config_setting lookup(std::size_t indx, bool visit = false) const; config_setting operator[](const std::string& name) const; config_setting operator[](std::size_t indx) const; std::string filename() const; std::size_t fileline() const; bool visited() const; config_setting_t* h; };
      
      





The main magic lies in lookup methods. It is assumed that the visited nodes flag is set through the last argument called visit, which is false by default. You have the right to indicate this value yourself. But since the most frequent access to the nodes is still via operator [], inside of it the lookup method is called with visit equal to true. Thus, the nodes for which you call operator [] will be automatically marked as visited. Moreover, as visited, the entire chain of nodes from the current to the root will be marked.







Let's move on to implementation. We show it completely for the config class:







 config::config(config_notify n) : n(n) { h = (config_t*)malloc(sizeof(config_t)); config_init(h); config_set_destructor(h, [](void* p) { delete reinterpret_cast<config_setting_hook*>(p); }); } config::~config() { if (n) for_each(root(), n); config_destroy(h); free(h); } void config::load(const std::string& filename) { if (!config_read_file(h, filename.c_str())) throw std::runtime_error(std::string("config read file error: ") + filename); } config_setting config::root() const { return config_setting(config_root_setting(h)); }
      
      





And partially for config_setting:







 config_setting::config_setting(config_setting_t* h, bool visit) : h(h) { assert(h); if (!config_setting_get_hook(h)) hook(h, new config_setting_hook()) if (visit) visit_up(h); } config_setting::~config_setting() { h = nullptr; } std::size_t config_setting::size() const { return config_setting_length(h); } config_setting config_setting::parent() const { return config_setting(config_setting_parent(h)); } bool config_setting::exists(const std::string& name) const { if (!is_group()) return false; return config_setting_get_member(h, name.c_str()); } config_setting config_setting::lookup(const std::string& name, bool visit) const { assert(is_group()); auto p = config_setting_get_member(h, name.c_str()); if (!p) throw_not_found(*this); return config_setting(p, visit); } config_setting config_setting::lookup(std::size_t indx, bool visit) const { assert(is_group() || is_array() || is_list()); auto p = config_setting_get_elem(h, indx); if (!p) throw_not_found(*this); return config_setting(p, visit); } config_setting config_setting::operator[](const std::string& name) const { return lookup(name, true); } config_setting config_setting::operator[](std::size_t indx) const { return lookup(indx, true); } bool config_setting::visited() const { return boost::algorithm::starts_with(path(), "root") || boost::algorithm::starts_with(path(), "root.version") || hook(h)->visited; }
      
      





We will separately consider helpers for working with a hook:







 void hook(config_setting_t* h, config_setting_hook* k) { config_setting_set_hook(h, k); } config_setting_hook* hook(config_setting_t* h) { return reinterpret_cast<config_setting_hook*>(config_setting_get_hook(h)); } void visit_up(config_setting_t* h) { for (; !config_setting_is_root(h) && !hook(h)->visited; h = config_setting_parent(h)) hook(h)->visited = true; }
      
      





And a helper to bypass extreme nodes:







 template <typename F> void for_each(const config_setting& st, F f) { if (st.size()) for (std::size_t i = 0; i < st.size(); ++i) for_each(st.lookup(i), f); else f(st); }
      
      





Output



It turned out to be a beautiful and more flexible, in our opinion, code. But we did not abandon the idea of ​​making similar changes to the original libconfig library, and, more precisely, to its interface written in C ++. A pull request is being prepared, but we are already working and we are cleaning our configs from unused settings.







application



Check out the source code here !








All Articles