Cron on Linux: History, Usage, and Device



The classic wrote that happy hours are not observed. In those wild times, there were neither programmers nor Unix, but nowadays programmers know very well: instead of them, cron will follow the time.







Command line utilities for me are both weakness and routine. sed, awk, wc, cut and other old programs are run by scripts on our servers daily. Many of them are designed as tasks for cron, a scheduler from the 70s.







I used cron superficially for a long time, without going into details, but once, faced with an error when running the script, I decided to figure it out thoroughly. So this article appeared, during the writing of which I got acquainted with POSIX crontab, the main cron variants in popular Linux distributions and the device of some of them.







Using Linux and running tasks in cron? Interested in Unix system application architecture? Then we are on the way!







Content





Origin of species



Periodic execution of user or system programs is an obvious need for all operating systems. Therefore, the need for services that allow centralized planning and execution of tasks, programmers have realized a long time ago.







Unix-like operating systems derive their pedigree from Version 7 Unix, developed in the 1970s by Bell Labs including the famous Ken Thompson. Together with Version 7 Unix, cron, a service for regular execution of superuser tasks, was also supplied.







A typical modern cron is a simple program, but the algorithm of the original version was even simpler: the service woke up once a minute, read the task plate from a single file (/ etc / lib / crontab) and performed for the superuser the tasks that should be performed at the current minute .







Subsequently, advanced options for a simple and useful service came with all Unix-like operating systems.







Generalized descriptions of the crontab format and the basic principles of the utility in 1992 were included in the main standard of Unix-like operating systems - POSIX - and thus cron from the de facto standard became the de jure standard.







In 1987, Paul Vixie, after interviewing Unix users for suggestions for cron, released another version of the daemon that fixes some of the problems of traditional cron and extends the syntax of table files.







By the third version, Vixie cron began to meet the requirements of POSIX, in addition, the program had a liberal license, or rather there was no license at all, except for the wishes in README: the author does not give guarantees, you can’t remove the author’s name, and you can only sell the program with source code. These requirements turned out to be compatible with the principles of free software, which was gaining popularity in those years, so some of the key Linux distributions that appeared in the early 90s took Vixie cron as a system distribution and are still developing it.







In particular, Red Hat and SUSE are developing the Vixie cron - cronie fork, while Debian and Ubuntu are using the original Vixie cron with many patches.







First, let's get acquainted with the user-defined crontab utility described in POSIX, after which we will analyze the syntax extensions introduced in Vixie cron and the use of Vixie cron variations in popular Linux distributions. And finally, the cherry on the cake is a parsing of the cron daemon device.







Posix crontab



If the original cron always worked for the superuser, then modern schedulers often deal with the tasks of ordinary users, which is more safe and convenient.







Cron s are shipped with a set of two programs: the constantly running cron daemon and the crontab utility available to users. The latter allows you to edit task tables specific to each user in the system, while the daemon starts tasks from user and system tables.







The POSIX standard does not describe the behavior of the daemon in any way, and only the crontab user program is formalized. The existence of mechanisms for launching user tasks, of course, is implied, but not described in detail.







There are four things you can do with the crontab utility: edit the user task table in the editor, load the table from the file, show the current task table, and clear the task table. Examples of the crontab utility:







crontab -e #    crontab -l #    crontab -r #    crontab path/to/file.crontab #     
      
      





When calling crontab -e



, the editor specified in the standard EDITOR



environment EDITOR



will be used.







The tasks themselves are described in the following format:







 # -  # # ,   * * * * * /path/to/exec -a -b -c # ,   10-    10 * * * * /path/to/exec -a -b -c # ,   10-            10 2 * * * /path/to/exec -a -b -c > /tmp/cron-job-output.log
      
      





The first five record fields: minutes [1..60], hours [0..23], days of the month [1..31], months [1..12], days of the week [0..6], where 0 - Sunday. The last, sixth, field is a string that will be executed by the standard command interpreter.







In the first five fields, values ​​can be listed with a comma:







 # ,         1,10 * * * * /path/to/exec -a -b -c
      
      





Or through a hyphen:







 # ,          0-9 * * * * /path/to/exec -a -b -c
      
      





User access to task scheduling is regulated in the POSIX files cron.allow and cron.deny which list, respectively, users with access to crontab and users without access to the program. The standard does not regulate the location of these files.







Running programs, according to the standard, must be passed at least four environment variables:







  1. HOME is the user's home directory.
  2. LOGNAME - user login.
  3. PATH is the path by which to find the standard system utilities.
  4. SHELL is the path to the shell used.


It is noteworthy that POSIX says nothing about where the values ​​for these variables come from.







Bestseller - Vixie cron 3.0pl1



The common ancestor of the popular cron variants is Vixie cron 3.0pl1, presented on the comp.sources.unix mailing list in 1992. The main features of this version we will consider in more detail.







Vixie cron comes in two programs (cron and crontab). As usual, the daemon is responsible for reading and starting tasks from the system task table and task tables of individual users, and the crontab utility is responsible for editing user tables.







Task table and configuration files



The superuser task table is located in / etc / crontab. The syntax of the system table corresponds to the syntax of Vixie cron, adjusted for the fact that the sixth column indicates the name of the user on behalf of whom the task is launched:







 #     vlad * * * * * vlad /path/to/exec
      
      





Common user task tables are located in / var / cron / tabs / username and use common syntax. When the crontab utility is launched, these files are edited on behalf of the user.







The lists of users with access to crontab are managed in the files / var / cron / allow and / var / cron / deny, where it is enough to add the user name as a separate line.







Extended syntax



Compared to POSIX crontab, Paul Vicksie's solution contains several very useful modifications to the utility task table syntax.







A new table syntax has become available: for example, you can specify days of the week or months by name (Mon, Tue, and so on):







 #         * * * Jan Mon,Tue /path/to/exec
      
      





You can specify the step through which tasks are launched:







 #       */2 * * * Mon,Tue /path/to/exec
      
      





Steps and intervals can be mixed:







 #             0-10/2 * * * * /path/to/exec
      
      





Intuitive alternatives to the regular syntax are supported (reboot, yearly, annually, monthly, weekly, daily, midnight, hourly):







 #     @reboot /exec/on/reboot #     @daily /exec/daily #     @hourly /exec/daily
      
      





Task execution environment



Vixie cron allows you to change the environment of running applications.







The USER, LOGNAME, and HOME environment variables are not just provided by the daemon, but are taken from the passwd file . The PATH variable gets the value "/ usr / bin: / bin", and SHELL gets the value "/ bin / sh". The values ​​of all variables except LOGNAME can be changed in user tables.







Some environment variables (primarily SHELL and HOME) are used by cron itself to run the task. Here's what it might look like using bash instead of the standard sh to run custom tasks:







 SHELL=/bin/bash HOME=/tmp/ # exec   bash-  /tmp/ * * * * * /path/to/exec
      
      





Ultimately, all environment variables defined in the table (used by cron or required by the process) will be transferred to the running task.







The crontab utility uses the editor specified in the VISUAL or EDITOR environment variable to edit files. If these variables are not defined in the environment where crontab was launched, then "/ usr / ucb / vi" is used (ucb is probably the University of California, Berkeley).







cron on Debian and Ubuntu



Debian and derivative developers have released a highly modified version of Vixie cron 3.0pl1. There are no differences in the syntax of table files; for users, this is the same Vixie cron. Biggest new features: syslog , SELinux, and PAM support.







Of the less noticeable, but tangible changes - the location of the configuration files and task tables.







User tables in Debian are located in the / var / spool / cron / crontabs directory, the system table is also in / etc / crontab. Debian-specific task tables are placed in /etc/cron.d, from where the cron daemon automatically reads them. User access control is regulated by the /etc/cron.allow and /etc/cron.deny files.







The default shell / bin / sh is still used as the default shell. Debian acts as a small POSIX-compatible dash shell that runs without reading any configuration (in non-interactive mode).







Cron itself in the latest versions of Debian is launched through systemd, and the launch configuration can be viewed in /lib/systemd/system/cron.service. There is nothing special in the configuration of the service; any finer task management can be done through environment variables declared directly in the crontab of each user.







cronie on RedHat, Fedora and CentOS



cronie - fork of Vixie cron version 4.1. As in Debian, the syntax did not change, but support for PAM and SELinux, working in a cluster, tracking files using inotify, and other features were added.







The default configuration is in the usual places: the system table is in / etc / crontab, packages put their tables in /etc/cron.d, user tables are in / var / spool / cron / crontabs.







The daemon runs under systemd, the service configuration is /lib/systemd/system/crond.service.







On startup, Red Hat-like distributions use / bin / sh by default, the role of which is standard bash. It should be noted that when running cron tasks through / bin / sh, the bash shell starts in POSIX-compatible mode and does not read any additional configuration when operating in non-interactive mode.







cronie in SLES and openSUSE



The German SLES distribution and its openSUSE derivative use the same cronie. The daemon here also runs under systemd, the service configuration is in /usr/lib/systemd/system/cron.service. Configuration: / etc / crontab, /etc/cron.d, / var / spool / cron / tabs. The same bash, running in POSIX-compatible non-interactive mode, acts as / bin / sh.







Vixie cron device



Modern descendants of cron have not changed radically in comparison with Vixie cron, but still have acquired new capabilities that are not required to understand the principles of the program. Many of these extensions are messy and confuse the code. The original cron source code by Paul Vixie is a pleasure to read.







Therefore, I decided to analyze the cron device using the example of a common program for both branches of cron development - Vixie cron 3.0pl1. I will simplify the examples by removing ifdefs that complicate reading and omitting the secondary details.







The work of the demon can be divided into several stages:







  1. Initialization of the program.
  2. Collect and update the list of tasks to run.
  3. The main cron loop operation.
  4. Task launch.


Let's sort them in order.







Initialization



When launched, after checking the process arguments, cron installs the SIGCHLD and SIGHUP signal handlers. The first one logs the completion of the child process, the second one closes the file descriptor of the log file:







 signal(SIGCHLD, sigchld_handler); signal(SIGHUP, sighup_handler);
      
      





The cron daemon in the system always works alone, only as the superuser and from the cron main directory. The following calls create a file lock with the PID of the daemon process, make sure that the user is correct, and change the current directory to the main one:







 acquire_daemonlock(0); set_cron_uid(); set_cron_cwd();
      
      





The default path is set, which will be used when starting the processes:







 setenv("PATH", _PATH_DEFPATH, 1);
      
      





Then the process is “demonized”: it creates a child copy of the process by calling fork and a new session in the child process (calling setsid). There is no more need for the parent process - and it completes the job:







 switch (fork()) { case -1: /*      */ exit(0); break; case 0: /*   */ (void) setsid(); break; default: /*     */ _exit(0); }
      
      





Termination of the parent process releases the lock on the lock file. In addition, you need to update the PID in the file to the child. After that, the task database is filled:







 /*    */ acquire_daemonlock(0); /*   */ database.head = NULL; database.tail = NULL; database.mtime = (time_t) 0; load_database(&database);
      
      





Further cron proceeds to the main work cycle. But before that, take a look at loading the task list.







Collecting and updating the task list



The load_database function is responsible for loading the task list. It checks the main system crontab and the directory with user files. If the files and directory have not changed, then the list of tasks is not reread. Otherwise, a new task list begins to form.







Downloading a system file with special file and table names:







 /*     ,  */ if (syscron_stat.st_mtime) { process_crontab("root", "*system*", SYSCRONTAB, &syscron_stat, &new_db, old_db); }
      
      





Loading user tables in a loop:







 while (NULL != (dp = readdir(dir))) { char fname[MAXNAMLEN+1], tabname[MAXNAMLEN+1]; /*      */ if (dp->d_name[0] == '.') continue; (void) strcpy(fname, dp->d_name); sprintf(tabname, CRON_TAB(fname)); process_crontab(fname, fname, tabname, &statbuf, &new_db, old_db); }
      
      





Then the old database is replaced by a new one.







In the above examples, calling the process_crontab function makes sure that the user exists that matches the table file name (unless it is the superuser), and then calls load_user. The latter already reads the file itself line by line:







 while ((status = load_env(envstr, file)) >= OK) { switch (status) { case ERR: free_user(u); u = NULL; goto done; case FALSE: e = load_entry(file, NULL, pw, envp); if (e) { e->next = u->crontab; u->crontab = e; } break; case TRUE: envp = env_set(envp, envstr); break; } }
      
      





Here, either the environment variable (lines of the form VAR = value) is set by the load_env / env_set functions, or the task description (* * * * * / path / to / exec) is read by the load_entry function.







The entry entity returned by load_entry is our task placed on the general list of tasks. A verbose analysis of the time format is carried out in the function itself, but we are more interested in the formation of environment variables and task launch parameters:







 /*         passwd*/ e->uid = pw->pw_uid; e->gid = pw->pw_gid; /*    (/bin/sh),      */ e->envp = env_copy(envp); if (!env_get("SHELL", e->envp)) { sprintf(envstr, "SHELL=%s", _PATH_BSHELL); e->envp = env_set(e->envp, envstr); } /*   */ if (!env_get("HOME", e->envp)) { sprintf(envstr, "HOME=%s", pw->pw_dir); e->envp = env_set(e->envp, envstr); } /*     */ if (!env_get("PATH", e->envp)) { sprintf(envstr, "PATH=%s", _PATH_DEFPATH); e->envp = env_set(e->envp, envstr); } /*     passwd */ sprintf(envstr, "%s=%s", "LOGNAME", pw->pw_name); e->envp = env_set(e->envp, envstr);
      
      





The main cycle also works with the current list of tasks.







Main cycle



The original cron from Version 7 Unix worked quite simply: in a cycle I reread the configuration, ran the tasks of the current minute as the superuser and slept until the beginning of the next minute. This simple approach on older machines required too many resources.







An alternative version was proposed in SysV, in which the daemon fell asleep either until the next minute, for which the task was defined, or for 30 minutes. Resources for re-reading the configuration and checking tasks in this mode were consumed less, but it became inconvenient to quickly update the list of tasks.







Vixie cron returned to checking task lists once a minute, since by the end of the 80s, resources on standard Unix machines became much larger:







 /*    */ load_database(&database); /*  ,       */ run_reboot_jobs(&database); /*  TargetTime    */ cron_sync(); while (TRUE) { /*  ,     TargetTime    ,    */ cron_sleep(); /*   */ load_database(&database); /*      */ cron_tick(&database); /*  TargetTime     */ TargetTime += 60; }
      
      





The cron_sleep function, which calls the functions job_runqueue (enumeration and start of tasks) and do_command (start of each individual task), is directly involved in the execution of tasks. The last function should be considered in more detail.







Task launch



The do_command function is executed in a good Unix style, that is, it does fork for asynchronous task execution. The parent process continues to launch tasks, the child process is preparing the task process:







 switch (fork()) { case -1: /*   fork */ break; case 0: /*  :          */ acquire_daemonlock(1); /*      */ child_process(e, u); /*       */ _exit(OK_EXIT); break; default: /*     */ break; }
      
      





There is a lot of logic in child_process: it takes standard output and error flows onto itself, so that it can then be sent to mail (if the MAILTO environment variable is specified in the task table), and, finally, it waits for the main task process to complete.







The task process is formed by another fork:







 switch (vfork()) { case -1: /*      */ exit(ERROR_EXIT); case 0: /* -   ,   .. */ (void) setsid(); /* *     ,    */ /*  ,    , *       */ setgid(e->gid); setuid(e->uid); chdir(env_get("HOME", e->envp)); /*    */ { /*   SHELL      */ char *shell = env_get("SHELL", e->envp); /*       , *    ,       */ execle(shell, shell, "-c", e->cmd, (char *)0, e->envp); /*  —    ?   */ perror("execl"); _exit(ERROR_EXIT); } break; default: /*    :      */ break; }
      
      





That, in general, is the whole cron. I omitted some interesting details, for example, accounting for remote users, but outlined the main thing.







Afterword



Cron is a surprisingly simple and useful program, made in the best traditions of the Unix world. She does not do anything superfluous, but she has been doing her job remarkably well for several decades now. Getting to know the code for the version that comes with Ubuntu took no more than an hour, and I got a lot of fun! Hope I could share it with you.







I don’t know about you, but it’s a little sad for me to realize that modern programming, with its tendency to re-complicate and re-abstract, has long ceased to have such simplicity.







There are many modern alternatives to cron: systemd-timers allow you to organize complex systems with dependencies, in fcron you can more flexibly control the consumption of resources by tasks. But personally, I've always had the simplest crontab.







In a word, love Unix, use simple programs and do not forget to read mana for your platform!








All Articles