Check Point: CPU and RAM optimization







Hello colleagues! Today I would like to discuss the topic “Optimizing CPU and RAM” which is very relevant for many Check Point administrators. There are frequent cases when the gateway and / or management server consume unexpectedly a lot of these resources, and I would like to understand where they are "leaking", and, if possible, make better use of them.



1. Analysis



To analyze the processor load, it is useful to use the following commands, which are entered in expert mode:



top shows all processes, the percentage of CPU and RAM consumed, uptime, process priority and more in real time and







cpwd_admin list Check Point WatchDog Daemon, which shows all the modules of the application, their PID, status and number of starts







cpstat -f cpu os CPU usage, their number and CPU time distribution in percent







cpstat -f memory os using virtual RAM, how much is all active, free RAM and more







The correct observation is that all cpstat commands can be viewed using the cpview utility. To do this, you just need to enter the cpview command from any mode in the SSH session.









ps auxwf a long list of all processes, their ID, occupied virtual memory and memory in RAM, CPU







Another variation of the command:



ps -aF will show the most expensive process







fw ctl affinity -l -a distribution of cores for different instances of the firewall, i.e. CoreXL technology







fw ctl pstat RAM analysis and general indicators of connections, cookies, NAT







free -m RAM buffer







Special attention is paid to the netsat team and its variations. For example, netstat -i can help solve the problem of monitoring clipboards. The parameter, RX dropped packets (RX-DRP) in the output of this command, as a rule, grows by itself due to drops of illegitimate protocols (IPv6, Bad / Unintended VLAN tags and others). However, if drops occur for another reason, then you should use this article to start an investigation and understand why this network interface discards packets. Having learned the reason, the operation of the appendix can also be optimized.







If the Monitoring blade is enabled, then you can view these indicators graphically in SmartConsole by clicking on an object and selecting “Device & License Information”.



It is not recommended to include Monitoring blades on an ongoing basis, but for the day for the test it is quite possible.







Moreover, you can add more parameters for monitoring, one of them is very useful - Bytes Throughput (throughput of the appendix).







If there is some other monitoring system, for example, free Zabbix based on SNMP, it is also suitable for identifying these problems.



2. RAM “leak” over time



The question often arises that over time, the gateway or management server begins to consume more and more RAM. I want to reassure: this is a normal story for Linux-like systems.



Having looked at the output of the free -m and cpstat -f memory os commands on the appendix from expert mode, you can calculate and see all the parameters related to RAM.



In fact, the available memory on the gateway at the moment is Free Memory + Buffers Memory + Cached Memory = + -1.5 GB , as a rule.



According to CP, over time, the gateway / management server is optimized and uses more and more memory, reaching about 80% of use, and stops. You can reboot the device, and then the indicator will reset. 1.5 GB of free RAM is enough for the gateway to complete all tasks, and management rarely reaches such thresholds.



Also, the conclusions of the above commands will show how much Low memory (RAM in user space) and High memory (RAM in kernel space) are used.



Kernel processes (including active modules such as Check Point kernel modules) use only Low memory. However, user processes can use both Low and High memory. Moreover, Low memory is approximately equal to Total Memory .



It should be worried only if the errors “modules reboot or processes being killed to reclaim memory due to OOM (Out of memory)” are poured into the logs. Then you should reboot the gateway and contact support if rebooting does not help.



A full description can be found in sk99547 and sk99593 .



3. Optimization



Below are questions and answers on optimizing CPU and RAM. They should honestly answer themselves and listen to the recommendations.



3.1. Was the uplining right? Was there a pilot project?



Despite competent sizing, the network could grow corny, and this equipment simply can not cope with the load. The second option, if there was no sizing as such.



3.2. Is HTTPS inspection enabled? If so, is the technology configured for Best Practice?



Refer to the article if you are our customer, or to sk108202 .



The order of rules in the HTTPS inspection policy is of great importance in optimizing the opening of HTTPS sites.



Recommended order of rules:



  1. Bypass rules with categories / URL
  2. Inspect rules with categories / urls
  3. Inspect rules for all other categories






By analogy with the firewall policy, Check Point looks for matching packets from top to bottom, therefore it is better to arrange bypass rules at the top, since the gateway will not spend resources on running through all the rules if you need to skip this package.



3.3 Are address-range objects used?



Objects with a range of addresses, for example, the network 192.168.0.0-192.168.5.0, consume significantly more RAM than 5 network objects. In general, it is considered good practice to delete unused objects in the SmartConsole, since each time the policy is installed, the gateway and management server spend resources and, most importantly, time, to verify and apply the policy.



3.4. How is Threat Prevention policy configured?



First of all, Check Point recommends placing IPS in a separate profile and creating separate rules for this blade.



For example, the administrator believes that the DMZ segment should only be protected with IPS. Therefore, so that the gateway does not waste resources processing packets with other blades, it is necessary to create a rule specifically for this segment with a profile in which only IPS is enabled.



Regarding the configuration of profiles, it is recommended that you configure it according to the best practices in this document (pages 17-20).



3.5. In the IPS settings, how many signatures are in Detect mode?



It is recommended that you carefully work through signatures in the sense that you should disable unused ones (for example, signatures for operating Adobe products require a lot of computing power, and if the customer does not have such products, it makes sense to disable signatures). Next, put Prevent instead of Detect where possible, because the gateway spends resources on processing the entire connection in Detect mode; in Prevent mode, it immediately drops the connection and does not spend resources on completely processing the packet.



3.6. What files are processed by Threat Emulation, Threat Extraction, Anti-Virus blades?



It makes no sense to emulate and analyze extension files that your users don’t download, or you think are unnecessary on your network (for example, bat, exe files can be easily blocked using the Content Awareness blade at the firewall level, so gateway resources will be spent less). Moreover, in the Threat Emulation settings, you can select Environment (the operating system) to emulate threats in the sandbox and install Environment Windows 7 when all users are working with version 10, it also makes no sense.



3.7. Are firewall rules and Application level rules arranged according to best practice?



If a rule has a lot of hits (matches), then it is recommended to put them at the very top, and rules with a small number of hits - at the very bottom. The main thing is to ensure that they do not overlap and overlap each other. Recommended Firewall Policy Architecture:







Explanations:



First Rules - the rules with the most matches are placed here

Noise Rule - a rule for dropping spurious traffic such as NetBIOS

Stealth Rule - prohibition of access to gateways and management to everyone except those sources that were specified in the Authentication to Gateway Rules

Clean-Up, Last and Drop Rules, as a rule, are combined into one rule to ban everything that was not previously allowed



Best practice data is described in sk106597 .



3.8. What are the settings for services created by administrators?



For example, some kind of TCP service is created on a specific port, and it makes sense in the Advanced settings of the service to uncheck “Match for Any”. In this case, this service will fall specifically under the rule in which it appears, and not participate in the rules where Any is in the Services column.







Speaking about services, it is worth mentioning that sometimes it is necessary to tighten timeouts. This setting will allow you to more efficiently use the resources of the gateway so as not to keep the TCP / UDP session of the protocol sessions that do not need a large timeout. For example, in the screenshot below, I rearranged the domain-udp service timeout from 40 seconds to 30 seconds.







3.9. Is SecureXL used and what is the percentage of acceleration?



You can check the quality of SecureXL performance with the basic commands in expert mode on the fwaccel stat and fw accel stats -s gateways . Next, you need to understand what kind of traffic is accelerating, which templates (templates) you can create more.



By default, Drop Templates are not enabled, their inclusion will favorably affect the operation of SecureXL. To do this, go to the gateway settings and in the Optimizations tab:







Also, when working with a cluster to optimize the CPU, you can disable the synchronization of non-critical services, such as UDP DNS, ICMP and others. To do this, go to the service settings → Advanced → Synchronize connections of State Synchronization is enabled on the cluster.







All Best Practices are described in sk98348 .



3.10. How is CoreXl used?



CoreXL technology, which allows the use of multiple CPUs for firewall instances (firewall modules), clearly helps to optimize the operation of the device. First, the fw ctl affinity -l -a command will show the firewall instances used and the processors delivered to the SND (the module that distributes the traffic to the firewall entities). If not all processors are involved, they can be added using the cpconfig command on the gateway.

Also a good story is to put a hotfix to enable Multi-Queue. Multi-Queue solves the problem when a processor with SND is used for many percent, and firewall instances on other processors are idle. Then SND would have the opportunity to create many queues for one NIC and set different priorities for different traffic at the kernel level. Consequently, CPU cores will be used more competently. Techniques are also described in sk98348 .



In conclusion, I would like to say that this is not all Best Practices for optimizing the work of Check Point, but the most popular. If you want to order an audit of your security policy or solve a problem related to Check Point, then please contact sales@tssolution.ru.



Thanks for attention!



All Articles