At the end of last year, after the deal with Rostelecom, we got at our disposal a cloud-based SD-WAN / SDN platform for providing IB services to customers. We connected to the project vendors delivering their solutions in a virtualized form, and we got a huge colossus, which we called the Unified Cybersecurity Services Platform, or EPSC. Its key feature is the delivery of security technologies from the cloud with the possibility of centralized management: deployment and change of a separate network function or global transformation in all serviced offices takes a matter of minutes. Today we’ll tell you more about its architecture and “stuffing”.
But before you get under the hood, a few words about what the EPSC actually knows. The platform includes services: email protection (Secure Email Gateway, SEG) and web applications (Web Application Firewall, WAF), network intrusion prevention (Unified Threat Management, UTM) and Anti-DDoS. All of them can be used both simultaneously and separately - here to each according to his needs.
Wrap me with traffic, please. Principles of work
In general: Customer Premises Equipment (CPE) routing devices are installed at all customer sites. CPE provides a secure tunnel to our data centers - to the EPSC platform. Thus, only the traffic that has been cleared on our perimeter security equipment gets to the customer. Moreover, thanks to SD-WAN and tunneling, the process of delivering traffic to our platform does not depend on the network of which provider the customer uses.
Platform architecture
The physical infrastructure of the EPSC is ordinary x86-servers with virtual machines deployed on them, switches and routers. The point is that this wealth can be managed flexibly and centrally. And here two magic words come to our aid - VNF and MANO.
VNF (Virtualized Network Function) - these are the network functions that are provided to end users (UTM, SEG and WAF). MANO (Management and Orchestration) - a set of tools for managing the life cycle of virtualized functions. These tools just give that centralization and flexible orchestration, which allows you to make changes at a fundamentally different speed.
And now in more detail.
At the edge of each data center is a Nokia 7750 SR-12 router with a throughput of 400 Gb / s - it provides routing on the perimeter and inside the cloud. At a lower level are the Juniper 5100 Spine Switches with 40 Gb / s ports, aggregating all network components from lower layers.
A virtual network includes two segments: SD-WAN, which connects the cloud environment to customer sites, and SDN DC, a software-defined network inside the cloud itself, through which ports, IP addressing, etc. are configured. when creating a service chain.
The server part consists of CloudBand cloud platform management servers, SDN management servers, SD-WAN (VSD, VSC) and NSG-BR management servers (virtual routers that terminate IPsec over VXLAN customer tunnels) or VXLAN, depending on which whether IPsec encryption.
White IP-addressing for customer traffic is implemented on the basis of 2 subnets with the / 22 mask (1018 addresses in each subnet): one of them is routed to the Internet, the other to RSNet.
fault tolerance
All cloud infrastructure is reserved on the basis of two geographically remote data centers in Moscow, which operate in Active – Standby mode. Moreover, any service chain (a set of functions for a particular client) is deployed in each of them as a cluster. In the event of an emergency, when switching from one node to another, only about 10 packets are lost. Since TCP was invented by clever people, the losses will be invisible.
In the future, we plan a geographical expansion beyond the Urals to reduce data transmission delays for the regions. However, today, even for Khabarovsk, delays are only about 200 ms - customers quite successfully play on Steam and watch videos from YouTube (from a real review about the service ).
In the near future it will be possible to connect one CPE to two independent WANs. The channel can be MPLS, Ethernet (copper or optics depending on the CPE used), USB LTE modem, etc. For example, the main channel can be fiber-optic, and the backup one can be wireless (because no one is immune from the excavator of fate and damaged wires). Currently, it is possible to use two WANs, but only when using two CPEs in an HA cluster.
Service Chains
The customer selects the basic parameters through the LC and starts the formation of the service chain: the network interfaces are automatically raised, IP addresses are assigned and VNF images are pumped, depending on the selected bandwidth. For example, if you ordered FortiGate Firewall for 300 Mbit, you will automatically be allocated 4 cores, 8 GB of RAM, a certain amount of disk space and the image will automatically be deployed in a 0-day configuration.
The performance of our platform is now designed for 160 service chains of 1 Gbit each, and it is planned to scale up to 1000 service chains of 50 Mbit each (since 1 Gbit chains will be much less popular, based on our market assessment).
CPE - better thinner, yes more?
Actually, for our service, we chose “thin” CPEs based on Nokia NSG-C and NSG-E200. They are subtle only because they cannot run third-party VNFs, but otherwise they have all the functionality needed to build communication between sites, tunneling (IPsec and VPN), but solutions from other manufacturers can be used for this task. There are also Access Control Lists (ACLs) for filtering: some of the traffic can be released locally, and some can be sent for fine cleaning to the EPSC data center, where the WAF, UTM and SEG virtual functions are implemented.
Also, the device can do Application Aware Routing (AAR), i.e. prioritize bandwidth depending on the type of traffic using Deep Packet Inspection (DPI) mechanisms. At the same time, both fixed and mobile networks or their combination for different traffic can be used as communication channels.
CPE activation is based on the principle of zero touch provisioning. That is, everything is elementary: we connect the CPE to the Internet, we connect a laptop from the internal Ethernet interface, click on the activation link, then turn on our LAN instead of the laptop and everything works. Moreover, in our cloud, the IPsec tunnel over VXLAN rises by default (that is, the address space is first isolated, then tunneled).
In general, the option with a thin CPE seems to us more convenient. With this approach, the customer is spared from a possible vendor-lock and can scale the solution at any time, without abutting the thickness of the CPE (50 Mbps).
In the case of a thick CPE, it, among other things, also implements the virtual functions selected by the client, and in the EPCC data center there is only service chain management. The advantage of this approach is that all traffic is processed locally. At the same time, the cost of such a device is much higher, and you can’t do more than one specific service on it - you need to purchase additional CPEs. Yes, and you can forget about multi-vendor.
But if the customer has a fairly large platform (office, branch) that passes a lot of traffic through itself (more than 200 Mbit / s), and he wants to filter everything locally, then it makes sense to pay attention to CPE Nokia NSG E300 series (and higher) c 1 to 10 Gbps bandwidth that support virtualization (VNF).
By default, SD-WAN brings all of the customer’s sites into a full mesh network, that is, it connects them together on a “each to each” basis without additional filtering. This helps to reduce delays in the internal perimeter. However, it is possible to build more complex topologies (star, complex star, multi-rank grids, etc.).
Exploitation
All VNFs that are in the cloud are currently available in the classic MSSP doctrine, that is, they are under our control. In the future, customers will have the opportunity to manage these services independently, but it is important to evaluate their competencies sensibly.
Some simple functions are now partially automated. For example, in Secure Mail Gateway, a policy is configured by default, in which the user receives a list of messages potentially related to spam once a day and can decide their fate on their own.
Some global problems and vulnerabilities that are being monitored by our JSOC will be quickly closed, including for the customers of the EPSC. And here the possibility of centralized management of policies and signatures is very helpful - this will protect customers from infections like Petya, NotPetya and others.
Honest answers to awkward questions
In fact, although we very much believe in the advantages of the service model, it is clear that for many the concept of the E-mail order is too unusual. What we are asked most often:
- FortiGate stands in your data center, that is, will you see our traffic?
- No, we do not see traffic. FortiGate receives it in encrypted form, decrypts it with the key from the customer, checks and processes it with a stream antivirus, encrypts it back with the same key, and only then sends it to your network. We have no options to wedge in this process (unless to engage in reverse engineering of harsh infobase solutions, but this is too expensive).
- Is it possible to save our network segmentation when moving to the cloud?
- And here there is a limitation. Network segmentation is not a cloudy story. UTM is a boundary firewall of a customer’s network on which segmentation cannot be done, that is, your data goes outward in a single channel. In a sense, you can segment the network inside the ACL on the CPE by setting different restrictions for different sites. But if you have business applications that need to interact differently with each other through the firewall, then you will still need to use a separate FW for internal segmentation.
- What will happen to our IP addresses when moving to the cloud?
- In most cases, you will have to change your white IP addresses to ours. An exception is possible if you have a provider non-independent autonomous system (PI AS), allocated to you, which can be re-announced from us.
Sometimes it’s more logical to choose a complex option - for example, let user traffic through UTM, and release other network segments directly to the Internet.
- And if you need to protect a site that is hosted by an external provider?
- CPE does not put there. You can connect WAF by changing the ADNS record: change the site’s IP to the IP of the service chain, and then traffic will go through our cloud without the help of CPE.
If you have other questions, write in the comments, and we will try to answer them