Fault-tolerant IPoE network at hand

Hello. So there is a network of 5k clients. Recently a not-so-pleasant moment came out - in the center of the network and we have Brocade RX8 and it started sending a lot of unknown-unicast packets, since the network is divided into vlans - this is partly not a problem BUT there are special vlans for white addresses, etc. and they are stretched in all directions of the network. So now imagine an incoming stream to the address of a client who does not study as a boarder and this stream flies towards the radio link on some (and so on everything) village - the channel is clogged - customers are evil - sad ...







The task is to turn the bug into a feature. I thought in the direction of q-in-q with a full client-vlan, but all kinds of pieces of iron like P3310 when I turn on dot1q stop passing DHCP, they still do not know how selective qinq and a lot of underwater crutches like that. What is ip-unnambered and how does it work? If very briefly, the gateway address + route on the interface. For our task, we need to: cut shapers, distribute addresses to clients, add routes to clients through specific interfaces. How to do all this? Sheyper - lisg, dhcp - db2dhcp on two independent servers, dhcprelay is spinning on access servers, ucarp also works on access servers - for backup. But how to add routes? You can add everything in a large script in advance - but this is not true. So we will fence a self-made crutch.







Having thoroughly rummaged on the Internet, I found a wonderful high-level library for c ++ that allows you to nicely sniff traffic. The algorithm of the program that adds routes is the following - we listen to the arp interface requests, if we have a server address on the lo interface that we request, add the route via this interface and add a static arp entry to this ip - in general, a few copy-paste, a little gag and you're done







Sources of the 'minibus'
#include <stdio.h> #include <sys/types.h> #include <ifaddrs.h> #include <netinet/in.h> #include <string.h> #include <arpa/inet.h> #include <tins/tins.h> #include <map> #include <iostream> #include <functional> #include <sstream> using std::cout; using std::endl; using std::map; using std::bind; using std::string; using std::stringstream; using namespace Tins; class arp_monitor { public: void run(Sniffer &sniffer); void reroute(); void makegws(); string iface; map <string, string> gws; private: bool callback(const PDU &pdu); map <string, string> route_map; map <string, string> mac_map; map <IPv4Address, HWAddress<6>> addresses; }; void arp_monitor::makegws() { struct ifaddrs *ifAddrStruct = NULL; struct ifaddrs *ifa = NULL; void *tmpAddrPtr = NULL; gws.clear(); getifaddrs(&ifAddrStruct); for (ifa = ifAddrStruct; ifa != NULL; ifa = ifa->ifa_next) { if (!ifa->ifa_addr) { continue; } string ifName = ifa->ifa_name; if (ifName == "lo") { char addressBuffer[INET_ADDRSTRLEN]; if (ifa->ifa_addr->sa_family == AF_INET) { // check it is IP4 // is a valid IP4 Address tmpAddrPtr = &((struct sockaddr_in *) ifa->ifa_addr)->sin_addr; inet_ntop(AF_INET, tmpAddrPtr, addressBuffer, INET_ADDRSTRLEN); } else if (ifa->ifa_addr->sa_family == AF_INET6) { // check it is IP6 // is a valid IP6 Address tmpAddrPtr = &((struct sockaddr_in6 *) ifa->ifa_addr)->sin6_addr; inet_ntop(AF_INET6, tmpAddrPtr, addressBuffer, INET6_ADDRSTRLEN); } else { continue; } gws[addressBuffer] = addressBuffer; cout << "GW " << addressBuffer << " is added" << endl; } } if (ifAddrStruct != NULL) freeifaddrs(ifAddrStruct); } void arp_monitor::run(Sniffer &sniffer) { cout << "RUNNED" << endl; sniffer.sniff_loop( bind( &arp_monitor::callback, this, std::placeholders::_1 ) ); } void arp_monitor::reroute() { cout << "REROUTING" << endl; map<string, string>::iterator it; for ( it = route_map.begin(); it != route_map.end(); it++ ) { if (this->gws.count(it->second) && !this->gws.count(it->second)) { string cmd = "ip route replace "; cmd += it->first; cmd += " dev " + this->iface; cmd += " src " + it->second; cmd += " proto static"; cout << cmd << std::endl; cout << "REROUTE " << it->first << " SRC " << it->second << endl; system(cmd.c_str()); cmd = "arp -s "; cmd += it->first; cmd += " "; cmd += mac_map[it->first]; cout << cmd << endl; system(cmd.c_str()); } } for ( it = gws.begin(); it != gws.end(); it++ ) { string cmd = "arping -U -s "; cmd += it->first; cmd += " -I "; cmd += this->iface; cmd += " -b -c 1 "; cmd += it->first; system(cmd.c_str()); } cout << "REROUTED" << endl; } bool arp_monitor::callback(const PDU &pdu) { // Retrieve the ARP layer const ARP &arp = pdu.rfind_pdu<ARP>(); if (arp.opcode() == ARP::REQUEST) { string target = arp.target_ip_addr().to_string(); string sender = arp.sender_ip_addr().to_string(); this->route_map[sender] = target; this->mac_map[sender] = arp.sender_hw_addr().to_string(); cout << "save sender " << sender << ":" << this->mac_map[sender] << " want taregt " << target << endl; if (this->gws.count(target) && !this->gws.count(sender)) { string cmd = "ip route replace "; cmd += sender; cmd += " dev " + this->iface; cmd += " src " + target; cmd += " proto static"; // cout << cmd << std::endl; /* cout << "ARP REQUEST FROM " << arp.sender_ip_addr() << " for address " << arp.target_ip_addr() << " sender hw address " << arp.sender_hw_addr() << std::endl << " run cmd: " << cmd << endl;*/ system(cmd.c_str()); cmd = "arp -s "; cmd += arp.sender_ip_addr().to_string(); cmd += " "; cmd += arp.sender_hw_addr().to_string(); cout << cmd << endl; system(cmd.c_str()); } } return true; } arp_monitor monitor; void reroute(int signum) { monitor.makegws(); monitor.reroute(); } int main(int argc, char *argv[]) { string test; cout << sizeof(string) << endl; if (argc != 2) { cout << "Usage: " << *argv << " <interface>" << endl; return 1; } signal(SIGHUP, reroute); monitor.iface = argv[1]; // Sniffer configuration SnifferConfiguration config; config.set_promisc_mode(true); config.set_filter("arp"); monitor.makegws(); try { // Sniff on the provided interface in promiscuous mode Sniffer sniffer(argv[1], config); // Only capture arp packets monitor.run(sniffer); } catch (std::exception &ex) { std::cerr << "Error: " << ex.what() << std::endl; } }
      
      







Libtins installation script
 #!/bin/bash git clone https://github.com/mfontanini/libtins.git cd libtins mkdir build cd build cmake ../ make make install ldconfig
      
      







The command to build the binary
 g++ main.cpp -o arp-rt -O3 -std=c++11 -lpthread -ltins
      
      







How to run it?
 start-stop-daemon --start --exec /opt/ipoe/arp-routes/arp-rt -b -m -p /opt/ipoe/arp-routes/daemons/eth0.800.pid -- eth0.800
      
      







Yes - it is re-editing the tables at the HUP signal. Why did not use netlink? Laziness is simply yes, and Linux is a script on a script - so that everything is fine. Well routes routes, what's next? Next, we need to send the routes that are on this server to the border - here, due to the same outdated piece of iron, we went the way with the least resistance - we put this task on BGP.







Bgp config
hostname *******

password *******

log file /var/log/bgp.log

!

# number of address, address and network invented

router bgp 12345

bgp router-id 1.2.3.4

redistribute connected

redistribute static

neighbor 1.2.3.1 remote-as 12345

neighbor 1.2.3.1 next-hop-self

neighbor 1.2.3.1 route-map none in

neighbor 1.2.3.1 route-map export out

!

access-list export permit 1.2.3.0/24

!

route-map export permit 10

match ip address export

!

route-map export deny 20



We continue. In order for the server to respond to arp requests, you must enable the arp proxy.









 echo 1 > /proc/sys/net/ipv4/conf/eth0.800/proxy_arp
      
      





Go ahead - ucarp. Scripts for launching this miracle we write ourselves







Example of starting a single daemon
 start-stop-daemon --start --exec /usr/sbin/ucarp -b -m -p /opt/ipoe/ucarp-gen2/daemons/$iface.$vhid.$virtualaddr.pid -- --interface=eth0.800 --srcip=1.2.3.4 --vhid=1 --pass=carpasword --addr=10.10.10.1 --upscript=/opt/ipoe/ucarp-gen2/up.sh --downscript=/opt/ipoe/ucarp-gen2/down.sh -z -k 10 -P --xparam="10.10.10.0/24"
      
      







up.sh
 #!/bin/bash iface=$1 addr=$2 gw=$3 vlan=`echo $1 | sed "s/eth0.//"` ip ad ad $addr/32 dev lo ip ro add blackhole $gw echo 1 > /proc/sys/net/ipv4/conf/$iface/proxy_arp killall -9 dhcrelay /etc/init.d/dhcrelay zap /etc/init.d/dhcrelay start killall -HUP arp-rt
      
      







down.sh
 #!/bin/bash iface=$1 addr=$2 gw=$3 ip ad d $addr/32 dev lo ip ro de blackhole $gw echo 0 > /proc/sys/net/ipv4/conf/$iface/proxy_arp killall -9 dhcrelay /etc/init.d/dhcrelay zap /etc/init.d/dhcrelay start
      
      







For dhcprelay to work on an interface, it needs an address. Therefore, on the interfaces that we use, we will add left addresses - for example, 10.255.255.1/32, 10.255.255.2/32, etc. I won’t tell you how to set up relays - everything is simple there.







So what we have. Backup gateways, auto-tuning routes, dhcp. This is the minimum set - even lisg is screwed on it and we already have a shaper. Why is everything so long and cool? Isn't it easier to take accel-pppd and generally use pppoe? No, it’s not easier - people can hardly push a patchcord into a router, not to mention pppoe. accel-ppp is a cool thing - but it didn’t work for us — a bunch of errors in the code — it rolls in, cuts crookedly, and the sad thing is that if it is brightened up, then people need to restart everything — the phones are red — in general, it didn’t fit. What is the plus of using ucarp rather than keepalived? Yes, in everything - there are 100 gateways, keepalived and one error in the config - everything does not work. 1 gateway does not work with ucarp. Regarding security, they say the addresses will be left-handed and will be used on the ball - to control this moment, we configure dhcp-snooping + source-guard + arp inspection on all switches / alt / bases. If the client does not have dhpc but statics - acces-list on the port.







Why was this all done? To destroy traffic we do not like. Now, each switch has its own vlan and unknown-unicast is no longer afraid, since it only needs to go to one port and not all ... Well, the side effects are a standardized hardware configuration and a high efficiency of address space allocation.







How to configure lisg is a separate topic. Links to libraries are attached. Perhaps someone will help the above in the implementation of their tasks. We are not yet introducing version 6 on our network - but there will be a problem - there are plans to rewrite lisg for version 6 well, and you will need to tweak the program, which adds routes.







Linux ISG

DB2DHCP

Libtins



UPD.



Everything turned out to be a little more complicated ... I had to write a more or less normal demon. And do without proxy arp. Now my daemon answers arp, it also adds / removes routes on subnets to clients, I also had to learn how to work with netlink. And it turned out one feature - when Linux recognizes the arp to some address, then after finding the interface - it takes the first address that it comes across from the last — which some customers don’t respond to — is solved using arptables.



In general, there is already a normal option for the link - there, by the way, listening to changes in routes and addresses and adding a route removal via netlink are implemented (with the latter, the headache was terrible)



github



All Articles