About this project
I’ve come across mesh networking several times now through work projects. I’ve read a lot about hypothetical attacks on them, and wanted to get more hands on experience. So, I’ve got my own little mesh network of Raspberry Pis, using the Batman routing protocol. The project is still early days, partly because I keep getting side-tracked by related things (see: other recent blog posts).
This blog post is to help me mentally “sort out” where I am in the project so far. It also covers a lot of what I’ll be talking about at BSides Liverpool next week!
What’s a mesh network?
In your home WiFi network, you have a bunch of nodes (laptops, phones, etc.) connected to a router. All of the messages being sent to and from network nodes (even to and from other nodes in the same network) will travel through the router. In mesh networks, there is no router. Every node in the mesh network can message every other node. There are various different configurations for this, such as nodes forming chains across larger geographic areas, or all nodes broadcasting their messages to every other node within a smaller area. Generally nodes within the mesh network can forward messages from other nodes onwards to their destinations.
Quite often, mesh networks are configured to be ad-hoc mesh networks. This means that the mesh configuration can dynamically change if new nodes are introduced, or other nodes drop out of range. This configuration is especially useful when nodes are mobile or likely to power on and off during standard operations.
What are mesh networks used for?
Any time you need data to be transmitted between nodes where it’s impractical to have all traffic go through single point(s). In a previous life, I learned about mesh networks for explosive ordnance disposal (EOD) robotics. As these robots could sometimes drive over lengthy distances and through terrains which would hamper signal transmissions strength, smaller robots could additionally be deployed to make up a mesh network. Instead of the operator transmitting their message directly to the EOD robot, it would be passed through smaller mobile robots. This made control signals a lot more reliable.
More recently, I have seen the proposed usage of mesh networks for future Connected/Autonomous Vehicles (CAVs). Future vehicles are hypothesized to be constantly sharing information with eachother, as well as taking information from roadside infrastructure (such as weather sensors and digital speed signs). Since vehicles tend to move around quite a lot, it would be impractical to have them join a different router every minute. Ad-hoc mesh networks where every node can communicate with all others within the area would be much more practical.
Attacks on mesh networks
A large range of attack types are possible within mesh networks. However, the susceptibility of the network will depend upon the exact protocol is being used to create the mesh. Different protocols will have different routing behaviors, and different security features baked in. The exact usage of the mesh network will also affect the type of attacks which can be used; a high-power network designed to handle large amounts of data with some expectation of data loss is very different to a low-powered one designed for those few messages to be transmitted with high levels of accuracy.
Within the proposed CAV mesh networks, emergency services such as ambulances will automatically transmit messages which cause traffic lights to favor them. Within the proposed protocols, these messages do not use any unique sequencing or timestamping value. As such, replay attacks could be highly effective. A malicious party could record an “I’m an ambulance!” message, and replay it any time they approach traffic lights to gain favorable lights. A protocol which makes use of the unfortunately named nonce (number once) value would not be susceptible to this.
Again depending upon the exact network configuration, data flooding attacks can be effective. Networks of low-powered devices which are designed to largely be inactive can easily be overwhelmed by having to deal with large amounts of data transmission. By forcing a device to remain within a wakeful state for a much greater proportion of time than intended, batteries can be used up more quickly than anticipated. Similarly, a network which relies upon low transmission error rates could suffer greatly from large amounts of data jamming interfaces and preventing legitimate messages from getting through.
Nodes within ad-hoc mesh networks will generally form their own routing tables based upon their interactions with other nodes. The insertion of malicious data into whichever mechanism is used to do this can cause inefficient routes to be generated, or even black hole nodes which drop all packets to be successfully entered into routes. This will in part be affected by the requirements to enter the network; if nodes can join without needing to authenticate to each-other in some way, there would be little to prevent malicious nodes being joined to the network and sending false data.
Ad-hoc mesh networks can also be targeted by Byzantine faults. This refers to a scenario whereby an error appears differently depending upon the viewpoint. Node A might see node B as the issue, whereas B sees node C as the issue, etc. When there is no single clear point of failure within the network, poor quality routing decisions can be taken.
This is of course not an exhaustive list- it is simply some of the more generic attacks which can be used to target mesh networks.
Building a simple mesh network
I’ve built up a little mesh network of Raspberry Pis at home. To handle the routing between the nodes, I’ve been using the Batman routing protocol, and loosely following this guide: https://github.com/binnes/WiFiMeshRaspberryPi
Time for a quick segue: What’s Batman?
Batman routing protocol
Better Approach To Mobile Ad-hoc Networking, or BATMAN, is a routing protocol for multi-hop ad-hoc mesh networks1. It allows you to create mesh networks which automatically reconfigure themselves depending on available connectivity, where data can travel as many hops as needed to reach its destination. So, if one node was to drop out of communication, the other nodes would automatically reconfigure themselves to route around it. Batman is also designed to allow you to connect your mesh network up to other networks (such as your standard Wi-Fi network or even the internet), which is cool.
Many routing protocols used for mesh networks sit at layer 3 of the OSI model, whereas Batman sits at layer 2. For easy reference, here’s a brief recap on the OSI model:
7 |
Application |
Where the end user can interact with the data |
HTTP, Telnet |
6 |
Presentation |
Converts data to readable formats, includes data encryption/decryption |
SSL, TLS |
5 |
Session |
Maintains connections, e.g. controlling ports and connection sessions |
APIs, Sockets |
4 |
Transport |
Data transmission |
TCP, UDP |
3 |
Network |
How data is routed through a network |
IP, IPSec, ICMP |
2 |
Data Link |
The format of data on the network |
Ethernet, PPP |
1 |
Physical |
Raw bits of physical medium such as cables |
Fiber, Wireless, Coax |
Essentially, Batman simulates the devices being physically connected. As such, it creates a new network interface on devices (e.g. bat0), which “real” network interfaces can be assigned to (e.g. eth0). The routing performed by Batman is carried out using the Linux kernel module, batman-adv. This means that you can (within reason) use whatever protocols you want for communication in your network. The decision to sit at layer 2 also makes Batman routing very quick. Traffic is handled in the kernel, without the overheads of having to send data to and from “userland”. This is especially beneficial in low-powered devices.
Sitting at layer 2 does make Batman a bit of a pain to work with sometimes. Tools like ping, traceroute, and tcpdump sit at the higher levels. As such, the Batman team have provided their own equivalents- batctl ping, batctl traceroute, batctl tcpdump.
Batman contains a whole range of novel features for creating and maintaining routes. I’m not going to go into detail on them here (perhaps in a future blog post!) but the general concepts are:
- Started out as an improvement to the Optimised Link State Routing (OLSR) protocol.
- Aims to decentralize knowledge of optimal routes between nodes, each node only knows the next best node to get its data to the rest of the network.
- Nodes broadcast Originator Messages (OGMs) to alert neighbours of their existence, which are automatically rebroadcast. Nodes can use the “reflection” of their own OGMs to assess link quality.
- Nodes can identify if they originated a message and won’t rebroadcast it, so you don’t get needless “echoes”.
- Nodes inform each other about their onward link quality- just because node A can reach node B reliably, doesn’t mean that node B can reach C, D, and E reliably.
- Routes with more hops are penalized, even if they’re of high quality.
- Nodes with multiple physical network interfaces do all sorts of fancy things to reduce the risk of interference.
My mesh network
Back to my mesh network. I have 5 raspis, with only 3 of them currently inducted into the mesh network. Batman is a bit finicky at times. The 3 raspis I’ve been using are running Raspian Buster Lite. On one of them I’ve been able to use apt install batman with zero issues, but with two I’ve had to manually build it from source. I have no idea why, it just seems to be difficult sometimes.
Two of the raspis are connected using wireless, whereas two are connected via ethernet. I went for this configuration so I could be “sure” that data was travelling between all 3 nodes. As I didn’t want to faff with Batman on my Linux VM, I’m using a router to connect to the mesh network. Here’s an MS paint diagram:
For debugging, I can plug any raspi in to the router and reach it over the eth0 interface. But once the nodes are in their mesh, I can ssh in to Raspi1 and use that to reach 2 or 3. Due to Batman sitting at layer 2, I can SSH to any node in the mesh using it. At some point I’ll add more nodes in to the mesh, but frankly I’m lazy and so haven’t gotten around to it.
Where are you at now?
Right now, I’m early days in the project. I’ve gone down various rabbit holes. I wrote a simple guest book in Python which I can use to generate simple TCP traffic between nodes. Details here: https://www.vicharkness.co.uk/2021/08/30/python-guest-book/
From there, I’ve been down another rabbit hole getting to grips with Scapy. Details here: https://www.vicharkness.co.uk/2021/08/30/manually-initialising-connections-with-scapy/
Most recently, I’ve been able to manually craft Batman heartbeat messages using Scapy, I think. They look okay, but I’ve not tried introducing them into a Batman mesh yet to investigate the effects- I need to get more nodes in to the mesh first ideally! I’ll probably be doing a blog post on the process and the results once it’s a bit more refined.
In general, I’d like to test out some of the mesh network attacks which I discussed in the Connected/Autonomous Vehicle (CAV) security paper I wrote at F-Secure. The paper is massive and very dry, but if you’re interested in EU proposals for CAV security it can be found here: https://www.f-secure.com/gb-en/consulting/our-thinking/future-threats-to-its-networks-and-cav-infrastructure
For example, what would happen if I could introduce several malicious nodes into the network? Could I cause other nodes to make improper routing decisions? What is the smallest number of nodes I could do this with? Say I had a single node, and manipulated the link quality metrics of the heartbeat messages passing through it- could I make my malicious node appear of high quality? Perhaps I can deliberately target specific nodes- letting their heartbeat messages go through so that link quality is preserved, whilst dropping any data messages from that specific node.
There are many different options on what to play with in a setup such as this. Hopefully I end up somewhere interesting!
1 Project website: https://www.open-mesh.org/projects/open-mesh/wiki
Be First to Comment