I have been working in the IoTOps world for sometime.
During that time I had to figure how to control ARMv8 (ARM64) IoT devices’ bandwidth or internet connection if you will, and simulate how traffic behaves.
Tools I used:
- Nvidia’s Jetson Xavier with Jetpack 4.5.1 installed.
- k3s single node cluster based on containerd.
While working on this task, I couldn’t find a working solution that supports HTTPS/TLS and figured I should write about it.
What is Chaos engineering?
Chaos engineering is the discipline of experimenting on a software system in production in order to build confidence in the system's capability to withstand turbulent and unexpected conditions.
While Wikipedia describes Chaos engineering as discipline that happens in a production environment, this post demonstrates how to generate chaos in any environment desired.
What is Chaos in the IoT sphere?
Most examples online describe Chaos engineering in a cloud or on premise context while there is also another context: Internet of Things or just IoT.
Distributed IoT products are growing. There are more and more companies building hardware and software solutions using IoT fleets, deployed in the fields such as Greeneye or generally spread worldwide.
Same as with the cloud, it is important to test these IoT devices by making sure that even when things go out of control, everything will either survive a disaster or come back to normal state after an unexpected event(s).
Why simulating traffic?
Working with in IoTOps for sometime has taught me 3 things:
- It is possible to touch the “edge of technology”, meaning that you might do something which no one wrote about and Stackoverflow won’t be there to help you.
- Physical devices, located remotely, require a well designed deployment system.
- Connectivity is luxury - IoT devices most likely experience connectivity issues, and depending on the product, this might be either a slow internet access or no internet access.
Focusing on the 3rd point, this is an important topic to cover before focusing on the 2nd point. Even if the deployment is working in a development environment, it might no work the same in production. Therefore, it’s important to be able to control and simulate traffic behaviour in the lab.
Could Toxiproxy help?
I tried using this tool after reading about it on Timothy Agustian’s blog post about Simulating Customized Chaos in Golang using Toxiproxy. I followed a Github issue, explaining how to use Toxiproxy and I set this up on my Jetson device. But, I got stuck with a problem: TLS requests didn’t go through and I couldn’t control the network as I wanted.
It seems like someone has tried to add support for TLS mode but this was awhile ago and nothing happened since then.
I had to come up with a different solution.
Burp Suite to the rescue
I decided to ditch the idea of running a proxy on the destination device i.e the Jetson and instead forward all of its requests to a host machine i.e my Macbook.
I already had Burp installed. I had to do two things before trying my idea:
- Install Burp’s certificates.
- Forward all k3s requests through my host machine.
Nick Frichette explains how to install Burp’s certificates on his Intercept Linux CLI Tool Traffic blog post:
- Open terminal
export http_proxy=http://<HOST MACHINE IP>:<PORT SET IN BURP>"
export https_proxy=http://<HOST MACHINE IP>:<PORT SET IN BURP>"
wget http://<HOST MACHINE IP>:<PORT SET IN BURP>/cert
openssl x509 -in cert -inform DER -out burp_suite.crt
sudo mv burp_suite.crt /usr/local/share/ca-certificates
Or follow Install Burp’s certificates.
Forwarding requests through k3s is simply done by adding an environment variable and restarting k3 service:
sudo vim /etc/systemd/system/k3s.service
Environment="http_proxy=http://<HOST MACHINE IP>:<PORT SET IN BURP>”
sudo service k3s restart
Now that everything is set, all there’s left to do is to run pods and checkout Burp’s Proxy → HTTP history.
Burp enables the option to Intercept HTTP traffic with Burp Proxy, which includes also the ability to modify requests and responses.
Local docker registry sniffing as example
This task required me to investigate a situation where containerd failed to pull docker images. One of the pods within the Jetson, is a local docker registry. The reason for that is caching: one device pulls from remote registry, all others pull from the local registry.
In order to reproduce this issue, I had to figure how to either drop the responses of the local docker registry requests or change their actual content.
Using burp this became easier to test:
And as a side note, in case you have secrets within your docker images, you should definitely rethink using the —secret option in docker buildx, otherwise secrets are completely exposed:
I wrote this post because I couldn’t find any best practice about controlling IoT device’s network and testing it in a chaotic environment.
IoT devices and IoTOps are in the rise. It seems like there’s a lot to learn about these topics, especially when it comes to testing, deploying and solving real world problems.
Using a man in the middle (MITM) to control the network and simulate different types of possible problems seems to be a good place to start.