I’ve always been interested in tinkering with Raspberry Pi, but was put-off by the size, power requirements, and necessity to know my way around Linux. Arduino micro-controllers on the other are easier to interface with various sensors, since there is no operating system, and everything is programmed in C through an IDE. They are also much smaller and consume far less power.
However, there are many things that Raspberry Pi can do beyond being a cheap desktop computer. I recently changed jobs and started working at Cloudflare. The technologies are completely new to me, having spent the last 20 years with Active Directory, Exchange, and SharePoint. I had no idea about the Linux web-hosting world of Apache/Nginx, SSL, CDN, etc.
I started off with building Debian VMs but wanted something a bit more tactile. I then picked up a Raspberry Pi Zero W. This tiny thing is a full Linux computer capable of pretty much anything. I quickly learnt my way around Linux basics, installed Apache, and propped up a basic “hello world” website. This is a great proof of concept exercise to show what’s possible with something so small. However, it isn’t really production ready, nor scalable to handle real-world visitors to your site. It’s not fast, nor secure, and cannot be configured for high availability.
Cloudflare’s mission is to make a better Internet for all… one that is secure, offering a speedy browsing experience, while being easy to deploy, and equally available for visitors no matter where they are in the world. What can Cloudflare do to improve the experience of visitors who browse content hosted on my tiny little Pi? Sounds like an extreme and unrealistic exercise doomed for failure. Can I really host “production” traffic on a Raspberry Pi?
I want something portable which I can power on anywhere in the world, and have a website available without any fiddling around with updating DNS, etc. I want to be a modern-day John Cusack, making himself heard over the Internet.
While perfectly serviceable, the Raspberry Pi Zero W does not have a built-in ethernet port. It has Wi-Fi, which works fine with my home router. However, the office network does not allow my Pi Zero W to connect, which is why I chose to move to Pi 4. This big brother is much faster, has more memory, and can be powered via PoE with an optional “hat”. The plan is to cluster 4x Pi 4s, connected to both Ethernet and Wi-Fi networks, and powered by PoE to keep the number of cables to a minimum.
On the network side, Cloudflare really does make things easy. We will learn to use the following Cloudflare products:
1. DNS – Cloudflare will be our authoritative DNS provider, and also proxy our DNS, keeping our origin IPs hidden.
2. SSL – Visitors connecting to Cloudflare will do so through a secure SSL connection. Furthermore, Cloudflare will connect to our Raspberry Pi through another secure SSL connection.
3. CDN – We will configure Cloudflare to aggressively cache and compress our web content. This allows for our web pages to be rendered quickly for visitors no matter where they are located.
4. DDoS protection – Cloudflare’s network is able to absorb and/or deflect network layer attacks by default.
5. WAF – Cloudflare’s Web Application Firewall will block malicious traffic, preventing attacks on our application layer.
6. Argo Tunnel – This technology allows us to simply power-on our Raspberry Pi anywhere in the world. A secure tunnel is established to Cloudflare, and our web content becomes available without needing to change DNS records. Everything comes online automatically.
7. Load Balancing – We will scale up for production traffic by clustering multiple Raspberry Pi, each having a synchronised copy of our web content. Cloudflare will automatically load-balance requests to each cluster node as needed.
At the application level, I’ll be using:
1. Nginx as a web server. I chose this over Apache because Cloudflare uses Nginx in-house. It’s also more resource efficient than Apache.
2. UFW – This is an additional firewall which we can use to block everything other than ports 22 (SSH), 80 (HTTP), 443 (HTTPS), and 5900 (VNC).
3. Bludit – This is a flat file CMS, which doesn’t use a traditional database (such as MariaDB or MySQL), rather uses JSON to store content in text files. Bludit is also very resource efficient.
4. Syncthing – A peer-to-peer file sync application that will keep our Bludit content in sync across all four cluster nodes. This will allow us to use Cloudflare’s load balancing to distribute traffic across all four cluster nodes.
5. Crontab – The default task-scheduler which will be used to reset file ownership information after Syncthing is done copying over updated files to the cluster nodes. Syncthing does not set ownership of files, so we will script this using Crontab.
6. VNC Connect – Allows remote desktop functionality over the Internet. All cluster nodes can now be managed both via SSH and VNC from anywhere in the world.
Stay tuned for the next article, when we'll finally get started with building our cluster...