Venus: Semi-Manual FreeBSD 11-CURRENT AMD64 ZFS+UEFI Installation

2016-02-02 10_50_41

In this post I’ll be describing how to do a semi-manual installation of a FreeBSD 11 ZFS system with UEFI boot. Big thanks to Ganael Laplanche for this mailing list entry, as it was of great help. Some things have changed since then which makes the process a little simpler, and that’s why I’m writing this. :) I’ll also include some steps I consider best practices.

The steps outlined below are generalized from how I installed FreeBSD on my dev box named Venus.

As I’m writing this, the latest FreeBSD 11 snapshot is of r294912 (2016-01-27), and does not yet support automatic installation to ZFS on UEFI systems. I’m using this snapshot for installing the system.

Start the installer normally, and go through the steps. When you get to the part where it asks whether you want to install to UFS, ZFS, etc, chose to open a shell.

Create the partition scheme for each drive you will be using in your root zpool, and make sure to use unique labels. Make sure to replace ‘ada0’ with whatever is appropriate for you.
gpart create -s gpt ada0
gpart add -t efi -s 800k ada0
gpart add -t freebsd-zfs -a 1m -s 55g -l YourLabel ada0

I aligned the freebsd-zfs partition to 1M to ensure it’s 4k aligned, and to leave room for boot loader changes. I specified a 55GB partition because my SATADOM’s are 64GB, and I want to leave some free space in case I need to replace one of them with another which isn’t the exact same size, and because I want to leave some room for other things such as a future log, cache or swap partition.

Create the zpool and add datasets, then exit the shell. All datasets within sys/ROOT/default are optional.
zpool create -m none -o altroot=/mnt -O atime=off -O checksum=fletcher4 -O compress=lz4 sys gpt/YourLabel
zpool set bootfs=sys/ROOT/default sys
zfs create -p sys/ROOT/default/var
zfs create -o compress=gzip-9 -o setuid=off sys/ROOT/default/var/log
zfs create -o compress=gzip-9 -o setuid=off sys/ROOT/default/var/tmp
zfs create sys/ROOT/default/usr
zfs create -o compress=gzip-9 sys/ROOT/default/usr/src
zfs create sys/ROOT/default/usr/obj
zfs create sys/ROOT/default/usr/local
zfs create sys/data
zfs create -o mountpoint=/usr/home -o setuid=off sys/data/homedirs
zfs mount -a
exit

Now the installer should continue doing its thing. Do what you’d normally do, but when it asks if you want to open a shell into the new environment, say yes.

Execute this commands to ensure ZFS mounts all datasets on boot:
echo 'zfs_enable="YES"' >> /etc/rc.conf

Configure the (U)EFI partitions by doing the following for each drive that is a member of the ‘sys’ zpool: (remember to replace ‘ada0’ with whatever is appropriate for you)
mkdir /mnt/ada0
newfs_msdos ada0p1
mount -t msdosfs /dev/ada0p1 /mnt/ada0
mkdir -p /mnt/ada0/efi/boot
cp /boot/boot1.efi /mnt/ada0/efi/boot/BOOTx64.efi
mkdir -p /mnt/ada0/boot
cat > /mnt/ada0/boot/loader.rc << EOF
unload
set currdev=zfs:sys/ROOT/default:
load boot/kernel/kernel
load boot/kernel/zfs.ko
autoboot
EOF

At this time you can double check you have the expected file hierarchy in /mnt/ada0:

(cd /mnt/ada0 && find .)

Should output:
.
./efi
./efi/boot
./efi/boot/BOOTx64.efi
./boot
./boot/loader.rc

Now, if you had more than one drive, you can just copy the contents of /mnt/ada0 to the appropriate mountpoints. cp -R /mnt/ada0/ /mnt/ada1/

Remember to unmount the EFI partitions, then exit the shell and reboot into the new system. :)

Once you’re in the new system, you should create a read-only ZFS dataset for /var/empty.

PS: Similar to how you need to re-apply bootcode when upgrading zpool version, you should probably re-copy /boot/loader.efi to the EFI partition as ./efi/boot/BOOTx64.efi. I am not sure if this is strictly necessary… But it shouldn’t hurt. :) I’ll update this paragraph when I get a confirmation one way or the other.

Introduction: Venus, the FreeBSD dev box

System Overview (Venus)

(Also known as: The Mini-ITX quarter-depth chassis that could fit a Micro-ATX mainboard)

I’ll be using this system for my FreeBSD hacking, but this post focus on the system hardware.

The story of this system started when I ordered my Super Micro quarter-depth (SC505-203B) Atom-based firewall named Kuiper from Nextron. The chassis specifications state it’s 24.9cm deep and 43.7cm wide (9.8″ and 17.2″ respectively), and would only fit mini-ITX boards.

As I was also interested in a Xeon E3v5 virtualization server, and would prefer it to be quarter-depth as well, I was a little disappointed that Super Micro didn’t have any mini-ITX mainboards for that platform. Nextron helpfully suggested that they could check if the Super Micro X11SSL-F mainboard (micro-ATX) would fit when building my firewall, as they had it on hand. It has the dimensions 24.4cm by 24.4 cm (9.6″ by 9.6″). I was not expecting it to fit as the depth of the mainboard was a mere 5 mm (0.2″) less than the chassis.

A few days later, when the firewall was built, they reported back: It fits! But they would have to sacrifice one of two 2×2.5″ drive bays. It was also a very, very snug fit, as can be seen in the image below. Excellent! I only need two data drives in that system anyway, and COULD use them as root drives if necessary.

Snug Fit 2

Snug fit! This is NOT the I/O side!

I was happy. Now, considering this chassis was designed for Atom systems, the PSU would probably not be capable of powering a 80W CPU, for not to mention the potential cooling trouble. Nextron suggested getting a 45W CPU, but I decided to pay the premium of getting a low-powered CPU, the Xeon E3-1240L, with a TDP of only 25W.

System Parts
Chassis SuperChassis CSE-505-203B
Mainboard Super Micro X11SSL-F
CPU Xeon E3-1240L
RAM 32GB: 2x 16GB DDR4 2133MHz ECC Unbuffered DIMM
HDD 2x Seagate Laptop Thin SSHD HDD/SSD Hybrid – 500GB SATA3 5400RPM 2.5″
SSD 2x Supermicro SATADOM 64GB MLC – Vertical (added later)

Picture Gallery

Please note the pictures above were taken at different points throughout my process of modifying the system. The final setup (for now) has two SATADOMs and two SSHD’s, and is shown in this posts featured image, and is the last entry in the gallery above.

I should probably also mention that the chassis ‘curvature’/apparent bending seen in some of the photos is a trick of the lens.

Introduction: Kuiper, PFSense Firewall

Kuiper Front Side

This system is a 1U quarter-depth Super Micro, Intel Atom-based system. Its name is slightly irregular considering my server naming scheme is planets orbiting Sol, but I figure “Kuiper” isn’t too hard to associate with a firewall, as the Kuiper belt encapsulates our solar system.

Low power consumption and threading was important when deciding on which hardware to get for this sytem. I expect it to push 1gbps (inter-VLAN) without too much trouble. It has five ethernet ports, where one is dedicated for IPMI. WAN has a dedicated ethernet port, and I’ll be setting up LAGG between at least two ports for the LAN side, for more efficient routing between VLANs, once the 16-port switch arrives.

System Parts
Chassis SuperChassis CSE-505-203B
Mainboard Super Micro A1SAI-2550F
CPU Intel Atom C2550 (14W, embedded)
RAM 1x 8GB DDR3-1600 ECC SODIMM
SSD 1x Innodisk 32GB InnoLite LP Vertical (for 1U) SATADOM MLC

I did a very simple network benchmark (iperf -c hostname -p 20000 -t 60 -P 1 -L 20000 -w 1m -t 30 -i 5), which showed about 600 Mbit/s  bandwidth. Kuiper’s CPU load was hovering around 10% during the test. I’ll have to do more scientific tests later, as I haven’t made any effort to find the actual bottleneck for this test.

FreeBSD, NGINX, SSL and the ChaCha20 cipher suites

In this post, I’ll be describing the journey of enabling the stronger ChaCha20 cipher suites on my FreeBSD NGINX reverse proxy. I’m using SSL Labs SSLTest to get the details on what is being offered, and for sanity-testing the SSL configuration in general. I’m using FreeBSD 10.2.

In nginx.conf:

http {
  (...)
  ssl_protocols       TLSv1 TLSv1.1 TLSv1.2;
  ssl_ciphers         EECDH+CHACHA20:EECDH+AES128:RSA+AES128:EECDH+AES256:RSA+AES256:!MD5;
  ssl_prefer_server_ciphers on;
  (...)
}

This clearly shows the server prefers the EECDH+CHACHA20 cipher suite. But according to the SSL test, it’s not being offered. Why could this be? I’m using whatever OpenSSL version is in the FreeBSD 10.2 base system, so I go poke it to find which ciphers it supports:

# /usr/bin/openssl version
OpenSSL 1.0.1p-freebsd 9 Jul 2015
# /usr/bin/openssl ciphers | grep -i chacha
#

Apparently, it doesn’t support the ChaCha cipher suites. Ok, then it’s obvious why NGINX isn’t offering those cipher suites. So I go over to my package build server and change its configuration around a little. I add security/libressl to the build list, and I instruct it to build the reverse proxy’s packages with the following make.conf options:

# Build ports against security/libressl
WITH_OPENSSL_PORT=      yes
OPENSSL_PORT=           security/libressl

I instruct the package builder to run a build for the reverse proxy, and it seems to be doing what I wanted. All ports linking to OpenSSL (including NGINX) are being rebuilt, and LibreSSL is built too.

Once the build server is done, I tell the reverse proxy to force an update and upgrade. (pkg update -f and pkg upgrade -f), restart NGINX, and test again.

It’s still not serving CHACHA20. WHAT!? I check if NGINX actually linked against LibreSSL:

# /usr/local/bin/openssl version
LibreSSL 2.2.4
# ldd /usr/local/bin/openssl
/usr/local/bin/openssl:
        libthr.so.3 => /lib/libthr.so.3 (0x800888000)
        libssl.so.35 => /usr/local/lib/libssl.so.35 (0x800aac000)
        libcrypto.so.35 => /usr/local/lib/libcrypto.so.35 (0x800d12000)
        libc.so.7 => /lib/libc.so.7 (0x801110000)

# ldd /usr/local/sbin/nginx
/usr/local/sbin/nginx:
        libthr.so.3 => /lib/libthr.so.3 (0x8008db000)
        libcrypt.so.5 => /lib/libcrypt.so.5 (0x800aff000)
        libpcre.so.1 => /usr/local/lib/libpcre.so.1 (0x800d1f000)
        libssl.so.35 => /usr/local/lib/libssl.so.35 (0x800f96000)
        libcrypto.so.35 => /usr/local/lib/libcrypto.so.35 (0x8011fc000)
        libz.so.6 => /lib/libz.so.6 (0x8015fa000)
        libc.so.7 => /lib/libc.so.7 (0x801810000)

Yes, it is linked against LibreSSL. And /usr/local/bin/openssl ciphers does list CHACHA20. This is very strange. While trying to figure out what’s going on here, I strike up a conversation with Allan Jude on IRC and casually mention my troubles. He mentions something about ChaCha20 being slower (not a bad thing), citing https://wiki.freebsd.org/SSHPerf, and mention some black magic that is using the OpenSSL client to check the SSL session handshake. So I do that.

# /usr/local/bin/openssl s_client -host HOSTNAME -port 443
(...)
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-CHACHA20-POLY1305
(...)

What? This is saying it’s using CHACHA20! Why isn’t the SSL test saying so? And then Allan casually says the SSL report does state the preferred cipher is CHACHA20. I go refresh the page, and indeed it does.

So to make a long story short, I had to link NGINX with LibreSSL to get CHACHA20 support, once that was done, I checked the now stale report to see CHACHA20 wasn’t supported, even though it actually was.

Lesson of the story: Be thorough. Very thorough. Messing up a simple step can make you stare at the screen wondering if you’ve lost your sanity. Also, sharing the trouble can be entertaining for those you share it with, and save you time. Win-win. :)

Let’s Encrypt on a FreeBSD NGINX reverse proxy

This is a write-up on how I set up “Let’s Encrypt” on the reverse proxy sitting in front of the various VM’s serving a few of my websites. I looked at a guide which was very helpful, but I had to fill in on some gaps and tweak the configuration slightly. I’ll be outlining every step of the way here.

Let’s Encrypt let people enable HTTPS with a trusted certificate, for free. You can even get multiple-domain certificates, in case you run multiple websites behind a single IP address.

First of all, I installed the Let’s Encrypt package.

I then configured nginx to serve the magic folder for verification (/usr/local/etc/nginx/sites-enabled/letsencrypt.conf), and made my “real” vhosts only listen for SSL traffic. (You may have to temporarily disable them or add the magic stuff to each of them, if you didn’t have a SSL configuration at all before this.) I then reloaded nginx.

This is the catch-all ‘magic’ vhost for verification. It will redirect real traffic to the https version of the site.

server {
  server_name example.com something.example.com anotherdomain.example;
  listen 80;
  location '/.well-known/acme-challenge' {
    default_type "text/plain";
    root /tmp/letsencrypt-auto;
  }
  location / {
    return 301 https://$host$request_uri;
  }
}

I then executed the following commands to create my certificate:
export DOMAINS="-d example.com -d something.example.com -d anotherdomain.example"
export DIR=/tmp/letsencrypt-auto
mkdir -p $DIR
letsencrypt certonly --server https://acme-v01.api.letsencrypt.org/directory \
-a webroot --webroot-path=$DIR --agree-dev-preview $DOMAINS

This command outputs the path to the directory containing the certificate files. “privkey.pem” is the private key, and “fullchain.pem” is the file you want to use as certificate.

I updated the nginx configuration to use these certificates, in /usr/local/etc/nginx/nginx.conf:

http {
  (...)
  ssl_certificate /usr/local/etc/letsencrypt/live/example.com/fullchain.pem;
  ssl_certificate_key /usr/local/etc/letsencrypt/live/example.com/privkey.pem;
  (...)
}

I then created a script [letsencrypt_renew.sh] which renews the certificate when it’s 14 days or less from expiring. I set up crontab to call it once a day, and only report about any errornous output:

13 2 * * * /root/letsencrypt_renew.sh /usr/local/etc/letsencrypt/live/example.com/fullchain.pem > /dev/null

And that’s it. My websites which are hosted at home are now served over HTTPS with a trusted certificate. For free.

The Let’s Encrypt public beta will start 3rd December 2015, good luck!

Manually compiling your own FUSE file system on FreeBSD

This is a combined rant and tutorial. The tutorial is available further down, under its own subheading. :)

It started Saturday, when I decided to jump in and get my hands dirty with the FUSE API. I started looking for the API documentation, but couldn’t find any which were relevant for my needs. I found some describing the internal kernel API, but nothing describing how to USE that API.

I found some example code with instructions describing how to compile it. These instructions state “gcc -Wall hello.c pkg-config fuse –cflags –libs -o hello”. I got errors about directories not found and flags not recognized. Oh well, not too surprising. I was maybe a little bit optimistic in thinking it was as easy as replacing gcc with clang!

So I went on to google for cmd switch replacements, etc, to no avail. After banging my head on this for over a day, I figured it’s better to leave this problem for another day, and just go with gcc for now. I installed gcc, and… same problem!

I then tweeted, hoping someone would have some suggestions for me. Almost immediately, @badboy_ replied, saying it was displayed wrong, and some of it should be wrapped in ticks. Okay, so I tried gcc48 -Wall hello.c `pkg-config fuse --cflags --libs` -o hello. Now it was complaining about pkg-config not being a valid command. Okay, one step further. In a combination of frustration and joy I immediately ask where to find “pkg-config” for FreeBSD. @FreeBSDHelp mentioned pkgconf. I tried the substitution trick on this, and the command line went gcc48 -Wall hello.c `pkgconf fuse --cflags --libs` -o hello. Okay one step further. It’s now complaining about a missing package for fuse. Digging around the ports tree, I find sysutils/fusefs-libs. Install it, try again.. and voila! It compiles, and works.

I once again try the clang-for-gcc substitution trick, and it works as a charm now. I immediately uninstalled gcc. ;)

How to Compile Your Own FUSE File System on FreeBSD?

  1. Install the package sysutils/fusefs-libs
  2. Have some code which uses FUSE (let’s assume hello.c from http://fuse.sourceforge.net/doxygen/hello_8c.html)
  3. Execute: clang -Wall hello.c `pkg-config fuse --cflags --libs` -o hello

That’s how easy it is, really. Now if only someone could have written that somewhere it could be found. :)

Changelog:
2015-10-28: Replaced ‘pkgconf’ with ‘pkg-config’ according to @myfreeweb‘s tweet.
2015-10-28: Added freshports.org link to package name.
2015-10-28: Cleaned up some links, adjusted text to reflect the post wasn’t published the day I started writing it.
2015-10-28: Added profile links for twitter handles.
2015-10-28: Made the link to @badboy_’s tweet more visible and added link to @FreeBSDHelp’s tweet.

October Update: Learning for TunnelFS

It has been a while since I announced I would work on an exciting new project, TunnelFS, and I think it’s time for a status update.

In that post, I mentioned I had a lot to learn before I could get anywhere with this project. I had an idea of how much it was, but it turned out to be more. I’ve been working on learning C — I can read C without much trouble, except the really complex cases, and I can actually write a little on my own. For example, I’ve been going through the entire FreeBSD SCTP implementation in an effort to document its sysctl MIBs (there’s still outstanding work on this, but I’ll cycle back to it at some point). I’ve been reading some books on the subject of C, and it makes sense to me. It’ll take a while to really learn the grammar, but the concepts are nothing new. It feels a bit like learning a new language which is similar to your own, but with slightly different grammar and vocabulary.

This week I’ve been trying to figure out FUSE. It’s difficult. Documentation seems to be virtually non-existent, the code that I have found isn’t documented, and what little documentation I have found is on some sourceforge page, or specific to the kernel-side implementation of the API with little or no concern of how one would use that API. I’ve been trying to figure out how to compile some example code using Clang, but have given up and will be using GCC for now. I don’t have to solve every problem all at once. The important thing is getting a grasp of the API. I hope to implement fuse-tunnelfs in C to make it easier to port it as a native file system, keeping the options open for future potential features such as booting from TunnelFS.

I’ve done some thinking on how I want TunnelFS to be implemented, logistically speaking. I’ve come up with two implementation alternatives, but might land on a middle ground if possible. The only difference between the two implementation options is how it handles filesystem I/O on the host side.

TunnelFS-Implementation1The first option has the filesystem I/O be handled by a separate process, which may run as a different user than root if desired, communicating with bhyve over UNIX socket or some other means which I have yet to investigate. This makes the data have to be copied or streamed three times: Between fuse-tunnelfs and the TunnelFS virtio driver (guest side), between TunnelFS VirtIO driver guest side and bhyve side (backend), and between that backend and the separate process. It also has a smaller chance of breaking Bhyve, as it leaves less code running in Bhyve’s context. It could also be slightly more portable, as the filesystem I/O code isn’t directly tied to Bhyve in any way.

TunnelFS-Implementation2The second option has the filesystem I/O implemented into Bhyve itself. This leaves a bigger footprint in Bhyve, meaning there’s a higher chance of breaking Bhyve. But it also means there’s one less data-copying step, as it won’t have to communicate with a separate process. It is possibly less portable than the first implementation, but it could remain easily portable if done right (i.e. not implementing the I/O logic directly in the VirtIO backend code). Filesystem I/O would run in the bhyve context, meaning all operations would be executed as root.

I think this is a performance-vs-security/reliability decision. I’d err on the side of security/reliability, but this is all quite far off into the future, so we’ll see what I find out. If anyone have any input on this please let me know; your knowledge is appreciated.

I think that sums up this month, and the status of TunnelFS pretty well. Hopefully, I’ll have skeleton code by the end of the year, but we’ll see. :)