How does this site work?
You may be wondering, with all of my years of experience in the field of web hosting, how exactly is this site set up and what my thought and content processes are. I’ve taken a DevOps-y aproach to this task and today is your lucky day since I am going to tell you, so buckle up as this is a long read!
1. The requirements Link to heading
Whenever I build something I start out by figuring out what I need it to do exactly. In the case of the site it needed to share my biography, the services I can offer and my random ramblings and tutorials with the world. Another thing was the site had to be multi-lingual, doing that as easily as possible would be nice. You can’t be of such a background and have a slow site either, so any bit of speed I can get, I’ll take and it also needed to be fully hosted by me. No relying on third parties to do steps on my behalf, that way I can show my prowess with the technology and the full stack. Open source is also something I can’t compromise on. Lastly, I’m not all that good when it comes to design, so it had to be easy enough to set up some sort of elegant theme and be done with it.
With all that I was able to identify these musthaves:
- Be Open Source (OSS)
- Self-hostability
- Have blogging capabilities
- Have card/CV capabilities
- Multi-lingual support
- It’s gotta go fast ๐ฆ
- Be automate-able
- Be secure
- Be stylish
Oh, skipping any large learning curve would be quite nice. While I do love tinkering, there are other things that need to be addressed during the day.
2. The options Link to heading
At first I had the idea of using a CMS. Now the debate was between Joomla, Drupal and Wordpress. Here’s the thing though: I know the most about Wordpress - hosting it, securing it, I have web designer relations that work with it, so it was a no-brainer.
Our descision of CMS being made let’s apply it to our requirement matrix:
- OSS โ
- Self-hostability :check_mark_button
- Blogging capabilites โ
- Card/Basic site capabilites โ
- Multi-lingual support ๐จ - needs a plugin increasing complexity
- Fast-ness ๐จ - this is unknown given plugins and templates which I can’t control
- Automation ๐ฅ - I mean sure, it can be automated but it would take a long time and not to the degree I want. What I want is to just sit down, write articles and be done with it, while also keeping data in source control (i.e Git), which is difficult with the inherent design. I also wanted some form of staging site and syncing between two Wordpress instances is hit and miss at times.
- Security ๐จ - Again, back to plugins/themes which increase the attack surface, you also have an admin interface to worry about, as well as database.
- Stylishness โ - Although it depends on the theme
Sitting back and looking at this, I was left wondering if I’m trying to fit a square peg in a round hole? There is always that danger that when you have a hammer every problem looks like a nail. Thinking about it, nothing in our requirements actually mandated a dynamic site to begin with. What to do though - learn a Javascript library and write all of the HTML and CSS as well manually? I mean, it’s going to be fast to load, but I doubt all of the time saved loading will equate the time I would’ve spent writting it. Then there’s the question of quickly generating posts for the blog as well… No, some other solution was needed…
HuGo? HuWho? HuGo to the rescue! I’ve dug down a bit further into it here. TL;DR it’s a Golang tool that basically reads markdown files, “compiles” them into HTMLs and CSS which you can then host. Oh, also it comes with a lightweight server so I can actually stage changes. Suddenly my automation abilities are actually quite close to infinite given the simplicity of the whole setup! Let’s run it by our requirement matrix:
- OSS โ
- Self-hostability :check_mark_button
- Blogging capabilites โ
- Card/Basic site capabilites โ
- Multi-lingual support โ - Built-in even!
- Fast-ness โ - What’s faster than base HTML?
- Automation โ - All we need to do is decide where to host it as well as the source files.
- Security โ - There is no attack surface at the site HTTP/S layer, no plugins, no admin panel, bliss.
- Stylishness โ - Again theme dependent though.
Given the above it was decided. HuGo for the win! So now it was a question of building the automation…
3. Automation Link to heading
Our choice of HuGo mean that our automation steps had to be as follows:
- Sit down and write content (actuall writing requires effort still unfortunately :sad_face:) HuGo has a built in server so you can have a local environment too.
- Once happy, commit changes to source control.
- These changes would then get compiled.
- The server would then simply serve the HTMLs.
3.1 SCM Link to heading
So for 1 we can’t automate much. For step 2 though, we have a bit of a problem… We wanted everything to be self-hosted. Github is quite the opposite. To solve this problems we would need to host our own SCM instance, let’s see what our requirements are:
- Needs to be OSS again
- Fast and small footprint (I don’t have a lot of capacity to host on my Intel NuC).
- Needs to plug into CI/CD
- Easy to configure and manage as we don’t have a lot of time.
Now that we know our requirements, let’s take a look at our options. Since we want OSS, we have several possibilities. Most popular of which is GitLab which has a community edition. It’s based on Ruby on Rails and has a pretty nifty GUI with the GitHub features, requires at least 4 CPUs and a PostgreSQL database and updating looks to be a pain and… Okay maybe this is not the best solution. Do we really need all of this stuff? All I want from a GUI is a place where I can see changes and the files themselves and the actual git server features so I can plug them into the CI/CD process at a later point. Looking at our matrix:
- OSS โ
- Small footprint โ - Requires 4 cores at least and PostgreSQL with it’s own requirements
- Pluging into CI/CD ๐จ - I have to use Gitlab CI which isn’t bad, but something new to learn so more time cost, also not as popular as other options out there.
- Easy to configure and live with ๐จ - Given the previous two points and the large GUI will be a pain to keep up to date as well…
Square peg, round hole situation again. Let’s keep lookig as it seems like our princess is in another castle…
Gitolite seems to be just what is needed. It’s light weight as it relies on the built in GIT install, can create hooks so that we can inform our upstreams of code changes in a given repo and the coolest part, all of the configuration is actually a repository that we can manage through git itself! All we need to is add our public keys to that repository and proceed from there. So to apply it to our matrix again:
- OSS โ
- Small footprint โ
- Pluging into CI/CD โ
- Easy to configure and live with โ - Since everything is configured on a repo
All we need to set it up is:
- Follow the install instructions
- Create a repository to host the raw site files - more info here. You can create a staging and a production branch as well, should you want to have a staging site to preview your changes.
Right, that done, we do have one small problem. Gitolite does not have a GUI. It would be nice to be able to view stuff at least. There is the cgit cgi script that’s built in C and allows us to … see our Git repos which I know plugs in quite nicely with Gitolite. It is a bit long in the tooth now but for my usage it has the benefit of being fast (only Assembly is faster than C) and it’s UI is clear. Gitolite also knows how to talk to it for a particularly juicy option - “config cgit.ignore=1” under a repo that we don’t want to be shown on our web interface (the gitolite-admin repo comes to mind).
So we also do the following since I couldn’t find a tutorial that actually describes the steps for us:
- Install a web server that can run the CGI script on our Gitolite machine. I chose Apache since I know that works best with this setup.
- Build the CGI using the instructions.
- Next configure apache. This is my Virtual hosts which I am sharing, because the rewrite rules were a pain to get working. You can configure it with SSL if you have the certs.
<VirtualHost *:80>
# Set to whatever you want
ServerAdmin webmaster@localhost
# This should also be where the binary is
DocumentRoot /var/www/htdocs/cgit/
<Directory "/var/www/htdocs/cgit/">
Options +ExecCGI
AddHandler cgi-script .cgi
DirectoryIndex cgit.cgi
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) /cgit.cgi/$1 [END,QSA]
</Directory>
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>
- Lastly we edit the /etc/cgitrc (create it if it doesn’t exist). Here is an example which I got to work:
enable-index-links=1
remove-suffix=1
enable-commit-graph=1
enable-log-filecount=1
enable-git-config=1
virtual-root=/
enable-index-links=1
enable-index-owner=0
# Set the title and description to whatever you want
root-title=My awesome SCM
root-desc=My super special repos
local-time=1
# This should be the URL that shows up when you want to clone from the UI
# The Debian packages install with the gitolite3 user.
clone-url=gitolite3@$IP_OF_YOUR_SCM:$CGIT_REPO_URL
# search for these files in the root fo the default branch
readme=:README.md
readme=:readme.md
readme=:README.mkd
readme=:readme.mkd
readme=:README.rst
readme=:readme.rst
readme=:README.html
readme=:readme.html
readme=:README.htm
readme=:readme.htm
readme=:README.txt
readme=:readme.txt
readme=:README
readme=:readme
readme=:INSTALL.md
readme=:install.md
readme=:INSTALL.mkd
readme=:install.mkd
readme=:INSTALL.rst
readme=:install.rst
readme=:INSTALL.html
readme=:install.html
readme=:INSTALL.htm
readme=:install.htm
readme=:INSTALL.txt
readme=:install.txt
readme=:INSTALL
readme=:install
# Wherever your gitolite repos are
scan-path=/var/lib/gitolite3/repositories
- Enjoy your newly installed local SCM with GUI!
- Create a repository for your site and actually push your raw files there.
3.2 CI Link to heading
It’s all good and dandy having our source files stored in SCM but webservers can’t process them. Sure one of the options suggested by the Hugo devs is to host with Github pages etc, but we want full self reliance (as much as is physically possible of course). To make hosting the actual files as easy as possible we can:
- Have a web server read the actual SCM and compile it into HTMLS which it then serves. It would act kind of like a proxy.
- Ship already compiled files to a web directory of an already configured site.
Option 1 is cool, but webservers in order to get it to work we would have to have our SCM be talked to publicaly. Apart from the security risks I also have a technical limitation, the Gitolite instance is hosted at home, while the server is going to be a public instance. At home due to my ISP’s setup I am double NAT-ed meaning I can’t controll ingress. Also I don’t want to have any sort of access to this SCM system as I want to use it for other projects as well that might be more sensitive. Option 2 it is then. Now if we can only find out what to compile the files with…
Jenkins, ever popular in the DevOps world is just the tool we need. While it is a bit heavy handed for our task, (to compile HuGo files all you need to do is execute a simple command) I have had other uses for it, so it’s just sitting in a container on my NuC LXC right next to the Gitolite instance.. No need for a requirements matrix when something is already set up, sometimes recycling beats innovation.
In Jenkins what we want to do is:
- First configure access keys for Jenkins to access the repository. You want an “SSH username and key” entered into the Jenkins credentials store.You then simply add the public key to the “gitolite-admin/keydir/” and make sure that jenkins has the correct access rights in the repository config under “gitolite-admin/conf/gitolite.conf”
- Install hugo so that it is available as a binary to the jenkins user via command line.
- Generate a Jenkins access token, this will be required in step 3. This is under “Manage Jenkins” > “Configure Global Security” > “Security” > “Add new access token” > “Git plugin notifyCommit access tokens”
- Back to Gitolite, go ahead and create a file at “gitolite-admin/local/hooks/repo-specific/jenkins”. This will be our post-recieve-hook that actually tells gitolite to notify Jenkins of any changes. The file looks like this:
#!/bin/bash
# URLs here are based on my DNS, cgit and jenkins are resolvable
REPOSITORY_URL=gitolite3@cgit:$GL_REPO
TRIGGER_URL=http://jenkins:8080/git/notifyCommit?url=$REPOSITORY_URL
# Note where you are storing this token, it could be anywhere, just change this value.
JENKINS_TOKEN=$(cat /var/lib/gitolite3/.jenkins_token)
echo -n "Notifying Jenkins of changes to $REPOSITORY_URL ... "
curl -s -m 5 "$TRIGGER_URL&token=${JENKINS_TOKEN}"
- We also need to ensure that our Gitolite rc file allows hooks to work. The file to edit is under the gitolite user’s home and is called “.gitolite.rc”. Annoyingly it isn’t handled by the admin repo (๐) so you need to manually edit it. Here’s how it looks like without comments:
%RC = (
UMASK => 0027,
GIT_CONFIG_KEYS => '.*',
LOG_EXTRA => 1,
ROLES => {
READERS => 1,
WRITERS => 1,
},
LOCAL_CODE => "$rc{GL_ADMIN_BASE}/local",
ENABLE => [
'help',
'desc',
'info',
'perms',
'writable',
'ssh-authkeys',
'git-config',
'daemon',
'gitweb',
'cgit',
'repo-specific-hooks',
],
);
1;
- Now we actually add a multibranch pipeline to a folder of our choice in Jenkins. We configure it with our SCM for Branch Sources and configure the rest of the details based on our branches (if we have a stg/production/master/main etc).
- Set the Build Configuration parameter to be a “Jenkins file” and give it the path where it should look for it. While we can use the web thing to write it out, I find it easier if it lives in the same SCM as the site files themselves.
- In your site source repository we can actually create our very first Jenkins file. Here’s an example of what it would look like:
pipeline {
agent any
stages {
stage('Building HTMLs...') {
steps {
sh "/usr/bin/hugo"
}
}
post {
// Clean after build
always {
cleanWs(cleanWhenNotBuilt: false,
deleteDirs: true,
disableDeferredWipeout: true,
notFailBuild: true,
)
}
}
}
I won’t go into details about Groovy syntax, but the main deal is that under steps it executes the hugo binary which actually creates our HTMLs. The post step is boiler-plate for cleaning up after ourselves. 9. Write some new article and see if it triggers a build and if it completes sucessfully!
So now that we’re actually building our files continuously (๐) whenever there is a change, we can actually proceed with deploying it … continuously.
3.3 CD Link to heading
Luckily for us (and my fingers typing this) there aren’t that many more steps left. Essentially what we need is a location which will host all of our files, a web server. This can be Apache, Nginx or even LightHTTPd. You could use IIS though at your own peril as Windows is going to be hard to sync to. For my case I have a VPS that runs Nginx and has a public IP, although the target for deployment can live anywhere, an AWS EC2 instance, your RaspberryPi at home (provided you can make it accessible through the network if you want a public site) etc. I’ve opted to use the rsync method so here are the steps I required:
- Configure an SSH user and key in the Jenkins credentials store. Add the public key to an SSH user on the VPS that actually can read and write to the “DocumentRoot”.
- Set up the sites you want your web server. This varies a lot based on your setup so I leave that in your capable hands.
- Now you can add the IP of your server, the username, the port and the path to the site files as global variables in Jenkins (you might even want to make them secret depending on your setup). This way if they change we don’t even have to touch the Jenkins files, just change it at one source.
- Ammend the Jenkins file to look something like this:
pipeline {
agent any
// I've added the user for SSH as a "Credential" so it needs to be loaded in an environment block
environment {
SSHUSER = credentials('SSH_USER')
}
stages {
stage('Building HTMLs...') {
steps {
sh "/usr/bin/hugo"
}
}
stage('Syncing to VPS...') {
/* Notice that VPS_PORT and VPS are global variables so I don't need to load them, but since
SSHUSER is the actual credentials we need to refer to the "USER" portion of the tuple with
the _USER indicator. Also here we rsync the public/ directory as that is where HuGO puts built
files */
steps {
sh """
rsync -az --rsh='ssh -p${env.VPS_PORT}' public/ \${SSHUSR_USER}@${env.VPS}:~/${env.PATH_TO_STG}/
"""
}
}
}
post {
// Clean after build
always {
cleanWs(cleanWhenNotBuilt: false,
deleteDirs: true,
disableDeferredWipeout: true,
notFailBuild: true,
)
}
}
}
- Note that the global variable for PATH_TO_STG will be the path to the staging site, you should have a different one for the path to the production site and user. You could also entirely forgo this and have only one target to deploy.
- Write some changes to an article and push them to SCM to make sure the build is triggered and the pipeline is working as expected.
- You should be seeing your site update in real time once the build completes (takes about 10 seconds for mine but I’m sure it can be made faster).
4. The conlucions Link to heading
So taking a DevOps-y approach to building a site takes a long time, but now our solution is much more robust and allows multiple people to work on the site in collaborative manner. We also get the benefit of an SCM and CI/CD server which we can use for other fun projects! Stay tuned to see more of those and I hope you’ve enjoyed going through this tedious article with me!