My Markdown → HTML Setup

A common way I see a lot of people blog, especially micro-blog, is in markdown.

Markdown is a lightweight markup language for creating formatted text using a plain-text editor.
— Wikipedia | Markdown

It built itself on-top of common syntax prevalent on the web and was designed to be converted into simple HTML output. Since it leveraged preexisting syntax it was easy for new users to pick up, and is now found all over the web and applications.

Since I started this website, I had been writing each page by hand using a few tools to facilitate that - and for a while I had been looking for a good way to try out using markdown to generate some lighter pages and these blogposts.

Writing HTML by hand

When it comes to blogging a lot of platforms offer WYSIWYG editor – allowing users to write in rich-text that then gets converted into HTML in the style of the platform. But for my case, since I self host this website, I decided to stick to my roots and write PURE HTML instead.

HTML is fairly simple and easy once you get use to the basic structure of the system. And since I’ve been working in HTML almost two decades now, at the time it felt like the best solution to make a clean website.

I briefly touched on my design process in 2019-01-21 - First! A New Years Resolution outlining that I wanted to make a very lightweight and simple website. And at the time I believed the best way to achieve this goal was to carefully structure and craft my website’s HTML by hand.

This article is making the process sound far more difficult than it is – it’s mostly just tedious.


<article>
  <h2> Title </h2>
  <p>
    Some paragraph....
  </p>
  <h2>
  <p> some subsection </p>
  </h2>
  <p> more text </p>
... etc
</article>

Is essentially what the website looks like - you can view the source of this page to see – it’s very simple HTML.

The benefit I found doing this, mostly leveraging tidy, allowed a very easy to edit codebase. And by leveraging the existing tags and their properties I also attempted to keep the styling to an absolute minimum. Using existing tags to enforce the styling I desired.

Only for certain areas (tables, code, quotes) where readability is an issue do I setup custom CSS.

Most of this process is actually what will continue to happen but the actual writing process will be unobstructed by the tedium of writing HTML.

Micro-blogging in general

At the time of writing this, I have no ported over any of my Gemini micro-blogs. This warrants a longer post, since I wrote consistently in gemini from March 2021 through May 2021 – having only stopped due to a long move leading to a lot of server downtime breaking the habit. My gemini updated multiple days a week - mostly due to the extremely lightweight and limited nature of the platform.

Gemtext

Gemtext was the gemini protocol’s standard MIME type. It was a basic markup language that relied on line based syntax. It was purposefully as lean as necessary because this was what was ACTUALLY being served to clients – unlike Markdown which first needed to be converted to HTML, gemtext was the actual text served and rendered on the viewers client. You could customize the style of your client - but you could not, as an author, dictate how your content would be viewed. This meant the only aspects of your blog you had control over was the actual content and it’s structure – which for a blog is really all you should care about.

It’s syntax contained most of what I was actually using here already from HTML:

headings
paragraphs that were wrapped based on page-width
links
lists
quotes
preformatted-text / codeblocks

Besides links - it also leveraged the same common syntaxes that markdown did.

Gemtext links

From my brief time in the IRC and in geminispace in general - a lot of the “recommendations” came from new users about providing in-line links. The philosophy was that by forcing links to exist on their own line - clients could configure how they wanted these to be seen and not have to worry about links interfering with the text.

Like Gopher (and unlike Markdown or HTML), Gemtext only lets you put links to other documents on a line of their own. You can’t make a single word in the middle of a sentence into a link. This takes a little getting used to, but it means that links are extremely easy to find, and clients can style them differently (e.g. to make it clear which protocol they use, or to display the domain name to help users decide whether they want to follow them or not) without interfering with the readability of your actual textual content.
— gemini.circumlunar.space – A quick introduction to “gemtext” markup | Links

I felt that this provided a lot of useful limitations that removed a huge barrier for me to actually write down ideas without feeling over burdened. I also lurked in the IRC - as well as implemented my own gemini server.

As a quick aside – the java server was a lot of fun! The protocol was very simple to work with for basic gemtext. I felt the ultimate downside was trying to build something for basic gemini capsule hosting (like I was using for a decent chunk of my time with gemini) - and something for developers to use as a base application server. At the time in 2021 a lot of talk was happening on IRC of users starting to look to provide more complex experiences via the protocol and I wanted a way for those interactions to be built out in Java - since most were in Go or Python at the time. This decision lead to me burning out due to difficulties splitting those responsiblities easily - where you could host along side your application - since I lacked the experience with more complex Gemini capsule applications.

But it was a good experience and I got hands on experience with Certs, Netty, and SNI - which actually came in handy at my job!

Wasn’t this about Markdown?

A lot of what I liked about Gemini I found missing when I returned to the World Wide Web. Writing a new post was tedious and I actually had a few drafts sitting unposted. They’re probably checked into my git at this moment! So I thought - why not just use markdown and convert to HTML? That’s what it’s built for - and I already designed my site to work with minimal customization of raw HTML tags!

How I use Markdown

Firstly, this blogpost was written in Markdown (with minimal HTML sprinkled in). Then I render the markdown into HTML using Discount. Frankly, I don’t know how I stumbled across this markdown parser - I think it came pre-installed on my KDE Arch system because another KDE program used it. But I liked it, and it seemed extensible enough for my needs.

This would produce the “body” of my articles - and I could then prepend and append the template-head and foot to my html output to form a blog post/web page.

Customizations

After I generated the output file, I replaced some placeholders in the templates via sed and then tidy’d the HTML. The only other major issue was Discount had no way of appending any link attributes – so for external links I had sed append the rel and target attributes - which work off the assumption they’re not there. A lot of my home-server scripts rely on assumptions…

This is all bundled up in a simple script file so I can just supply a few arguments and the full page is re-rendered on command.

Two Sources of Truth

In the sytem I devised the markdown files are really the “source of truth” but you could argue that the HTML files hold equal weigh - as they’re what you’re reading right now. The markdown is only useful if I render it as HTML. There exist nginx extensions to serve markdown as HTML so I store everything as markdown. I could also provide some heading information to the markdowns to remove the command arguments and have on boot it generate the .html files in place before launching the site… But these are all nice ideas for a later date.

Ultimately, this is something I contribute to ocassionally - I don’t need something too complicated. I just need to output some HTML a few times a year. So if I manually publish the HTML each time - that’s likely far more efficent then re-rendering.

Learnings

This is the first post that uses this - though I’ve converted a page over to this already. But once I worked out the kinks and built a flow that works for me - this made the writing process a LOT easier. Another issue was that once I tidy’d the HTML file - it became frustrating to edit, and I didn’t always re-tidy it. Because the output is always tidy’d by the script - I can edit the raw markdown as needed. And the script generally will always output the same file (with whatever changes I made of course). This makes the editing and git history a lot clearer.

I would recommend writing in markdown - or even trying out gemini - you can host your gemini capsule on the web even! (Most gemini webpages are gemini capsules converted). I am sure other “blog focused markups” also exist too.