YeenDeer softness blog (programming and electronics)

Ellie the Yeen is a soft YeenDeer that mweeoops and does programming

View on GitHub About Bots Projects Tags

Making a sitemap XML in Liquid

Decided to make a simple sitemap generator using Liquid that is a part of jekyll. it is a template language that lets you do quite a few things like you can generate quite a few different data formats using it and it supports iterations and variables and such.

Liquid is a fun template language similar to Jinja that allows you to do many useful things and generate all kinds of website related things and it is not only limited to HTML either.

To make something be processed by Liquid you have to place three dashes then some optional yaml then three dashes again like this.

---
---
Something

or

---
title: Some variable
---
Something else

if you want some variables defined and you can even put multiple levels of variables in the yaml like the following

---
layout: post
title: Test post
date: 2023-11-08 09:31
category: Programming
author: A yeen
tags: [tag1, tag2, tag3]
summary: This is just a test post
---
Content and any Liquid code here

This makes it a very powerful language that is easy to write and read both by computers and programmers.

Below is a template using Liquid that generates a sitemap that can be used by various search engines and such.
sitemap.xml liquid

---
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset
      xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
            http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">

<url>
  <loc>{{ site.url | append: site.baseurl }}/</loc>
  <lastmod>{{ site.time | date: '%Y-%m-%dT%H:%M:%S+01:00' }}</lastmod>
  <priority>1.00</priority>
</url>

{%- for p in site.posts %}
<url>
  <loc>{{ site.url | append: site.baseurl | append: p.url | urlencode }}</loc>
  <lastmod>{{ p.date | date: '%Y-%m-%dT%H:%M:%S+01:00' }}</lastmod>
  <priority>0.66</priority>
</url>
{% endfor %}

{%- for p in site.pages %}
<url>
  <loc>{{ site.url | append: site.baseurl | append: p.url | urlencode }}</loc>
  <lastmod>{{ site.time | date: '%Y-%m-%dT%H:%M:%S+01:00' }}</lastmod>
  <priority>0.10</priority>
</url>
{% endfor %}

</urlset>

As you see it supports iteration and filters like append and date which is used for date formatting.

This will then render into something approximately like this. Remember that the dates rendering is based on when the site is rendered which means the search engines will not constantly see the current date for pages. This can be good if there is a few extra things on some pages like links on the side that gets updated with interesting content. Some things are cut off from this as there are a lot of pages and there is one of the 3 categories each here

sitemap.xml example

<?xml version="1.0" encoding="UTF-8"?>
<urlset
      xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
            http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">

<url>
  <loc>https://ellietheyeen.github.io/</loc>
  <lastmod>2023-11-08T06:53:00+01:00</lastmod>
  <priority>1.00</priority>
</url>

<url>
  <loc>https://ellietheyeen.github.io/2023/11/06/github-actions-post-on-mastodon.html</loc>
  <lastmod>2023-11-06T14:15:00+01:00</lastmod>
  <priority>0.66</priority>
</url>

<url>
  <loc>https://ellietheyeen.github.io/about.html</loc>
  <lastmod>2023-11-08T06:53:00+01:00</lastmod>
  <priority>0.10</priority>
</url>

</urlset>

The sitemap is then put inside the robots.txt that is also generated and search engines can find it inside.

robots.txt liquid

---
---
User-agent: *
Allow: /
Sitemap: {{ site.url | append: site.baseurl }}/sitemap.xml

robots.txt rendered

User-agent: *
Allow: /
Sitemap: https://ellietheyeen.github.io/sitemap.xml

and will be reusable for other websites.

It is however very recommended to submit your sitemaps to Bing and Google so they can scan them at times and you will be notified about potential issues just like you can do with RSS feeds.

You can also use it to generate a makeshift JSON API to list all the posts on your Jekyll site which can be convenient since it is is easier to parse if you want to use it for bots or alike.

listposts.json liquid

---
---
{
  {%- for p in site.posts %}
    {{ p.title | jsonify }}: {{ site.url | append: site.baseurl | append: p.url | jsonify }}
    {%- unless forloop.last %},{% endunless -%}
  {% endfor %}
}

You should check out the page for Jekyll variables to understand this code better but here is a simple explanation for what they do.

There are also several filters that can be good to know what they do

There are many fun guides how to do stuff like that like here is one for example https://gist.github.com/MichaelCurrin/f8d908596276bdbb2044f04c352cb7c7

You should try Liquid out and see what useful things you can do using it like generating very convenient web pages with automatically generated elements. Maybe you want a sidebar with the latest posts shown on every page, Liquid makes this very easy.

Anyway this was a fun little project and can be very useful if you have a jekyll site you want to be able to have programmatically listed with a sitemap.

*Mweeoops*

By ellietheyeen 8 November 2023 Permalink Tags: jekyll liquid seo sitemap xml


Instance: