Description of the current state of SEO in the Historiana project.
Updated september 2020 to reflect the current state.
An archive of the orginal document from september 2019 is kept here
In the original document we outlined the current and common practises of SEO in general and connected it to the opportunities we wanted to take within the Historiana context. Please refer to the archived version for this background information.
For the statistics we were using Google Analytics which gave little insight into the usage of the various parts of the site because everything is registered as a single visit.
As an experiment we installed a local version of Matomo which is in ways quite similar to Google Analytics but can be self-hosted. They document tracking SPA navigation fairly well and currently we are using the trackPageView calls in the live version of Historiana.
example of custom page-tags in Matomo statistics
We are able to differentiate between the various sections of the site but can’t track the deeper content like specific Historical Content items.
Although this mostly solves the tracking part of the SPA it does not help for search-engine optimisation at all. Bots visiting the site for Google and other search engines won’t be able to navigate to specific pages. Partly because some of the navigational code is in Javascript and bots don’t parse Javascript, they just look for <a> links.
Another reason is that currently Historiana is using hash navigation; links to content are created with a # sign. For example: https://historiana.eu/#/teaching-learning instead of https://historiana.eu/teaching-learning
This is the default behaviour of Vue but causes problems on the SEO side of things. Originally this # symbol is used to address a position within a page. Frameworks like Vue misuses this feature to their advantage. Search engines however won’t support these links as it breaks things on their end.
Fixed the # issues In the early days of the Vue Framework the way things worked was that all routing of pages was done via a "hack" using the # symbol in the URL. As this symbol orginally was intended to reference a named section of a HTML page this broke the way Google et al index pages.
Later releases of Vue supports the history mode where the # is no longer needed. This required quite ome changes in the server but more importantly also in the codebase of Historiana.
It turned out that moving from the # to the History mode required quite some effort as many parts of the existing eActivity apps were coded explicitly around the # sign method. Disabling it broke quite a few areas of the site which were all tracked and fixed during development.
Apart from the code we also needed to cleanup and convert existing data in the database as in some situations the urls stored in the database included the # sign.
The current version is reworked and we got completely rid of the # sign heritage.
In the old days sites often had a “sitemap” which was a standard HTML page showing a tree of all the pages in the site. Some sites still have this feature but Google is pushing an invisible version of this concept: sitemap.xml.
This is a (computer generated) file which describes all known URLs of a website in a XML format. It is based on an open standard developed by Google.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>
We have extended the Historiana API and added the functionality to automatically generate a sitemap. Currently its feasible to create a map on request; for now there is no requirement to generate it daily, its generated at the time of the request.
The advantage is that newly added content is shown immediately in the sitemap
Google and other search engines will find the sitemap via the robots.txt file at historiana.eu/robots.txt which refers these engines to the url of the map at historiana.eu/api/sitemap
Even while the sitemap is only live for a few weeks now we find that Google is showing more and deeper content than before, we are optimistic for the long term results.
While Google is not the only search engine is it the biggest one and many people use it by default. Which results in many companies buying keywords and trying to beat others and be on the first page of search results.
This business model brings in a load of money and hence Google created tools for their customers to give insight into their investments. The current name of the main entry point of these tools is Google Search Console; earlier this was called the Webmaster Tools Wikipedia
We already set up an account for the Historiana project and we will continue using this tool.
With the sitemap tool completed we will continue to use this tool and monitor the statistics and use-ability reports it generates.
Quicker managed releases We moved from a dev > live development cycle to a dev > test > live cycle.
Current development is primarily focused on completing and improving existing parts, we except to start development of new features in October 2020. At this time we will split the development in two branches. A real development branch which carries the new parts. Things will break in this version and this version will only be available via dev.historiana.eu.
The live site will be a separate branch and published on test.historiana.eu ; once validated it will be duplicatede on the live site at historiana.eu
This way we can drastically shorten the time to fix issues on the live site as it is isolated from the development version in which things can be broken. We hope this will improve both stability and update frequency.
SSR In the previous report we indicated that being a Single Page Application has its drawbacks and there are various ways of getting around. One way would be to completely render everything on the server via tools as NuxtJS which accompanies the VueJS ecosystem.
We explicitly decided to not go that route, see the orginal document for details.
However we feel that it would be very useful to include meta information specific to a url into the header of the page. Currently sharing a link on Twitter, Whatsapp, Facebook et al will result in a generic blob of information added to the link as such:
Ideally every url would have its custom information specific to that url; eg https://historiana.eu/historical-content/key-moments/world-war-1 would show the image, title and first paragraph of text found on that page.
Doing this requires some server side processing which we are currently investigating.
We see this feature as a very welcome extension via which we can show the broad content of the project via social media sharing.