Tell us about your project

A schematics showing the process of creating sitemap

A Sitemap is a list of all Web pages from a Website, and it's very useful because it helps your content appear faster in the user's browser. There are two types of Sitemap, static and dynamic.

Static Sitemaps are being used today only for static Websites, or for Websites that do not have a lot of pages.

Creating these Sitemaps could take a long time because after you create every new page on your Website you'll have to manually include it to the Sitemap. On the other hand, dynamic Sitemaps are much more useful and used. They are connected with the Database of the Website, and every time a new page is created, it will automatically be included in the Sitemap.

Drupal 8 has one great module, which allows us to quickly and easily generate a Sitemap for our Website. It is a Simple XML Sitemap, which you can download here. The installation of this module is not complicated, and nor is using it.

In its user interface, we can choose which entities we would like to include in the Sitemap.

For example, if we choose entity type node as one of them, we can also choose which of the node content types we would like to exclude from the Sitemap, and also which nodes we'd like to exclude. After every change, we need to run the "Regenerate sitemap" script, which you can find on the module configuration page. ("admin/config/search/simplesitemap"). The sitemap will then be created under your project root folder, and you can access it by running this link: "your_project/sitemap.xml".

As I have already said, this module is very helpful, but what if we would like to generate our map which requires some other specification, for example, if we would like to include only the nodes created after some specific date, or we need  to include the nodes created by some specific user(s).

We could still manage to do that with the module's interface, but what if there are more than a hundred, or even a thousand nodes we needed to change. It would be such a pain if you had to do this manually. On the other hand, it would be a lot easier if we could create some custom code, which then would regenerate the Sitemap with our changes.

I'll explain in this blog how to make this happen.

In order to follow the code examples, you will need to have a custom module.

If you don't have any, you'll have to create one. For this tutorial, I'll name mine custom_sitemap.

In the custom_sitemap.module file, I'll create two functions:

function custom_sitemap_sitemap_link($uri, $lang, $changed, $priority) {
  $base_url = \Drupal::request()->getSchemeAndHttpHost();
  $path = $base_url . $uri;
  $link = custom_sitemap_sitemap_link_markup($path, $changed, $priority, $uri);
 
  return $link;
 
}
 
function custom_sitemap_sitemap_link_markup($path, $changed, $priority, $uri) {
 
  $base_url = \Drupal::request()->getSchemeAndHttpHost();
 
  return '<url>
  <loc>' . $path . '</loc>
  <xhtml:link href="' . $base_url . $uri . '" hreflang="en" rel="alternate">
  <lastmod>' . $changed . '</lastmod>
  <priority>' . $priority . '</priority>
   </xhtml:link></url>';
 
}

In order to create our custom values for the Sitemap, I'll need to retrieve the values which will represent the URLs of pages which will be included in the Sitemap, language the node text was written in, timestamp value which will represent the value when the node was changed, and priority (supports values from 0.1 to 1.0) for the Sitemap.

Then I'll create a controller in the src folder, which should be in the root folder of your module ( if you don't have it, create a folder named "Controller" ), and therein I`ll create a PHP file "MapgenerateController.php". This controller will be invoked every time when the script "/map-generate" is created ( which I will later define in my module.routing file ), and I will create a service that this module will use to retrieve the results.

<?php
namespace Drupal\simple_sitemap\Controller;
 
use Drupal\Core\Controller\ControllerBase;
 
class MapgenerateController extends ControllerBase {
 
  /**
   * Generates sitemap.
   *
   * @return array
   *   A simple renderable array.
   */
  public static function generate() {
    $sitemap = \Drupal::service('simple_sitemap.sitemap');
    $sitemap->generate();
   
    $element = array(
      '#markup' => 'Sitemap generated.',
    );
    return $element;
  }
}

In the previous code, I've only defined a service that will be called after every script run, and I have defined a markup element on the page which will be loaded after the script has run.

The next step is creating and defining a custom route for our controller. In the file custom_sitemap.routing.yml I`ve inserted the code below:

custom_sitemap.map_generate:
  path: '/map-generate'
  defaults:
    _controller: '\Drupal\custom_sitemap\Controller\MapgenerateController::generate'
    _title: 'Generate sitemap'
  requirements:
    _permission: 'access toolbar'

I have defined the path for our controller, which will execute the function "generate" after running the script "/map-generate".

Also, I need to create the service, which I have defined in our Controller file. It's very simple, in a module_name.services.yml file ( in my case: custom_sitemap.services.yml ), insert the code below:

custom_simplemap.services.yml

services:
 custom_sitemap.sitemap:
 class: Drupal\custom_sitemap\Sitemap
 public: true

In the previous code, I've defined our service custom_sitemap.sitemap, which will use the class I still need to create. 

The path of the class is: your_project/modules/custom_sitemap/src, the file name in my case is: Sitemap.php.

<?php
namespace Drupal\custom_sitemap;
 
use Drupal\Core\Entity;
use Drupal\node\Entity\Node;
use Drupal\Core\Controller\ControllerBase;
use Drupal\Core\Render;
use Symfony\Component\HttpFoundation\RedirectResponse;
use Symfony\Component\HttpKernel\Exception\NotFoundHttpException;
/**
 * Class Sitemap.
 *
 * @package Drupal\custom_sitemap
 */
class Sitemap {
 
    public function __construct() {
 
    }
 
    public function generate() {
        $base_url = \Drupal::request()->getSchemeAndHttpHost();
 
        $string = \Drupal::database()-&gt;select('simple_sitemap', 'sm')
        ->fields('sm', array('sitemap_string'))
       ->condition('sm.id', 1)
        ->execute()->fetchField();
        $string = substr($string, 0, -10);
       
 
        $sql = "SELECT title, langcode from node_field_data n
                            WHERE n.type = "page" and n.uid = 1";
        $query = db_query($sql);      
 
        $results = $query-&gt;fetchAll();
 
        foreach($results as $result){
 
                $changed = time();
                $changed = date('c', $changed);
 
                $langcode = $result-&gt;langcode;
 
                $name = $result-&gt;title;
 
                $name = str_replace(array(' '), array('-'), strtolower($name));    
 
                $uri = '/'. $name;
 
                $string .= custom_sitemap_sitemap_link($uri, $langcode, $changed, 0.7);                    
 
        }
            $string .= '
           ';
 
            if ($handle = fopen('sitemap.xml', 'w')) {
                if (fwrite($handle, $string) === FALSE)
                    print 'Cannot write to file';
                    fclose($handle);
            }
        }
 
    }

This class is probably the most important part of making a custom Sitemap, because there lays some basic logic of the custom integration.

Firstly I have selected the current content of the Sitemap from the Database, which is generated using the Simple XML Sitemap module. Then I've created an SQL query which will select all nodes from the database of type "Page" and which are created by the user with uid 1 (admin). This is just a demo example, you can write the SQL query whatever way you need.

In order to regenerate the Sitemap correctly with our custom values, I need to add the URL, langcode, and the timestamp when the node was changed defined, as well as the priority value for the Sitemap.

To that effect, I will select only the title and langcode of those nodes from the Database. The name values will be redefined for URL purposes, they will be converted to small letters using the function strtolower, and strings ' ' (empty space) will be replaced with '-' (dashes) using function str_replace.

I'll set that the value of "changed" timestamp is the moment of the running of the script, and the priority value is 0.7.

Again, these are optional values, you can set them however you want, based on your needs.

I'll write those changes in the sitemap.xml text file later.

Now, after every script run "your_domain/map-generate" your Sitemap will be regenerated with your newly added custom results. It would be highly recommended that you also create a cron directive, where you can define that your script runs automatically and periodically, whenever you want. If you don't already have a cron function in your custom module, create one and insert the code below:

function custom_sitemap_cron() {
  $time = date('Hi', time());
 
  // Once a day, just after the midnight.
  if ($time >= '0000' &amp;&amp; $time <= '0059') {
    // Generate sitemap.
    $generator = \Drupal::service('simple_sitemap.generator');
    $generator->generateSitemap('backend');
   
    $sitemap = \Drupal::service('custom_sitemap.sitemap');
    $sitemap->generate();
   
  }
}

In the previous code, I`ve defined that my custom cron will run every day, just after midnight, and it will run the Regenerate sitemap script that the Simple XML sitemap module is using, along with your newly added script.

 


Lazar Padjan