This version is still in development and is not considered stable yet. For the latest stable version, please use StreamX Guides 1.1.0!

Set up sitemap generation with StreamX and AEM

AEM provides built-in support for generating sitemaps, making the task straightforward if the platform fully controls the website’s structure. However, when working with environments that include multiple sources, markets, and projects, the process can become complex. StreamX is designed to reduce integration complexity. Its sitemap generation feature streamlines integration and improves search engine indexing accuracy.

In this tutorial we will set up StreamX sitemap generation with content originated from AEM.

Prerequisites

To complete this guide, you will need:

The WKND project is not included on the AEM author instance by default, it has to be manually installed. Installation instructions can be found on the official GitHub project page. You can validate the installation of WKND by visiting the WKND landing page in the English master.
Ensure no other StreamX instance or any other application is occupying port 8081.

Step 1: Get the source files

Clone the Git repository containing source files for the example:

git clone https://github.com/streamx-dev/streamx-docs-resources.git

Step 2: Install StreamX OSGi bundles and configuration

To integrate AEM with StreamX, you must install the StreamX OSGi bundle along with all the necessary OSGi dependencies and configurations. Once installed and configured, the OSGi bundle enables feeding StreamX Mesh with data sourced from AEM.

Follow the steps below to install the package:

  1. Visit AEM author - CRX Package Manager.

  2. Upload and install aem-with-streamx-tutorials/streamx-aem.all-1.0.2.zip from the cloned project repository.

Step 3: Run the StreamX Mesh

  1. Open the terminal and go to generate-sitemap-aem-tutorial inside the cloned project directory.

  2. Run the StreamX Mesh by using the following command:

    streamx run
  3. Wait for the following output:

    -------------------------------------------------------------------
    STREAMX IS READY!
    -------------------------------------------------------------------
    ...
    -------------------------------------------------------------------
    Network ID:
    ...
    Mesh configuration file: ./mesh.yaml
    -------------------------------------------------------------------

Step 4: Publish content from AEM

  1. Visit http://localhost:8081/sitemap.xml and confirm that the resource is not available. This is because no content has been ingested so far.

  2. Visit AEM author - Sites admin page - WKND United States.

  3. Select page /content/wknd/us and from the top menu, click on Manage Publication.

  4. On the next screen (Options):

    • Leave the defaults unchanged.

    • Click Next.

    • Set the following options:

      • Action : Publish.

      • Scheduling : Now.

  5. On the next screen (Scope):

    • Click the thumbnail of the /content/wknd/us item to reveal the Include Children option.

    • Click the Include Children item and deselect every checkbox.

    • Confirm your changes by clicking on Add.

  6. Finally, click Publish.

  7. Wait for AEM author to complete the publication process.

Step 5: Verify the sitemap

  1. Visit http://localhost:8081/sitemap.xml again.

    • Check the published content. It should contain all the pages you’ve just published similar to the following example:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <url>
        <loc>http://localhost:8081/content/wknd/us.html</loc>
    </url>
    <url>
        <loc>http://localhost:8081/content/wknd/us/en.html</loc>
    </url>
    ...
    <url>
        <loc>http://localhost:8081/content/wknd/us/en/magazine/ski-touring.html</loc>
    </url>
    <url>
        <loc>http://localhost:8081/content/wknd/us/es.html</loc>
    </url>
</urlset>

Summary

Congratulations! You have set up the StreamX sitemap generation with AEM used for the data source.