Skip to content

Latest commit

 

History

History
735 lines (517 loc) · 21.7 KB

README.md

File metadata and controls

735 lines (517 loc) · 21.7 KB

Help Ukraine now!

astro-sitemap

This Astro integration generates a sitemap.xml for your Astro project during build.

Release License: MIT


Why astro-sitemap?

The sitemap.xml file provides information about structure of your website, about its content: pages, images, videos and relations between them. See Google's advice on sitemap to learn more.

The astro-sitemap integration does everything the official @astrojs/sitemap integration does but much more.

Advantages of astro-sitemap over @astrojs/sitemap:

  • Exclude pages from a sitemap by glob patterns.
  • More control on the sitemap output:
    • manage XML namespaces;
    • lastmod format option;
    • possibility to add a link to custom XSL.
  • Automatically creates a link to the sitemap in the <head> section of generated pages.
  • Better logging.

Part of the functionality of astro-sitemap has become a minor update of the official integration @astrojs/sitemap from v0.1.2 to version 0.2.0.

Shared functionality with the official @astrojs/sitemap:

  • Split up your large sitemap into multiple sitemaps by a custom limit.
  • Ability to add sitemap specific attributes such as changefreq, lastmod, priority.
  • Final output customization via JS function (sync or async).
  • Localization support. In a build time the integration analyses the pages urls for presence of locale signatures in paths to establish relations between pages.
  • Reliability: all config options are validated.

❗ Both official and astro-sitemap integrations don't support SSR.

❗ This integration uses astro:build:done hook (the official @astrojs/sitemap does the same). This hook exposes only generated page paths. Thus, in the current version of Astro, both integrations don't have the ability to analyze the page source, frontmatter, etc. They can add changefreq, lastmod and priority attributes only in a batch or nothing.


Installation

Quick Install

The experimental astro add command-line tool automates the installation for you. Run one of the following commands in a new terminal window. (If you aren't sure which package manager you're using, run the first command.) Then, follow the prompts, and type "y" in the terminal (meaning "yes") for each one.

# Using NPM
npx astro add astro-sitemap

# Using Yarn
yarn astro add astro-sitemap

# Using PNPM
pnpx astro add astro-sitemap

Then, restart the dev server by typing CTRL-C and then npm run astro dev in the terminal window that was running Astro.

Because this command is new, it might not properly set things up. If that happens, log an issue on Astro GitHub and try the manual installation steps below.

Manual Install

First, install the astro-sitemap package using your package manager. If you're using npm or aren't sure, run this in the terminal:

npm install --save-dev astro-sitemap

Then, apply this integration to your astro.config.* file using the integrations property:

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  // ...
  integrations: [sitemap()],
}

Then, restart the dev server.

Usage

The astro-sitemap integration requires a deployment / site URL for generation. Add your site's URL under your astro.config.* using the site property.

astro.config.mjs

import { defineConfig } from 'astro/config';
import sitemap from 'astro-sitemap';

export default defineConfig({
  // ...
  site: 'https://example.com',

integrations: [sitemap()],
});

Now, build your site for production via the astro build command. You should find your sitemap under dist/sitemap-index.xml and dist/sitemap-0.xml!

Example of generated sitemap content

sitemap-index.xml

<?xml version="1.0" encoding="UTF-8"?>
  <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-0.xml</loc>
  </sitemap>
</sitemapindex>

sitemap-0.xml

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <url>
    <loc>https://example.com/</loc>
  </url>
  <url>
    <loc>https://example.com/second-page/</loc>
  </url>
</urlset>
Example of inserted HTML

All pages generated at build time will contain a link to the sitemap in the <head> section:

<link rel="sitemap" type="application/xml" href="/sitemap-index.xml">

Configuration

To configure this integration, pass an object to the sitemap() function call in astro.config.mjs.

💡 For this integration to work correctly, it is recommended to use the mjs or js configuration file extensions.

astro.config.mjs

...
export default defineConfig({
  integrations: [sitemap({
    filter: ...
  })]
});
canonicalURL
Type Required Default value
String No undefined

Absolute URL. The integration needs canonicalURL or site from astro.config. If both values are provided, only canonicalURL will be used by the integration.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    canonicalURL: 'https://another-domain.com',
  })],
};
filter
Type Required Default value
(page: String): Boolean No undefined

Function to filter generated pages to exclude some paths from the sitemap.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    filter(page) {
      return !/exclude-this/.test(page);
    },
  })],
};
exclude
Type Required Default value
String[] No undefined

The exclude option is an array of glob patterns to exclude static routes from the generated sitemap.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    exclude: ['404', 'blog-*/'],
  })],
};
customPages
Type Required Default value
String[] No undefined

Absolute URL list. It will be merged with generated pages urls.

You should also use customPages to manually list sitemap pages when using an SSR adapter. Currently, integration cannot detect your site's pages unless you are building statically. To avoid an empty sitemap, list all pages (including the base origin) with this configuration option.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    customPages: [
      'https://example.com/virtual-one.html',
      'https://example.com/virtual-two.html',
    ],
  })],
};
entryLimit
Type Required Default value
Number No 45000

Number of entries per one sitemap file.

The integration creates a separate sitemap-${i}.xml file for each batch of 45000 and adds this file to index - sitemap-index.xml.

See more on Google.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    entryLimit: 10000,
  })],
};
changefreq
Type Required Default value
EnumChangeFreq No undefined

This option corresponds to the <changefreq> tag in the Sitemap XML specification..

How frequently the page is likely to change.

Ignored by Google.

Available values:

export enum EnumChangefreq {
  DAILY = 'daily',
  MONTHLY = 'monthly',
  ALWAYS = 'always',
  HOURLY = 'hourly',
  WEEKLY = 'weekly',
  YEARLY = 'yearly',
  NEVER = 'never',
}

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    changefreq: EnumChangeFreq.WEEKLY,
  })],
};
lastmod
Type Required Default value
Date No undefined

This option corresponds to the <lastmod> tag in the Sitemap XML specification..

The date of page last modification.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    lastmod: Date(),
  })],
};
priority
Type Required Default value
Number No undefined

This option corresponds to the <priority> tag in the Sitemap XML specification..

The priority of this URL relative to other URLs on your site. Valid values range from 0.0 to 1.0.

Ignored by Google.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    priority: 0.9,
  })],
};
serialize
Type Required Default value
(item: SitemapItem): SitemapItemLoose undefined
Promise<SitemapItemLoose

Function to process an array of sitemap entries just before writing them to disk. Async or sync.

The undefined return value excludes the passed entry from the sitemap.

Type `SitemapItem`*
Name Type Required Description
url String Yes Absolute url
changefreq ChangeFreq No
lastmod String No ISO formatted date
priority Number No
links LinkItem[] No for localization
Type `LinkItem`
Name Type Required Description
url String Yes Absolute URL
lang String Yes hreflag, example: 'en-US'
Interface `SitemapItemLoose`

The SitemapItemLoose interface is a base for the SitemapItem.

It has the properties video, img and many more.

More details about SitemapItemLoose interface see in the sitemap.js repo readme and types source code.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    serialize(item) { 
      if (/exclude-this/.test(item.url)) {
        return undefined;
      }        
      if (/special-page/.test(item.url)) {
        item.changefreq = 'daily';
        item.lastmod = new Date();
        item.priority = 0.9;
      }
      return item;
    },
  })],
};
xslUrl
Type Required Default value
String No undefined

Absolute URL of XSL file to style XML or transform it to other format. Ignored by search engines.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    xslUrl: 'https://example.com/style.xsl',
  })],
};

Example of XML output

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="https://example/style.xsl"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
...
xmlns
Type Required Default value
NSArgs No undefined

Set the XML namespaces by xmlns attributes in <urlset> element.

Interface `NSArgs`
Name Type Required Default Description
xhtml Boolean No true xmlns:xhtml="http://www.w3.org/1999/xhtml"
news Boolean No xmlns:news="http://www.google.com/schemas/sitemap-news/0.9"
video Boolean No xmlns:video="http://www.google.com/schemas/sitemap-video/1.1"
image Boolean No xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
custom String[] No Any custom namespace. Elements of array'll be used as is without any validation.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    xmlns: { 
      xhtml: true,
      news: true, 
      image: true,
      video: true,
      custom: [
        'xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"',
        'xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"',
      ],
    },
  })],
};

Example of XML output

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
...
lastmodDateOnly
Type Required Default value
Boolean No undefined

If its value is true, the lastmod field in the XML output will contain a date part only.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    lastmodDateOnly: true,
    lastmod: Date(),
  })],
};
createLinkInHead
Type Required Default value
Boolean No true

Create a link on the sitemap in <head> section of generated pages.

The final output reprocessing is used for this. It can impact build time for large sites.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    createLinkInHead: false,
  })],
};

💡 See detailed explanation of sitemap specific options on sitemap.org.

Localization

Supply the integration config with the i18n option as an object with two required properties:

locales
Type Required
Record<String, String> Yes

Key/value - pairs. The key is used to look up the locale part of the page path. The value is a language attribute, only English alphabet and hyphen allowed.

See more on MDN.

defaultLocale
Type Required
String Yes

defaultLocale value must exist as one of the locales keys.

astro.config.mjs

import sitemap from 'astro-sitemap';

export default {
  site: 'https://example.com',

  integrations: [sitemap({
    i18n: {
      // All URLs that don't contain `es` or `fr` after `https://example.com/` will be treated as default locale, i.e. `en`
      defaultLocale: 'en',  
      locales: {
        en: 'en-US',     // The `defaultLocale` value must present in `locales` keys
        es: 'es-ES',
        fr: 'fr-CA',
      },
    },
  })],
};

Example of XML output

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <url>
    <loc>https://example.com/</loc>
    <xhtml:link rel="alternate" hreflang="en-US" href="https://example.com/"/>
    <xhtml:link rel="alternate" hreflang="es-ES" href="https://example.com/es/"/>
    <xhtml:link rel="alternate" hreflang="fr-CA" href="https://example.com/fr/"/>
  </url>
  <url>
    <loc>https://example.com/es/</loc>
    <xhtml:link rel="alternate" hreflang="en-US" href="https://example.com/"/>
    <xhtml:link rel="alternate" hreflang="es-ES" href="https://example.com/es/"/>
    <xhtml:link rel="alternate" hreflang="fr-CA" href="https://example.com/fr/"/>
  </url>
  <url>
    <loc>https://example.com/fr/</loc>
    <xhtml:link rel="alternate" hreflang="en-US" href="https://example.com/"/>
    <xhtml:link rel="alternate" hreflang="es-ES" href="https://example.com/es/"/>
    <xhtml:link rel="alternate" hreflang="fr-CA" href="https://example.com/fr/"/>
  </url>
  <url>
    <loc>https://example.com/es/second-page/</loc>
    <xhtml:link rel="alternate" hreflang="es-ES" href="https://example.com/es/second-page/"/>
    <xhtml:link rel="alternate" hreflang="fr-CA" href="https://example.com/fr/second-page/"/>
    <xhtml:link rel="alternate" hreflang="en-US" href="https://example.com/second-page/"/>
  </url>

...
</urlset>

Examples

Example Source Playground
basic GitHub Play Online
advanced GitHub Play Online
i18n GitHub Play Online

Contributing

You're welcome to submit an issue or PR!

Changelog

See CHANGELOG.md for a history of changes to this integration.

Inspiration

Module based on the awesome sitemap.js package ❤️.