Smart Crawl

Smart Crawl will revolutionize how you export static sites from WordPress.

It uses native WordPress functions to find all the pages and files of your WordPress website and exports them in static HTML.

Activate Smart Crawl

Smart Crawl is activated by default on newly set up sites, but you can easily enabled it for your existing Simply Static website by navigating to Simply Static -> Settings -> General -> Smart Crawl

Active Crawlers

You will notice that an additional setting will appear, showing you a list of all our available crawlers.

We enable all of them by default, but you can quickly remove specific crawlers by clicking the x icon next to the name:

There is also an explanation below the setting that explains what each crawler collects on your website.

This option gives you exactly the freedom and flexibility you need to fully customize how we crawl your WordPress website.

Adding a Custom Crawler

The new solution is fully extendable by developers, making it easy to build and add your own custom crawler to achieve the best results possible for your website. It's also the official way for other plugin developers to integrate with Simply Static's crawling mechanism.

Simply Static uses a crawler system to discover URLs for static export. Each crawler is responsible for detecting URLs of a specific type (e.g., archive URLs, post type URLs, etc.).

The crawler system is designed to be extendable, allowing third-party plugins to add their own crawlers for specific plugins or use cases.

To add a custom crawler, you need to:

  1. Create a class that extends the Simply_Static\Crawler\Crawler abstract class
  2. Implement the required methods and properties
  3. Add your crawler to the list of crawlers using the simply_static_crawlers filter

Step 1: Create a Custom Crawler Class

Here's an example of a custom crawler that detects URLs for a specific plugin:

<?php

namespace My_Plugin\Crawlers;

// Exit if accessed directly.
if ( ! defined( 'ABSPATH' ) ) {
    exit;
}

/**
 * Custom crawler for My Plugin
 */
class My_Custom_Crawler extends \Simply_Static\Crawler\Crawler {

    /**
     * Crawler ID.
     * @var string
     */
    protected $id = 'my-custom-crawler';

    /**
     * Constructor
     */
    public function __construct() {
        $this->name = __( 'My Custom Crawler', 'my-plugin' );
        $this->description = __( 'Detects URLs for My Plugin.', 'my-plugin' );
        
        // Optional: Set to false if you want this crawler to be disabled by default
        // $this->active_by_default = false;
    }

    /**
     * Detect URLs for this crawler type.
     *
     * @return array List of URLs
     */
    public function detect() : array {
        $urls = [];
        
        // Your custom logic to detect URLs
        // For example:
        $urls[] = 'https://example.com/my-plugin/page1';
        $urls[] = 'https://example.com/my-plugin/page2';
        
        return $urls;
    }
}

Step 2: Add Your Crawler to the List

Add your crawler to the list of crawlers using the simply_static_crawlers filter:

/**
 * Add custom crawler to Simply Static
 */
function add_my_custom_crawler( $crawlers ) {
    // Make sure the Simply Static plugin is active
    if ( class_exists( '\Simply_Static\Crawler\Crawler' ) ) {
        // Add your custom crawler to the list
        $crawlers[] = new \My_Plugin\Crawlers\My_Custom_Crawler();
    }
    
    return $crawlers;
}
add_filter( 'simply_static_crawlers', 'add_my_custom_crawler' );

Step 3: Load Your Crawler Class

Make sure your crawler class is loaded before the filter is applied. You can do this in your plugin's main file:

/**
 * Load custom crawler class
 */
function load_my_custom_crawler() {
    // Only load if Simply Static is active
    if ( class_exists( '\Simply_Static\Crawler\Crawler' ) ) {
        require_once plugin_dir_path( __FILE__ ) . 'includes/crawlers/class-my-custom-crawler.php';
    }
}
add_action( 'plugins_loaded', 'load_my_custom_crawler' );

How It Works

When Simply Static runs the URL discovery task, it will:

  1. Load all built-in crawlers from the src/crawler directory
  2. Apply the simply_static_crawlers filter, allowing your plugin to add custom crawlers
  3. Get the active crawlers (based on user settings)
  4. Run each active crawler to discover URLs

Your custom crawler will be included in the list of available crawlers in the Simply Static settings page, allowing users to enable or disable it as needed.

Best Practices

  1. Give your crawler a unique ID to avoid conflicts with other crawlers
  2. Provide a clear name and description so users understand what your crawler does
  3. Make your crawler efficient by only detecting URLs that are relevant to your plugin
  4. Consider setting active_by_default to false if your crawler is for a specific use case that not all users will need
  5. Use proper namespacing to avoid conflicts with other plugins