Why I built this
Why Astro for a site with 700+ pages?
Notion as a headless CMS
The image pipeline
Client-side filtering without an API
Michelin restaurant integration
Embedding Reddit reviews
Related hotels
SEO and structured data
What would I change?
The result

Why I built this

I was helping someone in the travel industry who needed a website to showcase luxury hotel deals. The requirements sounded simple at first: list hotels, show their perks, let people get in touch.

But the hotel count quickly grew past 600, each with its own images, perks, room categories, special offers, and geographic data. Managing all of that in a traditional CMS felt like overkill, and a full database-backed application felt like even more overkill for what is essentially a content site.

I wanted something that could generate hundreds of static pages at build time, pull content from a source that non-technical people could update easily, and still score well on PageSpeed. So I built Luxury Hotel Offers using Astro, Notion as the content backend, and Vercel for hosting.

Why Astro for a site with 700+ pages?

I had been wanting to try Astro on a real project for a while. The pitch is compelling: zero JavaScript by default, component-based authoring, and a static output mode that generates plain HTML files. For a content-heavy site where interactivity is limited to filtering and theme toggling, that seemed like the right trade-off.

The site generates roughly 700 static HTML pages at build time. That includes individual hotel pages, region pages, country pages, city pages, and a handful of utility pages like the filtered hotel listing and featured offers.

Astro handles this well. Build times are reasonable, and the output is just HTML and CSS with minimal JavaScript shipped to the client.

I went with Astro 6 in static output mode, Tailwind CSS for styling, and TypeScript throughout. Nothing exotic.

Notion as a headless CMS

This was the part I was most skeptical about, and it turned out to be one of the best decisions in the project. The entire hotel database lives in a single Notion database.

Each hotel is a row with properties for the name, description, perks, images, coordinates, partnerships, special offers, and links to city and country pages (which are their own Notion databases via relations).

At build time, the site queries the Notion API, maps each page to a typed hotel object, and resolves the city and country names by following relation IDs. Notion's API has rate limits, so the build includes retry logic with exponential backoff.

One early decision that paid off was treating Notion's property types generically. Notion properties are polymorphic, so rather than writing type-specific handling everywhere, a single utility function normalizes all property types to plain strings. That kept the data mapping layer clean.

The relations are deduplicated before fetching. Even though there are 600+ hotels, they only span around 80 unique cities and 40 countries, so the build resolves those once and caches the results in memory.

Caching across builds

Fetching 600+ hotels from Notion on every build would be slow and wasteful, so the site uses a two-layer cache. An in-memory layer avoids duplicate calls within a single build, and a disk cache with a 6-hour TTL avoids re-fetching across successive builds. On Vercel, the disk cache is also persisted between deploys so that a fresh checkout doesn't mean a cold start.

A GitHub Actions workflow triggers a Vercel deploy every 6 hours via a deploy hook, so the hotel data stays reasonably fresh without manual intervention.

The image pipeline

This was probably the trickiest part of the build (and the most annoying to debug). Notion stores images in signed S3 URLs that expire after about an hour. That means you can't just reference Notion image URLs in your HTML because they'll break shortly after the build finishes. The images need to be downloaded and baked into the site at build time.

The prebuild step downloads each hotel's hero image from Notion, then uses sharp to generate multiple optimized variants in WebP. There are four card sizes (ranging from 300w for mobile thumbnails up to 800w for desktop) and three hero sizes (up to 1600w for full-width displays). The card variants use a lower quality setting since they are displayed small, while the hero variants are higher quality.

The key architectural decision here was making the pipeline incremental. A manifest tracks which hotels have already been processed, and on subsequent builds only new or changed hotels get re-downloaded. This brought repeat build times down significantly, since most builds only need to process a handful of new images rather than all 600+.

The hotel card component then uses srcset for responsive loading. Above-the-fold images are eagerly loaded, and everything else is lazy loaded.

Client-side filtering without an API

Since the site is fully static with no API endpoints, all filtering and searching happens in the browser. Every hotel card is pre-rendered with data attributes for its region, country, city, brand, partnership, and perk tags. When a user applies filters or types a search query, JavaScript reads those attributes and shows or hides cards with CSS. No network requests, no server calls. It is instant.

The challenge was making this feel fast with 600+ cards in the DOM. The approach I landed on was progressive rendering: only a subset of cards are in the visible grid on initial load. The remaining cards sit in a hidden container and get moved into the grid on the first user interaction. After that initial expansion, filtering is just toggling visibility.

Filter state is synced to the URL so filtered views are shareable and bookmarkable.

Michelin restaurant integration

This ended up being one of the more interesting features to build. The idea was simple: for every hotel, show the nearest Michelin Guide restaurants so users can see the dining options in the area without leaving the page. The data source is a CSV dataset of Michelin restaurants that gets loaded into memory at build time.

For each hotel that has coordinates, the build calculates proximity to every restaurant using the haversine formula, with a bounding box pre-filter to avoid running the full calculation against the entire dataset. Results are sorted by award tier (three-star first, then two-star, one-star, Bib Gourmand, and Selected), then by distance within each tier.

Each hotel page shows aggregate stats at a glance: total nearby restaurants, how many are starred, how many are Bib Gourmand, and how many are Michelin Selected. For a hotel like Fouquet's on the Champs-Élysées, the numbers are staggering: over 500 Michelin Guide restaurants within 50km, with well over 100 starred. That kind of context helps users understand the dining landscape around a hotel in a way that a generic "great location" description never could.

The Michelin data also feeds into the JSON-LD structured data, which gives search engines a richer picture of the hotel's location context.

Embedding Reddit reviews

I wanted to add some social proof to hotel pages beyond the standard marketing copy. Reddit turned out to be a good source for that because the reviews tend to be more candid than what you find on booking platforms. People on Reddit will tell you if the pillows were uncomfortable or the breakfast was underwhelming in a way that a curated hotel review often won't.

The review data lives in the same Notion database as everything else, stored as JSON in a rich text field for each hotel. At build time, the build extracts the review text, author, subreddit, and link. Reviews are capped at three per hotel to keep the pages focused.

Since this is user-generated content from an external source, the build runs it through a profanity filter before rendering. Any review that triggers the filter gets excluded entirely. The component itself is simple: a small card per review showing the text, the subreddit it came from, and a link back to the original post.

I found that even just two or three honest Reddit reviews added more credibility to a hotel page than paragraphs of polished marketing copy.

One feature that took some thought was the "related hotels" section at the bottom of each hotel page. The naive approach would be to show random hotels from the same country, but that felt lazy.

Instead, each hotel is scored against every other hotel based on geographic proximity and brand match. The build groups results into tiers (same city, same country, same region) and only shows a tier if there are enough related hotels to fill it. A hotel in Paris will show other Paris properties first, then other French hotels, then other European ones.

This keeps the related section relevant rather than random, and it encourages users to explore nearby alternatives without leaving the site.

SEO and structured data

For a site with 700+ pages, getting the SEO foundations right mattered more than usual. Every page generates its own JSON-LD structured data.

Hotel pages get a LodgingBusiness schema with perks, offers, and nearby restaurants. Listing pages get an ItemList schema, and the homepage gets a WebSite schema with a search action for the sitelinks search box.

One thing I spent more time on than expected was content-aware lastmod dates for the sitemap. The naive approach of updating every page's lastmod on every build would be inaccurate and wasteful. Instead, the build hashes each hotel's SEO-relevant fields and compares against a persisted store. Only pages whose content actually changed get their lastmod updated.

This cascades to aggregate pages too. When a hotel in Paris changes, the Paris city page, France country page, and Europe region page all get their lastmod bumped. The build then writes a list of changed URLs that feeds into an IndexNow submission, so only genuinely changed pages get pushed to search engines.

The site also includes an llms.txt file. This is a structured, AI-readable context file designed for LLM-based discovery tools to understand the site without crawling every page.

What would I change?

If I were starting this project again, there are a few things I would approach differently.

The Notion API is slow. Fetching 600+ pages with relation resolution takes a while even with caching. For a project of this size, I would probably look at syncing Notion data to a local database or flat files rather than querying the API at build time. The caching layer works, but it adds complexity that would not be needed with a simpler data source.

Image processing at build time is fragile. Network timeouts, expired S3 URLs, and the occasional corrupted download all need handling. If I built this again, I might consider a separate image processing pipeline that runs independently of the main build, pushing optimized images to a CDN directly.

Seven image variants per hotel is probably too many. In practice, the card sizes could probably be collapsed to two or three variants without a noticeable difference in visual quality. The hero sizes are more justified since they are displayed much larger.

The result

The production site scores between 91 and 95 on PageSpeed mobile, serves 700+ static pages with sub-second load times, and keeps its content fresh from Notion every 6 hours. Vercel Speed Insights reports a perfect 100 across all pages, which is a nice validation that the static-first approach and image pipeline are pulling their weight.

The client-side filtering is instant, the image pipeline keeps file sizes reasonable, and the SEO setup means search engines get structured data and accurate change signals.

If you are curious about the live site, you can check it out at Luxury Hotel Offers. The stack is Astro 6, Tailwind CSS 3, TypeScript, the Notion API, sharp for images, and Vercel for hosting.

For anyone considering Notion as a headless CMS for a static site, I would say it works surprisingly well for content that changes infrequently and is managed by a small team. The API has its quirks, but the editing experience for non-technical users is hard to beat.

Hope this is helpful.