Sanity in Content Modeling: How to Structure Complex Data Without Losing Your Mind

Programming· 5 min read

Sanity in Content Modeling: How to Structure Complex Data Without Losing Your Mind

I was recently building a content management application for a Spanish client. I started with what seemed like a simple idea: a system of pages with editable blocks. Two weeks later, my database looked like a tangled mess of cables in a 1990s office basement.

The problem wasn't the technology. It was how I'd modeled the data.

The Dilemma: References vs Embedded Objects

This is the first breaking point in any moderately complex project. When you have related content, you have two paths:

Option 1: References (Relations)

```json { "id": "page_123", "title": "My Page", "author_id": "user_456", "blocks": ["block_1", "block_2", "block_3"] }

{ "id": "user_456", "name": "Juan", "email": "juan@example.es" }

{ "id": "block_1", "type": "text", "content": "Content here" } ```

Option 2: Embedded Objects

```json { "id": "page_123", "title": "My Page", "author": { "id": "user_456", "name": "Juan", "email": "juan@example.es" }, "blocks": [ { "id": "block_1", "type": "text", "content": "Content here" } ] } ```

Both work. But they have very different consequences.

With references, you get flexibility. The user can change their email and it reflects everywhere automatically. But you need more complex queries. In Supabase, you end up with joins. In NoSQL databases, you end up making multiple calls.

With embedded objects, you get speed. A single read gives you everything you need. But if the user changes their email, you need to update every page where it appears. And that scales poorly quickly.

My practical rule:

  • **Use references** if the data changes frequently and must reflect in multiple places
  • **Use embedded objects** if the data is relatively static or context-specific

In the client's project, authors changed rarely. Content blocks were specific to each page. So I embedded the blocks and used references only for the author. Problem solved.

The Singleton Pattern: When It's Sanity, When It's Madness

Every project has that data that's "unique". Global configuration. Site options. Color theme.

Many developers create a singleton document: a single record containing everything.

```json { "id": "config_global", "site_name": "My Site", "primary_color": "#0066ff", "secondary_color": "#ff6600", "logo_url": "https://...", "footer_text": "© 2025", "analytics_id": "UA-123456", "stripe_key": "pk_live_...", "email_from": "noreply@example.es", "max_uploads_mb": 50, "feature_flags": { "new_editor": true, "beta_analytics": false } } ```

This seems logical. But it's a disaster waiting to happen.

First, all your services constantly need to read this document. If you cache it, you have to invalidate the cache every time something changes. If you don't cache it, every request reads from the database.

Second, permissions become a problem. Who can change configuration? The whole team? Only admins? A global singleton makes this hard to control.

Third, versioning. If you need to rollback a configuration, you have no history.

My approach now:

I divide configuration into contexts:

```json // settings/branding { "id": "branding", "site_name": "My Site", "primary_color": "#0066ff", "logo_url": "https://..." }

// settings/features { "id": "features", "new_editor": true, "beta_analytics": false }

// settings/integrations (restricted access) { "id": "integrations", "stripe_key": "pk_live_...", "analytics_id": "UA-123456" } ```

Each section has its own access control. I can aggressively cache branding because it changes rarely. Features I can cache with less time because they change more often. Integrations I never cache.

Page Builders: The Double-Edged Sword

Now we get to the interesting part. Page builders are tempting. Drag, drop, publish. Clients love them.

But data modeling is where everything goes sideways.

A typical page builder generates something like this:

```json { "id": "page_landing", "title": "Landing Page", "slug": "landing", "sections": [ { "id": "section_1", "type": "hero", "props": { "title": "Welcome", "subtitle": "Your description here", "background_image": "https://...", "cta_text": "Get Started", "cta_url": "/signup" } }, { "id": "section_2", "type": "features", "props": { "title": "Features", "features": [ { "icon": "star", "title": "Fast", "description": "Very fast" } ] } } ] } ```

This works. But it has problems:

1. Rigidity: Components are hardcoded. If you want to add a new section type, you need to change code.

2. Reusability: If the same component appears on 10 pages and you want to change it, you have to update 10 documents.

3. Validation: How do you validate that a hero's props are valid? What if the client adds a field that shouldn't be there?

My approach:

I use a "components as references" pattern:

```json // components/hero_1 { "id": "hero_1", "type": "hero", "title": "Welcome", "subtitle": "Your description here", "background_image": "https://...", "cta_text": "Get Started", "cta_url": "/signup" }

// pages/landing { "id": "page_landing", "title": "Landing Page", "slug": "landing", "sections": ["hero_1", "features_main", "testimonials_2"] } ```

Now:

  • I can reuse components across multiple pages
  • If I change the hero, it reflects everywhere
  • I can validate each component type independently
  • It's easy to version changes

The trade-off is that the client can't edit directly on the page. But that's a small price for sanity.

The Underlying Principle

All of this comes from a simple idea: your data should reflect your business reality, not the technical convenience of the moment.

When you design poorly, you pay later. You pay in slow queries. You pay in sync bugs. You pay in features that are impossible to add.

When you design well, everything becomes easy. Adding features is fast. Changes are simple. Code is readable.

How I Do It Now

Before I write a single line of code, I draw:

1. What data changes frequently → References 2. What data is static or contextual → Embedded 3. What data do I need to cache → Divided singletons 4. What data does the user edit directly → Reusable components

This saves me weeks later.

Takeaway

Sanity in programming is predictable. It comes from small decisions made early. Content modeling is where it matters most.

It's not sexy. It's not a new library. But it's the difference between a project that scales and one that keeps you awake at 3 AM debugging data sync issues.

Next time you start a project, invest an hour drawing your data structure. Ask yourself: What changes? What's reusable? What do I need to cache?

You'll thank yourself in three months.

Brian Mena

Brian Mena

Software engineer building profitable digital products: SaaS, directories and AI agents. All from scratch, all in production.

LinkedIn