Connecting the Dots: Relational Data in MongoDB

So far, our backend understands how to safely validate incoming data, effectively paginate massive lists, and beautifully handle errors.

But our data is lonely.

We have a User model, but in a real application, users don't just exist to be stored. Users do things. They create posts, they leave comments, they join organizations, and they buy products.

Today, we are moving from storing isolated documents into designing Relational Data. We will create a Post model and link it back to our User model, answering the vital question: "Who owns this data?"

SQL vs NoSQL: The Relationship Clash

If you have used traditional SQL databases (like Postgres), relationships are strictly enforced using Foreign Keys across rigid tables.

MongoDB is a NoSQL (document-based) database. It is famously flexible. You have two main choices when relating data in MongoDB:

Embedding (Subdocuments): Storing the child data inside the parent data.
- Example: Storing an array of shipping addresses inside a User document.
- Pros: Extremely fast to read (one query).
- Cons: Bad for huge lists. If a user has 10,000 posts, the document gets too heavy and hits MongoDB's 16MB limit.
Referencing (Normalization): Storing an ID that points to a document in another collection.
- Example: A Post document stores the _id of the User who created it.
- Pros: Great for massive, scalable datasets. Models remain independent.
- Cons: Requires two queries to read the full data (but Mongoose makes this easy with .populate()).

For an ecosystem like "Users and Posts", where posts can be huge and infinite, Referencing is the industry standard.

Step 1: Building the Post Model

Let's create our second schema. Notice how we reference the User.

Create: `src/models/post.model.js`

import mongoose from 'mongoose';
 
const postSchema = new mongoose.Schema({
    title: {
        type: String,
        required: true
    },
    content: {
        type: String,
        required: true
    },
    // THIS IS THE CRITICAL FIELD
    author: {
        type: mongoose.Schema.Types.ObjectId, 
        ref: 'User', // Must EXACTLY match the name of the User model
        required: true
    }
}, { timestamps: true });
 
const Post = mongoose.model('Post', postSchema);
export default Post;

What's Happening Here?

By setting the type to ObjectId and providing a ref of 'User', we are telling Mongoose: "This author field isn't just a string. It is a direct link to a document residing in the User collection."

It acts like a digital bridge between two collections.

Step 2: Creating a Post

Now let's build a controller to create a post. How do we know who the author is? For now, we'll imagine the client passes the User ID in the body (we will secure this properly in the Authentication phase, but for now, focus on the database connection).

Create: `src/controllers/post.controller.js`

import { asyncHandler } from '../utils/asyncHandler.js';
import { ApiResponse } from '../utils/ApiResponse.js';
import Post from '../models/post.model.js';
 
export const createPost = asyncHandler(async (req, res) => {
    const { title, content, authorId } = req.body;
    
    // Provide the authorId to the post map
    const newPost = await Post.create({
        title,
        content,
        author: authorId
    });
    
    return res.status(201).json(
        new ApiResponse(201, newPost, "Post created successfully")
    );
});

The database saves exactly what you give it.

{
  "_id": "b39c0f82...",
  "title": "My first post!",
  "content": "Hello world",
  "author": "a12b3c4d..."  <-- Just the ID string
}

Step 3: Fetching Data with `.populate()`

Here is the dilemma. If your frontend asks for a Post, and you just do const post = await Post.findById(id);, the frontend only gets the authorId string. The frontend wants the author's Name and Email to render on the screen!

Without Mongoose, you would have to run two separate queries: find the Post, then use the ID to find the User.

Mongoose gives us a superpower: .populate().

export const getPost = asyncHandler(async (req, res) => {
    const post = await Post.findById(req.params.id)
                           .populate('author');
    
    // ... error handling omitted for brevity
    return res.status(200).json(
        new ApiResponse(200, post, "Post fetched")
    );
});

When Mongoose sees .populate('author'), it crosses the bridge we built in Step 1. It invisibly fetches the User document matching that ID and replaces the ID string with the actual object before returning it to you:

{
  "_id": "b39c0f82...",
  "title": "My first post!",
  "content": "Hello world",
  "author": {                 <-- Populated Object!
      "_id": "a12b3c4d...",
      "name": "Alice",
      "email": "[email protected]"
  }
}

Selectively Populating (Security Matters)

Wait, notice anything dangerous above? We just sent the entire User document to whoever requested the post. If that User document contained sensitive fields (like a password hash or an isAdmin flag), we just leaked it!

You must always specify exactly which fields you want Mongoose to fetch.

// Fetch the author, but ONLY bring back the name and email. Exclude _id.
const post = await Post.findById(req.params.id)
                       .populate('author', 'name email -_id');

The Reverse Lookup: Virtuals (Optional Pro Tip)

We linked Post to User. We can easily find the author of a post. But what if we query a User and want to see all their Posts?

We didn't save an array of Posts on the User model! Do we have to run a manual query every time? Post.find({ author: currentUserId })

Yes, that is exactly how you should do it for pagination. But if you occasionally want Mongoose to "fake" an array on the User object, you can use Virtuals. A virtual is a field that doesn't exist in the database, but Mongoose computes it on the fly.

// In user.model.js
userSchema.virtual('posts', {
  ref: 'Post',
  localField: '_id',    // The field on the User
  foreignField: 'author' // The matching field on the Post
});

Now you can call User.findById(req.params.id).populate('posts');! (Just remember, never do this if a user could have 10,000 posts, as it will crash your memory limitation. Stick to standard .find() with pagination if in doubt).

Summary

By utilizing MongoDB References and Mongoose .populate(), your database has evolved from a flat filing cabinet into a deeply connected web of information.

Subdocuments are for small, bounded lists (addresses).
References (ObjectId) are for infinite, independent records (posts vs users).
.populate() bridges the gap, allowing you to seamlessly fetch associated data without writing manual compound queries.

But we have reached a critical security crossroad.

Right now, anyone can send { authorId: '123' } in the POST body to create an article. That means I can easily pretend to be you and write posts under your name.

This is why real APIs require users to prove their identity mathematically before allowing them to create or modify related data.

It is time to dive deep into Authentication and Data Security. In the next section of our series, we will learn how to verify identity, issue cryptographically secure Tokens, and ensure that only Alice can delete Alice’s posts.

Happy coding!