Case Study

Building a Production HR SaaS Platform: Architecture, Trade-offs, and Lessons

A deep technical walkthrough of the architecture decisions behind a production employment marketplace — modular monolith design, tri-role RBAC, AI CV analysis, full-text search, and real-time messaging.

OOmar Harkouss

6 March 202614 min read

Building a production SaaS platform is a fundamentally different undertaking from building a side project. The gap is not primarily in lines of code. It is in the invisible decisions — the ones that determine whether the system can be reasoned about at 2am during an incident, whether a new engineer can onboard in a day, and whether the data model survives the first real usage pattern the product team didn't anticipate.

This article is a detailed walkthrough of the architectural decisions behind a production employment marketplace: what was chosen, why, what the trade-offs were, and what the codebase reveals about designing systems that need to work reliably in the real world for real users.

The platform in question is a tri-role HR SaaS serving three distinct user types — candidates, recruiters, and administrators — covering the full recruitment lifecycle: job discovery, applications, shortlisting, AI-powered CV review, peer-to-peer messaging, and transactional notifications.

I. The Foundational Question: Monolith or Services?

The first architectural decision any SaaS project faces is decomposition strategy. For a project at this scale — a single product, a solo or small team, with a bounded and well-understood domain — the answer is almost always the same, even if it is not always the one that gets chosen.

A modular monolith is the correct starting point.

The architecture chosen here follows this principle precisely. A single Next.js application contains all functional domains, each isolated into its own module directory with a consistent internal structure. The business logic boundary is enforced not by network calls or service contracts, but by module cohesion and the tRPC router interface.

This approach has a specific set of advantages that only become visible in production:

Atomic transactions across domain boundaries are trivial. In a microservices architecture, marking an application as "reviewed" while simultaneously sending a notification and updating a dashboard counter requires distributed transaction coordination — one of the genuinely hard problems in distributed systems. In a monolith, it is a Prisma $transaction block.
Type safety is end-to-end. With tRPC, a TypeScript type defined in a server router is automatically available to the React client without code generation steps or manual type synchronisation. The TypeScript compiler enforces the API contract between frontend and backend at build time.
Deployment surface is minimal. One application, one deployment target, one set of environment variables to manage.

The cost is well-understood: as the team or domain grows, the monolith requires discipline to keep modules genuinely isolated. Without enforced boundaries, the modular monolith collapses into a distributed ball of mud. The module-first directory structure — each domain owning its own server/, view/, schema/, and types/ directories — is the primary mechanism for preserving those boundaries.

II. The Module Architecture in Detail

Each functional domain in the codebase follows a consistent internal decomposition:

src/modules/[domain]/
  server/          — tRPC router, service functions, DB queries
  view/            — React components, pages, feature hooks
  schema/          — Zod validation schemas
  types/           — Domain-specific TypeScript types
  utils/           — Feature-local utility functions

This structure makes a specific claim about ownership: the module boundary is the unit of reasoning, not the technical layer. A candidate applications engineer does not need to understand the recruiter module to work on application workflows. The schema, API surface, and UI for that domain are co-located.

The domains mapped to modules are:

auth — registration, login, session management, email verification
candidate/* — profile, applications, CV review, shortlist, dashboard
recruiter/* — jobs, applications, company, profile, shortlist, dashboard
jobs + companies — public discovery (unauthenticated)
messaging — peer-to-peer conversations between candidates and recruiters
notifications — in-app notification state and event-driven creation
files — upload, metadata, ownership enforcement
lookup — reference data (regions, cities, fields of study, sectors)
settings — user-level preferences

This is approximately the right level of domain decomposition for a platform of this scope. The domains are large enough to be coherent and small enough to be owned.

III. Role-Based Access Control: The Dual-Layer Model

A three-role system — candidate, recruiter, administrator — requires a RBAC implementation that is both strict and operationally ergonomic. The approach taken here solves this with two independent enforcement layers that serve different purposes.

Layer 1: Edge Middleware

A middleware function intercepts every request before it reaches application code. It performs fast, stateless checks using cookie presence and URL prefix matching:

If the request targets /candidate/* and the session cookie indicates a recruiter role, redirect immediately.
If the request targets an authenticated prefix with no session cookie at all, redirect to login.

This layer is intentionally shallow. It cannot — and does not try to — perform deep database validation. Its purpose is user experience: preventing a recruiter from seeing a 403 page deep inside the candidate dashboard by catching the mismatch at the earliest possible point in the request lifecycle.

Layer 2: tRPC Procedure Guards

Every mutation and query in the API layer is wrapped in a typed procedure guard:

protectedProcedure — requires any authenticated session
candidateProcedure — requires authenticated session with role === CANDIDATE
recruiterProcedure — requires authenticated session with role === RECRUITER
adminProcedure — requires authenticated session with role === ADMIN

These are not UI redirects. They perform actual database-backed session validation and return TRPCError with code UNAUTHORIZED or FORBIDDEN on failure. An authenticated recruiter calling a candidateProcedure endpoint receives a typed error, regardless of what the middleware did or did not catch.

The design principle here is defense in depth: middleware handles UX, procedure guards handle security. The two layers are independent — compromising or bypassing one does not compromise the other.

Ownership Verification

Role checks are necessary but not sufficient. A recruiter is authorised to manage their own jobs but not those of another recruiter. A candidate should only access their own application history.

The correct pattern — applied consistently throughout the codebase — is to scope every query to the authenticated user's identity, not just their role:

// Anti-pattern: role check only
const job = await db.job.findUnique({ where: { id: input.jobId } });
if (!job) throw new TRPCError({ code: "NOT_FOUND" });
// Nothing stops a recruiter from editing another recruiter's job

// Correct pattern: scope to authenticated user
const job = await db.job.findUnique({
  where: {
    id: input.jobId,
    recruiterId: ctx.user.id  // ownership enforced at query level
  }
});
if (!job) throw new TRPCError({ code: "NOT_FOUND" });

This pattern means that even if a procedure guard were bypassed, the query itself would return nothing for data that doesn't belong to the authenticated user.

IV. The Data Model: Designing for a Marketplace

A marketplace data model must represent three distinct identity types and the relationships between them. The foundational design decision is whether to use a single User table with optional role-specific profile tables, or to create entirely separate tables per role.

The single User with optional profile tables is almost always the better choice. It preserves a single authentication identity, simplifies session management, and allows sharing of common fields (email, timestamps, preferences) without duplication.

The schema reflects this:

User (role, email, sessions, accounts)
  ├── CandidateProfile (one-to-one, nullable)
  │     ├── CandidateEducation[]
  │     ├── CandidateExperience[]
  │     ├── CandidateLanguage[]
  │     ├── CandidateSkill[]
  │     ├── CandidateDocument[]
  │     └── CvReview[]
  └── RecruiterProfile (one-to-one, nullable)
        └── Company (many-to-one)

The role-specific profile tables are nullable. A user who registers but has not yet completed profile setup has a User row but no CandidateProfile. This supports lazy profile initialisation — a performance optimisation in application flows where creating an empty profile for every new user would generate unnecessary rows.

The Marketplace Relations

Job (recruiter, company, status, vector, searchVector)
Application (candidate, job, status, coverLetter)
Conversation (recruiter, candidate, job)
Message (conversation, sender, content)
Notification (user, type, referenceId, read)
RecruiterShortlistCandidate (recruiter, candidate) — unique constraint
CandidateShortlistJob (candidate, job) — unique constraint

The unique constraints on shortlist tables enforce at the database level that a recruiter cannot shortlist the same candidate twice, and a candidate cannot save the same job twice. This is the correct place to enforce this invariant — not in application code where race conditions are possible, but in the database where the constraint is atomic.

Vector Columns and Full-Text Search

The schema contains two PostgreSQL-native column types that Prisma cannot model natively:

// Prisma schema (simplified)
model CandidateProfile {
  embedding  Unsupported("vector(1536)")?
}

model Job {
  embedding    Unsupported("vector(1536)")?
  searchVector Unsupported("tsvector")?
}

The Unsupported type is Prisma's mechanism for acknowledging that a column exists but deferring its management to raw SQL. This is the correct approach for PostgreSQL extensions like pgvector and full-text search, which have no Prisma-native equivalent.

The embedding columns store 1536-dimensional float vectors generated by the Gemini embedding model. Semantic similarity search uses cosine distance between these vectors to rank candidates against job descriptions — a fundamentally more robust matching strategy than keyword search.

The searchVector column stores a pre-computed tsvector for French-language full-text search, updated via database trigger or application-side raw query on each job write.

V. The Search Architecture: A Study in Pragmatic Trade-offs

Job discovery is the highest-traffic path in the application. The search implementation makes an interesting architectural decision that is worth examining closely.

The public job router bifurcates the search path based on query intent:

if (!input.query) {
  // Path A: Prisma-native filtering
  // Fast, type-safe, supports all Prisma filter operators
  const jobs = await db.job.findMany({
    where: buildPrismaFilter(input),
    orderBy: { createdAt: "desc" },
    skip: input.cursor,
    take: input.limit,
  });
} else {
  // Path B: Raw SQL with full-text search
  try {
    const jobs = await db.$queryRaw`
      SELECT *, ts_rank(search_vector, to_tsquery('french', ${input.query})) as rank
      FROM jobs
      WHERE search_vector @@ to_tsquery('french', ${input.query})
      ORDER BY rank DESC
      LIMIT ${input.limit}
    `;
  } catch {
    // Path C: ILIKE fallback if FTS fails
    const jobs = await db.job.findMany({
      where: { title: { contains: input.query, mode: "insensitive" } },
    });
  }
}

This three-path architecture encodes several pragmatic decisions:

Why bifurcate? Prisma's query builder is expressive for structured filtering (status, location, salary range, job type) but cannot express full-text search ranking. Raw SQL is required for ts_rank and the @@ tsvector match operator.

Why the ILIKE fallback? Full-text search with to_tsquery requires the query to be a valid tsquery expression. Common user inputs — partial words, typos, queries with stopwords — can cause to_tsquery to throw. The ILIKE fallback ensures the user sees results rather than an error when this occurs.

The cost of this pattern: Three code paths for one query means three surfaces for bugs and three behaviours to test. The Prisma path and the raw SQL path must produce compatible pagination structures. The fallback path has different ranking semantics than the primary path.

This is a reasonable trade-off for a production system. The alternative — a single unified search path — would require either accepting Prisma's inability to express ranking, or moving all filtering into raw SQL and losing type safety.

VI. The AI Integration: CV Review Pipeline

The CV review feature is architecturally interesting because it chains three distinct systems: file storage, an LLM, and a relational database.

The pipeline:

1. Candidate uploads CV PDF → Cloudinary (storage)
2. Candidate requests review → tRPC mutation
3. Backend downloads PDF from Cloudinary URL
4. PDF bytes sent to Gemini (gemini-2.5-flash) with structured output schema
5. Gemini returns JSON conforming to the schema
6. Backend persists review: scalar fields to typed columns, JSON sections to jsonb
7. Review available to candidate via getReviews / getReviewById queries

The critical design decision is step 4: structured output schema enforcement.

Sending a PDF to an LLM and asking for a free-form review is not an integration — it is a hope. The integration becomes reliable when the LLM is constrained to produce output that conforms to a specific JSON schema, validated before persistence.

The Vercel AI SDK's generateObject function enforces this:

import { generateObject } from "ai";
import { z } from "zod";

const cvReviewSchema = z.object({
  overallScore: z.number().min(0).max(100),
  summary: z.string(),
  strengths: z.array(z.string()),
  improvements: z.array(z.object({
    area: z.string(),
    suggestion: z.string(),
    priority: z.enum(["HIGH", "MEDIUM", "LOW"]),
  })),
  sections: z.object({
    experience: z.object({ score: z.number(), feedback: z.string() }),
    education: z.object({ score: z.number(), feedback: z.string() }),
    skills: z.object({ score: z.number(), feedback: z.string() }),
  }),
});

const { object: review } = await generateObject({
  model: google("gemini-2.5-flash"),
  schema: cvReviewSchema,
  messages: [{ role: "user", content: [{ type: "file", data: pdfBytes, mimeType: "application/pdf" }] }],
});

If the model produces output that does not conform to cvReviewSchema, the SDK throws before the application receives the data. The persistence layer never sees malformed AI output.

The persistence strategy balances two concerns: queryability and flexibility. Scalar fields (overallScore, section scores) are stored in typed columns — this allows filtering and aggregating reviews without parsing JSON. Rich structured sections are stored in jsonb columns — this allows the schema to evolve without database migrations for every structural change.

VII. The Messaging Architecture

Peer-to-peer messaging in a marketplace has a specific constraint that distinguishes it from general chat: access control is not just about authentication, it is about participant membership.

The data model reflects this:

Conversation (recruiterId, candidateId, jobId)
Message (conversationId, senderId, content, createdAt)

Every API operation on a conversation — reading messages, sending a message — performs a participant check before returning or writing data:

const conversation = await db.conversation.findFirst({
  where: {
    id: input.conversationId,
    OR: [
      { recruiterId: ctx.user.id },
      { candidateId: ctx.user.id },
    ],
  },
});

if (!conversation) {
  throw new TRPCError({ code: "FORBIDDEN" });
}

This pattern means a candidate cannot read messages from a conversation they are not part of, even if they know the conversation ID. The OR condition is essential — it allows either participant to pass the check without two separate queries.

Message retrieval uses cursor-based pagination rather than offset pagination:

const messages = await db.message.findMany({
  where: { conversationId: input.conversationId },
  take: input.limit,
  skip: input.cursor ? 1 : 0,
  cursor: input.cursor ? { id: input.cursor } : undefined,
  orderBy: { createdAt: "asc" },
});

Cursor pagination is correct for chat because it is stable under concurrent inserts. Offset pagination breaks when new messages arrive while a user is loading earlier history — the page boundaries shift, causing messages to appear twice or be skipped. Cursor pagination anchors to a specific message ID, making it immune to concurrent writes.

VIII. File Management: The Security Invariants

File upload in a multi-tenant application requires explicit ownership enforcement at every layer. The files module enforces two invariants:

Invariant 1: Path ownership. Every uploaded file's Cloudinary public ID must contain the uploading user's ID as a path segment. A request to register a file with path /documents/other-user-id/cv.pdf for user current-user-id is rejected.

const expectedPathSegment = ctx.user.id;
if (!input.publicId.includes(expectedPathSegment)) {
  throw new TRPCError({ code: "FORBIDDEN", message: "Path ownership violation" });
}

Invariant 2: Bucket policy enforcement. Different file types have different size and MIME type constraints. A CV upload bucket accepts application/pdf up to 10MB. A profile photo bucket accepts image/* up to 5MB. These constraints are enforced server-side, not just in the client upload widget.

Cleanup ordering: File deletion operations are designed to delete metadata from the database before deleting the file from Cloudinary. If the database deletion fails, the file remains in storage but the application no longer references it — recoverable by manual cleanup. If Cloudinary deletion were attempted first and failed, the database would contain a reference to a non-existent file — a harder problem to recover from.

This is a subtle but important operational detail: design cleanup operations so that the failure mode leaves the system in a state that is safe if inconsistent, rather than consistent but broken.

IX. Notifications: Event-Driven State

Notifications in the application serve two purposes: immediate feedback (a recruiter reviewed your application) and persistent audit state (a record that the event occurred). These two purposes are served by the same data structure but accessed differently.

The notification model:

model Notification {
  id          String   @id
  userId      String
  type        String
  referenceId String?  // ID of the entity the notification refers to
  read        Boolean  @default(false)
  createdAt   DateTime @default(now())

  @@index([userId, read, createdAt])
}

The compound index on (userId, read, createdAt) is a deliberate performance choice. The two most common queries are "unread notifications for this user" (filters on userId and read) and "all notifications for this user ordered by recency" (filters on userId, sorts on createdAt). The index covers both.

Notifications are created as side effects of domain events — application status changes, new messages, recruiter actions. The creation utilities are simple write functions called at the end of the relevant mutations:

// Inside recruiterApplication.updateStatus mutation
await db.application.update({ where: { id: input.applicationId }, data: { status: input.status } });

// Side effect: notify candidate
await createNotification({
  userId: application.candidateId,
  type: "APPLICATION_STATUS_CHANGED",
  referenceId: application.id,
});

This is correct for a system at this scale. The limitation is that notification creation is synchronous within the request — if the notification write fails, it either rolls back the application status update (if inside a transaction) or silently fails (if outside). A background job system would decouple these concerns for higher-volume scenarios.

X. What Production Readiness Actually Requires

The platform described above is architecturally sound and functionally complete. Production readiness, however, is a separate dimension that extends beyond correctness.

Observability

A system in production that you cannot observe is a system you cannot operate. Structured logging — where every log entry includes a request ID, user ID, feature tag, and duration — is the minimum baseline. Without it, correlating a user-reported error to a specific request in a log stream becomes archaeology.

The next layer is distributed tracing: the ability to see a single user action as a trace spanning the middleware, the tRPC procedure, the Prisma query, and the external API call. Tools like OpenTelemetry, Axiom, or Datadog APM provide this. Without it, latency attribution — identifying which component is responsible for a slow response — is guesswork.

Background Processing

Three operations in the current architecture have characteristics that make them poor candidates for synchronous request handling at scale:

CV embedding generation: Calling the Gemini embedding API is network-bound and quota-limited. At volume, a synchronous CV upload that blocks on embedding generation will produce timeouts.
Notification fan-out: If a job receives 500 applications and the recruiter takes a batch action, creating 500 notification rows synchronously will produce a slow request.
Email delivery: Transactional emails via Resend are fast, but email providers have their own rate limits and retry semantics that are better managed by a dedicated queue.

A background job system — Inngest, BullMQ, or a similar queue-backed processor — decouples these operations from the request lifecycle, allowing immediate response to the user and deferred execution of the heavy work.

Integration Testing

The most valuable tests for a system of this architecture are not unit tests of individual functions — it is integration tests of the critical paths at the tRPC procedure level:

A candidate cannot apply to the same job twice
A recruiter can only edit jobs they own
A user cannot read messages from a conversation they are not a participant in
An unauthenticated request to a protected procedure returns UNAUTHORIZED

These tests exercise the actual guard logic, the actual database queries, and the actual error semantics. They cannot be replaced by unit tests of the individual components.

XI. Lessons and Transferable Principles

After mapping this architecture in detail, several principles emerge that apply to any production SaaS platform.

Design for the failure case first. The ownership check pattern, the fallback search path, the cleanup ordering in file deletion — these are all expressions of the same instinct: assume the failure will happen, design the system so the failure mode is safe.

Put constraints in the database, not just the application. Unique constraints on shortlist tables, indexed columns on high-traffic queries, vector column types for semantic search — these are invariants that should be enforced at the persistence layer, not relied on application code to maintain.

The API contract is the product. tRPC's end-to-end type safety means that the API contract between server and client is enforced by the TypeScript compiler. Breaking changes are caught at build time, not at runtime. This is not a minor convenience — it fundamentally changes the confidence with which you can modify server-side code.

Bifurcate on intent, not on type. The search architecture's split between structured filtering and full-text search is a clean example of matching the tool to the problem. Forcing all queries through a single path — either all Prisma or all raw SQL — would require either abandoning ranking or abandoning type safety.

Observability is not an afterthought. Every production incident eventually comes down to the question: what was the system doing when this happened? The systems that answer that question quickly are the ones that built observability in before they needed it.

Conclusion

The architecture described here is a production employment marketplace built by one developer, shipped to real users, running against a real database, processing real files through a real AI pipeline. The decisions it reflects — modular monolith over microservices, dual-layer RBAC, cursor pagination, schema-constrained AI output, ownership-scoped queries — are not academic preferences. They are solutions to problems that become visible only when systems operate at production scale.

The single most consistent theme across all of these decisions is a preference for constraints. Database constraints over application logic. TypeScript constraints over runtime checks. Structured AI output over free-form generation. Schema validation over defensive conditionals.

Constraints are not limitations on what a system can do. They are the mechanism by which a system remains understandable as it grows.

← All articles Get in touch →