Day 3: PostgreSQL Setup: Create, Insert, and Query Your First User

Lesson 3 60 min

PostgreSQL Setup – Create, Insert, and Query Your First User

Alright team, let’s get our hands dirty and lay down the foundational brick of our AI-Powered CRM: the database. Yesterday, we got our React frontend humming, ready to talk to a backend. Today, we give that backend something solid to talk to – a robust, reliable data store. We're talking PostgreSQL, a workhorse of the database world.

Agenda for Day 3: Building Our Data Foundation

Why PostgreSQL? Understanding its role in a hyperscale, AI-driven CRM.
Setting Up PostgreSQL: Quickly getting a local instance running with Docker.
Designing Our users Table: Crafting the blueprint for our core entity.
Hands-On Data Interaction: Connecting, creating, inserting, and querying our first user.
The Big Picture: How this database fits into our evolving CRM architecture.

Core Concepts: The Unsung Hero of Scalability

Today, we're diving into Data Persistence and Relational Database Management Systems (RDBMS). Specifically, we'll touch upon ACID properties (Atomicity, Consistency, Isolation, Durability) and basic Schema Design.

Why PostgreSQL for a Hyperscale AI CRM? The Non-Obvious Truth.

You might be thinking, "AI-powered, hyperscale... shouldn't we be reaching for some cutting-edge NoSQL database right away?" And that's a fair thought. Many systems do leverage NoSQL for certain workloads. But here's the kicker, the non-obvious insight often learned through years in the trenches:

For core business entities like Users, Accounts, Leads, or Orders, even systems handling 100 million requests per second often rely on relational databases, heavily sharded and optimized.

Why? Because these core entities demand strong consistency and transactional integrity. When a user signs up, you absolutely need to know that their record is fully created, correctly linked, and won't suddenly disappear or become corrupted. This is where ACID properties shine. PostgreSQL provides that rock-solid foundation.

Think about it: Your AI models will eventually ingest and process vast amounts of user data to provide personalized insights, predict churn, or automate sales tasks. If the source user data is inconsistent or unreliable, your AI insights become garbage. A well-structured relational database ensures clean, consistent data, which is paramount for effective AI.

In big tech, we often see a hybrid approach: RDBMS for critical, highly consistent core data, and NoSQL databases (like Cassandra, DynamoDB, MongoDB) for high-volume, eventually consistent, or unstructured data (e.g., user activity logs, real-time recommendations caches, large-scale event streams). We'll explore these later, but for our CRM's heart, PostgreSQL is the right call for now. It provides the "source of truth" for our users.

Component Architecture: PostgreSQL in Our CRM

Component Architecture

Today, PostgreSQL becomes the backend's persistent storage layer. Our React frontend (from Day 2) will eventually talk to a backend API (which we'll build next), and that backend API will, in turn, communicate with PostgreSQL to store and retrieve data.

(Imagine a clean SVG diagram here: React Frontend -> Backend Service -> PostgreSQL Database)

Control Flow and Data Flow: Getting Data In and Out

Flowchart

For today, our "backend" is essentially a set of direct SQL commands.

Control Flow: We (as developers) execute SQL commands.
Data Flow:

We send CREATE TABLE (DDL - Data Definition Language) to define our users table.
We send INSERT (DML - Data Manipulation Language) to add a user.
We send SELECT (DML) to retrieve user data.
PostgreSQL processes these commands and either modifies its internal state (for DDL/INSERT) or returns data (for SELECT).

State Changes: From Empty Canvas to User Registry

State Machine

Our database starts as an empty canvas. Today, we'll transition it through these states:

Database Created: An empty crm_db exists.
users Table Created: The schema for users is defined.
User Record Inserted: Our first user exists in the table.

Sizing for Real-time Production Systems (100M RPS)

A users table is deceptively simple. At 100M RPS, this table becomes a critical hot-spot. How do big systems handle it?

Sharding: Breaking the users table into smaller, independent databases based on a key (e.g., user ID range, geographical region). Each shard handles a subset of users.
Read Replicas: Creating multiple read-only copies of the database to distribute read traffic. Writes still go to the primary.
Connection Pooling: Managing database connections efficiently from the application layer to avoid overhead.

For now, we're starting simple, but keep these concepts in your back pocket. The principles of good schema design and understanding query patterns today will pay dividends when you're thinking about sharding strategies tomorrow.

Hands-On: Setting Up PostgreSQL and Creating Our First User

Let's get practical. We'll use Docker for a quick, isolated PostgreSQL instance.

Step 1: Start PostgreSQL with Docker

bash

docker run --name crm-postgres -e POSTGRES_USER=crmuser -e POSTGRES_PASSWORD=crmpassword -e POSTGRES_DB=crm_db -p 5432:5432 -d postgres:14-alpine

This command starts a PostgreSQL 14 container, names it crm-postgres, sets up a user, password, and database, and maps port 5432 to your local machine.

Step 2: Connect to PostgreSQL

Once the container is running, connect using psql (you might need to install postgresql-client if you don't have it).

bash

psql -h localhost -p 5432 -U crmuser -d crm_db

Enter crmpassword when prompted. You should see the crm_db=# prompt.

Step 3: Design and Create the `users` Table

We need a table to store our users. Here’s a lean schema:

id: Unique identifier, primary key.
email: User's email, must be unique.
password_hash: Hashed password (never store plain passwords!).
created_at: Timestamp when the user was created.
updated_at: Timestamp for the last update.

sql

CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);

Insight: We're using UUID for id. Why? In distributed systems, UUIDs are fantastic because they can be generated independently across different services or shards without needing a central authority, reducing coordination overhead and preventing ID collisions. SERIAL integers are simpler for single-instance databases but less suitable for hyperscale distributed systems. gen_random_uuid() is a PostgreSQL function for this.

Step 4: Insert Our First User

Let's add a dummy user. Remember, in a real system, you'd hash the password on the backend before storing it. For now, we'll use a placeholder.

sql

INSERT INTO users (email, password_hash) VALUES ('alice@example.com', 'hashed_password_for_alice');

Step 5: Query Our First User

Verify that Alice is in our database:

sql

SELECT id, email, created_at FROM users;

You should see Alice's details. Congratulations! You've just established the core data foundation for your CRM.

Assignment: Level Up Your Schema and Operations

Now that you've got the basics, let's expand.

Enhance the users table:

Add first_name (VARCHAR(100), NOT NULL)
Add last_name (VARCHAR(100), NOT NULL)
Add phone_number (VARCHAR(20), UNIQUE)
Add is_active (BOOLEAN DEFAULT TRUE)
Hint: Use ALTER TABLE ADD COLUMN.

Insert another user: Add a new user, 'bob@example.com', with all the new fields.
Update a user: Change Alice's first_name to 'Alicia'. Hint: Remember to update updated_at.
Delete a user: Remove Bob from the database.
Bonus: Add an Index: Create an index on the email column to speed up lookups. Why is this critical? Because email will be frequently used for login and user identification.

Solution Hints

sql

-- 1. Enhance the users table
ALTER TABLE users ADD COLUMN first_name VARCHAR(100) NOT NULL DEFAULT 'Unknown';
ALTER TABLE users ADD COLUMN last_name VARCHAR(100) NOT NULL DEFAULT 'Unknown';
ALTER TABLE users ADD COLUMN phone_number VARCHAR(20) UNIQUE;
ALTER TABLE users ADD COLUMN is_active BOOLEAN DEFAULT TRUE;

-- 2. Insert another user
INSERT INTO users (email, password_hash, first_name, last_name, phone_number)
VALUES ('bob@example.com', 'hashed_password_for_bob', 'Bob', 'Smith', '555-123-4567');

-- 3. Update a user
UPDATE users SET first_name = 'Alicia', updated_at = CURRENT_TIMESTAMP WHERE email = 'alice@example.com';

-- 4. Delete a user
DELETE FROM users WHERE email = 'bob@example.com';

-- 5. Bonus: Add an Index
CREATE INDEX idx_users_email ON users (email);

The Practical Takeaway

Today, you didn't just run some SQL; you established the bedrock of your CRM. You understood why a relational database is crucial for core data even in hyperscale systems, ensuring data integrity and consistency for future AI capabilities. This robust foundation is what allows us to build complex, intelligent features on top. Next, we'll start connecting this database to a real backend service!

Learning Objectives

✓ Understand PostgreSQL’s role — Learn why relational databases power core, consistent data in large-scale systems.
✓ Set up a containerized database — Run PostgreSQL locally using Docker for fast, reproducible setup.
✓ Design a production-ready schema — Create a robust users table with correct data types and constraints.
✓ Perform core SQL operations — Create, insert, query, update, and delete user records confidently.
✓ Think ahead for scale — Introduce indexing and understand how today’s schema supports future hyperscale growth.

💬 Discuss this topic