Contact Search API: Filtering Contacts by Name
Welcome back, engineers! Today, we're tackling a seemingly straightforward feature that holds hidden complexities and scalability challenges: building a Contact Search API. Specifically, we'll focus on filtering contacts by name. While it might sound simple – just a database query, right? – the path from a local MVP to a hyperscale system handling 100 million requests per second demands a deeper understanding of search mechanics, indexing, and API design.
This isn't just about writing code; it's about understanding the implications of that code when your CRM grows from a handful of users to millions, and your contact database explodes from hundreds to hundreds of millions.
Agenda for Lesson 9:
The "Why" Behind Efficient Search: Understanding the business and technical criticality.
Core Concepts: Simple Filtering & Its Limits: Implementing a basic name filter and immediately seeing its scalability cracks.
Architecting for Future Scale: Introducing the idea of indexing and dedicated search solutions.
Hands-on Build-Along: Implementing our
GET /contactsendpoint with name filtering.Assignment & Solution Hints: Your next steps to solidify this knowledge.
The "Why" Behind Efficient Search: More Than Just a Feature
Imagine a sales rep trying to find "John Doe" among 50 million contacts. If your search takes 10 seconds, that's 10 seconds of lost productivity, 10 seconds of frustration. Multiply that by thousands of reps, and you have a significant operational bottleneck.
In a hyperscale CRM:
User Experience (UX) is King: Instantaneous search results are non-negotiable for a smooth, productive user experience. Latency kills productivity.
System Load: Inefficient queries can bring your database to its knees, impacting all other CRM operations.
Data Volume: As your CRM scales, the sheer volume of contact data makes naive search approaches catastrophic.
Advanced Needs: Users will eventually want partial matches, phonetic searches, fuzzy matching, and searching across multiple fields. We're laying the groundwork for that journey.
Today, we'll start simple, but with an eye toward these future challenges.
Core Concepts: Filtering Contacts by Name
System Design Concept: Progressive Search Architecture
Our strategy for search will be progressive. We start with the simplest, most direct approach, but immediately identify its weaknesses and mentally prepare for the next evolutionary step. This is a common pattern in big tech: build minimum viable, then iterate based on real-world constraints.
Initial Architecture:
Our GET /contacts endpoint will interact directly with our PostgreSQL database. When a name query parameter is provided, we'll use a LIKE clause in our SQL query.
Control Flow:
A user or another service sends an HTTP
GETrequest to/contacts?name=John.Our API Gateway (or load balancer) routes the request to our CRM Backend Service.
The Backend Service's handler extracts the
namequery parameter.It calls the
repositorylayer, passing the name.The
repositoryconstructs a SQL query likeSELECT * FROM contacts WHERE name LIKE $1.The database executes the query and returns matching contacts.
The
repositorymaps the database rows toContactstructs.The handler formats these structs into a JSON response.
The Backend Service sends the JSON response back to the client.
Data Flow:
Client (HTTP GET /contacts?name=...) -> Backend Service (extracts 'name') -> Repository (SQL query with LIKE) -> Database (returns rows) -> Repository (maps to structs) -> Backend Service (JSON response) -> Client
State Changes:
For a read-only search operation, the system's state doesn't change. However, the state of the request processing evolves: Request Received -> Parameter Parsed -> DB Query Executed -> Results Formatted -> Response Sent. The underlying contact data in the database remains the source of truth.
The Problem with LIKE '%name%' at Scale
When you write SELECT * FROM contacts WHERE name LIKE '%John%';, your database often has to perform a full table scan. This means it looks at every single row in your contacts table to find matches. For 100 contacts, it's trivial. For 100 million contacts, it's a disaster.
Insight: SQL LIKE queries with a leading wildcard (%) generally cannot leverage standard B-tree indexes, making them incredibly slow on large datasets. If you could search LIKE 'John%' (no leading wildcard), an index on the name column would be very efficient. But users rarely search just prefixes.
Real-world Application: This is precisely why big tech companies use specialized full-text search engines like Elasticsearch or Apache Solr for search functionality. These systems are designed to index vast amounts of text data and provide near-instantaneous search results, handling partial matches, relevance scoring, and complex queries much more efficiently than a relational database's LIKE operator. We'll explore these in future lessons, but for now, understand the limitation.
For our current lesson, we'll implement the basic LIKE '%name%' but be acutely aware of its limitations and the need for a future upgrade path. This pragmatic approach allows us to deliver functionality quickly while acknowledging technical debt for scale.
Component Architecture
Our CRM backend service, written in Go, will expose a new API endpoint. We'll reuse our existing Contact model and repository.
Hands-on Build-Along: Implementing the Search API
We'll extend our existing Go backend to add the search functionality.
1. Update repository.go
Add a new method to fetch contacts by name.
Insight: Notice LOWER(first_name) LIKE LOWER($1). This makes our search case-insensitive, which is a common user expectation. Using $1 for the search term is crucial for preventing SQL injection vulnerabilities. Never concatenate user input directly into SQL queries!
2. Update handlers.go
Modify the GetContacts handler to check for the name query parameter.
Insight: The r.URL.Query().Get("name") method is how you safely extract query parameters from the URL. If name is present, we use our new search method; otherwise, we fetch all contacts (which itself will become a performance bottleneck at scale – another future optimization!).
3. Update main.go
Ensure your main.go sets up the routes correctly.
Insight: We've integrated the GetContacts handler to intelligently serve both unfiltered and name-filtered requests based on the presence of the name query parameter. This is a clean way to handle variations of a resource retrieval.
Assignment: Beyond Basic Filtering
Your mission, should you choose to accept it, is to enhance our search capabilities.
Assignment Steps:
Add a
phonefilter: Modify theGetContactshandler andContactRepositoryto allow searching byphonenumber as well. The API should support/contacts?phone=+1-555-123-4567.Combine filters: Make the API support both
nameandphonefilters simultaneously. If both are present, contacts should match both criteria. Example:/contacts?name=John&phone=+1-555-123-4567.Refine the
GetAllContactsfallback: Currently, if no filters are present,GetAllContacts()fetches all contacts. For a large system, this is unsustainable. ModifyGetAllContacts(or theGetContactshandler) to implement pagination (e.g.,_limitand_offsetquery parameters) as a temporary measure. Don't implement it yet, just note where and how you would add it. This is a thought exercise for now, as full pagination will be a separate lesson.
Solution Hints:
Add
phonefilter:
In
handlers.go, retrievephoneFilter := r.URL.Query().Get("phone").In
repository.go, create a new methodGetContactsByPhone(phone string) ([]Contact, error)similar toGetContactsByName.Update the
GetContactshandler to checkphoneFilterand call the new repository method.
Combine filters:
The
GetContactshandler will need more complex logic. If bothnameFilterandphoneFilterare present, you'll need a new repository method likeGetContactsByNameAndPhone(name, phone string).The SQL query in this new method will use
WHERE (LOWER(first_name) LIKE LOWER($1) OR LOWER(last_name) LIKE LOWER($1)) AND LOWER(phone) LIKE LOWER($2). Remember to use parameterized queries!
Pagination thought exercise:
Consider adding
_limitand_offsetquery parameters.The
GetAllContactsSQL query would becomeSELECT ... FROM contacts LIMIT $1 OFFSET $2.The handler would parse these parameters (with default values) and pass them to the repository. This is vital for any API that returns lists of resources.
This lesson equips you with a functional search API and, more importantly, the critical foresight to understand its limitations and future scaling needs. You're now building not just features, but a resilient, scalable system.