Python and FastAPI
A significant amount of AI engineering is ordinary software engineering: data models, request/response handling, retries, error handling, validation, logging, and small backend services. This lesson gives you the Python and API patterns that will make every later lesson easier to build and debug.
Nothing here is AI-specific, and that's quite intentional. The skills you'll build transfer directly to every service, tool endpoint, and evaluation harness we'll create throughout the curriculum.
This lesson is intentionally provider-agnostic. Whether you later call OpenAI Platform, Anthropic's developer platform, Hugging Face, or Ollama Cloud, the FastAPI, validation, retry, and testing patterns here stay the same.
What you'll learn
- Define request and response schemas with Pydantic
- Build a minimal FastAPI application with validation and error handling
- Make outbound HTTP calls with
httpx, including retries and timeouts - Write basic tests with
pytestfor success and failure paths - Parse JSON and read configuration from environment variables
Concepts
FastAPI: a modern Python web framework for building APIs. It uses Python type hints and Pydantic models to generate validation, serialization, and documentation automatically. In this curriculum, FastAPI is the default for building tool endpoints, agent APIs, and evaluation services.
Pydantic: a data validation library that uses Python type annotations to define schemas. A Pydantic model is a class that validates incoming data and rejects bad shapes before your code runs. You'll use Pydantic models for tool arguments, API responses, run logs, and benchmark records throughout the curriculum.
httpx: an HTTP client library for Python. You'll use it to call external APIs (model providers, tool services) and your own FastAPI endpoints. It supports async and timeouts natively. For retries, you write your own logic (shown later in this lesson). httpx does not provide a high-level retry API.
pytest: a testing framework for Python. You'll use it to test your endpoints, your tool implementations, and eventually your evaluation pipelines.
Walkthrough
Project setup
Create the project directory and install dependencies:
mkdir ai-eng-foundations && cd ai-eng-foundations
python -m venv .venv && source .venv/bin/activate
pip install fastapi uvicorn httpx pydantic pytestCreate this file structure:
ai-eng-foundations/
├── app.py # FastAPI application
├── client.py # httpx client with retry logic (added later in this lesson)
├── test_app.py # pytest tests
└── requirements.txt
requirements.txt:
fastapi
uvicorn
httpx
pydantic
pytest
Start with a minimal FastAPI app
Create app.py with three endpoints:
# app.py
import os
from fastapi import FastAPI
from pydantic import BaseModel
# Read configuration from environment variables with sensible defaults
APP_NAME = os.getenv("APP_NAME", "ai-eng-foundations")
app = FastAPI(title=APP_NAME)
# --- Models ---
class EchoRequest(BaseModel):
message: str
class EchoResponse(BaseModel):
message: str
class SummarizeRequest(BaseModel):
title: str
body: str
priority: int
class SummarizeResponse(BaseModel):
field_count: int
fields: list[str]
title_length: int
# --- Endpoints ---
@app.get("/health")
def health():
return {"status": "ok"}
@app.post("/echo", response_model=EchoResponse)
def echo(request: EchoRequest):
return EchoResponse(message=request.message)
@app.post("/summarize-request", response_model=SummarizeResponse)
def summarize_request(request: SummarizeRequest):
return SummarizeResponse(
field_count=3,
fields=["title", "body", "priority"],
title_length=len(request.title),
)Run it:
uvicorn app:app --reloadTest it:
# In another terminal
curl http://localhost:8000/health
# Expected: {"status":"ok"}
curl -X POST http://localhost:8000/echo \
-H "Content-Type: application/json" \
-d '{"message": "hello"}'
# Expected: {"message":"hello"}
curl -X POST http://localhost:8000/summarize-request \
-H "Content-Type: application/json" \
-d '{"title": "Bug report", "body": "Something broke", "priority": 1}'
# Expected: {"field_count":3,"fields":["title","body","priority"],"title_length":10}If all three return the expected output, your FastAPI setup is working. Notice that the Pydantic models define the exact shape of the request and response. If you send {"msg": "hello"} to /echo, FastAPI returns a 422 validation error automatically. Try it and see.
Notice the os.getenv() calls at the top of app.py. This is how you read configuration from environment variables: provide a key and a default value. You can override any of them without changing code:
APP_NAME=my-project uvicorn app:app --reloadEvery later lesson uses environment variables for API keys, model names, and service URLs. The pattern is always the same: os.getenv("KEY", "default"). Hardcoded configuration in code is a bug waiting to happen, especially for secrets, which should never appear in source files.
Every later lesson also assumes you define schemas for your data using Pydantic models, not ad-hoc dictionaries. This is the habit to build now.
Add outbound HTTP calls with timeout and retry
Create client.py to call your own server and an external API with timeout and retry logic:
# client.py
import httpx
import time
# Status codes worth retrying — transient failures only
RETRYABLE_STATUS_CODES = {429, 500, 502, 503, 504}
def call_with_retry(method, url, max_attempts=3, timeout=5.0, **kwargs):
"""Make an HTTP call with timeout and exponential backoff.
max_attempts is the total number of tries (including the first).
Only retries on transient failures: timeouts, connection errors,
and specific HTTP status codes (429, 5xx). Client errors like
400, 401, 404 fail immediately — retrying those is pointless.
"""
for attempt in range(max_attempts):
try:
response = httpx.request(method, url, timeout=timeout, **kwargs)
# Client errors (4xx except 429) — fail fast, do not retry
if 400 <= response.status_code < 500 and response.status_code not in RETRYABLE_STATUS_CODES:
response.raise_for_status()
# Retryable server/rate-limit errors
if response.status_code in RETRYABLE_STATUS_CODES:
if attempt == max_attempts - 1:
response.raise_for_status()
wait = (2 ** attempt) + 0.1
print(f" Attempt {attempt + 1}: got {response.status_code}. Retrying in {wait:.1f}s...")
time.sleep(wait)
continue
return response
except (httpx.TimeoutException, httpx.ConnectError) as e:
if attempt == max_attempts - 1:
raise
wait = (2 ** attempt) + 0.1
print(f" Attempt {attempt + 1} failed: {e}. Retrying in {wait:.1f}s...")
time.sleep(wait)
# --- Call your own server ---
print("=== Calling /health ===")
r = call_with_retry("GET", "http://localhost:8000/health")
print(r.json())
print("\n=== Calling /echo ===")
r = call_with_retry("POST", "http://localhost:8000/echo", json={"message": "hello from client"})
print(r.json())
# --- Call an external API ---
print("\n=== Calling external API (JSONPlaceholder) ===")
r = call_with_retry("GET", "https://jsonplaceholder.typicode.com/todos/1")
print(r.json())
# --- Demonstrate timeout behavior ---
print("\n=== Demonstrating timeout (this should fail) ===")
try:
# httpbin delays 10s, but our timeout is 2s
call_with_retry("GET", "https://httpbin.org/delay/10", timeout=2.0, max_attempts=2)
except httpx.TimeoutException:
print(" Timed out as expected after the final attempt.")Run it (with your FastAPI server still running in another terminal):
python client.pyExpected output:
=== Calling /health ===
{'status': 'ok'}
=== Calling /echo ===
{'message': 'hello from client'}
=== Calling external API (JSONPlaceholder) ===
{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}
=== Demonstrating timeout (this should fail) ===
Attempt 1 failed: ... Retrying in 1.1s...
Timed out as expected after the final attempt.
The call_with_retry pattern (timeout on every call, exponential backoff on failure, hard stop after N attempts) will recur in every lesson that calls a model API or external service.
Add validation and error handling
Validation is already partially working. Pydantic catches missing and wrong-typed fields automatically. Verify by sending bad input:
# Missing required field
curl -X POST http://localhost:8000/echo \
-H "Content-Type: application/json" \
-d '{"wrong_field": "hello"}'
# Expected: 422 status with validation error detail, NOT a 500 server error
# Wrong field type
curl -X POST http://localhost:8000/summarize-request \
-H "Content-Type: application/json" \
-d '{"title": "Bug", "body": "Broken", "priority": "not-a-number"}'
# Expected: 422 — priority must be an integerBoth should return a 422 with a structured error body showing exactly which field failed and why. This is Pydantic doing the work; you did not write any error-handling code for these cases.
The goal is not exhaustive error handling. The goal is establishing the habit: define the expected shape, validate it, and return useful errors when the shape is wrong.
Write tests
Create test_app.py:
# test_app.py
from fastapi.testclient import TestClient
from app import app
client = TestClient(app)
def test_health():
response = client.get("/health")
assert response.status_code == 200
assert response.json() == {"status": "ok"}
def test_echo_success():
response = client.post("/echo", json={"message": "hello"})
assert response.status_code == 200
assert response.json() == {"message": "hello"}
def test_echo_missing_field():
response = client.post("/echo", json={"wrong_field": "hello"})
assert response.status_code == 422 # Pydantic validation error, not 500Run:
pytest test_app.py -vExpected output:
test_app.py::test_health PASSED
test_app.py::test_echo_success PASSED
test_app.py::test_echo_missing_field PASSED
FastAPI's TestClient runs the server in-process, so there's no need to start uvicorn separately. The 422 status code on the missing-field test confirms that Pydantic catches the validation error and returns a structured error response, not a crash.
Exercises
- Build the FastAPI app described above (
/health,/echo,/summarize-request) with Pydantic request/response models. - Use
httpxto call your server and one external API. Add timeout and retry logic. - Add request validation and error handling for missing fields, wrong types, and outbound failures.
- Write two
pytesttests: one success path, one invalid input path.
Completion checkpoint
You can:
- Run your FastAPI app and hit all three endpoints successfully
- Show a Pydantic model that validates a request body and rejects bad input
- Show an
httpxcall with timeout and retry logic - Run
pytestand see both tests pass
What's next
LLM Mental Models. You have the project scaffold now; before you call a model from code, get clear on tokens, context, and inference so the rest of Module 1 does not feel magical.
References
Start here
- FastAPI Tutorial — walk through this if you have not used FastAPI before; it covers everything this lesson needs
Build with this
- Pydantic docs — reference for model definitions, validators, and serialization
- httpx docs — reference for async HTTP calls and timeouts (retries are custom logic, as shown in this lesson)
Deep dive
- FastAPI full docs — dependency injection, middleware, background tasks, and other features you'll use in later lessons