Python and FastAPI

A significant amount of AI engineering is ordinary software engineering: data models, request/response handling, retries, error handling, validation, logging, and small backend services. This lesson gives you the Python and API patterns that will make every later lesson easier to build and debug.

Nothing here is AI-specific, and that's quite intentional. The skills you'll build transfer directly to every service, tool endpoint, and evaluation harness we'll create throughout the curriculum.

This lesson is intentionally provider-agnostic. Whether you later call OpenAI Platform, Anthropic's developer platform, Hugging Face, or Ollama Cloud, the FastAPI, validation, retry, and testing patterns here stay the same.

What you'll learn

Define request and response schemas with Pydantic
Build a minimal FastAPI application with validation and error handling
Make outbound HTTP calls with httpx, including retries and timeouts
Write basic tests with pytest for success and failure paths
Parse JSON and read configuration from environment variables

Concepts

FastAPI: a modern Python web framework for building APIs. It uses Python type hints and Pydantic models to generate validation, serialization, and documentation automatically. In this curriculum, FastAPI is the default for building tool endpoints, agent APIs, and evaluation services.

Pydantic: a data validation library that uses Python type annotations to define schemas. A Pydantic model is a class that validates incoming data and rejects bad shapes before your code runs. You'll use Pydantic models for tool arguments, API responses, run logs, and benchmark records throughout the curriculum.

httpx: an HTTP client library for Python. You'll use it to call external APIs (model providers, tool services) and your own FastAPI endpoints. It supports async and timeouts natively. For retries, you write your own logic (shown later in this lesson). httpx does not provide a high-level retry API.

pytest: a testing framework for Python. You'll use it to test your endpoints, your tool implementations, and eventually your evaluation pipelines.

Walkthrough

Project setup

Create the project directory and install dependencies:

mkdir ai-eng-foundations && cd ai-eng-foundations
python -m venv .venv && source .venv/bin/activate
pip install fastapi uvicorn httpx pydantic pytest

Create this file structure:

ai-eng-foundations/
├── app.py              # FastAPI application
├── client.py           # httpx client with retry logic (added later in this lesson)
├── test_app.py         # pytest tests
└── requirements.txt

requirements.txt:

fastapi
uvicorn
httpx
pydantic
pytest

Start with a minimal FastAPI app

Create app.py with three endpoints:

# app.py
import os
from fastapi import FastAPI
from pydantic import BaseModel

# Read configuration from environment variables with sensible defaults
APP_NAME = os.getenv("APP_NAME", "ai-eng-foundations")

app = FastAPI(title=APP_NAME)


# --- Models ---

class EchoRequest(BaseModel):
    message: str

class EchoResponse(BaseModel):
    message: str

class SummarizeRequest(BaseModel):
    title: str
    body: str
    priority: int

class SummarizeResponse(BaseModel):
    field_count: int
    fields: list[str]
    title_length: int


# --- Endpoints ---

@app.get("/health")
def health():
    return {"status": "ok"}

@app.post("/echo", response_model=EchoResponse)
def echo(request: EchoRequest):
    return EchoResponse(message=request.message)

@app.post("/summarize-request", response_model=SummarizeResponse)
def summarize_request(request: SummarizeRequest):
    return SummarizeResponse(
        field_count=3,
        fields=["title", "body", "priority"],
        title_length=len(request.title),
    )

Run it:

uvicorn app:app --reload

Test it:

# In another terminal
curl http://localhost:8000/health
# Expected: {"status":"ok"}

curl -X POST http://localhost:8000/echo \
  -H "Content-Type: application/json" \
  -d '{"message": "hello"}'
# Expected: {"message":"hello"}

curl -X POST http://localhost:8000/summarize-request \
  -H "Content-Type: application/json" \
  -d '{"title": "Bug report", "body": "Something broke", "priority": 1}'
# Expected: {"field_count":3,"fields":["title","body","priority"],"title_length":10}

If all three return the expected output, your FastAPI setup is working. Notice that the Pydantic models define the exact shape of the request and response. If you send {"msg": "hello"} to /echo, FastAPI returns a 422 validation error automatically. Try it and see.

Notice the os.getenv() calls at the top of app.py. This is how you read configuration from environment variables: provide a key and a default value. You can override any of them without changing code:

APP_NAME=my-project uvicorn app:app --reload

Every later lesson uses environment variables for API keys, model names, and service URLs. The pattern is always the same: os.getenv("KEY", "default"). Hardcoded configuration in code is a bug waiting to happen, especially for secrets, which should never appear in source files.

Every later lesson also assumes you define schemas for your data using Pydantic models, not ad-hoc dictionaries. This is the habit to build now.

Add outbound HTTP calls with timeout and retry

Create client.py to call your own server and an external API with timeout and retry logic:

# client.py
import httpx
import time


# Status codes worth retrying — transient failures only
RETRYABLE_STATUS_CODES = {429, 500, 502, 503, 504}


def call_with_retry(method, url, max_attempts=3, timeout=5.0, **kwargs):
    """Make an HTTP call with timeout and exponential backoff.

    max_attempts is the total number of tries (including the first).
    Only retries on transient failures: timeouts, connection errors,
    and specific HTTP status codes (429, 5xx). Client errors like
    400, 401, 404 fail immediately — retrying those is pointless.
    """
    for attempt in range(max_attempts):
        try:
            response = httpx.request(method, url, timeout=timeout, **kwargs)

            # Client errors (4xx except 429) — fail fast, do not retry
            if 400 <= response.status_code < 500 and response.status_code not in RETRYABLE_STATUS_CODES:
                response.raise_for_status()

            # Retryable server/rate-limit errors
            if response.status_code in RETRYABLE_STATUS_CODES:
                if attempt == max_attempts - 1:
                    response.raise_for_status()
                wait = (2 ** attempt) + 0.1
                print(f"  Attempt {attempt + 1}: got {response.status_code}. Retrying in {wait:.1f}s...")
                time.sleep(wait)
                continue

            return response

        except (httpx.TimeoutException, httpx.ConnectError) as e:
            if attempt == max_attempts - 1:
                raise
            wait = (2 ** attempt) + 0.1
            print(f"  Attempt {attempt + 1} failed: {e}. Retrying in {wait:.1f}s...")
            time.sleep(wait)


# --- Call your own server ---
print("=== Calling /health ===")
r = call_with_retry("GET", "http://localhost:8000/health")
print(r.json())

print("\n=== Calling /echo ===")
r = call_with_retry("POST", "http://localhost:8000/echo", json={"message": "hello from client"})
print(r.json())

# --- Call an external API ---
print("\n=== Calling external API (JSONPlaceholder) ===")
r = call_with_retry("GET", "https://jsonplaceholder.typicode.com/todos/1")
print(r.json())

# --- Demonstrate timeout behavior ---
print("\n=== Demonstrating timeout (this should fail) ===")
try:
    # httpbin delays 10s, but our timeout is 2s
    call_with_retry("GET", "https://httpbin.org/delay/10", timeout=2.0, max_attempts=2)
except httpx.TimeoutException:
    print("  Timed out as expected after the final attempt.")

Run it (with your FastAPI server still running in another terminal):

python client.py

Expected output:

=== Calling /health ===
{'status': 'ok'}

=== Calling /echo ===
{'message': 'hello from client'}

=== Calling external API (JSONPlaceholder) ===
{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}

=== Demonstrating timeout (this should fail) ===
  Attempt 1 failed: ... Retrying in 1.1s...
  Timed out as expected after the final attempt.

The call_with_retry pattern (timeout on every call, exponential backoff on failure, hard stop after N attempts) will recur in every lesson that calls a model API or external service.

Add validation and error handling

Validation is already partially working. Pydantic catches missing and wrong-typed fields automatically. Verify by sending bad input:

# Missing required field
curl -X POST http://localhost:8000/echo \
  -H "Content-Type: application/json" \
  -d '{"wrong_field": "hello"}'
# Expected: 422 status with validation error detail, NOT a 500 server error

# Wrong field type
curl -X POST http://localhost:8000/summarize-request \
  -H "Content-Type: application/json" \
  -d '{"title": "Bug", "body": "Broken", "priority": "not-a-number"}'
# Expected: 422 — priority must be an integer

Both should return a 422 with a structured error body showing exactly which field failed and why. This is Pydantic doing the work; you did not write any error-handling code for these cases.

The goal is not exhaustive error handling. The goal is establishing the habit: define the expected shape, validate it, and return useful errors when the shape is wrong.

Write tests

Create test_app.py:

# test_app.py
from fastapi.testclient import TestClient
from app import app

client = TestClient(app)


def test_health():
    response = client.get("/health")
    assert response.status_code == 200
    assert response.json() == {"status": "ok"}


def test_echo_success():
    response = client.post("/echo", json={"message": "hello"})
    assert response.status_code == 200
    assert response.json() == {"message": "hello"}


def test_echo_missing_field():
    response = client.post("/echo", json={"wrong_field": "hello"})
    assert response.status_code == 422  # Pydantic validation error, not 500

Run:

pytest test_app.py -v

Expected output:

test_app.py::test_health PASSED
test_app.py::test_echo_success PASSED
test_app.py::test_echo_missing_field PASSED

FastAPI's TestClient runs the server in-process, so there's no need to start uvicorn separately. The 422 status code on the missing-field test confirms that Pydantic catches the validation error and returns a structured error response, not a crash.

Exercises

Build the FastAPI app described above (/health, /echo, /summarize-request) with Pydantic request/response models.
Use httpx to call your server and one external API. Add timeout and retry logic.
Add request validation and error handling for missing fields, wrong types, and outbound failures.
Write two pytest tests: one success path, one invalid input path.

Completion checkpoint

You can:

Run your FastAPI app and hit all three endpoints successfully
Show a Pydantic model that validates a request body and rejects bad input
Show an httpx call with timeout and retry logic
Run pytest and see both tests pass

What's next

LLM Mental Models. You have the project scaffold now; before you call a model from code, get clear on tokens, context, and inference so the rest of Module 1 does not feel magical.

References

Start here

FastAPI Tutorial — walk through this if you have not used FastAPI before; it covers everything this lesson needs

Build with this

Pydantic docs — reference for model definitions, validators, and serialization
httpx docs — reference for async HTTP calls and timeouts (retries are custom logic, as shown in this lesson)

Deep dive

FastAPI full docs — dependency injection, middleware, background tasks, and other features you'll use in later lessons