Skip to content

Getting Started

Pyfake

A flexible, schema-driven fake data generator built on top of Pydantic v2.

Documentation · Github · PyPI

Generate realistic fake data for your Pydantic models with ease. Perfect for testing, prototyping, and anywhere you need valid mock data.


⚡ Quick Example

from typing import Annotated, List, Set, Literal
from pydantic import BaseModel, Field
from pyfake import fake
from rich import print


class Playlist(BaseModel):
    track_ids: List[int]
    genre: Literal["rock", "pop", "jazz"]
    tags: Annotated[List[str], Field(min_length=2, max_length=5)]
    unique_ratings: Set[int]

result = fake(Playlist, as_dict=True)
print(result)
python example.py
{
"track_ids": [28, 25, 95, 40],
"genre": "pop",
"tags": ["CJKHILHXTN", "qkhhjDJYiV"],
"unique_ratings": {17, 49}
}

✨ Why Pyfake?

Problem Most fake data generators Pyfake
Random but not structured ❌ Generates random data without understanding the schema ✅ Reads your Pydantic models to produce structured, schema-aware data
Structured but not realistic ❌ Generates data that fits the schema but isn't realistic (e.g. random strings for names) ✅ Uses intelligent generators to produce realistic fake data (e.g. names, addresses)
Hard to extend ❌ Difficult to add custom generators or handle complex types ✅ Easily extensible with a flexible generator registry and schema resolution system
Support for constraints ❌ Ignores field constraints like min_length, gt, multiple_of ✅ Respects all Pydantic field constraints when generating data
Support for python primitive types ❌ Limited support for complex types like Decimal, UUID, datetime ✅ Full support for Python primitives, including Decimal, UUID, datetime, and more
Reproducibility ❌ No built-in way to generate the same fake data across runs ✅ Supports seeding for reproducible fake data generation

Installation

pyfake is a lightweight and flexible fake data generation library built for developers, data scientists, and testers. You can install it using your preferred Python package manager.

uv is a fast Python package manager and installer. It is the recommended way to install pyfake for better performance and dependency resolution.

# Add pyfake to your project
uv add pyfake

Non Project envritonment

If you're working in a non-project environment, you can use uv to install pyfake directly:

uv pip install pyfake

💡 uv is significantly faster than pip and handles virtual environments automatically.

You can install pyfake using the standard Python package manager:

pip install pyfake

To install a specific version:

pip install pyfake==0.0.x

If you're working in a virtual environment (recommended):

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install pyfake

If you want the latest development version or plan to contribute, you can install directly from GitHub:

pip install git+https://github.com/Mukhopadhyay/pyfake.git

For a local development

If you want to clone the repository and install in editable mode for development:

git clone https://github.com/Mukhopadhyay/pyfake.git
cd pyfake
pip install -e .

This installs pyfake in editable mode, so changes to the source code are reflected immediately.

Requirements

  • Python 3.8+
  • Pydantic v2+
  • Numpy

How It Works

Pyfake reads your Pydantic schema and:

  • Inspects field types and constraints
  • Applies intelligent generators
  • Produces validated fake data
flowchart LR
    A[Pydantic Model] --> B[Schema Parser]
    B --> C[Generator Engine]
    C --> D[Validated Fake Data]

Read more about the design and implementation in the how it works section.