Getting Started

A flexible, schema-driven fake data generator built on top of Pydantic v2.

Documentation · Github · PyPI

Generate realistic fake data for your Pydantic models with ease. Perfect for testing, prototyping, and anywhere you need valid mock data.

⚡ Quick Example

from typing import Annotated, List, Set, Literal
from pydantic import BaseModel, Field
from pyfake import fake
from rich import print


class Playlist(BaseModel):
    track_ids: List[int]
    genre: Literal["rock", "pop", "jazz"]
    tags: Annotated[List[str], Field(min_length=2, max_length=5)]
    unique_ratings: Set[int]

result = fake(Playlist, as_dict=True)
print(result)

python example.py
{
"track_ids": [28, 25, 95, 40],
"genre": "pop",
"tags": ["CJKHILHXTN", "qkhhjDJYiV"],
"unique_ratings": {17, 49}
}

✨ Why Pyfake?

Problem	Most fake data generators	Pyfake
Random but not structured	❌ Generates random data without understanding the schema	✅ Reads your Pydantic models to produce structured, schema-aware data
Structured but not realistic	❌ Generates data that fits the schema but isn't realistic (e.g. random strings for names)	✅ Uses intelligent generators to produce realistic fake data (e.g. names, addresses)
Hard to extend	❌ Difficult to add custom generators or handle complex types	✅ Easily extensible with a flexible generator registry and schema resolution system
Support for constraints	❌ Ignores field constraints like `min_length`, `gt`, `multiple_of`	✅ Respects all Pydantic field constraints when generating data
Support for python primitive types	❌ Limited support for complex types like `Decimal`, `UUID`, `datetime`	✅ Full support for Python primitives, including `Decimal`, `UUID`, `datetime`, and more
Reproducibility	❌ No built-in way to generate the same fake data across runs	✅ Supports seeding for reproducible fake data generation

Installation

pyfake is a lightweight and flexible fake data generation library built for developers, data scientists, and testers. You can install it using your preferred Python package manager.

uv (Recommended)pipGitHub

uv is a fast Python package manager and installer. It is the recommended way to install pyfake for better performance and dependency resolution.

# Add pyfake to your project
uv add pyfake

Non Project envritonment

If you're working in a non-project environment, you can use uv to install pyfake directly:

uv pip install pyfake

💡 uv is significantly faster than pip and handles virtual environments automatically.

You can install pyfake using the standard Python package manager:

pip install pyfake

To install a specific version:

pip install pyfake==0.0.x

If you're working in a virtual environment (recommended):

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install pyfake

If you want the latest development version or plan to contribute, you can install directly from GitHub:

pip install git+https://github.com/Mukhopadhyay/pyfake.git

For a local development

If you want to clone the repository and install in editable mode for development:

git clone https://github.com/Mukhopadhyay/pyfake.git
cd pyfake
pip install -e .

This installs pyfake in editable mode, so changes to the source code are reflected immediately.

Requirements

Python 3.8+
Pydantic v2+
Numpy

How It Works

Pyfake reads your Pydantic schema and:

Inspects field types and constraints
Applies intelligent generators
Produces validated fake data

flowchart LR
    A[Pydantic Model] --> B[Schema Parser]
    B --> C[Generator Engine]
    C --> D[Validated Fake Data]

Read more about the design and implementation in the how it works section.