Getting Started
A flexible, schema-driven fake data generator built on top of Pydantic v2.
Documentation · Github · PyPI
Generate realistic fake data for your Pydantic models with ease. Perfect for testing, prototyping, and anywhere you need valid mock data.
⚡ Quick Example
from typing import Annotated, List, Set, Literal
from pydantic import BaseModel, Field
from pyfake import fake
from rich import print
class Playlist(BaseModel):
track_ids: List[int]
genre: Literal["rock", "pop", "jazz"]
tags: Annotated[List[str], Field(min_length=2, max_length=5)]
unique_ratings: Set[int]
result = fake(Playlist, as_dict=True)
print(result)
{
"track_ids": [28, 25, 95, 40],
"genre": "pop",
"tags": ["CJKHILHXTN", "qkhhjDJYiV"],
"unique_ratings": {17, 49}
}
✨ Why Pyfake?
| Problem | Most fake data generators | Pyfake |
|---|---|---|
| Random but not structured | ❌ Generates random data without understanding the schema | ✅ Reads your Pydantic models to produce structured, schema-aware data |
| Structured but not realistic | ❌ Generates data that fits the schema but isn't realistic (e.g. random strings for names) | ✅ Uses intelligent generators to produce realistic fake data (e.g. names, addresses) |
| Hard to extend | ❌ Difficult to add custom generators or handle complex types | ✅ Easily extensible with a flexible generator registry and schema resolution system |
| Support for constraints | ❌ Ignores field constraints like min_length, gt, multiple_of |
✅ Respects all Pydantic field constraints when generating data |
| Support for python primitive types | ❌ Limited support for complex types like Decimal, UUID, datetime |
✅ Full support for Python primitives, including Decimal, UUID, datetime, and more |
| Reproducibility | ❌ No built-in way to generate the same fake data across runs | ✅ Supports seeding for reproducible fake data generation |
Installation
pyfake is a lightweight and flexible fake data generation library built for developers, data scientists, and testers. You can install it using your preferred Python package manager.
uv is a fast Python package manager and installer. It is the recommended way to install pyfake for better performance and dependency resolution.
# Add pyfake to your project
uv add pyfake
Non Project envritonment
If you're working in a non-project environment, you can use uv to install pyfake directly:
uv pip install pyfake
💡
uvis significantly faster than pip and handles virtual environments automatically.
You can install pyfake using the standard Python package manager:
pip install pyfake
To install a specific version:
pip install pyfake==0.0.x
If you're working in a virtual environment (recommended):
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install pyfake
If you want the latest development version or plan to contribute, you can install directly from GitHub:
pip install git+https://github.com/Mukhopadhyay/pyfake.git
For a local development
If you want to clone the repository and install in editable mode for development:
git clone https://github.com/Mukhopadhyay/pyfake.git
cd pyfake
pip install -e .
This installs pyfake in editable mode, so changes to the source code are reflected immediately.
Requirements
- Python 3.8+
- Pydantic v2+
- Numpy
How It Works
Pyfake reads your Pydantic schema and:
- Inspects field types and constraints
- Applies intelligent generators
- Produces validated fake data
flowchart LR
A[Pydantic Model] --> B[Schema Parser]
B --> C[Generator Engine]
C --> D[Validated Fake Data]
Read more about the design and implementation in the how it works section.