Skip to content

Lists in pyfake

Simple usage

pyfake can generate values for list fields in your Pydantic models. It understands both the typing.List[...] form and the Python 3.9+ built-in generic form list[...].

from typing import List
from pyfake import fake
from pydantic import BaseModel

class User(BaseModel):
  tags: List[str]
  scores: list[int]

result = fake(User)
print(result)
{'tags': ['aBcDeFgHiJ', 'XyZaBcDeFg'], 'scores': [42, 7]}

No setup required — pyfake inspects the schema and generates valid values automatically.

Typing forms

Both typing.List[T] and list[T] are resolved. Use whichever syntax matches your codebase.


Returning Multiple Values

Generate more than one instance by passing num (same behaviour as for other types):

from pyfake import fake
from pydantic import BaseModel

class Wrapper(BaseModel):
  items: list[str]

results = fake(Wrapper, num=3)
print(results)
[
{'items': ['aBcDeFgHiJ']},
{'items': ['GhIjKlMnOp', 'QrStUvWxYz']},
{'items': ['XyZaBcDeFg']}
]

Receiving Model Instances

By default pyfake returns dictionaries. To receive Pydantic model instances set as_dict=False (useful when list items are nested models):

from pyfake import fake
from pydantic import BaseModel

class Inner(BaseModel):
  x: int

class Outer(BaseModel):
  inners: list[Inner]

results = fake(Outer, num=2, as_dict=False)
print(results)
[
Outer(inners=[Inner(x=12), Inner(x=3)]),
Outer(inners=[Inner(x=7)])
]

Metadata & Constraints

pyfake reads Pydantic/Annotated Field metadata to drive list generation. For lists the generator supports min_length and max_length (bounds on the length of the generated sequence). These map to GeneratorArgs.min_length and GeneratorArgs.max_length used by the registry.

How length is chosen

  • Default: when no bounds are provided, the list length is chosen randomly between 1 and 5 (inclusive).
  • Both min_length and max_length: a random length is chosen uniformly between the two bounds (inclusive).
  • Only min_length provided: the length is chosen randomly between min_length and the default upper bound 5.
  • Only max_length provided: the length is chosen randomly between the default lower bound 1 and max_length.

This behaviour mirrors the implementation in pyfake.core.registry.GeneratorRegistry._generate which uses rng.randint(args.min_length or 1, args.max_length or 5) for lists and sets.

Using Field / Annotated constraints

You can attach constraints with Annotated + Field, or with Field(...) directly on the attribute. Both influence the resulting list length.

from typing import Annotated, List
from pydantic import BaseModel, Field
from pyfake import fake

class BoundedModel(BaseModel):
  items: Annotated[List[int], Field(min_length=2, max_length=4)]

result = fake(BoundedModel, seed=42)
print(result)
{'items': [1, 2, 3]} # length will be between 2 and 4 (inclusive)

Supported Field options

Option Description
min_length Minimum length for the generated collection.
max_length Maximum length for the generated collection.

Unsupported / Partial Support

  • There is no length override for lists in GeneratorArgs (unlike the string generator's explicit length), so length control should be expressed via min_length/max_length.
  • Item-level pattern (e.g. a regex for string elements) is accepted by the resolver but the default string generator does not guarantee arbitrary regex matches. For regex-based string items, consider a custom generator or post-processing.

Several element types & unions

Lists may contain elements of union types. For example List[int | str] or List[Union[int, str]] is supported; each list item is generated by selecting a variant and generating a value for that variant.

from typing import List
from pydantic import BaseModel
from pyfake import fake

class Mixed(BaseModel):
  items: List[int | str]

print(fake(Mixed, seed=0))
{'items': [123, 'AbcDefGhIj', 7]}

Nullable elements

If the list's inner type is optional (e.g. list[Optional[int]] or list[int | None]), the resolver marks the inner union as nullable. The registry's nullable branch will return None in about 20% of union generations, so list elements may be None with that probability.

from typing import Optional
from pydantic import BaseModel
from pyfake import fake

class MaybeInts(BaseModel):
  values: list[Optional[int]]

print(fake(MaybeInts, num=3, seed=1))
[
{'values': [None, 5]},
{'values': [3]},
{'values': [None, None, 2]}
]

Nested lists and collections

Nested lists are supported (e.g. List[List[int]]) and each nested level will have its length chosen independently according to the same rules.

from typing import List
from pydantic import BaseModel
from pyfake import fake

class Matrix(BaseModel):
  matrix: List[List[int]]

print(fake(Matrix, seed=2))
{'matrix': [[1, 2], [3], [4, 5, 6]]}

Sets vs Lists

If the schema is a set[...] (or typing.Set[...]) the registry will generate a Python set. Internally it generates items and then converts to set(...), so duplicates are removed and the final size may be smaller than the requested length.

Annotated inner types (UUID example)

Field/Annotated metadata attached to the inner type is propagated when resolving the element schema. For example, Annotated[UUID, UuidVersion(1)] inside a list will cause the UUID generator to be invoked with the corresponding format (see docs/usage/uuid.md).

from typing import Annotated, List
from uuid import UUID
from pydantic.types import UuidVersion
from pydantic import BaseModel
from pyfake import fake

class VList(BaseModel):
  ids: List[Annotated[UUID, UuidVersion(1)]]

result = fake(VList, seed=0)
print(result)
{'ids': ['a1b2c3d4-0000-1000-8000-000000000001', '...']} # uuid v1 strings

Implementation notes

  • Lists/sets: length selection uses rng.randint(args.min_length or 1, args.max_length or 5) (default range 1..5).
  • Elements are produced by recursively generating the schema['items'] node — all resolver-provided metadata for the inner type (format, numeric bounds, uuid version, etc.) will affect element generation.
  • For sets the generated list of items is converted with set(items), which removes duplicates.
  • Format-based dispatch (e.g. UUID versions, date/time formats) is handled through GeneratorArgs.format and the registry's generator map (see pyfake.core.registry).

Unsupported / Partial support

  • The default string generator does not guarantee arbitrary regex pattern matches for elements. Pattern/regex metadata is accepted by the resolver but not enforced by the built-in string generator.
  • Lists do not enforce element uniqueness — use set[...] if uniqueness is required, but note the final size may be smaller due to deduplication.

If you need deterministic lengths, strict regex-based elements, or other behaviour not provided out of the box, you can register a custom generator or post-process generated values.