Bulk Operations
When you need to work with many records at once, bulk operations are significantly faster than per-instance operations.
bulk_createโ
Multi-row INSERT with batching:
from ryx.bulk import bulk_create
posts = [
Post(title=f"Post {i}", slug=f"post-{i}", views=i * 10)
for i in range(1000)
]
# Basic usage
created = await bulk_create(posts, batch_size=100)
# Skip validation (faster)
created = await bulk_create(posts, batch_size=100, validate=False)
# Ignore conflicts (PostgreSQL / MySQL)
created = await bulk_create(posts, batch_size=100, ignore_conflicts=True)
Parametersโ
| Param | Default | Description |
|---|---|---|
batch_size | 500 | Rows per INSERT statement |
validate | False | Run field validators before insert |
ignore_conflicts | False | INSERT ... ON CONFLICT DO NOTHING |
How It Worksโ
Records are split into batches of batch_size. Each batch becomes a single multi-row INSERT:
INSERT INTO "posts" ("title", "slug", "views") VALUES
('Post 0', 'post-0', 0),
('Post 1', 'post-1', 10),
...
('Post 99', 'post-99', 990)
Trade-offsโ
- Bypasses
clean(),before_save,after_save, and signals - Does not populate
pkon instances (database-generated) - Much faster than calling
save()in a loop
bulk_updateโ
Individual UPDATEs wrapped in a transaction:
from ryx.bulk import bulk_update
posts = await Post.objects.filter(active=True)
for post in posts:
post.views += 1
await bulk_update(posts, fields=["views"])
bulk_deleteโ
DELETE with IN clause:
from ryx.bulk import bulk_delete
deleted = await bulk_delete(Post.objects.filter(views=0))
print(f"Deleted {deleted} posts")
streamโ
Async generator for processing large result sets without loading everything into memory:
from ryx.bulk import stream
async for batch in stream(Post.objects.all(), chunk_size=500):
for post in batch:
process(post)
Uses keyset pagination under the hood when keyset is provided, or LIMIT/OFFSET otherwise.
For streaming with keyset-based pagination (more efficient for large datasets), use QuerySet.stream():
async for batch in (
Post.objects
.filter(active=True)
.order_by("id")
.stream(chunk_size=100, keyset="id")
):
for post in batch:
process(post)
Performance Comparisonโ
| Operation | 100 records | 10,000 records |
|---|---|---|
save() in loop | ~200ms | ~20s |
bulk_create | ~10ms | ~100ms |
tip
Always prefer bulk operations for data imports, migrations, and batch processing.
Next Stepsโ
โ Advanced โ Transactions, validation, signals