Get DB aggregations using Django ORM for efficient data analysis and reporting. This repository provides tools to perform complex aggregations with ease, leveraging the power of Django's ORM.
- Perform multiple aggregations simultaneously
- Calculate percentiles (PostgreSQL) and percentages with minimal additional setup
- Group data by multiple fields
- Generate time series data (PostgreSQL, MySQL)
- Flexible result limiting and pagination
- Custom aggregation types support
pip install drf-aggregation
First, add the AggregationMixin
to your viewset:
from drf_aggregation import AggregationMixin
from rest_framework.viewsets import GenericViewSet
class TicketViewSet(AggregationMixin, GenericViewSet):
queryset = Ticket.objects.all()
serializer_class = TicketSerializer
Then, register the viewset with your router:
urlpatterns = [
path("aggregation/ticket", TicketViewSet.as_view({"post": "aggregation"})),
]
Once set up, you can make requests like the following:
POST /aggregation/ticket
Content-Type: application/json
{
"group_by": "service",
"limit": 5,
"order_by": "-total_tasks",
"aggregations": {
"total_tasks": {
"type": "count"
},
"average_execution_time": {
"type": "average",
"field": "execution_time"
}
}
}
You can also use the aggregation function directly:
from drf_aggregation import get_aggregations
result = get_aggregations(
queryset=Ticket.objects.all(),
aggregations={
"total_tasks": {
"type": "count"
}
}
)
-
aggregations: A dictionary specifying the aggregations to perform.
- key: The name under which the aggregation result will be returned.
- value: A dictionary with aggregation settings.
- type: The type of aggregation (e.g., count, sum).
- index_by_group: Index for sorting by a specific field.
- field: Required for sum, average, minimum, maximum, percentile.
- percentile: A value from 0 to 1, required for percentile calculations.
- additional_filter: Uses filter parser from drf-complex-filter, required for percent.
-
group_by: List of fields to group the results by.
-
order_by: List of fields to sort the results.
-
limit: Number of groups to return or a dictionary with settings:
- limit: Number of groups to return.
- offset: Offset for the start of returned groups.
- by_group: Field to limit the result by, defaults to the first grouping field.
- by_aggregation: Aggregation to limit the result by, defaults to the first declared aggregation.
- show_other: Return remaining records as an additional group.
- other_label: Label for the additional group.
IntegerField
FloatField
DateField
(min/max only)DateTimeField
(min/max only)DurationField
By default, the following aggregations are enabled: count
, distinct
, sum
, average
, minimum
, maximum
.
To enable additional aggregations like percent and percentile, modify your settings.py
:
# in settings.py
DRF_AGGREGATION_SETTINGS = {
"AGGREGATION_CLASSES": [
"drf_aggregation.aggregations.common.CommonAggregations",
# Requires additional package "drf-complex-filter"
"drf_aggregation.aggregations.percent.PercentAggregation",
# Works only on PostgreSQL
"drf_aggregation.aggregations.percentile.PercentileAggregation",
],
}
Create a class with static methods for custom aggregation types:
class MyAggregations:
@staticmethod
def my_aggregation(aggregation, queryset):
name = aggregation.get("name")
return {f"{name}": models.Count("id")}
# in settings.py
DRF_AGGREGATION_SETTINGS = {
"AGGREGATION_CLASSES": [
"drf_aggregation.aggregations.common.CommonAggregations",
"path.to.MyAggregations",
],
}
result = get_aggregations(
queryset=Ticket.objects.all(),
aggregations={
"value": {
"type": "my_aggregation"
}
}
)
Group results by a list of fields:
result = get_aggregations(
queryset=Ticket.objects.all(),
aggregations={
"total_tasks": {
"type": "count"
}
},
group_by=["field1", "field2"]
)
Sort results by specified fields:
result = get_aggregations(
queryset=Ticket.objects.all(),
aggregations={
"total_tasks": {
"type": "count"
}
},
group_by="field1",
order_by="field1"
)
Use aggregations as sorting keys:
result = get_aggregations(
queryset=Ticket.objects.all(),
aggregations={
"total_tasks": {
"type": "count"
}
},
group_by="field1",
order_by="-total_tasks"
)
Limit the number of displayed groups:
result = get_aggregations(
queryset=Ticket.objects.all(),
aggregations={
"total_tasks": {
"type": "count"
}
},
group_by="field1",
order_by="-total_tasks",
limit=2
)
Display remaining groups as an additional category:
result = get_aggregations(
queryset=Ticket.objects.all(),
aggregations={
"total_tasks": {
"type": "count"
}
},
group_by="field1",
order_by="-total_tasks",
limit={
"limit": 2,
"show_other": true
}
)
Note: Time series aggregations are not supported on SQLite.
To perform time series aggregations, annotate your queryset with a truncated date field:
truncate_rules = { "created_at": "day" }
queryset = truncate_date(Ticket.objects.all(), truncate_rules)
result = get_aggregations(
queryset=queryset,
aggregations={
"total_tasks": {
"type": "count"
}
},
group_by="created_at__trunc__day"
)
Using AggregationMixin
, pass truncate_rules
in the request body:
POST /aggregation/ticket
Content-Type: application/json
{
"truncate_rules": { "created_at": "day" },
"group_by": "created_at__trunc__day",
"aggregations": {
"total_tasks": {
"type": "count"
}
}
}
Available truncation periods: year
, quarter
, month
, week
, day
, hour
, minute
, second
For more details on date truncation, see the Django documentation.
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License.