Kubecost provides the ability to allocate out of cluster costs, e.g. RDS instances and S3 buckets, back to Kubernetes concepts like namespace and deployment. All billing data remains on your cluster when using this functionality and is not shared externally.
The following guide provides the steps required for allocating out of cluster costs. In a multi-account organization, all of the following steps will need to be completed in the payer account.
This bucket will be used to store AWS cost and usage data.
Instructions for creating an S3 Bucket to be used for Cost and Usage data
When creating the Cost and Usage Report, configure the report to be delivered to the bucket created in step #1. When following the instructions for this step, choose Athena
as the data integration option so that reports are created in parquet format.
Instructions for creating a Cost and Usage Report
Kubecost utilizes AWS tagging to allocate the costs of AWS resources outside of the Kubernetes cluster to specific Kubernetes concepts, such as namespaces, pods, etc. These costs are then shown in a unified dashboard within the Kubecost interface.
In order to make the custom Kubecost AWS tags appear on the cost and usage reports, and therefore in Kubecost, individual cost allocation tags must be enabled. Details on which tags to enable can be found in Step #6 of this doc.
Instructions for enabling user-defined cost allocation tags
By completing this step, you will make available the Cost and Usage data created in Step #2 to Amazon Athena where Kubecost can then query the data.
Instructions for making Cost and Usage data available via Amazon Athena
To access billing data in Athena tables, and to enable other Kubecost functionality, you have two options:
- Create an IAM User with the following IAM permissions. Generate Access Keys for this IAM User and provide them to Kubecost.
- Attach the following permissions to the IAM Role associated with the EC2 instance(s) in the cluster where Kubecost is running.
We recommend kiam as a solution for adding IAM credentials directly to the Kubecost pod(s).
The below policy is designed to provide Kubecost least-priviledge access to AWS Cost and Usage data.
Validate the following resource names in the below IAM policy before applying to your account:
"Sid": "ReadAccessToAthenaCurDataViaGlue"
: Validate thedatabase
andtable
ARNs listed. If you used the AWS managed deployment, as described in Step #4, this should already be set correctly. If you set up the Cost and Usage report to Athena flow manually, you may need to adjust this value."Sid": "AthenaQueryResultsOutput"
: Modify the listed bucket ARN to match the location where Athena should put query execution result files."Sid": "S3ReadAccessToAwsBillingData"
: Modify the bucket ARN to match the name of the bucket created in Step #1.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "FullAthenaAccess",
"Effect": "Allow",
"Action": [
"athena:*"
],
"Resource": [
"*"
]
},
{
"Sid": "ReadAccessToAthenaCurDataViaGlue",
"Effect": "Allow",
"Action": [
"glue:GetDatabase*",
"glue:GetTable*",
"glue:GetPartition*",
"glue:GetUserDefinedFunction",
"glue:BatchGetPartition"
],
"Resource": [
"arn:aws:glue:*:*:catalog",
"arn:aws:glue:*:*:database/athenacurcfn*",
"arn:aws:glue:*:*:table/athenacurcfn*/*"
]
},
{
"Sid": "AthenaQueryResultsOutput",
"Effect": "Allow",
"Action": [
"s3:GetBucketLocation",
"s3:GetObject",
"s3:ListBucket",
"s3:ListBucketMultipartUploads",
"s3:ListMultipartUploadParts",
"s3:AbortMultipartUpload",
"s3:CreateBucket",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::aws-athena-query-results-*"
]
},
{
"Sid": "S3ReadAccessToAwsBillingData",
"Effect": "Allow",
"Action": [
"s3:Get*",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::<BILLING BUCKET>*"
]
}
]
}
"Sid": "FullAthenaAccess"
: Grants access to run queries on data exposed via Athena.athena:*
is generally safe because Athena constructs are all actually using the Glue catalog under the hood. Access to data in Athena is actually controlled by:- Whether the user performing the query has access to the appropriate Glue catalog, database, and table
- Whether the user performing the query has access to the S3 bucket where the data being queried is actually stored.
"Sid": "ReadAccessToAthenaCurDataViaGlue"
: When following the AWS provided instructions for enabling Cost and Usage Report data delivery via Athena, a Glue database will be created with the prefixathenacurcfn
. This Statement allows Kubecost to query the specific Glue database and table which stores the Cost and Usage data."Sid": "AthenaQueryResultsOutput"
: When executing queries in Athena, all results are automatically saved as a CSV to an S3 bucket. The default bucket used is prefixed withaws-athena-query-results
. This Statement provides Kubecost the access required to write Athena query results to the results bucket. This bucket can be customized within the Kubecost interface when setting up Out of Cluster resource access and this Statement would need to be updated to reflect the new bucket name."Sid": "S3ReadAccessToAwsBillingData"
: Provides read access to the underlying Cost and Usage report data being generated by AWS. This Statement will include access to the bucket created during Step #1 of this doc. This is required to allow Kubecost to run queries against the cost data, which exists in this bucket.
As Kubecost integrates with additional AWS services, additional read access may need to be granted to access those services.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:ListAllMyBuckets"
],
"Resource": [
"*"
]
},
{
"Effect": "Allow",
"Action": [
"sns:ListTopics",
"sns:GetTopicAttributes"
],
"Resource": [
"*"
]
},
{
"Effect": "Allow",
"Action": [
"cloudwatch:PutMetricAlarm",
"cloudwatch:DescribeAlarms",
"cloudwatch:DeleteAlarms",
"cloudwatch:ListMetrics",
"cloudwatch:GetMetricStatistics",
"cloudwatch:Describe*"
],
"Resource": [
"*"
]
},
{
"Effect": "Allow",
"Action": "ec2:Describe*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "elasticloadbalancing:Describe*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "autoscaling:Describe*",
"Resource": "*"
}
]
}
To allocate AWS resources to a Kubernetes concept, use the following tag naming scheme:
Kubernetes Concept | AWS Tag Key | AWS Tag Value |
---|---|---|
Namespace | kubernetes_namespace | <namespace name> |
Deployment | kubernetes_deployment | <deployment name> |
Label | kubernetes_label_NAME* | <label value> |
Pod | kubernetes_pod | <pod name> |
DaemonSet | kubernetes_daemonset | <daemonset name> |
Container | kubernetes_container | <container name> |
*In the kubernetes_label_NAME
tag key, the NAME
portion should appear exactly as the tag appears inside of Kubernetes. For example, for the tag app.kubernetes.io/name
, this tag key would appear as kubernetes_label_app.kubernetes.io/name
.
More on AWS tagging here.
Visit the Kubecost Settings page to provide the AWS access credentials and Athena information.
Note: you must include the protocol for your S3 bucket name, e.g. s3://aws-athena-query-results-5303329856255-us-east-1
- You may need to upgrade your AWS Glue if you are running an old version https://docs.aws.amazon.com/athena/latest/ug/glue-upgrade.html