Cost and Usage Report

Comprehensive cost and usage data delivered to S3 for deep billing analysis

The AWS Cost and Usage Report (CUR) is the most detailed billing dataset AWS produces, delivering line-item cost and usage data to an S3 bucket in CSV or Parquet format. It contains every charge on your AWS bill broken down to the individual resource level, with cost allocation tags, pricing details, and usage quantities. For cloud engineers and FinOps practitioners, CUR is the foundation for building custom cost analytics, feeding billing data into data warehouses like Redshift or Athena, and doing any analysis that goes beyond what Cost Explorer's pre-built reports support.

CUR Architecture - How Data Is Delivered

CUR is configured in the AWS Billing console and delivers files to an S3 bucket you specify. The delivery schedule and format are configurable:

Configuration	Options	Recommendation
Report granularity	Hourly, Daily, Monthly	Hourly for most use cases - enables time-of-day analysis
File format	CSV (gzip), Parquet (snappy)	Parquet for Athena/Redshift - much smaller and faster to query
Delivery frequency	Daily (default), hourly (with hourly granularity)	Daily unless you need near-real-time cost visibility
Report versioning	Create new version or overwrite	Overwrite for Athena integration; new version for auditing
S3 bucket	Any bucket in your account	Dedicated bucket with restricted access for billing data

The S3 path structure for CUR files:

bash

# CUR S3 path structure
# s3://{bucket}/{prefix}/{report-name}/{date-range}/{assemblyId}/{report-name}-{number}.csv.gz

# Example with Parquet format and Athena integration
s3://my-billing-bucket/cur/MyReport/20240501-20240601/
  ├── 01/
  │   └── MyReport-00001.snappy.parquet
  ├── MyReport-Manifest.json     # describes the files in this delivery
  └── crawler-cfn.yml            # optional CloudFormation for Glue crawler

# The manifest JSON tells downstream tools what files are in this report period
# Always use the manifest to discover CUR files programmatically

💡

AWS re-delivers CUR files multiple times during a month as new data arrives and corrections are made. The manifest.json is the authoritative list of what files constitute the current complete dataset for a billing period. Always load from the manifest, not by listing the S3 directory.

Key CUR Data Columns and What They Mean

A CUR file can have 300+ columns. Understanding the most important ones is essential for building cost analysis:

Column Group	Key Columns	Description
Identity	lineItem/LineItemId, lineItem/UsageAccountId	Unique ID for the line item; which account generated it
Time	lineItem/UsageStartDate, lineItem/UsageEndDate	When the usage occurred (hourly for hourly reports)
Service	lineItem/ProductCode, product/servicename	Which AWS service (e.g. AmazonEC2, AmazonRDS)
Resource	lineItem/ResourceId	Specific resource ARN or ID - enables per-resource costing
Usage	lineItem/UsageType, lineItem/UsageAmount	What type of usage (e.g. BoxUsage:t3.large) and how much
Cost - Unblended	lineItem/UnblendedCost	Actual rate charged for this line item (most common cost metric)
Cost - Blended	lineItem/BlendedCost	Averaged rate across org for this usage type (legacy - avoid)
Cost - Amortized	reservation/AmortizedUpfrontFeeForBillingPeriod	Upfront RI/SP cost spread across the period
Pricing	lineItem/LineItemType, pricing/term, pricing/unit	Whether On-Demand, RI, SP, tax, credit, refund
Tags	resourceTags/user:{TagKey}	All cost allocation tags on the resource at time of usage

Line item types in the LineItemType column - each requires different handling in analysis:

LineItemType	Meaning	Include in Cost Analysis?
Usage	Normal On-Demand or covered compute usage	Yes - primary cost data
DiscountedUsage	Usage covered by RI or Savings Plans at discounted rate	Yes - real cost
SavingsPlanCoveredUsage	Usage where SP rate was applied	Yes - actual spend
SavingsPlanNegation	Offset entry that cancels out the On-Demand rate	No - accounting entry only
Fee	RI or Savings Plans upfront or recurring fee	Yes for amortized cost analysis
RIFee	Reserved Instance fee line item	Include in amortized but not unblended analysis
Tax	Applicable taxes	Depends on use case - exclude for infrastructure cost analysis
Credit	AWS credits applied	Depends - may want to show pre-credit cost for planning
Refund	Billing corrections	Include to reconcile with actual bill

Querying CUR with Athena

Amazon Athena is the standard way to query CUR data. AWS provides a CloudFormation template to automatically create the Glue Data Catalog table and Glue Crawler when you enable CUR with Athena integration.

bash

-- Total cost by service for the current month
SELECT
  line_item_product_code AS service,
  SUM(line_item_unblended_cost) AS total_cost
FROM cur_database.cur_table
WHERE
  line_item_usage_start_date >= DATE_TRUNC('month', CURRENT_DATE)
  AND line_item_line_item_type NOT IN ('Tax', 'Credit', 'Refund', 'SavingsPlanNegation')
GROUP BY 1
ORDER BY 2 DESC;

-- Cost per resource for EC2, with tags
SELECT
  line_item_resource_id AS resource_id,
  resource_tags_user_environment AS environment,
  resource_tags_user_team AS team,
  SUM(line_item_unblended_cost) AS monthly_cost
FROM cur_database.cur_table
WHERE
  line_item_product_code = 'AmazonEC2'
  AND year = '2024'
  AND month = '5'
  AND line_item_line_item_type = 'Usage'
GROUP BY 1, 2, 3
ORDER BY 4 DESC
LIMIT 100;

-- Daily cost trend for the last 30 days
SELECT
  DATE_TRUNC('day', line_item_usage_start_date) AS usage_date,
  SUM(line_item_unblended_cost) AS daily_cost
FROM cur_database.cur_table
WHERE
  line_item_usage_start_date >= CURRENT_DATE - INTERVAL '30' DAY
  AND line_item_line_item_type NOT IN ('Tax', 'Credit')
GROUP BY 1
ORDER BY 1;

Athena query costs for CUR analysis:

Format	Typical CUR Size (monthly)	Athena Cost per Full Scan	Recommendation
CSV (gzip)	1-10 GB for mid-size org	$0.05 - $0.50 per query	Usable but not optimal
Parquet (snappy)	100-500 MB for same org	$0.005 - $0.025 per query	Recommended - 10x cheaper queries

💡

Always partition your Athena queries by year and month using the partition columns that CUR adds. Filtering on year='2024' AND month='5' dramatically reduces data scanned and Athena costs. The Glue crawler created by the CUR CloudFormation template adds these partition columns automatically.

CUR vs Cost Explorer API - When to Use Each

Both CUR and the Cost Explorer API provide cost data, but they serve different needs:

Dimension	CUR	Cost Explorer API
Granularity	Hourly, resource-level	Daily or monthly aggregated
Resource IDs	Yes - every individual resource	Limited - not always available
Latency	24-hour delay on delivery	Data available within 24h of usage
Cost	S3 storage + Athena query costs	$0.01 per API request
Data retention	As long as you keep in S3	13 months in Cost Explorer
Custom analysis	Unlimited - full SQL on raw data	Limited to supported groupings and filters
Integration effort	Higher - requires Athena/Glue setup	Low - direct API calls
Best for	Custom FinOps tooling, data warehouse, auditing	Quick analysis, dashboards, cost anomaly detection

Decision framework:

Use Case	Recommended Tool
Quick ad-hoc cost question	Cost Explorer console
Automated monthly report via script	Cost Explorer API
Custom FinOps dashboard with per-resource breakdown	CUR + Athena
Feeding cost data into a data warehouse (Redshift, Snowflake)	CUR Parquet to S3
Cost allocation by tag with complex hierarchies	CUR + Athena
Auditing historical cost for 2+ years	CUR in S3 (set lifecycle policy)
Billing anomaly detection	Cost Anomaly Detection (uses CUR internally)

CUR Setup and Operational Best Practices

Setting up CUR correctly from the start avoids rework. Key decisions to make at setup time:

bash

# CUR is configured in AWS Billing console or via API
# aws cur put-report-definition (must run in us-east-1)

aws cur put-report-definition \
  --region us-east-1 \
  --report-definition \
  "ReportName=my-cur-report,
   TimeUnit=HOURLY,
   Format=Parquet,
   Compression=Parquet,
   AdditionalSchemaElements=[RESOURCES,SPLIT_COST_ALLOCATION_DATA],
   S3Bucket=my-billing-bucket,
   S3Prefix=cur,
   S3Region=us-east-1,
   AdditionalArtifacts=[ATHENA],
   RefreshClosedReports=true,
   ReportVersioning=OVERWRITE_REPORT"

Best practices for CUR configuration and management:

Practice	Why	How
Enable RESOURCES schema element	Adds ResourceId column for per-resource analysis	Include in AdditionalSchemaElements
Use Parquet format	10x smaller files, 10x cheaper Athena queries	Set Format=Parquet
Enable Athena integration	Auto-creates Glue catalog and partitioning	Set AdditionalArtifacts=[ATHENA]
Set S3 lifecycle policy	CUR grows indefinitely - archive old data	Transition to Glacier after 90 days, delete after 3 years
Enable bucket versioning	CUR overwrites files - versioning enables recovery	Enable on the CUR S3 bucket
Restrict S3 access	Billing data is sensitive	Bucket policy: only Billing service + specific IAM roles
Activate cost allocation tags	Tags only appear in CUR after activation	Billing console > Cost Allocation Tags

⚠️

CUR files for large AWS organizations can be multiple GB per day. The RESOURCES schema element significantly increases file size (sometimes 5-10x) because it adds a row per resource per hour. For organizations with thousands of resources, evaluate whether you need hourly granularity or if daily is sufficient, and use Parquet to manage the data volume.

🎯

Interview Focus Points

1What is the difference between CUR and the Cost Explorer API? When would you use CUR instead of calling the Cost Explorer API?
2How would you set up a cost analytics pipeline that gives per-team cost breakdown by tag, updated daily?
3What is the difference between unblended, blended, and amortized cost in CUR? Which would you use for chargeback to teams?
4What are SavingsPlanNegation line items in CUR and why do you need to exclude them from cost analysis?
5How does the CUR manifest.json file work and why should you use it instead of listing the S3 directory?
6A team's CUR queries in Athena are slow and expensive - what would you do to optimize them?
7How do you handle CUR data for an AWS Organization where you want per-account cost visibility?
8What is the SPLIT_COST_ALLOCATION_DATA schema element in CUR and when would you need it?
9CUR has a 24-hour delay - how would you handle a situation where you need near-real-time cost visibility?
10How would you model amortized Reserved Instance costs in CUR for a monthly chargeback report?