AWS Cost Management
Cost and Usage Report
Comprehensive cost and usage data delivered to S3 for deep billing analysis
The AWS Cost and Usage Report (CUR) is the most detailed billing dataset AWS produces, delivering line-item cost and usage data to an S3 bucket in CSV or Parquet format. It contains every charge on your AWS bill broken down to the individual resource level, with cost allocation tags, pricing details, and usage quantities. For cloud engineers and FinOps practitioners, CUR is the foundation for building custom cost analytics, feeding billing data into data warehouses like Redshift or Athena, and doing any analysis that goes beyond what Cost Explorer's pre-built reports support.
CUR Architecture - How Data Is Delivered
CUR is configured in the AWS Billing console and delivers files to an S3 bucket you specify. The delivery schedule and format are configurable:
| Configuration | Options | Recommendation |
|---|---|---|
| Report granularity | Hourly, Daily, Monthly | Hourly for most use cases - enables time-of-day analysis |
| File format | CSV (gzip), Parquet (snappy) | Parquet for Athena/Redshift - much smaller and faster to query |
| Delivery frequency | Daily (default), hourly (with hourly granularity) | Daily unless you need near-real-time cost visibility |
| Report versioning | Create new version or overwrite | Overwrite for Athena integration; new version for auditing |
| S3 bucket | Any bucket in your account | Dedicated bucket with restricted access for billing data |
The S3 path structure for CUR files:
# CUR S3 path structure
# s3://{bucket}/{prefix}/{report-name}/{date-range}/{assemblyId}/{report-name}-{number}.csv.gz
# Example with Parquet format and Athena integration
s3://my-billing-bucket/cur/MyReport/20240501-20240601/
├── 01/
│ └── MyReport-00001.snappy.parquet
├── MyReport-Manifest.json # describes the files in this delivery
└── crawler-cfn.yml # optional CloudFormation for Glue crawler
# The manifest JSON tells downstream tools what files are in this report period
# Always use the manifest to discover CUR files programmaticallyAWS re-delivers CUR files multiple times during a month as new data arrives and corrections are made. The manifest.json is the authoritative list of what files constitute the current complete dataset for a billing period. Always load from the manifest, not by listing the S3 directory.
Key CUR Data Columns and What They Mean
A CUR file can have 300+ columns. Understanding the most important ones is essential for building cost analysis:
| Column Group | Key Columns | Description |
|---|---|---|
| Identity | lineItem/LineItemId, lineItem/UsageAccountId | Unique ID for the line item; which account generated it |
| Time | lineItem/UsageStartDate, lineItem/UsageEndDate | When the usage occurred (hourly for hourly reports) |
| Service | lineItem/ProductCode, product/servicename | Which AWS service (e.g. AmazonEC2, AmazonRDS) |
| Resource | lineItem/ResourceId | Specific resource ARN or ID - enables per-resource costing |
| Usage | lineItem/UsageType, lineItem/UsageAmount | What type of usage (e.g. BoxUsage:t3.large) and how much |
| Cost - Unblended | lineItem/UnblendedCost | Actual rate charged for this line item (most common cost metric) |
| Cost - Blended | lineItem/BlendedCost | Averaged rate across org for this usage type (legacy - avoid) |
| Cost - Amortized | reservation/AmortizedUpfrontFeeForBillingPeriod | Upfront RI/SP cost spread across the period |
| Pricing | lineItem/LineItemType, pricing/term, pricing/unit | Whether On-Demand, RI, SP, tax, credit, refund |
| Tags | resourceTags/user:{TagKey} | All cost allocation tags on the resource at time of usage |
Line item types in the LineItemType column - each requires different handling in analysis:
| LineItemType | Meaning | Include in Cost Analysis? |
|---|---|---|
| Usage | Normal On-Demand or covered compute usage | Yes - primary cost data |
| DiscountedUsage | Usage covered by RI or Savings Plans at discounted rate | Yes - real cost |
| SavingsPlanCoveredUsage | Usage where SP rate was applied | Yes - actual spend |
| SavingsPlanNegation | Offset entry that cancels out the On-Demand rate | No - accounting entry only |
| Fee | RI or Savings Plans upfront or recurring fee | Yes for amortized cost analysis |
| RIFee | Reserved Instance fee line item | Include in amortized but not unblended analysis |
| Tax | Applicable taxes | Depends on use case - exclude for infrastructure cost analysis |
| Credit | AWS credits applied | Depends - may want to show pre-credit cost for planning |
| Refund | Billing corrections | Include to reconcile with actual bill |
Querying CUR with Athena
Amazon Athena is the standard way to query CUR data. AWS provides a CloudFormation template to automatically create the Glue Data Catalog table and Glue Crawler when you enable CUR with Athena integration.
-- Total cost by service for the current month
SELECT
line_item_product_code AS service,
SUM(line_item_unblended_cost) AS total_cost
FROM cur_database.cur_table
WHERE
line_item_usage_start_date >= DATE_TRUNC('month', CURRENT_DATE)
AND line_item_line_item_type NOT IN ('Tax', 'Credit', 'Refund', 'SavingsPlanNegation')
GROUP BY 1
ORDER BY 2 DESC;
-- Cost per resource for EC2, with tags
SELECT
line_item_resource_id AS resource_id,
resource_tags_user_environment AS environment,
resource_tags_user_team AS team,
SUM(line_item_unblended_cost) AS monthly_cost
FROM cur_database.cur_table
WHERE
line_item_product_code = 'AmazonEC2'
AND year = '2024'
AND month = '5'
AND line_item_line_item_type = 'Usage'
GROUP BY 1, 2, 3
ORDER BY 4 DESC
LIMIT 100;
-- Daily cost trend for the last 30 days
SELECT
DATE_TRUNC('day', line_item_usage_start_date) AS usage_date,
SUM(line_item_unblended_cost) AS daily_cost
FROM cur_database.cur_table
WHERE
line_item_usage_start_date >= CURRENT_DATE - INTERVAL '30' DAY
AND line_item_line_item_type NOT IN ('Tax', 'Credit')
GROUP BY 1
ORDER BY 1;Athena query costs for CUR analysis:
| Format | Typical CUR Size (monthly) | Athena Cost per Full Scan | Recommendation |
|---|---|---|---|
| CSV (gzip) | 1-10 GB for mid-size org | $0.05 - $0.50 per query | Usable but not optimal |
| Parquet (snappy) | 100-500 MB for same org | $0.005 - $0.025 per query | Recommended - 10x cheaper queries |
Always partition your Athena queries by year and month using the partition columns that CUR adds. Filtering on year='2024' AND month='5' dramatically reduces data scanned and Athena costs. The Glue crawler created by the CUR CloudFormation template adds these partition columns automatically.
CUR vs Cost Explorer API - When to Use Each
Both CUR and the Cost Explorer API provide cost data, but they serve different needs:
| Dimension | CUR | Cost Explorer API |
|---|---|---|
| Granularity | Hourly, resource-level | Daily or monthly aggregated |
| Resource IDs | Yes - every individual resource | Limited - not always available |
| Latency | 24-hour delay on delivery | Data available within 24h of usage |
| Cost | S3 storage + Athena query costs | $0.01 per API request |
| Data retention | As long as you keep in S3 | 13 months in Cost Explorer |
| Custom analysis | Unlimited - full SQL on raw data | Limited to supported groupings and filters |
| Integration effort | Higher - requires Athena/Glue setup | Low - direct API calls |
| Best for | Custom FinOps tooling, data warehouse, auditing | Quick analysis, dashboards, cost anomaly detection |
Decision framework:
| Use Case | Recommended Tool |
|---|---|
| Quick ad-hoc cost question | Cost Explorer console |
| Automated monthly report via script | Cost Explorer API |
| Custom FinOps dashboard with per-resource breakdown | CUR + Athena |
| Feeding cost data into a data warehouse (Redshift, Snowflake) | CUR Parquet to S3 |
| Cost allocation by tag with complex hierarchies | CUR + Athena |
| Auditing historical cost for 2+ years | CUR in S3 (set lifecycle policy) |
| Billing anomaly detection | Cost Anomaly Detection (uses CUR internally) |
CUR Setup and Operational Best Practices
Setting up CUR correctly from the start avoids rework. Key decisions to make at setup time:
# CUR is configured in AWS Billing console or via API
# aws cur put-report-definition (must run in us-east-1)
aws cur put-report-definition \
--region us-east-1 \
--report-definition \
"ReportName=my-cur-report,
TimeUnit=HOURLY,
Format=Parquet,
Compression=Parquet,
AdditionalSchemaElements=[RESOURCES,SPLIT_COST_ALLOCATION_DATA],
S3Bucket=my-billing-bucket,
S3Prefix=cur,
S3Region=us-east-1,
AdditionalArtifacts=[ATHENA],
RefreshClosedReports=true,
ReportVersioning=OVERWRITE_REPORT"Best practices for CUR configuration and management:
| Practice | Why | How |
|---|---|---|
| Enable RESOURCES schema element | Adds ResourceId column for per-resource analysis | Include in AdditionalSchemaElements |
| Use Parquet format | 10x smaller files, 10x cheaper Athena queries | Set Format=Parquet |
| Enable Athena integration | Auto-creates Glue catalog and partitioning | Set AdditionalArtifacts=[ATHENA] |
| Set S3 lifecycle policy | CUR grows indefinitely - archive old data | Transition to Glacier after 90 days, delete after 3 years |
| Enable bucket versioning | CUR overwrites files - versioning enables recovery | Enable on the CUR S3 bucket |
| Restrict S3 access | Billing data is sensitive | Bucket policy: only Billing service + specific IAM roles |
| Activate cost allocation tags | Tags only appear in CUR after activation | Billing console > Cost Allocation Tags |
CUR files for large AWS organizations can be multiple GB per day. The RESOURCES schema element significantly increases file size (sometimes 5-10x) because it adds a row per resource per hour. For organizations with thousands of resources, evaluate whether you need hourly granularity or if daily is sufficient, and use Parquet to manage the data volume.
Interview Focus Points
- 1What is the difference between CUR and the Cost Explorer API? When would you use CUR instead of calling the Cost Explorer API?
- 2How would you set up a cost analytics pipeline that gives per-team cost breakdown by tag, updated daily?
- 3What is the difference between unblended, blended, and amortized cost in CUR? Which would you use for chargeback to teams?
- 4What are SavingsPlanNegation line items in CUR and why do you need to exclude them from cost analysis?
- 5How does the CUR manifest.json file work and why should you use it instead of listing the S3 directory?
- 6A team's CUR queries in Athena are slow and expensive - what would you do to optimize them?
- 7How do you handle CUR data for an AWS Organization where you want per-account cost visibility?
- 8What is the SPLIT_COST_ALLOCATION_DATA schema element in CUR and when would you need it?
- 9CUR has a 24-hour delay - how would you handle a situation where you need near-real-time cost visibility?
- 10How would you model amortized Reserved Instance costs in CUR for a monthly chargeback report?