Ace Cloud Interviews
Home/AWS Tutorial/Cost and Usage Report
💰

AWS Cost Management

Cost and Usage Report

Comprehensive cost and usage data delivered to S3 for deep billing analysis

The AWS Cost and Usage Report (CUR) is the most detailed billing dataset AWS produces, delivering line-item cost and usage data to an S3 bucket in CSV or Parquet format. It contains every charge on your AWS bill broken down to the individual resource level, with cost allocation tags, pricing details, and usage quantities. For cloud engineers and FinOps practitioners, CUR is the foundation for building custom cost analytics, feeding billing data into data warehouses like Redshift or Athena, and doing any analysis that goes beyond what Cost Explorer's pre-built reports support.

CUR Architecture - How Data Is Delivered

CUR is configured in the AWS Billing console and delivers files to an S3 bucket you specify. The delivery schedule and format are configurable:

ConfigurationOptionsRecommendation
Report granularityHourly, Daily, MonthlyHourly for most use cases - enables time-of-day analysis
File formatCSV (gzip), Parquet (snappy)Parquet for Athena/Redshift - much smaller and faster to query
Delivery frequencyDaily (default), hourly (with hourly granularity)Daily unless you need near-real-time cost visibility
Report versioningCreate new version or overwriteOverwrite for Athena integration; new version for auditing
S3 bucketAny bucket in your accountDedicated bucket with restricted access for billing data

The S3 path structure for CUR files:

bash
# CUR S3 path structure
# s3://{bucket}/{prefix}/{report-name}/{date-range}/{assemblyId}/{report-name}-{number}.csv.gz

# Example with Parquet format and Athena integration
s3://my-billing-bucket/cur/MyReport/20240501-20240601/
  ├── 01/
  │   └── MyReport-00001.snappy.parquet
  ├── MyReport-Manifest.json     # describes the files in this delivery
  └── crawler-cfn.yml            # optional CloudFormation for Glue crawler

# The manifest JSON tells downstream tools what files are in this report period
# Always use the manifest to discover CUR files programmatically
💡

AWS re-delivers CUR files multiple times during a month as new data arrives and corrections are made. The manifest.json is the authoritative list of what files constitute the current complete dataset for a billing period. Always load from the manifest, not by listing the S3 directory.

Key CUR Data Columns and What They Mean

A CUR file can have 300+ columns. Understanding the most important ones is essential for building cost analysis:

Column GroupKey ColumnsDescription
IdentitylineItem/LineItemId, lineItem/UsageAccountIdUnique ID for the line item; which account generated it
TimelineItem/UsageStartDate, lineItem/UsageEndDateWhen the usage occurred (hourly for hourly reports)
ServicelineItem/ProductCode, product/servicenameWhich AWS service (e.g. AmazonEC2, AmazonRDS)
ResourcelineItem/ResourceIdSpecific resource ARN or ID - enables per-resource costing
UsagelineItem/UsageType, lineItem/UsageAmountWhat type of usage (e.g. BoxUsage:t3.large) and how much
Cost - UnblendedlineItem/UnblendedCostActual rate charged for this line item (most common cost metric)
Cost - BlendedlineItem/BlendedCostAveraged rate across org for this usage type (legacy - avoid)
Cost - Amortizedreservation/AmortizedUpfrontFeeForBillingPeriodUpfront RI/SP cost spread across the period
PricinglineItem/LineItemType, pricing/term, pricing/unitWhether On-Demand, RI, SP, tax, credit, refund
TagsresourceTags/user:{TagKey}All cost allocation tags on the resource at time of usage

Line item types in the LineItemType column - each requires different handling in analysis:

LineItemTypeMeaningInclude in Cost Analysis?
UsageNormal On-Demand or covered compute usageYes - primary cost data
DiscountedUsageUsage covered by RI or Savings Plans at discounted rateYes - real cost
SavingsPlanCoveredUsageUsage where SP rate was appliedYes - actual spend
SavingsPlanNegationOffset entry that cancels out the On-Demand rateNo - accounting entry only
FeeRI or Savings Plans upfront or recurring feeYes for amortized cost analysis
RIFeeReserved Instance fee line itemInclude in amortized but not unblended analysis
TaxApplicable taxesDepends on use case - exclude for infrastructure cost analysis
CreditAWS credits appliedDepends - may want to show pre-credit cost for planning
RefundBilling correctionsInclude to reconcile with actual bill

Querying CUR with Athena

Amazon Athena is the standard way to query CUR data. AWS provides a CloudFormation template to automatically create the Glue Data Catalog table and Glue Crawler when you enable CUR with Athena integration.

bash
-- Total cost by service for the current month
SELECT
  line_item_product_code AS service,
  SUM(line_item_unblended_cost) AS total_cost
FROM cur_database.cur_table
WHERE
  line_item_usage_start_date >= DATE_TRUNC('month', CURRENT_DATE)
  AND line_item_line_item_type NOT IN ('Tax', 'Credit', 'Refund', 'SavingsPlanNegation')
GROUP BY 1
ORDER BY 2 DESC;

-- Cost per resource for EC2, with tags
SELECT
  line_item_resource_id AS resource_id,
  resource_tags_user_environment AS environment,
  resource_tags_user_team AS team,
  SUM(line_item_unblended_cost) AS monthly_cost
FROM cur_database.cur_table
WHERE
  line_item_product_code = 'AmazonEC2'
  AND year = '2024'
  AND month = '5'
  AND line_item_line_item_type = 'Usage'
GROUP BY 1, 2, 3
ORDER BY 4 DESC
LIMIT 100;

-- Daily cost trend for the last 30 days
SELECT
  DATE_TRUNC('day', line_item_usage_start_date) AS usage_date,
  SUM(line_item_unblended_cost) AS daily_cost
FROM cur_database.cur_table
WHERE
  line_item_usage_start_date >= CURRENT_DATE - INTERVAL '30' DAY
  AND line_item_line_item_type NOT IN ('Tax', 'Credit')
GROUP BY 1
ORDER BY 1;

Athena query costs for CUR analysis:

FormatTypical CUR Size (monthly)Athena Cost per Full ScanRecommendation
CSV (gzip)1-10 GB for mid-size org$0.05 - $0.50 per queryUsable but not optimal
Parquet (snappy)100-500 MB for same org$0.005 - $0.025 per queryRecommended - 10x cheaper queries
💡

Always partition your Athena queries by year and month using the partition columns that CUR adds. Filtering on year='2024' AND month='5' dramatically reduces data scanned and Athena costs. The Glue crawler created by the CUR CloudFormation template adds these partition columns automatically.

CUR vs Cost Explorer API - When to Use Each

Both CUR and the Cost Explorer API provide cost data, but they serve different needs:

DimensionCURCost Explorer API
GranularityHourly, resource-levelDaily or monthly aggregated
Resource IDsYes - every individual resourceLimited - not always available
Latency24-hour delay on deliveryData available within 24h of usage
CostS3 storage + Athena query costs$0.01 per API request
Data retentionAs long as you keep in S313 months in Cost Explorer
Custom analysisUnlimited - full SQL on raw dataLimited to supported groupings and filters
Integration effortHigher - requires Athena/Glue setupLow - direct API calls
Best forCustom FinOps tooling, data warehouse, auditingQuick analysis, dashboards, cost anomaly detection

Decision framework:

Use CaseRecommended Tool
Quick ad-hoc cost questionCost Explorer console
Automated monthly report via scriptCost Explorer API
Custom FinOps dashboard with per-resource breakdownCUR + Athena
Feeding cost data into a data warehouse (Redshift, Snowflake)CUR Parquet to S3
Cost allocation by tag with complex hierarchiesCUR + Athena
Auditing historical cost for 2+ yearsCUR in S3 (set lifecycle policy)
Billing anomaly detectionCost Anomaly Detection (uses CUR internally)

CUR Setup and Operational Best Practices

Setting up CUR correctly from the start avoids rework. Key decisions to make at setup time:

bash
# CUR is configured in AWS Billing console or via API
# aws cur put-report-definition (must run in us-east-1)

aws cur put-report-definition \
  --region us-east-1 \
  --report-definition \
  "ReportName=my-cur-report,
   TimeUnit=HOURLY,
   Format=Parquet,
   Compression=Parquet,
   AdditionalSchemaElements=[RESOURCES,SPLIT_COST_ALLOCATION_DATA],
   S3Bucket=my-billing-bucket,
   S3Prefix=cur,
   S3Region=us-east-1,
   AdditionalArtifacts=[ATHENA],
   RefreshClosedReports=true,
   ReportVersioning=OVERWRITE_REPORT"

Best practices for CUR configuration and management:

PracticeWhyHow
Enable RESOURCES schema elementAdds ResourceId column for per-resource analysisInclude in AdditionalSchemaElements
Use Parquet format10x smaller files, 10x cheaper Athena queriesSet Format=Parquet
Enable Athena integrationAuto-creates Glue catalog and partitioningSet AdditionalArtifacts=[ATHENA]
Set S3 lifecycle policyCUR grows indefinitely - archive old dataTransition to Glacier after 90 days, delete after 3 years
Enable bucket versioningCUR overwrites files - versioning enables recoveryEnable on the CUR S3 bucket
Restrict S3 accessBilling data is sensitiveBucket policy: only Billing service + specific IAM roles
Activate cost allocation tagsTags only appear in CUR after activationBilling console > Cost Allocation Tags
⚠️

CUR files for large AWS organizations can be multiple GB per day. The RESOURCES schema element significantly increases file size (sometimes 5-10x) because it adds a row per resource per hour. For organizations with thousands of resources, evaluate whether you need hourly granularity or if daily is sufficient, and use Parquet to manage the data volume.

🎯

Interview Focus Points

  • 1What is the difference between CUR and the Cost Explorer API? When would you use CUR instead of calling the Cost Explorer API?
  • 2How would you set up a cost analytics pipeline that gives per-team cost breakdown by tag, updated daily?
  • 3What is the difference between unblended, blended, and amortized cost in CUR? Which would you use for chargeback to teams?
  • 4What are SavingsPlanNegation line items in CUR and why do you need to exclude them from cost analysis?
  • 5How does the CUR manifest.json file work and why should you use it instead of listing the S3 directory?
  • 6A team's CUR queries in Athena are slow and expensive - what would you do to optimize them?
  • 7How do you handle CUR data for an AWS Organization where you want per-account cost visibility?
  • 8What is the SPLIT_COST_ALLOCATION_DATA schema element in CUR and when would you need it?
  • 9CUR has a 24-hour delay - how would you handle a situation where you need near-real-time cost visibility?
  • 10How would you model amortized Reserved Instance costs in CUR for a monthly chargeback report?