Ace Cloud Interviews
Home/AWS Tutorial/CodeArtifact
🛠️

AWS Developer Tools & CI/CD

CodeArtifact

Secure artifact repository supporting npm, Maven, PyPI, NuGet, and more

AWS CodeArtifact is a fully managed artifact repository service that supports npm, Maven, PyPI, NuGet, RubyGems, Swift, and generic packages, providing a secure, scalable central store for software dependencies. It can proxy public repositories like npmjs.com and Maven Central, caching packages internally so builds remain fast and reliable even when upstream sources are unavailable.

Core Concepts: Domain, Repository, and Upstream

CodeArtifact has a three-level hierarchy that must be understood before using the service:

ConceptDescriptionAnalogy
DomainTop-level organizational container, spans accountsAn organization's artifact namespace
RepositoryPackage store within a domain, format-specific or mixedAn npm registry or Maven repo
PackageA named, versioned artifact within a repositoryreact@18.2.0, boto3==1.28.0
Upstream repositoryAnother CodeArtifact repo or external connection this repo fetches fromProxy to npmjs.com
External connectionLink to a public repository (npm, PyPI, Maven Central, etc.)npmjs.com gateway

When a client requests a package, CodeArtifact checks the repository first. If not found, it searches upstream repositories in order, caching the package for future requests. This chain can span multiple repositories before hitting an external connection.

💡

A domain can span multiple AWS accounts, enabling a single artifact store for an entire organization. Package assets stored in a domain are encrypted and deduplicated - the same package version stored across multiple repositories only uses storage once.

Configuring Package Managers to Use CodeArtifact

Each package manager needs to be configured to use CodeArtifact. The login command handles token generation and configuration automatically:

bash
# Get the repository endpoint and set auth token (valid 12 hours by default)
# npm
aws codeartifact login --tool npm \
  --repository my-repo \
  --domain my-domain \
  --domain-owner 123456789012

# pip (Python)
aws codeartifact login --tool pip \
  --repository my-repo \
  --domain my-domain \
  --domain-owner 123456789012

# Maven
aws codeartifact login --tool mvn \
  --repository my-repo \
  --domain my-domain \
  --domain-owner 123456789012

# NuGet
aws codeartifact login --tool dotnet \
  --repository my-repo \
  --domain my-domain \
  --domain-owner 123456789012

For CI/CD pipelines, generate the token programmatically rather than using the login command:

bash
# Get auth token for use in buildspec.yml
CODEARTIFACT_TOKEN=$(aws codeartifact get-authorization-token \
  --domain my-domain \
  --domain-owner 123456789012 \
  --query authorizationToken \
  --output text)

# Use token with pip
pip install --index-url https://aws:$CODEARTIFACT_TOKEN@my-domain-123456789012.d.codeartifact.us-east-1.amazonaws.com/pypi/my-repo/simple/ my-package
⚠️

CodeArtifact auth tokens expire after 12 hours by default (maximum 12 hours). In long-running CI/CD pipelines or build fleets, tokens must be refreshed. Use --duration-seconds to request a shorter token if needed. Never store tokens in environment variables that persist across sessions.

Upstream Repositories and Package Caching

CodeArtifact's upstream feature allows you to create a layered repository structure that combines internal packages with proxied public packages:

Repository TypePurposeExample Setup
Internal onlyPrivate packages your team buildsmy-internal-packages repo
Proxy/cache onlyExternal connection to npmjs.com/PyPInpm-store repo with npmjs.com connection
AggregatedUpstream points to both internal and proxymy-app repo upstream: [my-internal-packages, npm-store]

When npm install runs against the aggregated repo, CodeArtifact returns internal packages first, falling through to cached public packages. This pattern ensures internal packages take precedence over any identically-named public package (dependency confusion attack prevention).

💡

Once a package version is cached from an upstream, it is retained in CodeArtifact regardless of whether the upstream version is later deleted or the package is unpublished (like the left-pad incident). This makes builds reproducible and resilient to upstream changes.

Resource-Based Policies and Cross-Account Access

CodeArtifact uses both IAM identity policies and resource-based policies on domains and repositories. Understanding the interaction is important for cross-account setups:

Policy TypeAttached ToControls
IAM identity policyUser, role, or groupWhat CodeArtifact actions a principal can take
Domain resource policyDomainWhich accounts/principals can access the domain
Repository resource policyRepositoryFine-grained read/write/publish per repository
bash
# Repository resource policy - allow another account to publish
aws codeartifact put-repository-permissions-policy \
  --domain my-domain \
  --repository my-repo \
  --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {"AWS": "arn:aws:iam::OTHER_ACCOUNT:root"},
        "Action": [
          "codeartifact:PublishPackageVersion",
          "codeartifact:PutPackageMetadata"
        ],
        "Resource": "*"
      }
    ]
  }'

Pricing Model

DimensionPriceNotes
Storage$0.05 per GB/monthDeduplicated within a domain
Package requests$0.05 per 10,000 requestsRead and write operations combined
Data transfer outStandard S3 rates applyTransfers within same region are free

There is no per-repository or per-user fee. Cost scales with actual usage. The free tier includes 2 GB storage and 100,000 requests per month.

💡

Keep public proxy repositories in the same region as your build fleet to avoid data transfer charges. If builds run in multiple regions, consider per-region repositories with upstream chains back to a central repository.

🎯

Interview Focus Points

  • 1What is the hierarchy of domain, repository, and package in CodeArtifact?
  • 2How does CodeArtifact upstream configuration work and why is it useful?
  • 3How do you authenticate a build pipeline with CodeArtifact and how long do tokens last?
  • 4How does CodeArtifact protect against dependency confusion attacks?
  • 5How would you set up cross-account artifact publishing with CodeArtifact?
  • 6What external connections does CodeArtifact support for public package proxying?
  • 7How does package caching from upstreams work - what happens if the upstream deletes a package?
  • 8What is the pricing model for CodeArtifact and what drives costs?