Release 1.16
cert-manager 1.16 includes various improvements to the metrics in the cert-manager components.
Breaking changes
- Venafi Issuer may fail to renew Certificates if the duration conflicts with the CA minimum / maximum duration policy in Venafi.
- Venafi Issuer may fail to renew Certificates if the issuer has been configured for TPP with username-password authentication.
- Helm schema validation may reject your existing Helm values files if they contain typos or unrecognized fields.
Themes
Extended Metrics
The webhook and cainjector components now have metrics servers, so that platform teams can monitor the performance of all the cert-manager components and gain more information about the underlying Go runtime in the event of a problem. Read the Prometheus Metrics page to learn more.
Venafi Issuer
We've made some important improvements to the Venafi Issuer.
If you use the Venafi Issuer with a TPP server with username-password authentication, cert-manager 1.16 now uses OAuth authentication instead of the deprecated API Key authentication. This is a potentially breaking change, because you may need to reconfigure your TPP server to enable OAuth authentication, and you may need to reconfigure the cert-manager service accounts in TPP to work with OAuth.
The desired certificate.spec.duration
value will now be sent to the Venafi API server.
The default value for certificate.spec.duration
is 90 days, but you may have changed this in your Certificate resources.
Your Venafi issuing template may be configured to ignore the requested From
and To
times,
in which case nothing will change.
Your Venafi issuing template may be configured with a maximum or a minimum duration,
in which case your certificate requests may fail after you upgrade to cert-manager 1.16.
Consider this carefully when upgrading to cert-manager 1.16.
When connecting to Venafi TPP, cert-manager can now load the CA certificate from a Secret resource. This allows you to manage the CA with familiar tools such as trust-manager.
Read the Venafi Issuer page to learn more.
Route53 DNS01 Solver Cleanup
The Route53 DNS01 solver code had become over-complicated due to its age and due
to the variety of authentication methods that have been added over the years.
When we upgraded to AWS SDK for Go V2
in the last release, we did not have a
good understanding of the new SDK and we were not able to test it thoroughly
with all authentication methods. In this release we started putting that right.
In this release we have tidied up the code and added more logging so that it is easier to debug problems in the field. We have improved the documentation of the Route53 API fields, particularly the region field, where we have tried to describe where and how cert-manager uses that value.
We have relaxed the API validation so that the region
field is now optional.
cert-manager will now fall back to using the AWS_REGION
environment variable of the controller Pod,
regardless of which authentication mechanism is used.
Users who use IAM Roles for Service accounts or Pod Identity need
not specify the region, but if your Issuer or ClusterIssuer does include a region (for the sake of satisfying the old API validation),
that issuer region will be ignored, if the AWS_REGION
environment variable is set.
cert-manager will now use regional STS endpoints, when using AssumeRole
or when
using a dedicated (non-mounted) Kubernetes ServiceAccount.
The regional endpoint will be computed based on the Issuer region
field,
or the AWS_REGION
environment variable.
ℹ️ This change only affects the
AssumeRole
configuration, which is used for cross-account authentication, and theAssumeRoleWithWebIdentity
configuration, where the user supplies the name of a Kubernetes ServiceAccount. It does not affect you if you have configured the cert-manager ServiceAccount for IRSA, where the ServiceAccount token is mounted in to the cert-manager controller Pod. Regional STS endpoints were already being used in that case.ℹ️ There are good reasons to use regional STS endpoints, summarized as follows on the Amazon AWS blog:
Although the global (legacy) AWS STS endpoint https://sts.amazonaws.com is highly available, it’s hosted in a single AWS Region — US East (N. Virginia) — and like other endpoints, it doesn’t provide automatic fail-over to endpoints in other Regions.
📖 Read Manage AWS STS in an AWS Region to learn about which regions support STS.
📖 Read AWS STS Regional endpoints, to learn how to configure the use of regional STS endpoints using environment variables.
Read the ACME Issuer Route53 page to learn more.
Memory Optimizations
We have continued our effort to reduce the memory footprint of cert-manager.
The cainjector no longer caches Secret data; instead it only caches the metadata of Secret resources. This significantly reduces its memory usage. It also reduces the load on the Kubernetes API server, when cainjector starts up, because it no longer needs to send all the data of all the Secret resources over the network.
We have added a new ClientWatchList
feature flag to the controller, cainjector, and the webhook.
This is actually a new beta feature in the Kubernetes client-go module,
which enables a much more efficient mechanism for populating the client side caches.
This reduces the load on the Kubernetes API server,
because cert-manager components will no longer request complete unpaged lists of all API resources when they start up.
And it reduces the peak memory use of the cert-manager components when they startup,
because they no longer have to hold a duplicate unpaged list of resources in-memory
while they add them to the client side cache.
Helm Schema Validation
The Helm chart now includes a JSON schema which will validate the values that you supply when installing the chart. This will help you to get your Helm values right first time. It will alert you to typos and unrecognized fields in your existing Helm values files. And it will make it easier for the cert-manager maintainers to maintain the Helm chart, avoiding typos and mistakes in the default values file.
Community
Thanks again to all open-source contributors with commits in this release, including: TODO
Thanks also to the following cert-manager maintainers for their contributions during this release: TODO
Equally thanks to everyone who provided feedback, helped users and raised issues on GitHub and Slack and joined our meetings!
Thanks also to the CNCF, which provides resources and support, and to the AWS open source team for being good community members and for their maintenance of the PrivateCA Issuer.
In addition, massive thanks to Venafi for contributing developer time and resources towards the continued maintenance of cert-manager projects.
Changes since v1.15.0
Feature
- Add
SecretRef
support for Venafi TPP issuer CA Bundle (#7036,@sankalp-at-gh
) - Add
renewBeforePercentage
alternative torenewBefore
(#6987,@cbroglie
) - Add a metrics server to the cainjector (#7194,
@wallrj
) - Add a metrics server to the webhook (#7182,
@wallrj
) - Add client certificate auth method for Vault issuer (#4330,
@joshmue
) - Add process and go runtime metrics for controller (#6966,
@mindw
) - Added
app.kubernetes.io/managed-by: cert-manager
label to the cert-manager-webhook-ca Secret (#7154,@jrcichra
) - Allow the user to specify a Pod template when using GatewayAPI HTTP01 solver, this mirrors the behavior when using the Ingress HTTP01 solver. (#7211,
@ThatsMrTalbot
) - Create token request RBAC for the cert-manager ServiceAccount by default (#7213,
@Jasper-Ben
) - Feature: Add a new
ClientWatchList
feature flag to cert-manager controller, cainjector and webhook, which allows the components to use of the ALPHAWatchList
/ Streaming list feature of the Kubernetes API server. This reduces the load on the Kubernetes API server when cert-manager starts up and reduces the peak memory usage in the cert-manager components. (#7175,@wallrj
) - Feature: Append cert-manager user-agent string to all AWS API requests, including IMDS and STS requests. (#7295,
@wallrj
) - Feature: Log AWS SDK warnings and API requests at cert-manager debug level to help debug AWS Route53 problems in the field. (#7292,
@wallrj
) - Feature: The Route53 DNS solver of the ACME Issuer will now use regional STS endpoints computed from the region that is supplied in the Issuer spec or in the
AWS_REGION
environment variable. Feature: The Route53 DNS solver of the ACME Issuer now uses the "ambient" region (AWS_REGION
orAWS_DEFAULT_REGION
) ifissuer.spec.acme.solvers.dns01.route53.region
is empty; regardless of the flags--issuer-ambient-credentials
and--cluster-issuer-ambient-credentials
. (#7299,@wallrj
) - Helm: adds JSON schema validation for the Helm values. (#7069,
@inteon
) - If the
--controllers
flag only specifies disabled controllers, the default controllers are now enabled implicitly. AddeddisableAutoApproval
andapproveSignerNames
Helm chart options. (#7049,@inteon
) - Make it easier to configure cert-manager using Helm by defaulting
config.apiVersion
andconfig.kind
within the Helm chart. (#7126,@ThatsMrTalbot
) - Now passes down specified duration to Venafi client instead of using the CA default only. (#7104,
@Guitarkalle
) - Reduce the memory usage of
cainjector
, by only caching the metadata of Secret resources. Reduce the load on the K8S API server whencainjector
starts up, by only listing the metadata of Secret resources. (#7161,@wallrj
) - The Route53 DNS01 solver of the ACME Issuer can now detect the AWS region from the
AWS_REGION
andAWS_DEFAULT_REGION
environment variables, which is set by the IAM for Service Accounts (IRSA) webhook and by the Pod Identity webhook. Theissuer.spec.acme.solvers.dns01.route53.region
field is now optional. The API documentation of theregion
field has been updated to explain when and how the region value is used. (#7287,@wallrj
) - Venafi TPP issuer can now be used with a username & password combination with OAuth. Fixes #4653.
Breaking: cert-manager will no longer use the API Key authentication method which was deprecated in 20.2 and since removed in 24.1 of TPP. (#7084,
@hawksight
) - You can now configure the pod security context of HTTP-01 solver pods. (#5373,
@aidy
)
Bug or Regression
- Adds support (behind a flag) to use a domain qualified finalizer. If the feature is enabled (which is not by default), it should prevent Kubernetes from reporting:
metadata.finalizers: "finalizer.acme.cert-manager.io": prefer a domain-qualified finalizer name to avoid accidental conflicts with other finalizer writers
(#7273,@jsoref
) - BUGFIX Route53: explicitly set the
aws-global
STS region which is now required by thegithub.com/aws/aws-sdk-go-v2
library. (#7108,@inteon
) - BUGFIX: fix issue that caused Vault issuer to not retry signing when an error was encountered. (#7105,
@inteon
) - BUGFIX: the dynamic certificate source used by the webhook TLS server failed to detect a root CA approaching expiration, due to a calculation error. This will cause the webhook TLS server to fail renewing its CA certificate. Please upgrade before the expiration of this CA certificate is reached. (#7230,
@inteon
) - Bugfix: Prevent aggressive Route53 retries caused by IRSA authentication failures by removing the Amazon Request ID from errors wrapped by the default credential cache. (#7291,
@wallrj
) - Bugfix: Prevent aggressive Route53 retries caused by STS authentication failures by removing the Amazon Request ID from STS errors. (#7259,
@wallrj
) - Bump
grpc-go
to fixGHSA-xr7q-jx4m-x55m
(#7164,@SgtCoDFish
) - Bump the
go-retryablehttp
dependency to fixCVE-2024-6104
(#7125,@SgtCoDFish
) - Fix Azure DNS causing panics whenever authentication error happens (#7177,
@eplightning
) - Fix incorrect indentation of
endpointAdditionalProperties
in thePodMonitor
template of the Helm chart (#7190,@wallrj
) - Fixes ACME HTTP01 challenge behavior when using Gateway API to prevent unbounded creation of HTTPRoute resources (#7178,
@miguelvr
) - Handle errors arising from challenges missing from the ACME server (#7202,
@bdols
) - Helm BUGFIX: the cainjector ConfigMap was not mounted in the cainjector deployment. (#7052,
@inteon
) - Improve the startupapicheck: validate that the validating and mutating webhooks are doing their job. (#7057,
@inteon
) - The
KeyUsages
X.509 extension is no longer added when there are no key usages set (in accordance to RFC 5280 Section 4.2.1.3) (#7250,@inteon
) - Update
github.com/Azure/azure-sdk-for-go/sdk/azidentity
to addressCVE-2024-35255
(#7087,@dependabot[bot]
)
Other (Cleanup or Flake)
- Old API versions were removed from the codebase.
Removed:
(acme.)cert-manager.io/v1alpha2
(acme.)cert-manager.io/v1alpha3
(acme.)cert-manager.io/v1beta1 (#7278,
@inteon
) - Upgrading to client-go
v0.31.0
removes a lot of noisyreflector.go: unable to sync list result: internal error: cannot cast object DeletedFinalStateUnknown
errors from logs. (#7237,@inteon
)