Critical alert channels

Get a notification everytime a job is re-run after a crash.

This feature is available in the Enterprise Edition.

If the node it which it runs halt suddenly (such as a power loss), then the job will be restarted automatically. Windmill itself doesn't crash and other softer interruptions like a pod termination involve a grace period (300s) to let the job finish.

Critical alerts are generated under the following conditions:

Job is re-run after a crash.
License key does not renew.
Workspace error handler fails.
Number of running workers in a group falls below a specified threshold (has to be configured in the worker group config).
Number of jobs waiting in queue is above a threshold for more than a specified amount of time.

You just need to configure SMTP and setup a critical alert channel (aka email address) in the instance settings or connect your instance to Slack and fill in a channel name.

Critical alert channels Config

You can also set an alert to receive notification when the number of running workers in a group falls below a given number. It's available in the worker group config.

Workers alerts Slack

Workers Alerts

Set an alert to receive notification when the number of running workers in a group falls below a given number.