Skip to content

TimerTrigger

Finbar Ryan edited this page Jun 30, 2021 · 27 revisions

The README gives an overview of TimerTrigger and its use. Below we cover some of the internal details as well as some of the advanced configuration options.

Scheduling

For each of your TimerTrigger functions, an in process timer is created based on the schedule you specified. The timer begins immediately on startup, with the interval until the next execution set based on when your function was last executed. When the interval expires, your function is invoked. By default TimerTrigger keeps track of executions by keeping track of execution history in storage, ensuring that schedules can be maintained in the face of process crashes or node restarts. It uses this history to "catch up" on missed schedule executions, if for some reason an occurrence was missed. You can disable that behavior if you wish by setting UseMonitor = false on your TimerTrigger attribute.

If your function execution takes longer than the timer interval, another execution won't be triggered until after the current invocation completes. The next execution is scheduled after the current execution completes.

Singleton Locks

Behind the scenes, TimerTrigger uses the Singleton feature of the WebJobs SDK to ensure that only a single instance of your triggered function is running at any given time. When the JobHost starts up, for each of your TimerTrigger functions a blob lease (the Singleton Lock) is taken. This distributed lock ensures that only a single instance of your scheduled function is running at any time. If the blob for that function is not currently leased, the function will acquire the lease and start running on schedule immediately. If the blob lease cannot be acquired, it generally means that another instance of that function is running, so the function is not started in the current host. When this happens, the host will continue to periodically check to see if it can acquire the lease. This is done as a sort of "recovery" mode to ensure that when running steady state, if an instance goes down, another instance notice and pick up where the other left off.

Local Development

If you set JobHostConfiguration.Tracing.ConsoleLevel to Verbose, you can see these Singleton lock operations written to the Console output.

There are configuration knobs on JobHostConfiguration.Singleton that allow you to control the Singleton lock behavior. For local development, it is recommended that you temporarily change JobHostConfiguration.Singleton.ListenerLockPeriod to its minimum value of 15 seconds to ensure that your WebJob host starts quickly when it is being stopped/restarted often during local development. You can use the JobHostConfiguration.IsDevelopment property and JobHostConfiguration.UseDevelopmentSettings method to configure these options (and others) for you automatically. This will facilitate local development. To configure your machine for local development, set a local environment variable named AzureWebJobsEnv with value Development. Once you've done that, you can modify your startup code as follows.

var config = new JobHostConfiguration();
            
if (config.IsDevelopment)
{
    config.UseDevelopmentSettings();
}

config.UseTimers();

JobHost host = new JobHost(config);
host.RunAndBlock();

The default for ListenerLockPeriod is 60 seconds, which means that if the host is killed ungracefully (e.g. the Console window is shut when running locally) the blob lease that was held cannot be reacquired until the lease naturally expires. This means that if you kill a running instance then attempt to immediately restart the JobHost, you'll see that the function won't start running immediately since it's waiting to acquire the Singleton lock.

Note that this is really only an issue when running locally. In Azure, if the JobHost shuts down (e.g role restarts, etc.) graceful shutdown logic will release the lock immediately so when the host starts back up it can be reacquired. If the instance does terminate abruptly w/o allowing the graceful shutdown logic to complete, then the natural lease timeout will still be in play. You can adjust that via the ListenerLockPeriod setting mentioned above, if you desire a shorter lease.

Note that when running in production, you'll want to restore the default configuration settings. While using a shorter lease time facilitates local development, with a shorter lease there are also more lease renewals required when running at steady state, which is why the default of 60 seconds was chosen. The listener then only needs to attempt to renew the lease every 30 seconds or so (it starts trying at half the expiry), reducing storage calls.

Also note that if you're sharing the same storage account between your local development and production deployment, the Singleton locks (blob leases) will be shared. This means that when starting your JobHost locally, on startup it will be competing with the instances running in Azure for the lock leases. Generally that will mean that the local function instance will not start because there is another instance running. To get around this, you can either use a separate storage account for local development (which is a good practice anyways), or perhaps stop the job running in Azure.

Troubleshooting

As mentioned above, timer triggers rely on blob leases for proper operation. A lease is taken for each timer trigger function, and are renewed automatically behind the scenes. If the host is healthy, these renewals will succeed without issue. However there are ways your code might cause issues with the host and prevent the renewals from occurring in time, resulting in lost lease errors, e.g.:

Singleton lock renewal failed for blob 'my-storage/MyApp.Functions.MyTimerTrigger.Listener'
    with error code 409: LeaseIdMismatchWithLeaseOperation. The last successful renewal completed
    at 0001-01-01T00:00:00Z (-2147483648 milliseconds ago) with a duration of 0 milliseconds. The
    lease period was 60000 milliseconds.

If you are experiencing these issues, there are things for you to check:

  • ensure that your host isn't maxing out CPU. If the host instance is redlining, the background renewal tasks will fail to run, resulting in lease errors like the above.
  • if your function(s) are async, ensure that you're following correct async coding practices. - Any calls to Task.Wait() or Task.Result or Task.GetAwaiter().GetResult() are troublesome, as they block threads, which again can prevent the background renewal tasks from running. Ensure your async functions are "async all the way down".
  • ensure you're using asynchronous APIs for any IO operations. Synchronous/blocking IO is troublesome.
  • look for other ways you may be overloading the host. Are you running too many functions on a single instance, stressing the host? Can you break things out?