tempdb: Are We There Yet?
- Posted in:
- Database Administration
- SQL Agent
- SQL Server
As a global resource, it can be challenging to manage [tempdb]. It's a broad topic, so for this post, I'll just focus on these two scenarios that leave [tempdb] out of space:
- Maximum file size is limited and there is no more free space to allocate new pages.
- Files are configured for unlimited growth and have consumed all available disk space.
Queries to obtain information for either scenario are reasonably simple: there's the sys.dm_os_volume_stats dynamic management function, the FILEPROPERTY metadata function, and even the (unsupported) extended procedure xp_fixeddrives to name a few. Running queries on demand is a good place to start, but you'll probably want to have some automation, right? A tactic I have used (and that I've seen frequently) is to set up a SQL Agent Job that runs one or more queries and takes action, based on the query outcome.
Naturally, the job runs on a predefined schedule. But how frequently should we check disk/available space for [tempdb]? The temporary nature of [tempdb] makes this a difficult question: objects within aren't saved from one session of SQL Server to another, and evidence to explain runaway growth or loss of available space may be gone before an assessment can be made. Whatever schedule I decide on, I'll always wonder if it's frequent enough (or too frequent).
It's tempting to "over-schedule" a job's frequency, perhaps as much as every X seconds. Asking SQL Server "Are we out of disk space?" over and over again doesn't make a lot of sense to me, though. It reminds me of Bart and Lisa asking Homer "Are we there yet?" until he snaps. Ideally, instead of asking, I want SQL Server to *tell me* when disk/available space is running low.
Performance Condition Alerts
For [tempdb] data files that have a set maximum file size, a performance condition alert for the "Free Space in tempdb (KB)" counter can be created. It's part of the "Transactions" object. (On a side note, trying to find an appropriate counter by exploring what's available in each object, one at a time, is tedious. It's so much easier to query the sys.dm_os_performance_counters DMV.) In the example below, the alert fires if free space falls below 102400KB/100MB. The alert responds by executing a job "Log tempdb Space Usage Stats", and has a five minute delay between responses.
This approach works well for [tempdb] files that are configured with an initial size equal to their maximum size. With no possibility for auto-growth, you'd decide on the free space threshold amount, do a little math to convert it to kilobytes, and enter the value on the General page of the Alert dialog. If [tempdb] configuration and available disk space allow for future auto-growth events, take into consideration the alert may fire more often than desired.
WMI Alerts
For [tempdb] data files configured with an unlimited Maximum File Size, a WMI alert can be configured for the DATA_FILE_AUTO_GROW event. In the previous example, the alert (or more specifically, the counter) only applies to [tempdb]. But the DATA_FILE_AUTO_GROW event applies to data files for *all* databases. In the example below, the WQL query will restrict the alert to just [tempdb] via a WHERE clause. Like the first example, the alert responds by executing the job "Log tempdb Space Usage Stats", and has a five minute delay between responses.
This approach works well for [tempdb] files that are configured with room for growth. Note the event fires once per data file growth event. With the likelihood of multiple data files, the "Delay between responses" remains important. Also, if [tempdb] data files grow enough to consume all available disk space, the last alert might occur long before [tempdb] runs out of available space internally. As an alternative to a WMI event, an Event Notification can be configured for the DATA_FILE_AUTO_GROW event.
What to Log
Depending on how [tempdb] data files are configured, it makes sense to check either available disk space or available space within [tempdb] itself (or both) when an alert fires. In addition, determining what is consuming so much space in [tempdb] makes sense too. Microsoft's documentation includes some really good queries for monitoring [tempdb] use. Sunil Agarwal also has an article I found helpful. (It's an oldie, but a goodie.) Additionally, I've found these DMVs useful:
- sys.dm_db_file_space_usage
- sys.dm_db_task_space_usage
- sys.dm_db_session_space_usage
- sys.dm_tran_session_transactions
Comments