The Unfulfilled Promise of TRY...CATCH
- Posted in:
- Backup
- Database Administration
- Development
- SQL Server
- T-SQL
I've known for quite a while that there are issues with TRY...CATCH for T-SQL. I've had my share of frustrations with it, and I am clearly not alone. Just check out some of these Microsoft Connect items: 1 2, 3, 4, 5, 6, 7, 8, 9, 10. Five of them are Closed As Won't Fix, three of them are Active (all three were opened more than 8 years ago), and the other two are Closed By Design.
This paints a pretty grim picture of the state of error handling with T-SQL. I've been around long enough to have remembered the "old way" of error handling before TRY...CATCH came along in SQL 2005. I was happy to have the feature when it arrived. But TRY...CATCH hasn't lived up to the promise of what it should be.
My particular TRY...CATCH issue is one that's bothered me for years. The times I've discussed it with other SQL pros, I usually get similar input: use PowerShell, or use SqlCmd.exe with an output file, or use individual SQL Agent job steps with hard-coded T-SQL statements. In spite of good intentions, those types of suggestions agitate me. Sure, it's natural to try to seek workarounds when we run into issues. But why should we have to jump out of T-SQL to handle a T-SQL error? It's as if we're admitting that T-SQL isn't a *real* procedural language. And what about the issues we don't know about? Or the issues we can't possibly foresee? That's the point of having and using error handling! If a fundamental part of our programming language doesn't work, what are we to do?
TRY...CATCH Catches Last Error Only
Here's an example of what's gotten my goat for so long. It's a backup statement that fails along with the output from SSMS:
Note there are two errors (3024 and 3013). Let's try it within a TRY...CATCH block:
BACKUP DATABASE master
TO DISK = 'master.diff.bak'
WITH DIFFERENTIAL
Msg 3024, Level 16, State 0, Line 2
You can only perform a full backup of the master database. Use BACKUP DATABASE to back up the entire master database.
Msg 3013, Level 16, State 1, Line 2
BACKUP DATABASE is terminating abnormally.
The CATCH block only "sees" one of the errors. This is problematic. We really need both of those errors for a full understanding of what happened. The one error accessible within the CATCH block doesn't tell us much on its own ("BACKUP DATABASE is terminating abnormally.").
BEGIN TRY
BACKUP DATABASE master
TO DISK = 'master.diff.bak'
WITH DIFFERENTIAL
END TRY
BEGIN CATCH
SELECT
ERROR_NUMBER() AS [Error_Number],
ERROR_SEVERITY() AS [Error_Severity],
ERROR_STATE() AS [Error_State],
ERROR_LINE() AS [Error_Line],
ERROR_MESSAGE() AS [Error_Message];
END CATCH
DBCC CHECKDB
Now an even worse example. If you want to try this yourself, you'll need a corrupt database. You can download a few from Paul Randal. There's also a video by Jes Borland that shows you how to manually corrupt a database with a hex editor.
This time there are three errors and some informational messages. Here's what happens when we use a TRY...CATCH block:
DBCC CHECKDB ('CorruptionTest')
WITH NO_INFOMSGS;
Msg 8939, Level 16, State 98, Line 2
Table error: Object ID 1269579561, index ID 1, partition ID 72057594043105280, alloc unit ID 72057594049789952 (type In-row data), page (1:410). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 133129 and -4.
Msg 8928, Level 16, State 1, Line 2
Object ID 1269579561, index ID 1, partition ID 72057594043105280, alloc unit ID 72057594049789952 (type In-row data): Page (1:410) could not be processed. See other errors for details.
Msg 8980, Level 16, State 1, Line 2
Table error: Object ID 1269579561, index ID 1, partition ID 72057594043105280, alloc unit ID 72057594049789952 (type In-row data). Index node page (0:0), slot 0 refers to child page (1:410) and previous child (0:0), but they were not encountered.
CHECKDB found 0 allocation errors and 3 consistency errors in table 'tbl_ServiceVersion' (object ID 1269579561).
CHECKDB found 0 allocation errors and 3 consistency errors in database 'CorruptionTest'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (CorruptionTest).
We didn't just lose one error. We lost them all! DBCC CHECKDB on a corrupt database just came back "clean" and the CATCH block never executed. We've moved into dangerous territory now. This is no longer an inconvenience or an annoyance. This could lead to data loss, financial harm to your company, maybe even some risk to your career.
BEGIN TRY
DBCC CHECKDB ('CorruptionTest')
WITH NO_INFOMSGS;
END TRY
BEGIN CATCH
SELECT
ERROR_NUMBER() AS [Error_Number],
ERROR_SEVERITY() AS [Error_Severity],
ERROR_STATE() AS [Error_State],
ERROR_LINE() AS [Error_Line],
ERROR_MESSAGE() AS [Error_Message];
END CATCH
I'd love to change the world, but I don't know what to do but if I can't, maybe my little discovery can help a few of you. As opposed to a workaround, I've been trying to find a way to augment TRY...CATCH. And I think I've had a breakthrough. It's pure T-SQL and looks promising. More details available in my next post.
Comments