Sports Analogies
I’m not exactly a sports fan. Not even close. But every once in a while, I will use a sports analogy that would make my dad proud.
A couple of weeks ago, I was troubleshooting a performance problem. Every fifteen minutes, a group servers would start paging and our IO waits would skyrocket. It was pretty easy to looking for the culprit. All we needed to do was find whatever processes were running every fifteen minutes. In this case, it was our transaction log backups.
The application on this group of servers uses a series of twenty databases (which is another post for another day) and the transaction logs backups were all configured to kick off at the same time. I wanted my servers to run them serially instead of all at once. My boss is a golfer, and I explained that we were doing a scramble instead of a shotgun. She completely got it. You could say I hit it out of the park. Oh, wait. Mixing analogies probably isn’t a good idea here.
One of the lessons learned in this process (and I thank Kendra Little for her help in this) is that the log shipping engine on the target side wants the filename in a specific format. While I was at it, I converted the timestamp to UTC to match how the log shipping process names its file.
The second lesson learned here is that the schedule for the backup job should vary every so slightly than the LSCopy job on the target server. Assuming your servers have synchronized times, you will be doing a copy at the same time as the backup. That’s causing a little bit of disk contention. You also run the risk of your DR site being behind production the length of your repeat interval. In my case, that was fifteen minutes. What I did was set my backups to start at the top of the hour, repeating every 15 minutes. The LSCopy schedule to start three minutes after the top of the hour, repeating every fifteen minutes. That means my DR server is about three minutes behind production.
The third lesson here is that the SQL Server Agent doesn’t appear to be covered in the MIN and MAX RAM setting on your instance. Since we were copying some pretty big files across the network, that SQL Server Agent was sucking up a hefty amount of RAM, causing the server to page. I had to lower the MAX RAM setting. Imagine explaining why you wanted to give a client’s database server less RAM than it already had.
SET QUOTED_IDENTIFIER ON; DECLARE @v_LocalDrive nvarchar(1) DECLARE @v_LocalFolder nvarchar(50) declare @v_timestamp nvarchar(30) SELECT @v_LocalDrive = 'd' SELECT @v_LocalFolder = 'tlogs' DECLARE @v_dbname nvarchar(100) DECLARE @v_SQL nvarchar(1000) DECLARE c_databases CURSOR FOR select name from sys.databases where recovery_model = 1 and database_id > 4 ORDER BY [name] OPEN c_databases FETCH NEXT FROM c_databases INTO @v_dbname WHILE (@@fetch_status <> -1) BEGIN IF (@@fetch_status <> -2) BEGIN select @v_timestamp = convert(nvarchar, GETUTCDATE(), 112) + RIGHT('00'+ convert(varchar, datepart (hh, GETUTCDATE())),2) + RIGHT('00'+ convert(varchar, datepart (minute, GETUTCDATE())),2) + RIGHT('00'+ convert(varchar, datepart (ss, GETUTCDATE())),2) SELECT @v_SQL = 'BACKUP LOG [' + @v_dbname + '] to DISK = N''' + @v_LocalDrive + ':\' + @v_LocalFolder + '\' + @v_dbname + '_' + @v_timestamp + '.trn''' +' WITH NOFORMAT, NOINIT, NAME = N'''+ @v_dbname + '_' + @v_timestamp + ''', SKIP, REWIND, NOUNLOAD, STATS = 10' exec (@v_SQL) END FETCH NEXT FROM c_databases INTO @v_dbname END CLOSE c_databases DEALLOCATE c_databases
More Interview Questions
I wrote a post many months ago with some interview questions for my old boss to use when searching for my replacement. Tomorrow, we have a candidate interviewing to join our team. Here are a few things from the top of my head that I just might ask. You’ll need to find your own answers, though.
- Tell me about VLFs
- Do you have any experience with MSX-TSX relationships in the SQL Server Agent?
- How do you maintain INDEXes and STATISTICS in your environment today? How could you improve that?
- When would you disable auto update of statistics?
- How do you keep your skills current with SQL Server technology?
- What’s the most compelling reason to upgrade from SQL 2005 to SQL 2008 R2?
- What’s the biggest mistake you see other SQL Server DBAs make?
- What’s your worse habit as a DBA?
- Are you familiar with the PASS Summit or local chapters?
- With what trace flags are you familiar?
- What are the two types of authentication? Which is more secure?
- Which will be faster, inserting a million rows of data or updating a million rows of data? (shamelessly stolen from Tom LaRock)
Selectively Updating Statistics
Most blogs should have a disclaimer that your mileage may vary. That’s not the case here. I can assure you that your mileage will vary. Test this before you run it against your 12 TB production data warehouse.
I recently inherited a few servers that have autoupdate of statistics disabled. That’s a long story. In other cases, we want to manually update statistics because it may take a quite some time for SQL Server to detect that it has stale statistics. We were running a script that updates all statistics across all indexes on all tables in all databases. And it was taking forever.
One day it hit me.
We’re running Michelle Ufford‘s script to reorganize and rebuild indexes. What if we could do the same for statistics? Why were we forcing the update of statistics on a static table? With a little bit of help from a script that Kendra Little wrote, I was able to put together a process that will dynamically update statistics only where needed. And we added a little more logic to set the sample rate as well. For small tables, doing a full scan makes more sense. But for large tables, this a smaller sample size is needed.
A lot of the values are hard coded here and should be moved to parameters. Maybe in the next version. This thing looks for five percent or 1000 rows, whichever comes first. It works in my environment. Your mileage will vary.
-- Dynamic Database Statistics Update -- -- Created: Mike Hillwig -- 01/26/2012 --
create table #statsmaint (databasename varchar(100), schemaname varchar(100), tablename varchar(100), indexname varchar(100), rowsupdated int, totalrows int)
--- Stats calculation adapted from Kendra Little's script found at --- www.littlekendra.com/2009/04/21/how-stale-are-my-statistics/
exec sp_msforeachdb 'use ?; INSERT #statsmaint SELECT DISTINCT ''?'' , s.name , tablename=object_name(i.object_id) ,index_name=i.[name] , si.rowmodctr , si.rowcnt FROM sys.indexes i (nolock) JOIN sys.objects o (nolock) on i.object_id=o.object_id JOIN sys.schemas s (nolock) on o.schema_id = s.schema_id JOIN sys.sysindexes si (nolock) on i.object_id=si.id and i.index_id=si.indid where STATS_DATE(i.object_id, i.index_id) is not null and o.type <> ''S'' and (si.rowmodctr > 1000 OR cast(si.rowmodctr as float) / cast (si.rowcnt+1 as float) > .05) and ''?'' <> ''tempdb'' order by si.rowmodctr desc'
DECLARE @v_dbname varchar(100) DECLARE @v_schemaname varchar(100) DECLARE @v_tablename varchar(100) DECLARE @v_indexname varchar(100) DECLARE @v_SQL varchar(1000) DECLARE @v_rowsupdated int DECLARE @v_percentscan varchar (10) DECLARE @v_totalrows int DECLARE c_statistics CURSOR FOR SELECT databasename, schemaname, tablename, indexname, rowsupdated, totalrows FROM #statsmaint
OPEN c_statistics FETCH NEXT FROM c_statistics INTO @v_dbname, @v_schemaname, @v_tablename, @v_indexname, @v_rowsupdated, @v_totalrows WHILE (@@fetch_status <> -1) BEGIN IF (@@fetch_status <> -2) BEGIN
SELECT @v_percentscan = '100' where @v_totalrows <= 50000 SELECT @v_percentscan = '75' WHERE @v_totalrows BETWEEN 50000 AND 1000000 SELECT @v_percentscan = '50' WHERE @v_totalrows BETWEEN 1000000 AND 10000000 SELECT @v_percentscan = '25' where @v_totalrows > 10000000
select @v_SQL = 'UPDATE STATISTICS ' + @v_dbname + '.' + @v_schemaname + '.' + @v_tablename + ' ' + @v_indexname + ' WITH SAMPLE ' + @v_percentscan + ' PERCENT --' + cast (@v_rowsupdated as varchar) + ' OF ' + cast(@v_totalrows as varchar) + ' ROWS UPDATED. STARTED ' + cast(current_timestamp as varchar) print @v_sql exec (@v_sql)
END FETCH NEXT FROM c_statistics INTO @v_dbname, @v_schemaname, @v_tablename, @v_indexname, @v_rowsupdated, @v_totalrows END
CLOSE c_statistics DEALLOCATE c_statistics
drop table #statsmaint
The PNR Datatype
About sixty years ago, American Airlines, in conjunction with IBM, revolutionized an industry by introducing the first computer-based inventory management system called SABRE. This was revolutionary in many ways. Not only did it change the way the travel industry booked plane tickets and hotel rooms, but it also gave the public their first view of compuers in practical use. Culturally, this introduced the concept of a confirmation code. SABRE, like other systems, uses a six character code. This is frequently called a PNR or a record locator code. Next time you get a boarding pass for your flight, find your PNR. It’s not too hard to spot.
SABRE has evolved over the years, but the PNR code has persisted. Imagine making a flight reservation and having them give you a confirmation number that more closely resembled your bank account number. Yuck.
The more I think about this, the more fascinated I am with using a six character code as a data type. In a modern relational database system, it may be completely impractical. But think back several decades. It was revolutionary.
Imagine using all 26 letters in the English alphabet plus 8 numbers. We don’t want to use 0 and 1 because they can be confused with O and I, respectively. That’s 34^6 or 1,544,804,416 records that can be held before needing to recycle. Even by today’s standards, it’s quite a lot of records.
Using this datatype today could be a bit messy. Because we don’t store things on sequential punch cards, we’d need to have an algorithm to find the next available value. And putting a clustered index on this as a column would cause page splits for days.
I’ve been chewing on this idea for close to a year now, and I still can’t decide if I love it or hate it more.
Random Blogs
A couple of weeks ago, I presented at the SNESSUG and had an excellent time. They’re a great group and were a lot of fun. One of my slides was to “Beware of Advice from Random Blogs.”
I say this because there is just so much bad advice out there. Know your sources. Challenge the stuff you read. And more than anything, make sure you test the hell out of any script you download. Do you really want to run some script you found at Bernard’s Blog Land against your 12 TB production data warehouse and have it thrashing at your indexes? I certainly wouldn’t.
I work with a few people that I swear have read a few too many bad blog posts. Some of their suggested “best practices” are things that I’m constantly challenging. These are smart people. But I believe they’ve read some bad advice.
That said, I read a blog post from Kimberly Tripp today that made me giggle. It reaffirms something I’ve been saying for six months. And this one isn’t a random blog. So why do I take this one at face value? It comes from a reputable source, someone I’ve met, and this is someone whose advice has never failed me.
Read the slide above: If it comes from Paul or Kimberly, it’s true.
Top Ten Operational Mistakes at SNESSUG
Tonight, I’m presenting my Top Ten Operational Mistakes at the Southern New England SQL Server User Group meeting.
I gave this presentation at SQL Saturday in South Florida and it was a lot of fun. I’ve updated it slightly to include a warning on avoiding advice from random bloggers. My benchmark there is if someone says you should always or never do something, they’re not right.
Top 10 Operational Mistakes to Avoid 2012
Confirming Objects Modified
I work in a hosting environment, and frequently our clients will ask us to promote a stored procedure (or some other schema object) through the DEV, TEST, Production environments. We have one client that is really big on seeing some type of evidence that we did what we say we did.
I wrote this little nugget that generates enough confirmation for the client’s relationship manager to demonstrate that we did indeed move their code. And it seems to make the client happy.
set nocount on go DECLARE @dbname VARCHAR(30) DECLARE @num_objects INT DECLARE @object_type VARCHAR(3) SELECT @dbname = 'userdatabasename' -- Use the database where the objects were moved SELECT @num_objects = 1 -- Use the number of objects moved. SELECT @object_type = 'P' -- Use P for procedures, F for functions, U for tables, V for views, etc. DECLARE @SQL VARCHAR (1000) select @@servername SELECT @sql = 'select top ' + cast(@num_objects as varchar) + ' left(name,30) as object, object_id, modify_date from ' + @dbname + '.sys.objects where type = ''' + @object_type + '''
order by modify_date desc' EXEC (@sql)
Setting All Databases to SIMPLE Recovery Mode
I’m cleaning up some stuff in my dev environment today, and I have some pretty big transaction log files. These have gotten big even though I do regular full and transaction log backups. In order to do some maintenance work, I wrote this little nugget this morning. It’s anther script that generates a script.
Again, this is for my DEV environment. I’d never advise someone to run all databases in SIMPLE recovery mode in a production environment unless there was a very specific need to do that.
set nocount on go select 'ALTER DATABASE [' + name + '] SET RECOVERY SIMPLE' from sys.databases where database_id > 6 and recovery_model_desc = 'FULL'
Stopping a Series of SQL Agent Jobs
Yesterday, I participated in my first DR test at this company. It was a long day, but we learned a lot. This was also our first SQL Server client that we tested. For our Oracle clients, the Oracle DBAs have this process down to a science.
We’re using Log Shipping for our DR environment, and before I opened up the databases, I needed to stop the LS Restore processes for all 22 of the databases, and I wasn’t about to right-click on 22 SQL Agent jobs. That’s when I wrote this little nugget:
set nocount on go select 'EXEC msdb.dbo.sp_update_job @job_id=N'''+ cast(job_id as varchar(60))
+''', @enabled=0 GO' from msdb.dbo.sysjobs where name like 'LSRestore_%' and enabled = 1
What this does is generate a SQL script that I can copy, paste, and run. Sure, I could have encapsulated this into a cursor and EXECute it, but I like to have a little bit of control over these things. Lately, I’ve become a big fan of writing scripts that generate other scripts.
After the DR test, I write something similar that re-enabled those jobs once I had my backups restored.
Audit Prep Toolkit
When I was the DBA at Acme Packet, we went through a Sarbanes Oxley audit at least twice a year. It’s the price you pay for being a publicly traded company. One of the things I learned in my tenure there was that the best way to survive an audit is to anticipate what the auditor is going to ask for. Over a few years, I developed a great rapport with my auditors and typically had a mountain of data for them to sift through before they even walked through the door. By putting together a handful of scripts and reports, our auditors were able to spend more time doing actual auditing instead of waiting for us to provide data.
I’m working on a toolkit that DBAs will be able to use to have this data ready for their auditors. It’s just a handful of scripts that generate the data needed to demonstrate some basic audit controls. By dumping that data into the BI engine of your choice, it will look like you know what you’re doing and are well prepared. Here are a few things that you can expect.
- Basic Server Configuration Info
- Database logins
- Database logins with the sysadmin role
- List of users per database
- List of users per database including role
- List of database roles and users included
- List of explicit grants for database users
- Backup history
- Failed backups
- Failed backups and proof of notification
- List of SQL Agent Jobs
- SQL Agent Jobs and Schedule
- SQL Agent Job History