DBPedias

Your Database Knowledge Community

Mike Hillwig

  1. New Server in an Hour? No Problem! #sqlsat142

    If you attended my presentation at SQL Saturday 142 in Waltham, thank you. That may have been the most fun I’ve had presenting to a small group in a while. You were engaged and asked a ton of great questions.

    I promised that I’d have my presentation and slide deck posted by the end of the weekend. Unfortunately, I injured myself yesterday and it’s a bit difficult to type at the moment.

    I’ll get a good blog post together by the end of the week. I have eight hours of flights on Thursday, so that should be doable. And I’ll post it to http://crankydba.com/go/1HourServer.

  2. Coming to #NESQL How Not to Be a Cranky DBA

    On Thursday, May 10, I will be presenting at the New England SQL Server User Group at Microsoft in Waltham. Details can be found here. I’m pretty excited because this is one of the largest SQL user groups in the country. At the same time, this completely scares the bejeezus out of me because this is one of the largest SQL user groups in the country. There is a volcanic amount of brain power in this room.

    When Adam Machanic (blog | twitter) asked me if I’d speak, I jumped at the chance. We started brainstorming my topic and I sent him the slide decks of my two “top ten” presentations. He suggested that we combine them and polish up the title a bit. And that’s how “How Not to Be a Cranky DBA” was born.

    As I’ve been preparing my slide deck, I realized that this user group gets some fantastic speakers. We’ve had Brent Ozar, Grant Fritchey, Aaron Bertrand, and of course, Adam Machanic. These aren’t exactly second stringers. These are big names in the SQL Server community, and I’m being given some really big shoes to fill. This is a really big deal for me, and I need to bring my A game. I’ve spent a good deal of time making my slides visually appealing and choosing the stories I want to tell.

    One of the things I like about this topic is that it’s all story telling and no demo. Anyone who knows me knows that I love to tell stories, and this is a whole presentation full of them.

    Come join us for an evening of SQL goodness.

  3. Sports Analogies

    I’m not exactly a sports fan. Not even close. But every once in a while, I will use a sports analogy that would make my dad proud.

    A couple of weeks ago, I was troubleshooting a performance problem. Every fifteen minutes, a group servers would start paging and our IO waits would skyrocket. It was pretty easy to looking for the culprit. All we needed to do was find whatever processes were running every fifteen minutes. In this case, it was our transaction log backups.

    The application on this group of servers uses a series of twenty databases (which is another post for another day) and the transaction logs backups were all configured to kick off at the same time. I wanted my servers to run them serially instead of all at once.  My boss is a golfer, and I explained that we were doing a scramble instead of a shotgun. She completely got it. You could say I hit it out of the park. Oh, wait. Mixing analogies probably isn’t a good idea here.

    One of the lessons learned in this process (and I thank Kendra Little for her help in this) is that the log shipping engine on the target side wants the filename in a specific format. While I was at it, I converted the timestamp to UTC to match how the log shipping process names its file.

    The second lesson learned here is that the schedule for the backup job should vary every so slightly than the LSCopy job on the target server. Assuming your servers have synchronized times, you will be doing a copy at the same time as the backup. That’s causing a little bit of disk contention. You also run the risk of your DR site being behind production the length of your repeat interval. In my case, that was fifteen minutes. What I did was set my backups to start at the top of the hour, repeating every 15 minutes. The LSCopy schedule to start three minutes after the top of the hour, repeating every fifteen minutes. That means my DR server is about three minutes behind production.

    The third lesson here is that the SQL Server Agent doesn’t appear to be covered in the MIN and MAX RAM setting on your instance. Since we were copying some pretty big files across the network, that SQL Server Agent was sucking up a hefty amount of RAM, causing the server to page.  I had to lower the MAX RAM setting. Imagine explaining why you wanted to give a client’s database server less RAM than it already had.

    SET QUOTED_IDENTIFIER ON;
     DECLARE @v_LocalDrive nvarchar(1)
     DECLARE @v_LocalFolder nvarchar(50)
     declare @v_timestamp nvarchar(30)
     SELECT @v_LocalDrive = 'd'
     SELECT @v_LocalFolder = 'tlogs'
     DECLARE @v_dbname nvarchar(100)
     DECLARE @v_SQL nvarchar(1000)
     DECLARE c_databases CURSOR FOR
     select name
     from sys.databases
     where recovery_model = 1
     and database_id > 4
     ORDER BY [name]
     OPEN c_databases
     FETCH NEXT FROM c_databases INTO @v_dbname
     WHILE (@@fetch_status <> -1)
     BEGIN
     IF (@@fetch_status <> -2)
     BEGIN
     select @v_timestamp = convert(nvarchar, GETUTCDATE(), 112) + RIGHT('00'+ convert(varchar, datepart (hh, GETUTCDATE())),2)
     + RIGHT('00'+ convert(varchar, datepart (minute, GETUTCDATE())),2) + RIGHT('00'+ convert(varchar, datepart (ss, GETUTCDATE())),2)
     SELECT @v_SQL = 'BACKUP LOG [' + @v_dbname + '] to DISK = N''' + @v_LocalDrive + ':\' + @v_LocalFolder + '\' + @v_dbname + '_' + @v_timestamp + '.trn''' +' WITH NOFORMAT, NOINIT, NAME = N'''+ @v_dbname + '_' + @v_timestamp + ''', SKIP, REWIND, NOUNLOAD, STATS = 10'
     exec (@v_SQL)
     END
     FETCH NEXT FROM c_databases INTO @v_dbname
     END
     CLOSE c_databases
     DEALLOCATE c_databases

     

  4. More Interview Questions

    I wrote a post many months ago with some interview questions for my old boss to use when searching for my replacement. Tomorrow, we have a candidate interviewing to join our team.  Here are a few things from the top of my head that I just might ask. You’ll need to find your own answers, though.

    • Tell me about VLFs
    • Do you have any experience with MSX-TSX relationships in the SQL Server Agent?
    • How do you maintain INDEXes and STATISTICS in your environment today? How could you improve that?
    • When would you disable auto update of statistics?
    • How do you keep your skills current with SQL Server technology?
    • What’s the most compelling reason to upgrade from SQL 2005 to SQL 2008 R2?
    • What’s the biggest mistake you see other SQL Server DBAs make?
    • What’s your worse habit as a DBA?
    • Are you familiar with the PASS Summit or local chapters?
    • With what trace flags are you familiar?
    • What are the two types of authentication? Which is more secure?
    • Which will be faster, inserting a million rows of data or updating a million rows of data? (shamelessly stolen from Tom LaRock)
  5. Selectively Updating Statistics

    Most blogs should have a disclaimer that your mileage may vary. That’s not the case here. I can assure you that your mileage will vary. Test this before you run it against your 12 TB production data warehouse.

    I recently inherited a few servers that have autoupdate of statistics disabled. That’s a long story.  In other cases, we want to manually update statistics because it may take a quite some time for SQL Server to detect that it has stale statistics. We were running a script that updates all statistics across all indexes on all tables in all databases. And it was taking forever.

    One day it hit me.

    We’re running Michelle Ufford‘s script to reorganize and rebuild indexes. What if we could do the same for statistics? Why were we forcing the update of statistics on a static table? With a little bit of help from a script that Kendra Little wrote, I was able to put together a process that will dynamically update statistics only where needed. And we added a little more logic to set the sample rate as well. For small tables, doing a full scan makes more sense. But for large tables, this a smaller sample size is needed.

    A lot of the values are hard coded here and should be moved to parameters. Maybe in the next version. This thing looks for five percent or 1000 rows, whichever comes first. It works in my environment. Your mileage will vary.

     

    -- Dynamic Database Statistics Update
     --
     -- Created: Mike Hillwig
     -- 01/26/2012
     --
    create table #statsmaint
     (databasename varchar(100),
     schemaname varchar(100),
     tablename varchar(100),
     indexname varchar(100),
     rowsupdated int,
     totalrows int)
    --- Stats calculation adapted from Kendra Little's script found at
     --- http://www.littlekendra.com/2009/04/21/how-stale-are-my-statistics/
    exec sp_msforeachdb 'use ?;
     INSERT #statsmaint
     SELECT DISTINCT
     ''?''
     , s.name
     , tablename=object_name(i.object_id)
     ,index_name=i.[name]
     , si.rowmodctr
     , si.rowcnt
     FROM sys.indexes i (nolock)
     JOIN sys.objects o (nolock) on
     i.object_id=o.object_id
     JOIN sys.schemas s (nolock) on
     o.schema_id = s.schema_id
     JOIN sys.sysindexes si (nolock) on
     i.object_id=si.id
     and i.index_id=si.indid
     where
     STATS_DATE(i.object_id, i.index_id) is not null
     and o.type <> ''S''
     and (si.rowmodctr > 1000 OR cast(si.rowmodctr as float) / cast (si.rowcnt+1 as float) > .05)
     and ''?'' <> ''tempdb''
     order by si.rowmodctr desc'
    DECLARE @v_dbname varchar(100)
     DECLARE @v_schemaname varchar(100)
     DECLARE @v_tablename varchar(100)
     DECLARE @v_indexname varchar(100)
     DECLARE @v_SQL varchar(1000)
     DECLARE @v_rowsupdated int
     DECLARE @v_percentscan varchar (10)
     DECLARE @v_totalrows int
     DECLARE c_statistics CURSOR FOR
     SELECT databasename, schemaname, tablename, indexname, rowsupdated, totalrows
     FROM #statsmaint
    OPEN c_statistics
     FETCH NEXT FROM c_statistics INTO @v_dbname, @v_schemaname, @v_tablename, @v_indexname, @v_rowsupdated, @v_totalrows
     WHILE (@@fetch_status <> -1)
     BEGIN
     IF (@@fetch_status <> -2)
     BEGIN
    SELECT @v_percentscan = '100' where @v_totalrows <= 50000
     SELECT @v_percentscan = '75' WHERE @v_totalrows BETWEEN 50000 AND 1000000
     SELECT @v_percentscan = '50' WHERE @v_totalrows BETWEEN 1000000 AND 10000000
     SELECT @v_percentscan = '25' where @v_totalrows > 10000000
    select @v_SQL = 'UPDATE STATISTICS ' + @v_dbname + '.' + @v_schemaname + '.' + @v_tablename + ' ' + @v_indexname + ' WITH SAMPLE ' + @v_percentscan + ' PERCENT --' + cast (@v_rowsupdated as varchar) + ' OF ' + cast(@v_totalrows as varchar) + ' ROWS UPDATED. STARTED ' + cast(current_timestamp as varchar)
     print @v_sql
     exec (@v_sql)
    END
     FETCH NEXT FROM c_statistics INTO @v_dbname, @v_schemaname, @v_tablename, @v_indexname, @v_rowsupdated, @v_totalrows
     END
    CLOSE c_statistics
     DEALLOCATE c_statistics
    drop table #statsmaint
  6. The PNR Datatype

    About sixty years ago, American Airlines, in conjunction with IBM, revolutionized an industry by introducing the first computer-based inventory management system called SABRE.  This was revolutionary in many ways. Not only did it change the way the travel industry booked plane tickets and hotel rooms, but it also gave the public their first view of compuers in practical use. Culturally, this introduced the concept of a confirmation code. SABRE, like other systems, uses a six character code. This is frequently called a PNR or a record locator code. Next time you get a boarding pass for your flight, find your PNR. It’s not too hard to spot.

    SABRE has evolved over the years, but the PNR code has persisted. Imagine making a flight reservation and having them give you a confirmation number that more closely resembled your bank account number. Yuck.

    The more I think about this, the more fascinated I am with using a six character code as a data type.  In a modern relational database system, it may be completely impractical. But think back several decades. It was revolutionary.

    Imagine using all 26 letters in the English alphabet plus 8 numbers. We don’t want to use 0 and 1 because they can be confused with O and I, respectively.  That’s 34^6 or 1,544,804,416 records that can be held before needing to recycle. Even by today’s standards, it’s quite a lot of records.

    Using this datatype today could be a bit messy. Because we don’t store things on sequential punch cards, we’d need to have an algorithm to find the next available value. And putting a clustered index on this as a column would cause page splits for days.

    I’ve been chewing on this idea for close to a year now, and I still can’t decide if I love it or hate it more.

  7. Random Blogs

    A couple of weeks ago, I presented at the SNESSUG and had an excellent time. They’re a great group and were a lot of fun. One of my slides was to “Beware of Advice from Random Blogs.”

    I say this because there is just so much bad advice out there. Know your sources. Challenge the stuff you read. And more than anything, make sure you test the hell out of any script you download. Do you really want to run some script you found at Bernard’s Blog Land against your 12 TB production data warehouse and have it thrashing at your indexes? I certainly wouldn’t.

    I work with a few people that I swear have read a few too many bad blog posts. Some of their suggested “best practices” are things that I’m constantly challenging. These are smart people. But I believe they’ve read some bad advice.

    That said, I read a blog post from Kimberly Tripp today that made me giggle. It reaffirms something I’ve been saying for six months. And this one isn’t a random blog. So why do I take this one at face value? It comes from a reputable source, someone I’ve met, and this is someone whose advice has never failed me.

    Read the slide above: If it comes from Paul or Kimberly, it’s true.

     

  8. Confirming Objects Modified

    I work in a hosting environment, and frequently our clients will ask us to promote a stored procedure (or some other schema object) through the DEV, TEST, Production environments. We have one client that is really big on seeing some type of evidence that we did what we say we did.

    I wrote this little nugget that generates enough confirmation for the client’s relationship manager to demonstrate that we did indeed move their code. And it seems to make the client happy.

    set nocount on
    go
    
    DECLARE @dbname VARCHAR(30)
    DECLARE @num_objects INT
    DECLARE @object_type VARCHAR(3)
    
    SELECT @dbname = 'userdatabasename' -- Use the database where the objects were moved
    SELECT @num_objects = 1   -- Use the number of objects moved.
    SELECT @object_type = 'P' -- Use P for procedures, F for functions, U for tables, V for views, etc. 
    
    DECLARE @SQL VARCHAR (1000)
    
    select @@servername
    
    SELECT @sql = 'select top ' + cast(@num_objects as varchar) + 
    ' left(name,30) as object, object_id, modify_date from ' + @dbname +
    '.sys.objects where type = ''' + @object_type + '''
    order by modify_date desc'
    
    EXEC (@sql)

     

  9. Setting All Databases to SIMPLE Recovery Mode

    I’m cleaning up some stuff in my dev environment today, and I have some pretty big transaction log files. These have gotten big even though I do regular full and transaction log backups. In order to do some maintenance work, I wrote this little nugget this morning. It’s anther script that generates a script.

    Again, this is for my DEV environment. I’d never advise someone to run all databases in SIMPLE recovery mode in a production environment unless there was a very specific need to do that.

    set nocount on
    go
    select 'ALTER DATABASE [' + name + '] SET RECOVERY SIMPLE'
    from sys.databases
    where database_id > 6
    and recovery_model_desc = 'FULL'
  10. Audit Prep Toolkit

    When I was the DBA at Acme Packet, we went through a Sarbanes Oxley audit at least twice a year. It’s the price you pay for being a publicly traded company. One of the things I learned in my tenure there was that the best way to survive an audit is to anticipate what the auditor is going to ask for. Over a few years, I developed a great rapport with my auditors and typically had a mountain of data for them to sift through before they even walked through the door. By putting together a handful of scripts and reports, our auditors were able to spend more time doing actual auditing instead of waiting for us to provide data.

    I’m working on a toolkit that DBAs will be able to use to have this data ready for their auditors. It’s just a handful of scripts that generate the data needed to demonstrate some basic audit controls. By dumping that data into the BI engine of your choice, it will look like you know what you’re doing and are well prepared. Here are a few things that you can expect.

    • Basic Server Configuration Info
    • Database logins
    • Database logins with the sysadmin role
    • List of users per database
    • List of users per database including role
    • List of database roles and users included
    • List of explicit grants for database users
    • Backup history
    • Failed backups
    • Failed backups and proof of notification
    • List of SQL Agent Jobs
    • SQL Agent Jobs and Schedule
    • SQL Agent Job History
    Some of these sound redundant, and they absolutely are. It all depends on what your auditor cares about during that particular audit. And frankly, I’m okay with the redundancy because it keeps an auditor off my back.
  1. 1
  2. Next ›
  3. Last »