Wednesday, February 9, 2011

And you may say to yourself… MY GOD! WHAT HAVE I DONE?

Have you ever found yourself living David Byrne’s dire prophecy? (http://www.youtube.com/watch?v=I1wg1DNHbNU... Don’t try to reenact…  the 80’s were more tolerant to bizarre body movements, tics and twitches advertised as a ‘dance’)
Specifically, when you develop SQL, has it ever happened to you that you’ve casually run some UPDATE or DELETE statement,  and got the message that… say, way more records than you had intended were just deleted? And you sit there, staring at the screen, letting the fact of what just happened enter your mind little by little… and the only thought in your mind is… “holy sh**!”
Hopefully you’ve got backup. And a pretty recent one too. And the time to go through the restore hassle.
SQL doesn’t have an ‘undo’. That bothered us for years. We have partially answered this problem with our Data Profiler, which enables you to take quick snapshots of the data before you’re changing anything. But that’s not a full solution.
We’ve now taken a further step in that direction:  Our new Colombo can take a DML statement (INSERT, UPDATE or DELETE) detect exactly what changes its going to make, and generate the SQL script that will do the exact opposite of what the statement does. NICE. Well, we think so. You simply paste your statement into Colombo, select your database, and that’s it!
There is still considerable way to walk to provide a full undo to every sql operation, batch script, and stored procedure that you might call. And we intend to walk that road. For now, we’ve taken the first step. Think we’re crazy? Let us know…
(Now that song will be stuck in my head for the next 3 days. It’s the same problem with anything 80’s….)

Sunday, February 6, 2011

Useful information we can get out of a query execution plan

Much has been written about Query Execution Plans (we’ll call them QEPs), and how they can be used for analyzing how SQL Server interprets your TSQL code and what it does behind the scenes – information you can use to do such things as find bottlenecks, build indexes where they might help, and generally improve the performance of your code.
Well, we have found more uses for it. There is more information in the QEP, information that’s not typically used, yet is highly useful and very easy to retrieve:

1.     Is your code broken?

TSQL code can get ‘broken’ all the time. You might be referencing tables or columns that don’t exist. They might have even existed at the time you wrote the procedure, but have been removed since, leaving your procedure ‘broken’ -  It will raise an error next time its executed. How many of those do YOU have in your database?
Now, if a code is broken, SQL Server will let you know about that when attempting to retrieve its execution plan. Strictly speaking, this is not QEP information. Its information you will get while trying to get the QEP. And that is not the only way to get that information. But it’s the easiest.

2.     What entities are used in for what operation?

Among all the logical and physical operations, and effects on memory etc, QEP also contains the very basic information of what entities it is working with, and what it does on such entities. This information can be then used to build a basic ‘browser database’, which you can use to answer questions like: which code inserts to this table? I got data missing… which procedure removes records from here?

In order to get any SQL’s execution plan in SQL Server, all you need to do is:

set SHOWPLAN_ALL on

write your SQL here

set SHOWPLAN_ALL on

If you get an error, this means the stored procedure is broken, and the error would say why. Otherwise, you can get the execution plan in a tabular format, and retrieve the specific entities used from it.

We put this functionality into one of our freebie products, Diana Lite. You can build the full ‘browser database’ out of the collection of any database’s SQL Code, then easily filter it and get to whatever you are looking for:


Sweet, Huh?

Tuesday, January 25, 2011

Validating DML statements before they happen

In some of our products (Database Compare, Marie-Alix and Columbo) we have been using a new idea for a while, which we call ‘Pre-Execute Validation’. What it does is, quite simply, checks that data can be updated in a table, and if it cannot, tells you specifically where your SQL will fail.

There’s something about manipulating data in sets that is a real pain: it either works or it isn’t. if you are doing an UPDATE for, say, 10,000 rows, and 2 rows in there will result in primary key, foreign key or, say, data type violation, the database wouldn’t tell you where the problem is. It will just say ‘fail, sorry’. Now go figure where your problem lies…

 In our most recent implementation, for example, (release as part of Columbo on January 2011)  ,  it takes a DML statement (UPDATE, INSERT or DELETE) and checks if it can be executed on the database. It goes over all the records that are to be updated, and checks against various database constraints for any potential violations. If there are, it tells you the exact rows and specific values that cause the problems.

If a picture is worth a 1000 words, what does a 2 minute movie worth? We estimate around… 20,000 then? Check it out here:


This little clip shows how we validate one single database integrity rule: field size. But there are a great number of rules to check. So, We wanted to spell here all the specifics constraints it checks against… mainly for your feedback, and if you think there are other things we can check but forgot. We pretty much good 80% of the list below implemented… and the rest, along with whatever other ideas we pick up along the way, to be finished by spring 2011. Ok, so here goes…


Checks for all statements

  • Permission on the object
  • All tables\views and their fields exists
  • All aliases exist (grammatical: if I do SELECT o.name, an o alias must be in the FROM)
  • If I got parameters: is their type AND SIZE are equivalent to the underlying datatapes in tables ot which they INSERT\UPDATE\SELECT…?

Checks for SELECT

  • Collation conflicts? (if there’s a join)
  • If doing convertions (CAST,CONVERT) check that all values that will be in the select will not do an overflow error. Read all about overflow (for instance, converting to smalldattime, do I have any dates that are before 1/1/1900 or anything after… and zero in on those lones)

Checks for UPDATE

  • NOT NULL (cant set to NULL a field that’s NOT NULL)
  • Primary key and other uniqueness
  • Foreign keys
  • Values out of acceptable range (date, numeric, strings ) and warn if there’s an option of data loss (overflow error)
  • Anything with cascade update\delete?

Checks for INSERT

  • All NOT NULL exits
  • Types (+sizes)
  • Primary key and other unique constraints not violated
  • Foreign keys
  • Not inserting into identity (unless explicitly asks, and then check constraints)
  • Value ranges: too big numbers, too long strings… and warn if there’s an option of data loss
  • No empty strings where not allowed (there such an option in SQL Server, if I remember)
  • COLLATE problems? Can data of wrong char set can be inserted? (though that may be harder… for later)
  • Overflow errors
  • Insert on views: can it take place? (the inserting on the one or on the many side… perhaps there are properties if the view is insert-able, or maybe only if certain values are there its insert-able, if the one side fields are NULL… learn)

Checks for DELETE

  • Foreign keys not violated

Function\Proc Calls

  • If a parameter is a SELECT, validate that it returns 1 row only (same if a SELECT is in the field of another SELECT statement?) also if I got SELECT in an expression…SET @a='X'+(SELECT nodename from prv.Category)

Variables

  • That all declared before use
  • Nothing declared more than once
  • A warning if not used
  • A warning if passed as a param before value was set(?)
  • Types of values that are put into them
  • If string type, (varchar,char) length of values, and if set from a SELECT or something, check the SELECT results and see if could potentially have longer value (check underlying DDL for length, and another check for the result set right now and if it violates…estrange eh?)

Temp tables

  • That declared (CREATE TBALE, select..INTO) before use
  • Not declared more than once (sql allows? 2 CREATE for temp table, INTO more than once?)
  • Registers what fields can be there (easier when its CREATE, harder in SELECT) if it’s a SELECT *, can only do it with conn context.
  • Field count, Type and size check as much as possible in transactions involving temp tables

DDL: make sure the data is ok for:

(for all of those, also make sure that the entities exist, of course, and that such constraint doesn’t already exist-  both in terms of name, or if there is another constraint with a different name that just does the same)
  • Creating keys\indexes\unique constraints
  • ALTERing a field to NOT NULL: can be done
  • ALTERing a field size: any data will be trimmed out?
  • ALTERing a field Type: data allows? (say, converting to numeric or date so check that all the values could actually be)
  • Adding an FK : Data allows it?
  • Adding a PK: Data allows it?
  • Adding a field – make sure its not in the table already
  • Adding a NOT NULL field