Level Extreme .NET Magazine April 2008 issue

Universal Thread Magazine March 2007 issue

2008
2007
2006
2005
2004
2003
2002

2006
2005
2004

2007
2006
2005
2004

2007
2006
2005
2004
2003

2007
2006
2005
2004
2003

2007
2006
2005
2004
2003

2004
2003
2002
2001
2000
1998
1997
1996
1993

2003
2002

2003
2001

2003
2002
2001

2003
2001

Visual FoxPro 2001
Visual FoxPro 2000
Visual Studio 2000

West Wind 2002
Double Impact 2001
FoxTeach 2001
Jam sessions
Technical Guidelines

Universal Thread documentation

About
Acronyms
Contact information
Troubleshooting
Time zones
Web Service
Terms & Conditions
Copyright

Manage your account for the messages area options, your subscription information, your invoicing, youir banners and your pictures Subscribe to the Universal Thread and get all the benefits related to the messages area A corporate subscription is needed for companies that have more than one developer Access the Universal Thread store to purchase your subscription, corporate subscriptions and banners The Universal Thread is covering several conferences per year. On site, reporters cover the technical aspect of the conference as well as making interviews, taking pictures and videos and other related content. Get all the reports from our coverages site. Universal Thread home page Level Extreme .NET Magazine, a newly published online magazine by Level Extreme about Microsoft .NET technology and its community Universal Thread Magazine, a magazine dedicated to the Visual FoxPro community PASS Community Summit 2005

The Professional Association for SQL Server (PASS) will host the industry’s largest user-run educational event exclusively dedicated to the SQL Server in Grapevine, Texas on September 27 – 30, 2005. Attend the 2005 PASS Community Summit, the largest event of the year exclusively dedicated to SQL Server education. Get the largest and most comprehensive coverage of SQL Server 2005 anywhere while you develop and expand your knowledge of SQL Server 2000. Meet and network with members of the Microsoft SQL Server development team, SQL Server MVP's and other SQL Server professionals. Check out the world's premier SQL Server vendors at the largest, most comprehensive tradeshow dedicated to SQL Server. And, much more!

Date: 
Location: 

Tuesday, September 27

by Dave Bernard

Everything's bigger in Texas...

...including the hotel. And the temperature. And the humidity.

It was 97 degrees (F) when my roomie, Dan LeClair graciously scooped me up at the airport. It's only a short ride, even in rush hour, to the The Gaylord Texan Resort and Convention Center, our home away from home for a few days.

What a cool place. Did I say it was big?

We promptly got settled in and then went down to registration and a reception. Registration was quick, and we received a nice satchel with materials, including a voucher for a complimetary copy of Microsoft SQL Server 2005 Standard Edition, when they ship it (covered later).

The reception was pretty well-attended and included light snacks, one drink ticket and a live quiz show.

by Dan LeClair

This is the second year I’ve been fortunate enough to cover the PASS Community Summit for the Universal Thread. Both summits have been preceded by hurricanes. Fortunately, neither show was affected by the storms, although we were having some bad moments in Houston last week…

Storms or no, the PASS folks throw a whale of a show, and this years’ is as big and as interesting as last year’s. And, with SQL 2005 in the production pipeline, a lot of excitement was in the air for this “release year” conference.

As before, the summit is made up of multiple activities: pre-conference sessions, the main three days of presentations, two hands-on labs, a large vendor exhibit, as well as daily keynotes and other activities. It’s almost too much to take in; in fact, none of the sessions are scheduled to repeat, which is a bit of a shame as there are many more that I’d like to have seen than I’ll have time for.

Wednesday, September 28

General Session
Kevin Kline and Bill Baker

by Dave Bernard

Kevin Kline, President of PASS, opened the 2005 conference, themed "Users Working Together". Kevin is an engaging speaker, and was obviously proud of how far PASS had come since its inception in 1998. Here are some facts:

  • SQL PASS 2005 is the largest event of it's kind; 1,800+ attendees, 40 exhibitors, 70+ chapters, 100 breakout sessions.
  • Attended by almost 90 of the 120 SQL Server MVP's worldwide are in attendance.
  • Supported by an all-volunteer effort.

Kevin emphasized the strong user-centric "community ecosystem" which gives PASS its foundation. Three main SIG's (Application Development, Business Intelligence and DBA) are going strong, with new ones in the works (Data Modeling and Security, Compliance and Auditing). See their blogs for some great information. Kevin noted the extraordinary growth in members from outside North America, and promised a PASS event in Asia would be a reality in the near future.

Lastly, Kevin gave the PASSion Award to Linda Rab (DBA SIG Chairperson) for going above and beyond the call of duty to help make the SIG program a success; she was responsible for recruiting many others to cause.

Kevin then gave way to Bill Baker, the General Manager for SQL Server Business Intelligence at Microsoft. One of the first things out of Bill's mouth was to announce that on November 7th, Microsoft will launch SQL Server 2005, Visual Studio 2005, and BizTalk 2006, in San Francisco.

Bill continued by quickly enumerating SQL Server 2000's market results, which are nothing to sneeze at (...at which there is nothing to sneeze?):

  • 22% year-over-year growth, 2003-2004;
  • 52% cumulative growth, 2002-2004;
  • 25% CAGR over the past seven years.
Market revenue share (20%) still lags behind IBM and Oracle (each at about 34%), but SQL Sever unit share outstrips those two by any measure, with SQL Server holding about 41% of unit share overall and 45% of the enterprise market. Bill interrupted his talk for the first of several t-shirt tossing events; Microsoft staffers tossed black shirts with single words on the back like Analyze, Integrate, Report, Query, Extend, Replicate (our favorite), Manage and Notify. Bill quipped that "our customers do verbs, but we sell nouns." One audience member loudly suggested the verb "Ship", inviting Bill's (faux) wrath. Bill promptly took out what he introduced as the "SQL Server 2005 Gold Master", ready to ship, but broke it in half getting it out of the case. He assured the crowd that the shipment would not be further delayed, nonetheless.

SQL Server 2005 is not an unknown by any stretch; over one million copies of betas and CTP's have been distributed to date. The SQL Server 2005 project has created several new initiatives and programs designed to give this new release the biggest head start out of the door ever. For example, the Front Runner Initiative, aimed at ISV's in the US, provided technical and marketing resources to spur applications to market faster. Microsoft worked hard to identify large SQL Server implementations (over 50 TB-sized deployments nominated themselves) so that this newest version would be tested in the most demanding of environments. One of these environments was Microsoft itself; keeping with their "eating your own dog food" mantra, the Microsoft IT Department supports over 70 SQL Server applications in-house today, including a 1.7TB SAP R/3 implementation, a staging data warehouse for all Microsoft data (2TB in 1,800+ tables with 500 subscribing systems), sales and revenue reporting (a BI application with 9,000 users running 25,000 reports per week) and an extrordinary 1.8TB worldwide tax information database. In addition, seven customers are live on SQL Server 2005 now, with 25 expected to be live by the time it goes to manufacturing.

SQL Server is the database of choice for some of the world's largest data repositories, including Verizon (52 billion rows) and the eighth largest data warehouse in the world (over 20TB). One of these large installations, Barnes and Noble, provided the data for Project REAL (Reference Implementation, End to End, At Scale, Lots of Users), which Microsoft is using as a gigantic best practices generator and proof of capability for SQL Server 2005. The BN implementation is 2TB in 10 billion rows, representing about three years of sales data, and is growing at the rate of 8.5 million rows per day.

Bill made it clear that the bar was raised very, very high for SQL Server 2005; and this newest iteration of their flagship database product has met and exceeded that bar. It's ready for action.

BI for the Relational Guy
Erin Welker

by Dave Bernard

Erin is the Lead, BI, for Scalability Experts. Her session was a very good introduction to Business Intelligence, aimed at the DBA who may know a lot about relational database technologies, but who has little exposure to concepts like data warehouses, data mining and online analytic processing (OLAP).

Business Intelligence is difficult to describe in a sound bite (byte?), but Erin said it best in suggesting that BI enabled data to be turned into actionable knowledge and decisions. It has evolved over time, starting with an emphasis on getting data into a system rather than turning that data into information. This emphasis on gathering data (as opposed to tuning for reporting) created a conflict for resources due to opposing requirements: reporting (querying) requires the optimization of large amounts of data, while OLTP (adding data) requires optimization over much, much smaller data sets. Optimizing for updates, query volumn and indexing is also at odds. How do we make these equally necessary functions co-exist efficiently in an enterprise?

The answer is to create separate data warehouses that are optimized for reporting purposes. These data warehouses can assume the form of a relational system or an OLAP system. The relational system is usually in the form of a star schema, in which dimensional tables surround a central fact table; facts tie dimension intersections together to present information of interest. For example, a Sales fact table may be surrounded by dimensional tables like Customer, Product, Currency, etc. However, this leads to a purely data-centric approach, without providing much in the way of business logic or other application-centric characteristics.

Keep in mind, also, that the true users of a data warehouse are not members of the IT department, or are even "super users", but are often C-level executives that may use the system infrequently. As Erin says, "there's nothing like implementing a data warehouse to show a company how bad their data is", and these less-than-super users find the holes very quickly. Also, these users need a presentation that is much more human-readable; those short geeky column names have got to go. Enter the OLAP Cube.

An OLAP cube is a multidimensional database that pre-aggregates values to expedite queries (a shortcoming of relational data warehouse schemas). OLAP is designed specifically for dynamic analysis of extremely large data sets. As efficient as they are, your OLAP tool (such as Microsoft Analysis Services) needs to be capable of such functions as data compression (to reduce redundant data), selective aggregation (to prevent someone from building a cube that is "all things to all people") and avoids storing empty columns. Selective aggregation has, perhaps, the greatest effect because it prevents a potential data explosion when the designer attempts to create a cube with every possible relationship imaginable.

Erin emphasized the importance of "soft factors" when implementing a BI strategy. BI/data warehouse projects are high-risk endeavors; take a lesson from the software development side of the house and embrace the RAD philosophy of iterative, piece-by-piece development and deployment; deliver early and often. Don't rely on IT people for all of your information; talk to the target users. And don't try top model the whole enterprise as step one!

Erin characterized data mining as the "discovery of trends and patterns based on existing data." Examples include capabilities like predicting inventory replinishment, credit risk analysis, cross-selling and healthcare outcode analysis. Data mining is powered by your OLAP front end; early OLAP users were expert at wielding Excel pivot tables as a BI tool, but this has been surpassed by tools like, Analysis Services in SQL Server, Reporting Services and Sharepoing Portal Services.

As Erin put it, two users often walk into meetings with conflicting reports based on the same source data. BI can help prevent that kind of occurrence and uncover unforseen actionable information in your enterprise.

Future Directions in Data Language Integration: The LINQ Project
Yair Alan Griver

by Dave Bernard

Before I get into this session, let me say a word about lunch: Great! After our carb-only, protien-free "continental breakfast" (better characterized as "bagels-and-two-types-of-cream-cheese breakfast"), lunch was a tremendous and pleasant surprise. I must say that it was well-organized as well, the lines lacking the normal glacial progress in most conferences. The evening receptions thus far have been adequate as far as food goes, which is often the weakest part of a conference. End of interlude.

To a long time FoxPro developer such as myself, Yag is a legend in the business. He is one of several ex-Foxers who have moved into power positions in Microsoft's development tool leadership hierarchy and, as Group Manager of the Visual Studio Data group, Yag continues to influence the direction of data handling with a distinctive FoxPro flavor. As I listened to this session, I chuckled to myself more than once at how Fox-like this all seemed.

FoxPro developers were long ago blessed with powerful, SQL-based, internal data handling capability. It was embedded right into the language and was famously optimized (known as "Rushmore") for blazing speed. In the view of many FoxPro developers, Visual Studio .NET only offers very clumsy data handling capabilities, via ADO.NET, and lacks the crispness, conciseness and performance of an integrated data manipulation capability.

But, soon, no more.

The recently announced LINQ Project introduces the .NET Language Integrated Query family of API's, divided into three families: Standard Query Operators, DLinq (ADO.NET) and XLinq (System.XML). All are based on many core .NET namespace API's and are designed to greatly simplify querying and working with dynamic data and XML in CLR languages. The version that YAG demonstrated is very new and has a ways to go before being added to Visual Studio, but, as you'll see, the promise is already there.

The Standard Query Operator library enables the querying of all sorts of data, including arrays, folders, system functions, collections, XML hierarchies, tables, etc. DLinq is optimized to work with ADO.NET and is the preferred way to talk to SQL Server data stores while XLinq is optimized to handle XML datasets. This new functionality is all based on .NET framework methods. What's interesting is that execution of your query is actually deferred until you reference the result set itself, such as when binding it to a control or assigning a data value to a variable.

A simple example in VB:

     Dim primes= {1, 3, 5, 7, 9, 11}
     Dim x = Select p From p in primes Where p > 5

     For Each prime in x
        console.WriteLine(prime)
     Next

Yag showed an example of directly reading an IIS log file this way (he even created his own Browse method so that he could easily list the results during his demos, influenced, undoubtably, by it's FoxPro twin.

When you want to access SQL Server data in .NET today, you have to jump through a lot of hoops: create a SQL connection, setup the SQL command, add paramters, use DataReader to execute the command, get a non-stongly-type object back and addressed the data elements ordinally. There's also no compile time checks.

Using DLinq, which is a very lightweight wrapper over your relational model, accessing SQL Server data can be as simple as:

   Class Northwind : DataContext
   Northwind db = new Northwind(@"Server=.\SQLExpress;Database=...");
   var Contacts = From c In db.Customers Where c.City == "London"_
       Select new {c.Name, c.Phone}

Essentially, tables and rows are mapped to classes and objects; this builds on existing ADO.NET and .NET Transactions methods.

XLinq brings powerful XML data manipulation capabilities to CLR languages; it has the expressive power of XPath/XQuery, but with C#/VB as the programming language:

Dim emp = _
    <employee>
        <name>Joe</name>
        <age>28</age>
        <department id="432">
            <deptname>Engineering</deptname>
        </department>
    </employee>

Also:

   Public LinkCosmos as XDocument = GetCurrentLink...("http://localhost/TestStream.XML")
   Dim links = Select dest:= cstr(item.element("linkurl"))
      From item in LinkCosmos.Root.Descendents("item")

is the same as:

   Dim links = Select dest:= cstr(item.linkurl) From item in LinkCosmos.Root...item

The LINQ Project is a huge step in the right direction for the CLR languages; data manipulation made easier than ever.

Build Your Own Reference Database for a Service-Oriented Architecture
Jonathan Storm

by Dave Bernard

When you have a hammer, everything looks like a nail

--Baruch's Law

So started Jonathan, DBA for Port Townsend Paper Corporation, to illustrate his point that, just as you can't build a house with just a hammer, you equally need a full tool set to build data systems, too. Having a toolset is one thing; integrating your tools to work together well is another matter. What integrates your tool set? The hands, head, heart of the master builder.

A Service-Oriented Architecture is an integration architecture approach. The Reference Service is the enterprise SOA service for providing master lists needed by multiple applications, such as people, organizations, customers, products, statuses, geographic information, etc. It defines a way of organizing software assets on a network, providing business users with high-level business services that are easily called into and integrated with other business processes. Reference is a universal information tool, integrating OLTP, OLAP, Analysis Services and data warehouses with each other.

Jonathan believes that we need partially-automated master data translation and routing and connecting, along with a master system that holds the gold standard, the list of master lists, to integrate systems. Reference is a single system to lookup, validate and reference values for many other databases; a compiled enterprise dictionary and thesaurus with cross-references. It is based on a central, physical repository of information in database form. Reference tables need to be highly normalized and extensible. It is an important part of the overall SOA approach to integrating applications.

Like many other significant endeavors, this one takes time, patience and management buy-in at the highest levels. It should be approached incrementally and be allowed to grow organically. You will be using all of your BU resources (your data warehouse will have most of your Reference Service elements already defined). Make your foundation strong and small in the beginning.

Jonathan spent some time going over his experience in creating a Reference Service at his company, how the schema was architected, how the tables were populated, etc. He estimates that a robust implementation might cost 2-3 man-years of effort.

Thanks for a good overview of this up and coming technology, Jonathan.

Managing Unstructured and Semi-Structured Data in SQL Server 2005
Michael Rys

by Dave Bernard

Michael Rys is Program Manager, SQL Server Engine, at Microsoft. I have seen few sessions with so much detail presented in so little time, with little loss of effect. I never imagined there was so much to know about handling unstructured data.

Unstructured data is just that: data of which the data engine has no intrinsic understanding of semantics or internal structure. Examples are Word documents, images and compiled binaries. Semi-structured data may lack an explicit schema, but is usually accompanied by some sort of meta information that describes it in some way. The structure may even vary from instance to instance and over time.

SQL Server 2000 provided very good capabilities for handling this kind of data, but SQL Server 2005 goes further, especially when it come to handling XML data. Full text support is improved and character and binary types support up to 2GB of data, up from 8000 bytes in SQL Server 2000. Of course, support for the CLR in SQL Server 2005 allows for all sorts of robust embedded data handling against non-structured text.

Michael interspersed his talk with loads of actual demonstrations running DDL code. He used the OpenRowSet (BULK...) to load files from disk directly into a BLOB varchar column. Very cool!

Full Text searching in SQL Server 2005 is now about 50% faster; index creation when repopulating catalogs is now about 600% faster! And it scales up to 2 billion documents per catalog (does the Library of Congress have that many documents?) New DDL statements have been added (CREATE FULLTEXT INDEX, native backup/restore/recovery/attach/detach), too. It looks like Full Text searching is finally a full-fledged citizen of SQL Server.

XML data support has been greatly improved; it is now supported as a native SQL type. It is queryable with XQuery, updateable with XML-DML and provides automated checks for well-formed XML, as well as automatic validation checks. You can create schema collections, stored in the system metadata, that allow you to constrain XML documents, that is, you can guarantee the "shape" of your XML data. There is a lot more to this than meets the eye.

You can even index XML data to speed XQuery operations! Even more, you can combine XQuery with Full Text searching for even more powerful searching. I think this is going to lead to very interesting applications in handling large document repositories such as medical abstracts and articles, which are often distributed in XML form but, until now, require translation into a pure text format before effective full text techniques can be brought to bear.

Microsoft has obviously spent considerable effort in improving both Full Text operations and XML data support; taken together, handling of non-structured data has been moved forward in a big way in SQL Server 2005.

A SQL Server DBA’s Guide to CLR Integration
Dr. Greg Low, Readify Pty Ltd.

by Dan LeClair

One of the significant changes for SQL Server is the ability to easily integrate VB.Net and C# code in the many different objects that use to require T-SQL to code. This session’s title intrigued me, as it implied that it wasn’t as easy as just adding assemblies in your code.

One of the basic fundamentals of CLR assemblies in SQL is that the compiled CLR code (and, optionally, the source code) is stored within the database itself, rather than as separate, externally-called classes. This gives it portability, as it travels with a database across backups and detachments. However, it means that common assemblies need to be integrated in each database that needs them. They will get written out in database scripts, but that makes the script hard to read, as all you see is hexadecimal code.

Dr. Low was very clear on his beliefs as to the appropriate places for CLR assemblies, and spent some time in his presentation describing how it can be used and where the greatest benefits are. The bottom line seems to be this: use CLR for tasks that T-SQL is not good at, such as string manipulation, cryptography, file access, and so forth. For data-centric operations, stick with T-SQL. User-defined functions seem to be very good candidates for building as CLR assemblies, as well as defining your own data types and aggregates. Stored procedures tend to be less so, as they usually are processing data. Triggers can also be written using CLR, but again, you need careful thought as to what the gains can be.

In short, his message was to curb the enthusiasm for CLR. Lots of interesting things can now be done but give some thought as to the appropriateness – are they things that make sense to have in your database?

Fine Tuning Performance with Plan Guides
Keith Elmore, Microsoft Corp.

by Dan LeClair

The sessions in this year’s PASS Summit are divided into different tracks, and this session fell into the “Enterprise Database Administration and Deployment” track (in fact, all of the ones I attended did). Unfortunately, I was a bit out of my element in this particular presentation. Not that I didn’t get anything out of it, mind you, but as a database developer, some of what was presented was a little obscure.

But for the DBA’s in the crowd – and this session was in the big hall – it seemed to really be of interest. SQL 2005 Plan Guides allow the DBA to manually fine-tune the execution plan via query hints for a specific stored proc, overriding the default execution plan and (hopefully) speeding up code performance.

Normally, in the course of code development, the DBA or developer would be responsible for testing and tuning queries through code changes, index changes, and so forth, and Plan Guides are not intended to supplant this responsibility. Rather, they are intended for circumstances where queries are not under local control – such as the case where they are written by an offsite partner or ISV not willing (or not able) to change slow-performing code.

What was obvious to me was that this is not something to approach lightly. First, there are three different plan guides, each with very specific requirements regarding code. The SQL plan guide, which offers the most control of query hints, can only operate against queries that match what it was designed against byte-for-byte – including casing, white space, and even carriage-returns and line-feeds. The Object plan guide is a little less restrictive, as the case for key words and white space doesn’t matter. The least restrictive – Template – uses the same case and matching rules as Object, but only supports two levels of parameterization, simple and forced, when developing.

Most of this information was presented at the beginning of the session. Keith then went into a number of different samples of how to implement the plan guides, and the outcome of the guides. Again, the DBA’s in the crowd really seemed to get into what was being shown. For me, it was enough to find out that this tool is available for situations where I don’t have control over query code.

Index Enhancements in SQL 2005
Kalen Delaney, Solid Quality Learning

by Dan LeClair

Kalen Delaney is a well-known SQL guru, having experience with every version of SQL back some eighteen years. I’ve never been able to squeeze into one of her sessions before, as they are usually packed. This one was expected to be as well, and kudos to the PASS organizers for scheduling it in the big hall.

SQL 2005 offers some really incredible enhancements for indexes that should prove to be very useful. A lot of ground was covered in this session, and I’m looking forward to experimenting with what I learned. The enhancements fall into one of three general categories: availability of indexes, manageability of index metadata, and performance improvements.

First, indexes can now be specifically disabled and enabled, rather than having to be dropped and recreated. This is a big improvement for situations like system upgrades and bulk data loads, as the operator or application doing the work does not have to determine what the index syntax is beforehand. Indexes can also now be built on line, rather than having to lock the table for the duration of their construction.

There are also a lot of changes regarding index metadata. These changes involve better index analysis (such as showing usage rates and fragmentation), reorganization (using ALTER INDEX…REORGANIZE instead of DBCC INDEXDEFRAG), and measuring performance statistics. New system stored procedures are available that deliver table-based result sets, meaning you can only return the columns you are specifically interested in; however, Kalen warned about some gotchas when using a couple of these.

One thing that was only briefly mentioned was that the Index Tuning Wizard has been tossed, and replaced by Database Tuning Advisor (a much more capable and complete application), but unfortunately not much more was covered about it. Guess I’ll have to look that one up myself.

A lot more was covered, but I can’t finish this without mentioning the new ability to include columns in indexes without actually adding them to a key. Basically this gives you the ability to get around the 900K / 16 column key limitation for better covering indexes, and hopefully better performance. The caveat here, though, is that it is now possible to build indexes as wide as the table itself, which wouldn’t really buy you much in performance.

Wednesday Vendor Exhibits and Random Thoughts
by Dan LeClair

If you read my report from last year, then you know that I think the vendor exhibits are one of the important features of a conference, as it gives potential buyers a chance to get a good look at what is on the market. This year’s vendor exhibit is at least as big as last year’s, with products ranging from high-end servers to magazine subscriptions and consulting services. HP was a big hit with attendees, with a massive t-shirt giveaway and opportunities to win geek-gear (the photo shows just one of the lines to register for shirts). I saw several products that were very interesting, including Red Gate’s log recovery tool (which I even recommended to someone on-line later in the day), and some others I will cover later. Any attendee that leaves here without info on useful tools (and at least one t-shirt) isn’t trying very hard.

Thursday, September 29

Develop Occasionally Connected Applications: Visual Studio 2005/SQL Server 2005
Steve Lasker, Program Manager, Visual Studio, Microsoft
Philip Vaughn, Program Manager, SQL Server, Microsoft

by Dave Bernard

This was an interesting and well-attended session. Thin client, mobile computing, handheld applications; whatever you want to call it, small footprint devices are increasingly being used to deploy applications in far-flung places. Often, these applications need to work even when the network isn't available. Steve and Philip demonstrated that, by taking advantage of the robust capabilities of Visual Studio 2005 and SQL Server 2005, you can accomplish just that.

Microsoft's overall approach is called the Smart Client, which overlaps both the thin client and rich client platforms, utilizing the best of both worlds. Together with ClickOnce technology, Smart Clients are as easy to deploy as web applications, but deliver the power of a desktop application. ClickOnce provides centralized, URL-based installation that doesn't require Administrator rights; in addition, the application automatically detects necessary updates upon startup.

The word of the day is synchronization. To work effectively in a disconnected state, the local client hardware needs to have enough data (data normally retrieved from a server) to continue functioning. This data may be messages, web service results, query results or even application updates; and it needs to happen in a way that is transparent to the user.

If you're going to synchronize with a database like SQL Server, you'll need a local data store: enter SQL Server Mobile. This lightweight database, now also available on the TabletPC, supports merge replication and remote data access (RDA), which are critical to the synchronization process we need to implement.

Visual Studio 2005 supports the latest Smart Client features. Steve detailed, through a long demo, the creation, deployment (via ClickOnce) and execution of a Smart Client application. Here's what he did:

  • Create new WinForms project in Visual Studio 2005
  • In the Data Sources window (where you can specify database, web service or object source types), establish a connection to your data source.
  • Steve chose to use the new datafile connection approach that allows specification of the actual MDF file instead of figuring out a connection string.
  • This fires up SQL Server Express in the background, which asks you if you want to make a local copy (cool, and recommended).
  • At that point, you can fine tune the datasource by excluding non-essential database objects from your local instance of the MDF.
  • You'll notice that Visual Studio has automatically populated the form controls menu according to columns found in the data source (very cool), and you can further tune that by specifying for each which kind of control provider you want (list box, text box, etc.) when that element is dragged to the form.
  • You can aggregate these columns/controls into containers that can then be dragged to the form en masse. One very cool thing is that labels are automatically created from the column names, with spaces inserted in the proper places where camel-case has been used in the column names! Where has this been all this time?
  • Data binding is accomplished by dragging a data element (table, e.g.) on top of a form control (very cool, and it uses a lot of smart defaults).
  • F5 starts it running.
  • To deploy, select Publish (to web server, CD, etc.), which then builds and installs the application, complete with a ClickOnce web page starting point that outlines the pre-requisites. The Install button on the web page runs a little bit of code to make sure that the pre-requisites and permissions are adequate.

It all seems pretty straightforward to me.

In another demo, Steve prepared a CE application by choosing a pre-built SDF file as the data source (if can now be created in VS 2005 or SQL 2005 (via SSI) rather than the arduous process required before); for SDF support, you need to add certain DLL's to your project for deployment purposes (set them to Copy if Newer).

Phillip quickly covered the merge replication aspects of maintaining offline data sets, contrasting it with the simpler (and weaker) RDA approach. If you're going to do it at all, go ahead with merge replication so that you can take advantage of its full-featured function set as your application evolves. New in SQL Server 2005 are several useful features:

  • Synchronization can now be done over the Internet (requires HTTPS);
  • DDL has been expanded to include many merge functions;
  • Support for database schema changes;
  • Big improvements in performance;
  • Bi-directional synchronization with conflict resolution support;
  • Designed from the ground-up for offline work.

If you're in the mobile application development business, the information covered in this session is a must-know. And if you aren't doing mobile applications now, you most probably will be doing so in the near future. Great session.

Ad-Hoc Reporting with Report Builder
Carolyn Chau, Microsoft

by Dave Bernard

The fact that a good reporting tool is of paramount importance to many attendees and their companies was underscored by the packed room for this session. And it was surely the kind of session where you leave convinced that you got your money's worth. Carolyn's rapid fire but smoothly flowing presentation kept everyone wide awake in that difficult after-lunch time slot.

Carolyn's session was ostensibly covering ad-hoc reporting, but it also served as an effective introduction to the reporting capabilities and enhancements now available with Visual Studio 2005. Report Builder is a tool aimed squarely at end-user needs, rather than those of the developer, who will be more attracted to the full-featured Reporting Services module; it does an effective job at hiding the complexity of the data source's physical layer and provides much more of a business model approach to defining reports. Report Builder is not a replacement for OLAP tools or pivot table approaches. It is being delivered with SQL Server 2005 and is fully integrated into Reporting Services and, like so many other parts of the new offerings, it is deployed as a ClickOnce application (requiring the 2.0 version of the .NET Framework.

Carolyn's session was light on slides and heavy on demos, the current one building on the previous ones; she rarely had to use the keyboard, too. It proved that Microsoft has spent considerable energy in making the design surface slick and easy to use for creating reports; it's modeled after PowerPoint (which appeals to a wide, non-developer audience) and is heavily drag-and-drop driven. It's like having a PowerPoint application that knows about data. It was very easy to add charts, graphs and other "foreign" objects. You can also cause the report to include user-selectable criteria at runtime (usually rendered in the form of pull-downs on the report, but that is tunable by the report developer), a very useful feature for weary report creators. It was very easy (and almost automatic) to add aggregated subtotals to the report; Report Builder makes use of existing schema RI information, but you can also build your own on-the-fly calculations.

Filter conditions are easily applied, too. One really cool aspect of adding filters is that Report Builder can render an English-like description of the defined filter that can then be displayed on the report. Very nice.

Reports can be saved (in RDL format) to be run later. They can be exported to all kinds of file types (XML, TIFF, etc.) You can view the generated SQL and RDL meta data if you want. It's surprising how complex some of the generated SQL calls are.

Infinite Drill (only available in the Enterprise edition) automatially creates drill-down links on the report in cells with appropriate RI relationships. For example, if you render an Invoice History report, links that drill down into the Invoice detail will be created, and there may be links on that report that go further; this allows users to explore data along model navigation paths. If a link makes sense, it will be there. Users can keep drilling as long as there is a path to follow from the current item. Drill-throughs can be tuned by the developer to include more or fewer columns, too.

The Report Model wizard walks you through the creation of a Report Builder model from your data schema; the only real restriction is that stored procedures are not supported, which I find a little surprising. The wizard automatically breaks dates down into parts (day, month, year) and automatically creates sensible aggregate fields. Once a model is created, it serves as the basis for your ad-hoc reports. Large models can be presented in more simplified forms by dividing them into "perspectives". For example, put Customers and Orders in a perspective, leaving out Invoices, General Ledger, etc.

Thanks, Carolyn, for a high-quality, quick-paced session.

Best Practices for Structured SQL Server Development
Adam Machanic, Senior Database Engineer, GetConnected

by Dave Bernard

Adam, a SQL Server MVP and self-described "Total SQL Server Geek", hosted a full session (and in a big room, to boot). This guy obviously has quite a following.

Adam is a fellow who believes that it is a mistake to trust your application and the data it delivers -- and he's right. To begin to trust our applications, we need to structure our applications so that results can be consistently reproduced. So how do we do that? Adam says you have to (1) handle exceptions, (2) unit test and (3) debug properly.

Adam started ot by illustrating worst practices in implementing a stored procedure "API". Stored procedures should be thought of just like any other interfaces: black-boxes with well-defined inputs and outputs, which is not the way most developers regard stored procedures. It means things like:

  • No passing in column names! You don't want to have to change lots of code just because you've changed a column name. This doesn't help readability, either.
  • No passing in table names!
  • No passing in partially formed WHERE clauses! If you do, you're trying to write stored procedures that do too much, which will defeat performance optimizations.
  • Don't change the outputs returned based on the inputs! Always return the same, well-defined data set.

Don’t do things like this (at least they used sp_ExecuteSQL)!
CREATE PROCEDURE myGreatProc
  @sql NVARCHAR(2000)
AS
BEGIN
  SET NOCOUNT ON
  EXEC sp_ExecuteSQL @sql
END

The result is, according to Adam, "a world of pristine beauty." As long as your application knows as little as possible about your database system, and as long as you keep those interfaces intact, you can do anything you want in that database (fix bad data, modify the schema, re-tune your stored procedures); no one will notice because the inputs and outputs have remained unchanged. What you have is a beautiful, totally decoupled application. Ah, nirvana by way of a level of abstraction.

Handle, er, expect, exceptions. Be aware that there are three types of runtime exceptions:

  • Statement terminating; this exception stops the current statement, issues an error, then moves on to the next statement.
  • Batch aborting; this exception stops the entire batch and issues an error. If another batch is in the queue, it will be processed.
  • Connection aborting; this is the sign of a serious problem! Say goodbye to your entire connection.
  • SET_XACT_ABORT ON – this option turns statement terminating errors into batch aborting errors. Also, it forces rollback of a transaction on error. Highly recommended.

Currently, @@ERROR only reports problems with severity > 10 and is reset after every statement; RAISERROR is user-controlled and will not terminate a batch. Because stored procedures do not implicitly start a single transaction in which multiple DML statements can run and exceptions do not automatically abort transactions, DBAs and developers must control transactions themselves!

Enter the TRY/CATCH feture of SQL Server 2005 - a huge improvement:

BEGIN TRY
   SELECT convert(int, 'xyz')
END TRY
BEGIN CATCH
   PRINT ERROR_MESSAGE()
   PRINT ERROR_NUMBER()
   PRINT ERROR_SEVERITY
END CATCH
GO

TRY/CATCH allows you not only to handle an error, but to do something smart in response. XACT_STATE() can be queried to determine the current transaction state; -1 means "must rollback", 1 means "commit or rollback", 0 otherwise. ROLLBACK is your friend. Use RAISERROR to handle things like constraint violations (WHERE NOT EXISTS fails, e.g.), logic violations (an unexpected value in a variable) and bad or missing passed parameters.

Of course, there's always the possibility of unexpected exceptions; the first defense is to code well so they don’t occur (test for divide-by-zero, e.g.) and then test, test, test. You can never test enough. Make sure that, when debugging, you're trying to discover bugs and not just trying to get the code to compile.

Debugging TSQL is tricky; a lot of PRINT and SELECT statements are typically employed. Adam made us aware of a couple of neat open source tools that will help a lot: TSQLMacro and TSQLAssert.

TSQLMacro provides a TSQL macro expansion capability, allowing the developer to centralize common pieces of code and, even better, automate the inclusion of test and debug code. TSQLAssert uses TSQLMacro's power to provide an assertion framework for stored procedures. Very useful!

Great session, Adam.

Writing .Net Code in SQL Server 2005
Peter DeBetta, Wintellect

by Dan LeClair

Having spent most of yesterday in DBA-oriented sessions, I decided to stick to database development topics today, starting with Peter DeBetta’s presentation on writing .Net code in SQL Server. Although he covered many of the same points as Greg Low did the day before, Peter went into much more detail on how to construct SQL objects with CLR.

Writing assemblies for SQL starts with creating a new SQL Server project. Five new project templates have been introduced for SQL 2005 in VS.Net 2005, allowing you to create user-defined aggregates (UDA), user-defined functions (UDF), stored procedures, triggers, and user-defined types (UDT). There are some common properties for all of them, including one for a connection string to the SQL server, one for assembly signing, and one for setting the security level of the assembly. Assembly security levels include SAFE (no external or unsafe calls), EXTERNAL ACCESS (allowing access to files outside of the SQL operating system), and UNSAFE (allowing PInvoke or other “unsafe” calls).

I should mention here that technically, all assemblies operate safely within the SQL operating system. If an assembly does something bad, SQL has control of it and will shut the thread down before it is allowed to crash the system. There may be perfectly valid reasons to allow an UNSAFE security level; however, it is something that should be approached with caution.

Before going on to present examples, Peter spent some time going over the differences between SQL Server data types, Sqltypes (in SQL Server projects), and regular .Net variable types. For example, the various different character types within SQL Server (Char, Varchar, Nchar, Nvarchar, NText, and Text) are represented by the Sqltypes SqlString and SqlChar, and as String and Char() within .Net. You will have to write your SQL assemblies a little different than regular .Net code because of this.

The meat of the session, of course, were examples of the different types of SQL objects that can be written. Space doesn’t allow me to cover them all in detail; however, this small example of a UDF that encrypts a password shows some of the important points of writing SQL objects in CLR:

using System;
using System.Data.Sql;
using System.Data.SqlTypes;
using System.Text;
using System.Security.Cryptography;
public struct MySecurity
{
	[SqlFunction]
	public static SqlBinary EncryptPassword(SqlString password)
	{
		UnicodeEncoding uniEncoding = new UnicodeEncoding();
		byte[] passBytes = uniEncoding.GetBytes((string)password);
		SHA1 sha = new SHA1CryptoServiceProvider();
		byte[] passHash = sha.ComputeHash(passBytes);
		return (SqlBinary) passHash;
	}
}
Two things to note here: one is the new namespace, System.Data.SqlTypes, and the second is the project identifier (SqlUserDefinedAggregate, SqlFunction, and so forth).

Of the five project template, only UDT’s and UDA’s can only be done in CLR; the other three can also be done in T-SQL. Just as Greg Low did yesterday, Peter suggested using some thought before launching into massive CLR assembly construction. UDT’s, for example, should be small and scalar – they are not meant to represent large things like customers, but smaller things like currency (with descriptor for dollar, euro, and so forth) CLR is intended to extend T-SQL – not replace it. For data-centric operations, T-SQL (being native to SQL Server) will generally be faster.

A lot of ground was covered in this session, but afterwards I felt as if I had a much better understanding of the uses of CLR and how to construct assemblies. Definitely worth going to.

Developing Enterprise Data Integration Components
Kirk Haselden,Microsoft Corp.

by Dan LeClair

Dave Bernard and I both wound up in the same 3:15 PM session, so I decided to go next door and check out Kirk Haselden’s presentation on integration components. SQL Server Integration Services (SSIS) is the latest incarnation of DTS, and one of the new features is the ease in which custom tasks are created.

One of the prerequisites for this session was previous experience with Integration Services, but even though I haven’t done much but play around with it, the session was generally easy to follow. One of the main points of the presentation was that custom task writing is much easier to do with .Net than it is with scripts. In fact, Kirk constructed, registered, and instantiated one in under a minute (drafting an attendee with a watch to time him). Okay, it was the classic “hello world” example, but enough to demonstrate his point.

There are a number of common requirements to writing custom tasks, including the following:

  • The project must have a fixed version attribute.
  • You must add a strong name to your assembly in order to register it in the GAC
  • An SNK keyfile must be created in your project directory.
  • A reference to Microsoft.SqlServer.ManagedDTS.dll and Microsoft.SqlServer.DTSRuntimeWrap.dll must be added to the project
  • A reference to Microsoft.SqlServer.Dts.Runtime namespace must be added to the beginning of your source code files.

Custom tasks are based on the Task class and have a number of attributes that should be set. One of the more interesting ones is “TaskContact” which is basically a “who wrote me” line:

TaskContact = "Expression Task;Microsoft Corporation; Microsoft SQL Server v9;
© 2004 Microsoft Corporation; All Rights Reserved;
http://www.microsoft.com/sql/support/default.asp;1"),
However, it also doubles as a placeholder attribute in case a required component is missing on the server the task in installed on. Kirk didn’t just demo a “hello world” task, he had deeper, more useful demos to show as well. He also discussed custom task best practices – provide progress feedback, even if it is just “started” and “finished”, avoid collections if possible, tasks should only have one function, and others. He also explained some of the inner workings of custom tasks, such as synchronous vs. asynchronous buffers, and went over the pipeline component and how it works. Lastly he finished up with the expression task and how it works. Whew! A lot of material was presented in this session, but it was worth attending.

Advanced DDL Tricks
Joe Celko

by Dan LeClair

Of the sessions I went to today, this was the one I was looking forward to the most. Joe Celko, author of SQL for Smarties as well as other books and over 750 articles on SQL Server, was sure to have some great stuff to present. A lot of other attendees thought so too, as the room filled up quickly before he began.

As much as I hate criticizing a fellow Texan, though, this session was disappointing. Instead of advanced DDL (data definition language) tricks, it turned out to be “Joe Celko’s Opinions on Proper Schema Construction.” Granted, it was probably new information to some folks, but some of what was covered I regard as basic table design rules, some points I disagree with, and some comments I frankly felt insulted by.

Celko’s opening point was that most bad SQL queries are the result of bad schema design. That’s a pretty broad statement, but one I might agree with under many circumstances. One particular point he brought up is that there are some standard naming conventions for column names that should be used when possible. ISO-11179 was one recommended standard, but generally accepted industry terminology should also be considered. Names should follow a noun-adjective pattern and columns should be named for what they really are. Constraints should be named descriptively, and used more frequently than most people use them.

However, some unfortunate generalizations were made. For example, null-tolerant columns are not on his most-favored-habit list. In the case of string and numeric columns, I would agree – better to have specified default values to prevent issues with nulls in calculations and searches. But, since an empty value is not valid in a date column, how else do you indicate the absence of a value, other than with a null? Same with foreign key columns.

Another surprising declaration was that use of the IDENTITY column value for primary keys is bad. I was not the only one to do a double-take on this, as it is (at least in my view) one of the easiest ways to implement surrogate keys in SQL Server. Celko’s beef with them is a lack of validation and verification (I didn’t really get his point on that) and that, with cheap disk space and powerful processors available today, there isn’t a really good reason to use them. Apparently he prefers to use natural keys as primary keys, including multi-column natural keys, regardless of the amount of space they consume in both table and index width. But what really hacked me was when he accused people that use IDENTITY columns of being the same ones who write bad code and use a lot of resource-consuming cursors in their queries. At that point, I almost walked out of the session.

Lastly, he had some unkind words about people that confuse “records” with “rows” and “fields” with “columns”. Huh? Sorry, Ashton-Tate and Microsoft used the terms “record” and “field” for years, and the habits of old dBase and FoxPro developers die hard. I’ve worked with SQL for eight years now, and still mix the terms up. Joe, if that makes me less of a person in your eyes – too bad. It also cost you a fan, but with 750 articles to your name I suppose you don’t really care, do you?


Friday, September 30

What's Next for Database?
Dr. Jim Gray, Technical Fellow, Microsoft

by Dave Bernard

In the database world of 1985, it took $50,000 worth of hardware to accomplish 100 transactions per second (tps); the goal was to reach 1,000 tps. Jim's current $400 laptop can handle 8,200 tps using SQL Server 2005; the improvement in price/performance is roughly tracking Moore's Law. That translates into about a 100-fold improvement each decade; so the $1 billion job of 10 years ago can be done for about $10 million today, and $100,000 in 10 years! From a data storage standpoint, terabytes (TB) are common now at about $500/TB, and petabytes (PB) will be common in a decade for about the same price point.

Jim characterizes the current and future state of the database world as "data avalanche", with a need to integrate all kinds of data. We can capture and store everything now (no need to throw anything away, but people need to know "what's interesting?" and "what can you tell me about X?" Jim believes that, just as punch cards were the "stone tablets" of their day (not updateable), the day is coming where databases will only allows inserts and reads, without allowing any deletes or updates. This is a reasonable view given the price/performance improvement curve we have seen and will continue to see.

Now and in the future, most of the data ("everything" currently amounts to about 15 exabytes) exists at the edge of the network: PDA's, cameras, phones, etc., and all of these devices want and need to share data and all will need databases. Jim thinks that an entirely new "perfect system" paradigm needs to be defined, one that:

  • Knows everything,
  • Knows what you want to know,
  • Can answer questions in an easy-to-understand way,
  • Tells you what you should have asked,
  • Can give "approximate" answers,
  • Is inexpensive to buy and own,
  • Is constantly asking questions on its own,
  • And tells you when an interesting answer pops up.

However, managing all that data is very hard; people costs are huge. So this system must emphasize the ability to self-manage, self-organize, self-configure and self-heal. In short, data capture is easy, but curating, organizing, searching and displaying data is still too hard. Jim advocates the re-unification of programs and data, allowing the objectification of information (What is a gene? A star? Their properties and methods?). Glue all these models together, using things like time, space, text, etc., for core types; person, event, document, gene, etc., become extensions of the core types.

I think the most controversial idea that Jim promotes is his vision of the DBMS as a huge web service: put everything inside the database! That means no web server or application server; drive everything from the database engine. I think this is, well, naive. Another smart fellow warned against the overarching goal of universal simplification ("Make everything as simple as possible, but not simpler." -- Albert Einstein); it ignores both history and reality. History has shown that the wants and needs of data consumer will preempt any attempt at universal simplification. Real worlds demands are too fast-changing, complex and diverse for any one-size-fits-all strategy. The law of unintended consequences trumps any attempt to simplify complex systems. You can't define the universe with a simple statement.

Given that, Jim's presentation provided great context for where we've been and where we're going, and much food for thought on how best to prepare for the future.

Microsoft Round Table Discussion
by Dave Bernard

I was invited to participate in one six Round Table Discussions that Microsoft was leading during the conference; small groups of 12-15 people were involved in each, led by a couple of Microsoft representatives. I've never really participated in something like this before, so I didn't know what to expect.

Lunch was served, then each person in turn introduced themselves and gave a breakdown of their job duties by percentage of time spent for any of about 15 categories such as DBA, project management, business analysis, architecture, marketing, quality assurance, R&D, etc. Most of the attendees were DBA's working in larger SQL Server enterprise shops. Everyone wore a lot of hats; many had duties that defied their stated job titles.

After this, we were asked to enumerate our three most important job responsibilities, and then detail the tasks that go with each. For each task, we were scorecard such things as the number of hours we spent on each task in a given time period, and the importance and significance levels we attached to each task. Finally, we mapped these tasks to where they fit in the various stages of the product life cycle.

We were then led into a discussion of "pain points", and lively discussion ensued for the rest of session. Patch management was definitely a hot button topic; many believed that the current methodology was risky and almost everyone relied on multiple non-Microsoft sources to corroborate and vet patches before actually applying them. One person suggested applying the SQL Server 2005 Upgrade Advisor approach to patches, taking a more proactive approach to determining ahead of time potential mine fields and missing dependencies.

One point surprising to me was the near unanimous opinion that, at least in the SQL Server space, Microsoft gets security right, especially compared with many competitors and third party vendors. Other good ideas were put on the table, which are, even now, winging their way to Redmond.

I've followed Microsoft almost since their inception as a company. I will be the first to say that I've long been impressed by their willingness, even eagerness to listen to their customer base; this was most definitely in evidence in this session, to Microsoft's credit. And I also believe that, by and large, the vast majority of reasonable ideas what customers ask for eventually gets included into the product line. While highly laudable, this is both a blessing and a curse. While listening and fulfilling customer requests is a great (and lucrative) way to do business, the breadth and diversity of Microsoft's customer base makes it more and more difficult to shoe-horn all of these myriad, great ideas into a fairly flat product line. It can negatively affect delivery schedules; for example, if I'm a medium size business waiting for the next version of SQL Server, I may have to wait for a release delayed by some feature used only by Fortune 100 companies, in which I couldn't be less interested. I think this is why we are seeing more and more stratification of each product line into different levels of capabilities (largely arranged around customer size). However, Microsoft needs to do a better job of explaining this stratification and who fits where. I wouldn't be surprised to see this stratification end up actually dividing into separate products with their own delivery schedules. That might be a very good thing for customers. General Motors doesn't try to build their entire car line on the same chassis, nor necessarily build all types of cars in the same plant. Pontiac owners are different in nearly every way from Cadillac owners.

Keep listening Microsoft, but it's time to drop the all-things-to-all-people-in-one-package approach. It promises everything, but satisfies no one.

Data Modeling Key Decisions – Pros and Cons
Paul Nielsen, Solid Quality Learning

by Dan LeClair

Paul Nielsen’s session was a great antidote to Celko’s diatribe of the evening before. Nielsen, author of SQL Server 2000 Bible, had some very interesting ideas on data modeling, including approaches that I had never considered before.

To start, here is a quote direct from his session:

Information is an organizational asset, and, according to its value and scope, must be organized, inventoried, secured, and made readily available in a usable format for daily operations and analysis by individuals, groups, and processes, both today and in the future.

With that as a base, he then presented some basic ideas about data modeling, including his “optimization theory” triangle (with good schema design being the basis upon which other optimization techniques rely on); how to evaluate model designs; an overview of the modeling process, and an introduction to various data models themselves.

A large portion of the presentation was devoted to discussing a relatively new idea: the O/R DBMS, or object-oriented relational database. Definitely not your father’s standard relational model! The O/R model, he claimed, supported the best of both worlds: data integrity, standard SQL set-based results, and performance from the relational side, and inheritance, polymorphism, and strong app development from the OO side.

There were a number of skeptics in the audience, including me, and I must confess that until I looked at the completed database (the code for which being on his USB drive, which he very trustingly passed around) it didn’t all quite come together. But, after seeing the actual tables and his description of how it all worked, I would be interested in giving it a try. It should be noted here, though, that when asked by one of the attendees if he had actually put this into a production system, he said no, but I’d be willing to bet we see this theory in practice soon.

He then went on to review some other data-modeling related concepts, including normalization, entity development, primary keys, optional data (such as the use of nulls), data quality, and the applications of standards in model construction. It was very interesting to compare what he presented with Celko’s opinions from the afternoon before.

All in all, a very interesting session, and one which gave me many good ideas.

The 15 Need-to-Know Points About XML
Kent Tegels, HDR Inc.

by Dan LeClair

This was another session I took a chance on, as I’m not a big “XML is everything” believer. However, Kent presented a lot of good information that helped clarify both how and when to use XML within SQL. By this, I don’t mean just returning a result set as XML; what was covered was actually storing XML within a SQL database for later use.

So, what are the 15 need-to-know points?

The first two – what XML is and is not – sort of summed up the session for me:

  • XML is not appropriate storage for most data. Don’t use when it makes sense to use regular rows;
  • XML is not limited to messages – Office 12 is XML-based, and more and more databases need to store complete XML documents;
  • XML is not likely to go away soon, and is not something to ignore or avoid – SQL pro’s need to know this stuff. The market is going to demand that you know it;
  • XML is not unmanageable or unlearnable – it may be tough to learn but worth it.

With an intro like that, I had to pay attention to the rest of the session, despite lunch. He then went on to discuss the primary storage mechanism for XML in SQL, which is the XML data type. It is a little different than regular XML, in that it really only stores fragments; SQL requires well-formed XML but a lot of the other rules are not enforced. Not every part of an xml document is retained – prologue, insignificant white space, for example, get stripped out. In short, XML you put in may not be the XML you get out of SQL.

This led to a discussion of some of the other problems of XML in SQL. As you might guess, your Storage structure affects performance, and if you do DML against XML it will be slower than against regular scalar row info. This can be alleviated somewhat by XML indexing (which he discussed later). It is also very easy to avoid appropriate normalization – which encourages what he called “cowboy programming”. If you are in a heavy OLTP environment, he recommended NOT storing data in xml format, but if you have to, be aware of the costs and how to minimize them.

Some of the other “15 things” he covered included a review of XML Schemas, how to use schema collections, how XML indexing works, the XML type methods, and alternatives to storing XML. It sounds like a lot, and in fact, it was, and Kent admitted that he was tinkering with his presentation up to the last minute to try and cover as much as possible. I felt it was a great introduction to a complex subject.

Analysis Services 2005 Database Design
Dan Bulos, Symmetry Corporation

by Dan LeClair

This was not the session I had planned on finishing the conference with, but I tailed Dave Bernard in for a last-minute chat, saw the title, and decided to sit in, as data-modeling for analysis services is a bit of a mystery to me. Dan’s unofficial title of the session was “Musings of a Cube Dude”, and, in fact, that was exactly what he presented – an entertaining mixture of theory and anecdote from his years as a cube developer.

The session centered around UDM – the Unified Dimensional Model approach to cube design, which is what the Business Intelligence Development Studio is based on. UDM uses attributes, instead of hierarchies, to construct dimensions, and depends on a key attribute upon which all others are based. He used the example of geography to explain this, asking questions like it is an attribute of the customer, should it be a separate dimension, will other sales offices be concerned with it, and so forth.

Something to consider when using UDM is excess “cube space”. When building dimensions, it is possible (depending on the source data) to create space between dimensions where there no “tuples”, or pairings of data. For example, if a sweater company offers red sweaters in “larger” and “medium” but not “small”, adding “small” as an attribute will create cube space as there is no combination of “small, red” sweaters. Sounds pretty basic, but this can be an issue when designing cubes.

Not having strong data warehousing experience, some of the presentation was lost on me. My own fault for randomly sitting in on the session, because modeling analysis cubes is really different than modeling OLTP systems. But there were several points he made that even a cube-novice like me could understand: you always need to think about where the application will change and expand; be careful in the development of your attribute relationships (particularly time, since it can be subdivided so many ways); rules are made to be broken, but only do it with forethought and intent.

Dan was an entertaining speaker on a complicated subject, and it was pretty clear he had a lot of passion regarding cube design. A lot of what he presented I can’t immediately use, but you can be sure his session notes will be on my hard drive for a while.

It's a Wrap...
by Dave Bernard

I come away from this, my first SQL PASS experience, impressed and re-energized. Pound for pound, it has to be one of the best values in the wide world of software technology conferences. It's big enough to be considered comprehensive, but not so big that it loses that close-knit community feel that the PASS organizers hold so dear. The ratio of industry leaders and experts to attendees may be the best I've ever seen. My only nitpick would be the curiously regular (daily) schedule modifications (sessions cancelled or moved), which I can't recall experiencing before at a conference. Oh, and it would have been nice to have some fruit at breakfast.

I applaud the choice of venue; the Gaylord Texan Resort was an outstanding host. This is the best conference venue I've been to in recent memory.

A friend of mine who once owned a training company once told me that "training is free", that is, you almost always get back much more than your investment, And SQL PASS is much more than "training"; it's access to experts, it's commiserating with people you'd otherwise never meet, it's an opportunity to focus on extending your knowledge and capabilities and it's just plain fun, too.

I'll definitely add SQL PASS to my annual list of must-see tech.

Interview of Speakers at PASS 2005

Jean-René Roy has interviewed key speakers at PASS 2005. The question was: "What features in SQL Server 2005 give you a good reason to upgrade?"


View the video

This is the list of speakers which have been interviewed:

 
Rushabh Mehta, Solid Quality Learning 
Rushabh is a business intelligence consultant for Solid Quality Learning and the principal ETL architect for Raymond James Financials. In the past 4 years, Rushabh has been instrumental in the design and development of major data warehouse initiatives for clients in retail and finance. He has architected and developed BI and ETL processes for systems ranging from a What-if analysis system that determine pricing strategies for a retail client; to a multi-terabyte financial decision support system. As the principle ETL architect, Rushabh is currently helping Raymond James Financials shape and achieve their enterprise BI strategy.

 
Randy Dyess, Solid Quality Learning 
Randy has a large variety of experiences dealing with SQL Server over the past 8 years and has worked with environments with Terabytes of data and environments that had over a 1000 databases with only a few megabytes of data in each database. Randy is currently a lead consultant at Scalability Experts and spends his time solving both security and optimization issues. Randy is the author of TransactSQL Language Reference Guide and numerous magazine and newsletter articles pertaining to SQL Server security and optimization issues. Randy is also the founder and principle author of the popular database security web site: www.Database-Security.Info and the popular SQL Server optimization web site: www.TransactSQL.Com. Randy volunteers on the PASS national newsletter committee and the North Texas SQL Server Users Group local SQL Server group.

 
Kevin Kline, Quest Software 
Kevin Kline is the Director of Technology for the SQL Server Solutions group at Quest Software, a leading provider of database management and application monitoring tools. Kevin is also the President of the Professional Association for SQL Server (PASS - www.sqlpass.org) and frequently contributes to database technology magazines, web sites, and discussion forums. Kevin is author of "SQL in a Nutshell" (http://www.oreilly.com/catalog/sqlnut/) and "Transact-SQL Programming" (http://www.oreilly.com/catalog/wintrnssql/) by O'Reilly & Associates . When he’s not working, Kevin likes to romance his wife, play to his four kids, write, and garden.

 
Don Kiely, Information Insights 
Don Kiely, MVP, MCSD, MSDE, is a senior technology consultant specializing in developing desktop and Web applications that integrate databases, Microsoft Office, and related technologies, using tools including SQL Server, Visual Basic, ASP.NET and XML. Don has authored and co-authored several programming books, including Visual Basic Visual Basic 6 Client/Server How-To and Visual Basic Programmer's Guide to the Windows Registry. He writes regularly for many industry journals, including InformationWeek, IEEE Computer magazine, Visual Studio .NET (VBPJ), and many other magazines. Don trains developers and speaks regularly at industry conferences, including TechEd, VSLive!, DevConnections, and others. He earned a BS in Civil Engineering from the University of Notre Dame and an MBA from the University of Colorado, but fortunately learned the error of his establishment ways. In his spare time he roams the Alaska wilderness by foot, dog sled, skis, and kayak, with his dogs Mardy and Izzi.

 
Adam Machanic, Data Manipulation Group 
Adam Machanic is an independent database-focused software consultant, writer, and speaker based in Boston, Massachusetts. He has implemented SQL Server for a variety of high-availability OLTP and large-scale data warehouse applications, and also specializes in .NET data access layer performance optimization. He is a Microsoft Most Valuable Professional (MVP) for SQL Server and a Microsoft Certified Professional.

 
Peter Debetta, Wintellect 
Peter DeBetta is the resident SQL Server trainer and consultant for Wintellect. Peter develops software using SQL Server, .NET, and other technologies. He also teaches SQL Server Programming courses and speaks at various conferences around the world about SQL Server, including VSLive!, WinDev, Devscovery, and Microsoft's SQL Server Summit. Peter has written courseware, books and articles on the subject, including articles for MSDN Magazine and a new Microsoft Press book entitled "Introducing SQL Server 2005 for Developers". When Peter isn't working with SQL Server, you can find him singing and playing guitar, taking pictures, or simply enjoying life with his wife and new son.

 
Wayne Snyder 
Wayne has more than 24 years of experience in project management, database administration, software design, performance measurement and capacity planning. Wayne has been aggressively involved with client server and web technologies using a wide range of databases and Windows GUI development tools. He has provided both stand alone client/server solutions, as well as mainframe/PC integration software. Wayne’s expertise includes: software development and design; systems analysis; systems integration; user education; performance management and tuning; database administration; and project management. He has provided these services to major clients in the healthcare, financial services and manufacturing industries. Wayne Snyder MCT, MCSE, MCDBA and MVP.

 
Pat Wright, 1-800 contacts 
Pat Wright is a DBA for 1-800contacts and the President of the Intermountain Sql Server Users group. He has been a DBA for the past 6 years throughout the valley. He is an outspoken advocate of Microsoft SQL Server and has presented for the 2005 Beta's Unleashed event in SLC and many times for the Sql Server users group. He can be reached at pwright@1800contacts.com or by visiting the Intermountain Sql Server users group web site (issug.sqlservercentral.com).

 
Kurt Windish, Levi, Ray & Shoup, Inc. 
Kurt Windisch is a senior technical specialist with Levi, Ray & Shoup, Inc. (MCSP and CTEC), an information technology firm based out of Springfield, Illinois (www.lrs.com). He has over 13 years of experience in IT and is currently a DBA and web application developer for LRS's internal IT department. Kurt is on the Professional Association for SQL Server's (PASS) board of directors, holds an MSCD certification, has written for SQL Server Professional, and has presented at the past five PASS North America conferences.

Brian Knight, SQL Server Central
http://www.sqlservercentral.com
Brian Knight is a co-founder of SQLServerCentral.com and co-author of many SQL Server books.

West Brown
Product manager for Quest Software

Rick Heiges, Scalability Experts
Blog: http://www.sqljunkies.com/WebLog/rheiges/default.aspx
Rick Heiges is a Database Consultant with Scalability Experts which is one only three companies working with Microsoft in the Ascend program for SQL Server 2005 developing courseware and delivering courses to Microsoft and key clients. Recently, Rick has presented at PASS European Conference and the SQL Serverf 2005 Road Show in Lisbon. Rick is also on the Board of Directors of PASS which is focused on user to user education.

Your reporting team

 
Daniel LeClair 
Daniel LeClair is a Senior Developer with Sogeti USA in Houston, TX., and owner of GSF Development Consortium, also in Houston. He has many years of database development experience, dating back to dBase II, and has been constructing systems with Visual FoxPro from its introduction in 1995. In 1997, he began working with MS SQL Server 6.5, using VFP as a front end; currently he specializes in SQL Server database development and data migration. Daniel was the project leader for Eagle USA’s TALON system, winner of the 1999 VFP Excellence Award in the “Client-Server” category. He has written several articles on VFP development for FoxPro Advisor, and has spoken internationally on VFP and SQL Server development. He has a BBA in MIS from the University of Houston, and currently holds MCSD Certification in VFP.

 
David Bernard, The Intellection Group, Inc. 
Dave is co-founder and Vice President at The Intellection Group, Inc., which builds web-based business solutions for many different industries. He has been a software developer, manager and executive for over 25 years. He has worked in FoxPro continuously from version 1.02 through VFP 9, is Vice President of the Atlanta FoxPro Users Group and holds MCSD and MCDBA certifications. He currently specializes in Internet-based extranet development using VFP and SQL Server and has recently been immersed in voice recognition, text-to-speech, natural language and RFID projects.

 
Jean-René Roy, Technologies Softdesign Inc. 
Jean-René Roy has been developing IT systems for large and small enterprises since 1987. In 1991 he founded the consulting firm Technologies SoftDesign Inc. Since that time he has worked on projects for Bell Canada, CGI, ArchiDATA, Sun Telcom, Accent de Ville, Kraft, ADN Medical and many other clients. He has been and is still very active in the developers’ community by acting as a user group leader for more then ten years with the MFUG, SSMUG and GUVSM user groups in Montreal and the Ottawa.NET and OttawaSQL.NET UG. Jean-René is also the founder and president of www.DevTeach.com / www.SQLTeach.com an international developer’s conference. Jean-René Roy is a SQL Server MVP.



Copyright © 1993-2008, Level Extreme Inc., All Rights Reserved
62 Rue Doucet, Petit-Rocher, New Brunswick, E8J 1L3
Telephone: 1-506-783-9007 Email: mfournier@levelextreme.com