Friday, June 5, 2009

Restricting Users to a Schema in a Shared Database

Our healthcare analysts do lots of "data smashing," so they needed a work area where they could play with the data. Initially we were creating individual databases for each user, but this quickly proved difficult to manage. So we decided to create a single sandbox database and separate users/projects by schemas.

The main stumbling block was to figure out how to grant all of the rights that a database owner would normally have, but only in a specific schema. After some experimenting with SQL Server, here is what worked for us:

-- create the work area (Schema)
create schema HEDIS authorization dbo
GO

-- create user that will be restricted to the HEDIS schema
create user TESTUSER
for login TESTUSER
with DEFAULT_SCHEMA = HEDIS

GO

-- Grant various create rights to the user
-- with grant (the user can authorize others)

grant create table, create view, create procedure,
create function, create synonym

to TESTUSER
with grant option

GO

-- Grant control over the schema
grant control on schema::HEDIS
to TESTUSER with grant option
GO


That's it. Let me know if anyone knows a different/better way to do it or if I missed some rights/permissions.

Data Warehouse Overview

In this post I want to briefly describe what I mean by ultra-light ultra-useful data warehouse.

Our data warehouse development effort came about because our internal IT has largely failed to create a truly easy-to-use integrated data warehouse. We have numerous systems spanning patient encounters, claims, referrals, scheduling, prescriptions, labs, radiology, membership, CRM, provider credentialing, general ledger, payroll, HR and so on. We also have many different "data warehouses." We have such database/reporting technologies as Oracle, SQL Server, DB2 on Unix/mainframes, Sybase, and even a large Teradata complex with Chrystal Enterprise. Yet, in the end, reporting analysts were using Paradox and Access to cobble together integrated data sets from these source systems and existing "data warehouses" and e-mailing PDFs to the business users.

So we have created an externally-hosted SQL Server-based platform to remedy the problem. Here are its main characteristics:

1. Total database size: 480Gb
2. One production database server with SQL Server 2005 Enterprise (migrating to 2008), one production report server with Sharepoint integration, and one development DB server.
3. ETLs use Integration Services and stored procedures.
4. 2.5 FTEs manage the entire solution, excluding server and software management, which is done by the hosting provider.
5. RAD methodology is used to build dimensional models and ETLs quickly and incrementally. All ETLs are framework-driven and highly automated.

In the end, the degree of integration and performance we have achieved makes this solution highly useful, as it quickly eliminated the need for Paradox-based data integration and reporting, provided consistency among analysts, delivered Web-based self-service reports, and provided a way for a rapid response to business needs. The fact that we are building this on a SQL Server platform with 2.5 FTEs and an external hosting vendor makes this an ultra-light solution in terms of the cost to the organization.

In the upcoming posts I will document our ongoing issues and lessons learned, as well as review some interesting designs, such as our highly automated single-code-based "uber-ETL" process, technical server architecture with disk layouts, and so on.

Stay tuned.