SUDS LOGO
Shared Urban Data System (SUDS)

Mark L. Gillenson, Ph.D.
Professor of Management Information Systems

August 24, 2004

The Importance of Data Today

Take a drive through any modern U.S. city and think for a moment about how much of what you see is supported by data and the information systems that store and process it. Automobiles are not only designed and manufactured with information systems but also contain computers as critical operating components. Retail stores collect data and use information systems to manage their inventories and track their sales. The ubiquitous cell phones that we carry depend on computer-enabled networks to function. Indeed, every aspect of modern commerce, from mundane operations like bookkeeping and payroll in companies of every size and description, to more exotic transactions like Internet sales, are based on data and the information systems that support it.

Governments, too, at all levels use data and information systems. In fact, the beginning of the era of modern "data processing" can be traced to the introduction of punched-card data tabulation equipment by the U.S. Census Bureau in the 1890's. At the federal level, today, it would be hard to imagine the social security apparatus functioning without its data and computers. At the state level, drivers license and motor vehicle registration systems are all highly automated. Even at the local level, such requirements of modern living as property tax collections and city personnel management require data automation.

Focusing in on the local level, we find today's municipal and county governmental bodies stretched increasingly thin in terms of the services they can reasonably provide. Some local government obligations, such as basic school, police, and fire services, have to take priority. Social services and even non-critical activities in the basic services often take a back seat. At the same time, new demands continue to flow from the populace, as well as from federal, state, and judicial mandates. However, it is often difficult, if not impossible, to generate new revenue to keep up with existing obligations, much less meet these new demands (and even basic services are subject to cuts in times of revenue difficulties.) As a separate issue, governmental bodies at all levels are notoriously deficient at coordinating their activities and resources, including their data, to gain mutually beneficial efficiencies. Indeed, government could learn a lot from Wal-Mart and its suppliers, if only the revenues, incentives, and release from the pressures of other obligations were there.

The need for more and better local government services is clear and present. Providing better health and educational services to disadvantaged children; helping adults gain new skills through targeted educational opportunities and other support such as day care for their children; enhancing emergency services to improve performance and coordination; the list is both extensive and compelling. The question that we pose is whether today's technologies can be brought to bear on these issues without necessitating unacceptable financial demands and without requiring the development of major new government organizations.

We believe that the time has come to apply modern data resource and information systems techniques to these pressing social problems. Furthermore, with targeted external funding providing the resources, we believe that a creative solution to the infrastructural problems standing in the way of such help is for local state universities to serve as agency-neutral facilitators and coordinators of such efforts. With these as our beliefs and this as our goal, the University of Memphis' Center for Community Criminology & Research (CCCR) created the Shared Urban Data System (SUDS). SUDS is the information and data engine of CCCR. Its detailed data, analysis, and report generation features hold the promise of helping to solve the community's pressing social problems.

One of the key concepts of SUDS is data sharing. Since data sharing plays such a central role in the SUDS environment, we shall discuss it before proceeding on to describe SUDS in more detail.

Sharing Data

When several applications or users in an information systems environment require particular data, at a very basic level there are two choices for organizing it. One choice is to have copies of the data stored in different files or databases that each support one or several of the applications or users. The other choice is to have the data stored in one database and to have all of the applications or users share it. This is by no means a trivial choice within a single company, much less in a multi-entity environment such as that of multiple local governments and government agencies, where it is certainly more challenging.

Historically, database management systems vendors and professionals have been preaching the benefits of shared data since the early 1970s. More recently, beginning in the 1990s, Enterprise Resource Planning (ERP) systems vendors, such as SAP, Peoplesoft, and others, have built systems and promoted system philosophies that are very much grounded in the idea of a central, shared database. The question here is, what are the benefits of sharing data in general and in the local government environment, in particular? An additional question is whether there are any impediments to doing so.

Benefits

The benefits of sharing data can be viewed from both the technical and practical points of view. We shall examine both.

Technical

As noted above, data in a non-shared environment inevitably is stored redundantly in the databases of each application that requires it. There are three essential problems with redundant data. First, it obviously takes up more storage space than if it is stored in just one place and is shared by all applications that need it. Second, when the data changes, time and effort must be expended to update it in every place that it is stored. Third, if for any reason, when the redundant data is updated it is updated correctly in one database but not updated correctly or not updated at all in another database, then the data is inconsistent across the databases. This eventuality, which is known as a data integrity problem, can happen within the databases of a single local government or government agency. However, it is especially likely to happen across multiple local governments or government agencies due to varying levels of sophistication, priorities, and even knowledge that changes to the data have taken place.

When data is shared, it is, in the optimum case, stored in one place and is accessible to all who require it and have a legitimate need-to-know (if the data is not public.) This greatly reduces the storage space required for the data and it dramatically reduces the effort needed to update it. Furthermore, if the data is stored only once, it eliminates the potential data integrity problems caused by errors or omissions in updating the same data held in multiple databases.

Another issue regarding shared versus redundant data is data security for sensitive data. The more places that sensitive data is stored redundantly, the more opportunities there are for hackers to compromise it. Of course, the other side of the coin is the assumption that if the sensitive data is stored in one place and is shared, the security systems in use at the storage location are trustworthy and robust enough to truly protect the shared data.

Practical

In a sense, it is difficult to separate the technical from the practical benefits. After all, isn't improved data security a "practical" benefit? The approach that we take here is to show how the application of the foregoing technical benefits of shared data adds value to the lives and work of the following constituencies.

To the Policy Maker: The classic problem that corporate executives have with redundant data - and this certainly applies to public policy makers, as well - is that of trust. The all too common scenario is that of the meeting in which several people make presentations to corporate executives. The presentations should all be based on a common set of figures but it turns out that at least some of the presentations are based on figures that are not consistent with those of the others. How could that happen? Simple! The presentations were based on redundant data occurring in different databases that were not consistently updated. How can a policy maker make an informed decision if he or she doesn't know which set of data to believe?

To the Worker: The IT worker benefits from not having to maintain multiple redundant databases. The workers who are the users of the data benefit from knowing that the data they are working with is accurate and is the same data that their counterparts in other agencies are using.

To the Citizen: The citizen benefits in several ways. First, if the policy makers are working with accurate, consistent data, then hopefully the policies they make will be of greater benefits to the citizens. Second, there will be cost savings across local governments and government agencies in not having to maintain and update the same data in multiple databases. Third, the improved security of a single copy of the data being stored in a highly secure IT environment is important in this era of data proliferation.

Impediments

There are some impediments to sharing data which, while not insurmountable, must be understood.

Application

The data owners, in this case the local governments and government agencies, must be willing to give up supervisory control of the data. Further, they must believe and trust that the people and equipment at the shared data site are up to the task of properly managing the data. For non-public data, there must be a process in place to rule on future requests for access to the data.

Technical

The shared data facility must be capable of providing proper data security. It must be capable of providing access to the data to all who have permission to access it. Furthermore, it must provide access at a performance level that is acceptable to all of the users. The absence of any of these capabilities must be considered a serious impediment to the shared data effort.

Data Integration

When diverse data is collected and stored in files in a database, it can also be integrated. Data integration refers to the linking of related data from different files or databases. For example, consider the following files:

· A file showing basic demographics, such as population, income levels, educational levels, head of household characteristics including employment status, number of children in a household, etc., all keyed to zip code or census tract.

· A file showing the location of day care centers, including zip code or census tract.

· A file showing the location of health care facilities including zip code or census tract.

These files can be integrated by matching the common zip codes or census tract numbers found in the records in the different files. By doing so, one might discover that low income neighborhoods with relatively large numbers of children per household are over-served, properly served, or underserved by day care centers or by health care facilities.

The Shared Urban Data System (SUDS)

The Shared Urban Data System (SUDS) is an ongoing project of the University of Memphis' Center for Community Criminology and Research (CCCR), which is housed in the university's Department of Criminology. It is intended to provide the benefits of data sharing and broad data access to local government and government agencies in the Memphis metropolitan area.

The SUDS Concept

SUDS is a Web-based information security portal and shared database that resides on the CCCR's network server. It provides secure Internet access to community-wide data in the form of queries and maps. SUDS provides multiple databases, some of which are publicly available and some of which are private and password protected, analysis tools, and cross-platform support from a single data management console. SUDS information integration technology provides secure, read-only access to a collection of data intended to answer research questions and provide tracking of social, demographic, crime, and health trends. Goals of the system include:

· To present users with better information.
· To improve accessibility to integrated data.
· To provide state-of-the-art analytical tools such as GIS to data users.
· To enable informed decision-making and research.
· To improve data consistency from one system to another.
· To improve data quality through timely updating, standardized formatting, and maintenance of high quality metadata.

The SUDS Project Team

The SUDS project is housed within the University of Memphis' Department of Criminology & Criminal Justice and its Center for Community Criminology & Research (CCCR). It is led by Department Chair Prof. Richard Janikowski and by its director, Ms. Cindy Martin. The CCCR staff that the SUDS project can draw from includes:

Research Staff:
· Statisticians
· Survey Specialists
· Social Science Researchers
· Graduate Research Assistants

Technical Staff:
· Database Coordinator / Network Specialist
· Database Programmers
· Geographic Information System (GIS) Programmers
· GIS Technicians
· Graphics and Web Page Designers
· Consulting Faculty
· Graduate Research Assistants (technically experienced and serve as assistant GIS technicians, assistant database programmers, and assistant web page developers.)

The Availability and Accessibility of Data

One of the tremendous advantages of SUDS is its ability to make otherwise inaccessible data available to the public or to a select set of private users. The simple fact is that many of the local governments and government agencies in a metropolitan area, and Shelby County is no exception, do not have the technical capacity to put their data up on a Web site with or without security controls. In many cases such data does not even exist in a digital form and it literally sits in a filing cabinet. SUDS serves as a community resource through which this data can be made available and accessible through the Internet.

The Neutrality of the University Setting

An urban, state university, such as the University of Memphis, is the perfect, neutral setting for a database that is shared by local governments and government agencies, for several reasons. As an arm of the state government, the university should be interested in helping local governments and may even consider that it has a mandate to do so. Also, as an arm of the state government, the university should be interested in social issues concerning the citizens in its region. The university has no incentive to favor one local government or government agency over another. There is no profit motive. The university has significant technical and personnel resources to draw upon for such an effort, assuming that the effort is funded by grants.

SUDS Projects and Grants

The SUDS Public Portal

SUDS currently has three publicly available databases on its Web site. The first of these is the First Years database. Created for the Community Institute for Early Childhood, it is a "compilation of services offered to expectant mothers and young children through 5 years of age in Shelby County." Services listed include:

· Childcare
· Prenatal services
· Advocacy programs
· Nutritional programs
· Recreational outlets
· Cultural outlets
· Safety programs
· Violence prevention/intervention programs
· Health services
· Household and baby goods
· Parenting skills/child development education
· School-based programs

Information on these topics is available in the form of directories, maps, and research-based reports. The maps include overlays showing the locations of bus routes, schools, clinics, hospitals, landmarks, and parks.

The second publicly available database is the Youth Resources Information System. This database includes a wide range of lists, maps, and facts in support of youth, including:

· Employment services and career counseling
· Faith organizations
· Housing and neighborhood initiatives information
· Leisure activities
· Medical information including hospitals, clinics, early childhood health and nutrition programs, and substance abuse and treatment programs.
· Public safety and government programs, including early childhood safety and violence prevention programs.
· A variety of schools and educational programs · A wide variety of service programs including parent groups, substance abuse programs, thrift shops, and volunteer agencies.

It also includes links to the Web sites of the City of Memphis Schools, Shelby County Schools, the University of Memphis, and Southwest Tennessee Community College.

The third publicly available SUDS database is the Digital Geospatial Metadata System (DGMS.) In effect, this "data about the data" is an index to all of the detailed data SUDS has collected in its various databases.

SUDS Reports

A variety of reports are available through the SUDS portal, some of which are public and others of which are private and secure. Currently, the public reports fall into three main categories: early childhood, selected local communities, and general interest.

SUDS Maps

A number of maps are available through the SUDS portal, using its GIS capability. The diverse subject matter of the various maps includes TennCare enrollment information, police patrol districts, various population demographics, housing data, and park locations.

SUDS Future

Data Expansion

SUDS will continue to expand as local governments and government agencies learn about its capabilities and advantages, and ask to participate. In addition, SUDS has data sharing arrangements with state agencies, including the Tennessee Department of Children's Services, the Tennessee Department of Health, and the Tennessee Healthy Weight Network. These agreements will greatly expand the secure databases in SUDS while providing historical data and current baseline data for research and data analysis. SUDS is further expanding with data at the federal level with the U.S. Census Bureau (see below.)

U.S. Census Pilot Project

The U.S. Census Bureau maintains an extensive, Web-accessible database of its massive decennial census files. The system includes a data retrieval tool called Data Ferrett. The Census Bureau is interested in developing a network of local nodes with detailed, local data that will be accessible through its system. Through an agreement with the Brookings Institution and the Census Bureau, the University of Memphis and SUDS will become a pilot site to incorporate SUDS' local administrative and research data into the Census Bureau's nationwide system. Through this new agreement, Data Ferrett will become available for use with our local data, providing state-of-the-art technology for researchers and policy makers.

Extended Data Integration

The prospects for research and developing new knowledge through data integration in this environment are boundless. Just being able to integrate diverse local data is exciting enough. Being able to integrate local data with state and federal data through the Census Bureau's nationwide system adds yet another very valuable dimension to the environment. Furthermore, the day may come when we are able to make integrated data comparisons between Memphis data and data pertaining to other cities that are also tied into the Census Bureau's system.