|
Mark L. Gillenson, Ph.D.
Professor of Management Information Systems
August 24, 2004
The Importance of Data Today
Take a drive through any modern U.S. city and think for a moment about how much of what you see is supported by data and
the information systems that store and process it. Automobiles are not only designed and manufactured with information systems
but also contain computers as critical operating components. Retail stores collect data and use information systems to manage
their inventories and track their sales. The ubiquitous cell phones that we carry depend on computer-enabled networks to function.
Indeed, every aspect of modern commerce, from mundane operations like bookkeeping and payroll in companies of every size
and description, to more exotic transactions like Internet sales, are based on data and the information systems that support it.
Governments, too, at all levels use data and information systems. In fact, the beginning of the era of modern "data processing"
can be traced to the introduction of punched-card data tabulation equipment by the U.S. Census Bureau in the 1890's. At the
federal level, today, it would be hard to imagine the social security apparatus functioning without its data and computers.
At the state level, drivers license and motor vehicle registration systems are all highly automated. Even at the local level,
such requirements of modern living as property tax collections and city personnel management require data automation.
Focusing in on the local level, we find today's municipal and county governmental bodies stretched increasingly thin in terms
of the services they can reasonably provide. Some local government obligations, such as basic school, police, and fire services,
have to take priority. Social services and even non-critical activities in the basic services often take a back seat. At the same
time, new demands continue to flow from the populace, as well as from federal, state, and judicial mandates. However, it is often
difficult, if not impossible, to generate new revenue to keep up with existing obligations, much less meet these new demands
(and even basic services are subject to cuts in times of revenue difficulties.) As a separate issue, governmental bodies at all
levels are notoriously deficient at coordinating their activities and resources, including their data, to gain mutually beneficial
efficiencies. Indeed, government could learn a lot from Wal-Mart and its suppliers, if only the revenues, incentives, and release
from the pressures of other obligations were there.
The need for more and better local government services is clear and present. Providing better health and educational services
to disadvantaged children; helping adults gain new skills through targeted educational opportunities and other support such as
day care for their children; enhancing emergency services to improve performance and coordination; the list is both extensive
and compelling. The question that we pose is whether today's technologies can be brought to bear on these issues without
necessitating unacceptable financial demands and without requiring the development of major new government organizations.
We believe that the time has come to apply modern data resource and information systems techniques to these pressing social
problems. Furthermore, with targeted external funding providing the resources, we believe that a creative solution to the
infrastructural problems standing in the way of such help is for local state universities to serve as agency-neutral facilitators
and coordinators of such efforts. With these as our beliefs and this as our goal, the University of Memphis' Center for Community
Criminology & Research (CCCR) created the Shared Urban Data System (SUDS). SUDS is the information and data engine of
CCCR. Its detailed data, analysis, and report generation features hold the promise of helping to solve the community's pressing
social problems.
One of the key concepts of SUDS is data sharing. Since data sharing plays such a central role in the SUDS environment, we
shall discuss it before proceeding on to describe SUDS in more detail.
Sharing Data
When several applications or users in an information systems environment require particular data, at a very basic level there
are two choices for organizing it. One choice is to have copies of the data stored in different files or databases that each support
one or several of the applications or users. The other choice is to have the data stored in one database and to have all of the
applications or users share it. This is by no means a trivial choice within a single company, much less in a multi-entity environment
such as that of multiple local governments and government agencies, where it is certainly more challenging.
Historically, database management systems vendors and professionals have been preaching the benefits of shared data since the
early 1970s. More recently, beginning in the 1990s, Enterprise Resource Planning (ERP) systems vendors, such as SAP, Peoplesoft,
and others, have built systems and promoted system philosophies that are very much grounded in the idea of a central, shared
database. The question here is, what are the benefits of sharing data in general and in the local government environment, in
particular? An additional question is whether there are any impediments to doing so.
Benefits
The benefits of sharing data can be viewed from both the technical and practical points of view. We shall examine both.
Technical
As noted above, data in a non-shared environment inevitably is stored redundantly in the databases of each application that
requires it. There are three essential problems with redundant data. First, it obviously takes up more storage space than if it is
stored in just one place and is shared by all applications that need it. Second, when the data changes, time and effort must be
expended to update it in every place that it is stored. Third, if for any reason, when the redundant data is updated it is updated
correctly in one database but not updated correctly or not updated at all in another database, then the data is inconsistent across
the databases. This eventuality, which is known as a data integrity problem, can happen within the databases of a single local
government or government agency. However, it is especially likely to happen across multiple local governments or government
agencies due to varying levels of sophistication, priorities, and even knowledge that changes to the data have taken place.
When data is shared, it is, in the optimum case, stored in one place and is accessible to all who require it and have a legitimate
need-to-know (if the data is not public.) This greatly reduces the storage space required for the data and it dramatically reduces
the effort needed to update it. Furthermore, if the data is stored only once, it eliminates the potential data integrity problems
caused by errors or omissions in updating the same data held in multiple databases.
Another issue regarding shared versus redundant data is data security for sensitive data. The more places that sensitive data
is stored redundantly, the more opportunities there are for hackers to compromise it. Of course, the other side of the coin is
the assumption that if the sensitive data is stored in one place and is shared, the security systems in use at the storage location
are trustworthy and robust enough to truly protect the shared data.
Practical
In a sense, it is difficult to separate the technical from the practical benefits. After all, isn't improved data security a "practical"
benefit? The approach that we take here is to show how the application of the foregoing technical benefits of shared data adds
value to the lives and work of the following constituencies.
To the Policy Maker: The classic problem that corporate executives have with redundant data - and this certainly applies
to public policy makers, as well - is that of trust. The all too common scenario is that of the meeting in which several people make
presentations to corporate executives. The presentations should all be based on a common set of figures but it turns out that at
least some of the presentations are based on figures that are not consistent with those of the others. How could that happen?
Simple! The presentations were based on redundant data occurring in different databases that were not consistently updated.
How can a policy maker make an informed decision if he or she doesn't know which set of data to believe?
To the Worker: The IT worker benefits from not having to maintain multiple redundant databases. The workers who
are the users of the data benefit from knowing that the data they are working with is accurate and is the same data that their
counterparts in other agencies are using.
To the Citizen: The citizen benefits in several ways. First, if the policy makers are working with accurate, consistent
data, then hopefully the policies they make will be of greater benefits to the citizens. Second, there will be cost savings across
local governments and government agencies in not having to maintain and update the same data in multiple databases. Third,
the improved security of a single copy of the data being stored in a highly secure IT environment is important in this era of data
proliferation.
Impediments
There are some impediments to sharing data which, while not insurmountable, must be understood.
Application
The data owners, in this case the local governments and government agencies, must be willing to give up supervisory control
of the data. Further, they must believe and trust that the people and equipment at the shared data site are up to the task of
properly managing the data. For non-public data, there must be a process in place to rule on future requests for access to the
data.
Technical
The shared data facility must be capable of providing proper data security. It must be capable of providing access to the data
to all who have permission to access it. Furthermore, it must provide access at a performance level that is acceptable to all of
the users. The absence of any of these capabilities must be considered a serious impediment to the shared data effort.
Data Integration
When diverse data is collected and stored in files in a database, it can also be integrated. Data integration refers to the linking of
related data from different files or databases. For example, consider the following files:
· A file showing basic demographics, such as population, income levels, educational levels, head of household characteristics
including employment status, number of children in a household, etc., all keyed to zip code or census tract.
· A file showing the location of day care centers, including zip code or census tract.
· A file showing the location of health care facilities including zip code or census tract.
These files can be integrated by matching the common zip codes or census tract numbers found in the records in the different files.
By doing so, one might discover that low income neighborhoods with relatively large numbers of children per household are
over-served, properly served, or underserved by day care centers or by health care facilities.
The Shared Urban Data System (SUDS)
The Shared Urban Data System (SUDS) is an ongoing project of the University of Memphis' Center for Community Criminology
and Research (CCCR), which is housed in the university's Department of Criminology. It is intended to provide the benefits of
data sharing and broad data access to local government and government agencies in the Memphis metropolitan area.
The SUDS Concept
SUDS is a Web-based information security portal and shared database that resides on the CCCR's network server. It provides
secure Internet access to community-wide data in the form of queries and maps. SUDS provides multiple databases, some of
which are publicly available and some of which are private and password protected, analysis tools, and cross-platform support
from a single data management console. SUDS information integration technology provides secure, read-only access to a
collection of data intended to answer research questions and provide tracking of social, demographic, crime, and health trends.
Goals of the system include:
· To present users with better information.
· To improve accessibility to integrated data.
· To provide state-of-the-art analytical tools such as GIS to data users.
· To enable informed decision-making and research.
· To improve data consistency from one system to another.
· To improve data quality through timely updating, standardized formatting, and maintenance of high quality metadata.
The SUDS Project Team
The SUDS project is housed within the University of Memphis' Department of Criminology & Criminal Justice and its Center
for Community Criminology & Research (CCCR). It is led by Department Chair Prof. Richard Janikowski and by its director,
Ms. Cindy Martin. The CCCR staff that the SUDS project can draw from includes:
Research Staff:
· Statisticians
· Survey Specialists
· Social Science Researchers
· Graduate Research Assistants
Technical Staff:
· Database Coordinator / Network Specialist
· Database Programmers
· Geographic Information System (GIS) Programmers
· GIS Technicians
· Graphics and Web Page Designers
· Consulting Faculty
· Graduate Research Assistants (technically experienced and serve as assistant GIS technicians, assistant database
programmers, and assistant web page developers.)
The Availability and Accessibility of Data
One of the tremendous advantages of SUDS is its ability to make otherwise inaccessible data available to the public or to
a select set of private users. The simple fact is that many of the local governments and government agencies in a
metropolitan area, and Shelby County is no exception, do not have the technical capacity to put their data up on a Web
site with or without security controls. In many cases such data does not even exist in a digital form and it literally sits
in a filing cabinet. SUDS serves as a community resource through which this data can be made available and accessible
through the Internet.
The Neutrality of the University Setting
An urban, state university, such as the University of Memphis, is the perfect, neutral setting for a database that is shared
by local governments and government agencies, for several reasons. As an arm of the state government, the university
should be interested in helping local governments and may even consider that it has a mandate to do so. Also, as an arm
of the state government, the university should be interested in social issues concerning the citizens in its region. The
university has no incentive to favor one local government or government agency over another. There is no profit motive.
The university has significant technical and personnel resources to draw upon for such an effort, assuming that the effort is
funded by grants.
SUDS Projects and Grants
The SUDS Public Portal
SUDS currently has three publicly available databases on its Web site. The first of these is the First Years database.
Created for the Community Institute for Early Childhood, it is a "compilation of services offered to expectant mothers
and young children through 5 years of age in Shelby County." Services listed include:
· Childcare
· Prenatal services
· Advocacy programs
· Nutritional programs
· Recreational outlets
· Cultural outlets
· Safety programs
· Violence prevention/intervention programs
· Health services
· Household and baby goods
· Parenting skills/child development education
· School-based programs
Information on these topics is available in the form of directories, maps, and research-based reports. The maps include
overlays showing the locations of bus routes, schools, clinics, hospitals, landmarks, and parks.
The second publicly available database is the Youth Resources Information System. This database includes a wide range
of lists, maps, and facts in support of youth, including:
· Employment services and career counseling
· Faith organizations
· Housing and neighborhood initiatives information
· Leisure activities
· Medical information including hospitals, clinics, early childhood health and nutrition programs, and substance abuse
and treatment programs.
· Public safety and government programs, including early childhood safety and violence prevention programs.
· A variety of schools and educational programs
· A wide variety of service programs including parent groups, substance abuse programs, thrift shops, and volunteer
agencies.
It also includes links to the Web sites of the City of Memphis Schools, Shelby County Schools, the University of Memphis, and
Southwest Tennessee Community College.
The third publicly available SUDS database is the Digital Geospatial Metadata System (DGMS.) In effect, this "data about
the data" is an index to all of the detailed data SUDS has collected in its various databases.
SUDS Reports
A variety of reports are available through the SUDS portal, some of which are public and others of which are private
and secure. Currently, the public reports fall into three main categories: early childhood, selected local communities,
and general interest.
SUDS Maps
A number of maps are available through the SUDS portal, using its GIS capability. The diverse subject matter of the
various maps includes TennCare enrollment information, police patrol districts, various population demographics, housing
data, and park locations.
SUDS Future
Data Expansion
SUDS will continue to expand as local governments and government agencies learn about its capabilities and advantages,
and ask to participate. In addition, SUDS has data sharing arrangements with state agencies, including the Tennessee
Department of Children's Services, the Tennessee Department of Health, and the Tennessee Healthy Weight Network.
These agreements will greatly expand the secure databases in SUDS while providing historical data and current baseline
data for research and data analysis. SUDS is further expanding with data at the federal level with the U.S. Census Bureau
(see below.)
U.S. Census Pilot Project
The U.S. Census Bureau maintains an extensive, Web-accessible database of its massive decennial census files. The
system includes a data retrieval tool called Data Ferrett. The Census Bureau is interested in developing a network of
local nodes with detailed, local data that will be accessible through its system. Through an agreement with the Brookings
Institution and the Census Bureau, the University of Memphis and SUDS will become a pilot site to incorporate SUDS' local
administrative and research data into the Census Bureau's nationwide system. Through this new agreement, Data Ferrett
will become available for use with our local data, providing state-of-the-art technology for researchers and policy makers.
Extended Data Integration
The prospects for research and developing new knowledge through data integration in this environment are boundless.
Just being able to integrate diverse local data is exciting enough. Being able to integrate local data with state and federal
data through the Census Bureau's nationwide system adds yet another very valuable dimension to the environment.
Furthermore, the day may come when we are able to make integrated data comparisons between Memphis data and
data pertaining to other cities that are also tied into the Census Bureau's system.
|