Title: The Case for Software Infrastructure Maintenance
1The Case for SoftwareInfrastructure Maintenance
- Jim Horning
- Chief Scientist
- Information Systems Security Operation
- SPARTA, Inc.
- Sonoma State University, November 13, 2008
2Overview
- Definitions
- Some ancient history
- Some recent history
- Maintenance of civil infrastructures
- Maintenance of software
- Two things that are not software maintenance
- SCADA
- A final puzzle for you
- References
3Infrastructure
- An underlying base or foundation especially for
an organization or system. - The basic public works of a city or subdivision,
including roads, bridges, sewer and water
systems, drainage systems, and essential public
utilities. - The roads, bridges, rail lines, and similar
public works that are required for an industrial
economy, or a portion of it, to function. - Throughout history, infrastructure systems and
services have continuously evolved in both
technology and organization. Indeed, in many
instances, social scientists measure the level of
civilization or advancements of a society on the
basis of the richness and articulation of its
infrastructure systems. One can easily
distinguish at least fifty systems and subsystems
that constitute a city's infrastructure, ranging
from large-scale transportation and water
projects to neighborhood medical clinics and
libraries. - A computer system's infrastructure would include
the hardware, the operating system, database
management system, communications protocols,
compilers and other development toolsmore
generally, any element implicitly relied on in
the provision of a service.
4Maintenance
- The work of keeping something in proper
condition upkeep. - Accounting Periodic expenditures undertaken to
preserve or retain an asset's operational status
for its originally intended use. - Military The routine recurring work required to
keep a facility in such condition that it may be
continuously used, at its original or designed
capacity and efficiency for its intended purpose.
Includes inspection, testing, classification as
to serviceability, adjustment, servicing,
recovery, evacuation, repair, overhaul, and
modification. - Software The recurring updating of programs in
order to continue to operate as intended in a
changing environment.
5Ancient historyKey Roman Infrastructures
- Roads
- Agriculture and food stores
- Aqueducts
- Photo from Assante
6Timeline of Roman aqueducts Assante
7Lack of maintenance Assante
8Recent historyCivil infrastructures
- Much has been said about the neglect and
consequent deterioration of Americas civil
infrastructurethe publicly financed or regulated
structures and facilities that support essential
functions such as transportation (land, water,
and air), water supply and wastewater treatment,
power, and waste disposal. - There have been many costly infrastructure
failures that could have been prevented by timely
maintenance. - American engineers have been warning about
under-investment in infrastructure maintenance
for at least a quarter-century (e.g., America in
Ruins The Decaying Infrastructure, 1983). - But less has been done than said.
9- New Orleans afterHurricane Katrina
10Hurricane Katrina, Aug. 29, 2005
- Cascading problems
- Wind
- High water
- Levees collapsed
- Massive flooding
- Electricity lost
- Pumps failed
- Telephones largely failed
- Water and sewer systems largely failed
- Hospitals, schools, police, transportation,
libraries, banks, - Each collapsed infrastructure made restoring
others harder - Over 1.5 K dead
- Over 100 G in Federal aid alone
- Over 100 K trapped in city during storm over 250
K refugees - Complete recovery may take 20 years
11Interstate 35W bridge collapse, Aug. 1, 2007
New York Times photo
12(No Transcript)
13Interstate 35W bridge collapse, Aug. 1, 2007
- Multiple causes
- Faulty design
- Gusset plates were too thin for design load (½
instead of 1) - Structure was fracture critical
- Inspection two years prior failed to recognize
gusset plate buckling that was visible in
photographs - Deferred maintenance (rated in poor condition
for 17 straight years) - Bridge overloaded with construction equipment and
materials - 13 killed, 145 injured
- 38 M compensation package for victims
- Expedited replacement of bridge cost 400 M
- Replacement had been scheduled for 2020-25
- See http//www.transportation.org/sites/bridges/do
cs/I-3520Bridge20Collapse20and20Response.pdf
for details and many graphic photos
14My argument
- Civilization and infrastructure are intimately
intertwined. - Rising civilizations build and benefit from their
infrastructures in a virtuous cycle. - As civilizations decline, their infrastructures
decay.
15- Dependence on critical infrastructures is
increasing globally. - This is true not only of information systems and
network services, but also of many others that we
rely on for our livelihoods and well-being. - These critical infrastructures are becoming more
interrelated, and more heavily dependent on
information technology. - People demand ever more and better services, but
understand ever less about what it takes to
provide those services.
16- The failure of a critical infrastructure can
cascade into others. - The very synergies among infrastructures that
allow progress to accelerate are a source of
positive feedback, allowing initial failures to
escalate into much larger long-term problems
involving many different infrastructures. - Remediating after a collapse often involves many
secondary costs that were not foreseen. - The more different infrastructures that fail
concurrently, the more difficult it becomes to
restore service in any of them. - Restoring a lost ecosystem generally costs much
more than the sum of the costs of restoring each
element separately.
17The maintenance trade-off
- Engineers know that physical infrastructures
decay without regular maintenance, and they
prepare for aging (e.g., corrosion and erosion)
that requires inspections and repairs. - Proper maintenance is generally the cheapest form
of insurance against failures. - With rare exceptions, such as spacecraft, where
its not feasible. - However, it has a definite present cost that must
be balanced against the unknown future cost of
possible failures.
18Software maintenance
- Although computer software does not erode or
corrode, it is subject to incompatibilities and
failures caused by changing environments,
changing user practices, and changes in
underlying hardware and software. - Therefore, it requires maintenance.
- Yet the costs of software maintenance are often
ignored in the planning, design, construction,
and operation of critical systems. - Incremental upgrades to software are error-prone
and complicate maintenance.
19Software maintenance examples
- Y2K
- In the 60s it seemed perfectly reasonable to use
two digits in dates to encode the year. - Who knew the COBOL software would still be used
in 00? - Global Positioning System satellite 32
- In the November 2008 issue of BoatU.S. magazine,
there's a reference to a new GPS satellite being
switched on. It uses the identifier PRN 32,
which causes some Northstar GPS units to become
confused and shut down. Fortunately, there are
firmware updates available, though in some cases
they cost money. Unfortunately, most boaters
wouldn't know a firmware update if they hooked
one, so there will undoubtedly be accidents and
other problems, and GPS units acting flakey
(they only crash when that particular satellite
is in view).
20 Two things that I dont callSoftware
Maintenance
- Adding new functionality This is Software
Extension. - Adding a new wing to a building is not
maintenance. - Patching bugs This is just Belated Quality
Assurance (BQA). - November 11, 2008 (IDG News Service) Some
security patches take timeseven and a half
years, in fact, if you count the time it's taken
Microsoft Corp. to patch a security issue in its
SMB (Server Message Block) service, which was
fixed Tuesday. This software is used by Windows
to share files and print documents over a
network. - In a blog posting, Microsoft acknowledged that
Public tools, including a Metasploit module, are
available to perform this attack. Metasploit is
an open-source tool kit used by hackers and
security professionals to build attack code.
According to Metasploit, the flaw goes back to
March 2001, when a hacker named Josh Buchbinder
(a.k.a. Sir Dystic) published code showing how
the attack worked. Ben Greenbaum, research
manager at Symantec Corp., said the flaw may have
first been disclosed at Defcon 2000, by Christien
Rioux (a.k.a. Dildog), chief scientist at
Veracode Inc. - Whoever discovered the flaw, Microsoft seems to
have taken an unusually long time to fix it.
21Neglecting maintenance
- Creating maintainable systems is difficult and
requires significant foresight, appropriate
budgets, and skilled individuals. - Neglect is the inertially easy path maintenance
requires recurring effort, talent, and funding. - But appropriate investments in maintenanceand in
maintainability could yield enormouslong-term
benefits, through reliability, robustness against
attack, ease of use, and adaptability to new
needs.
22 Supervisory Control and Data Acquisition Systems
- SCADA refers to a system that collects data from
various sensors at a factory, plant, or other
remote location and then sends it to a computer
system that uses the data to manage and control a
device, a facility, or a collection of
facilities. - SCADA is used broadly to describe control and
management solutions in a wide range of
industries, including Water Management Systems,
Electric Power, Traffic Signals, Mass Transit
Systems, Environmental Control Systems, and
Manufacturing Systems. - This is where software and civil infrastructure
meet(or collide). - Virtually all modern SCADA systems are controlled
by software. - For operational efficiency, more and more SCADA
systems are being connected to the Internet.
23(In)security
- Insecure networked computers provide vandals easy
access to the Internet, where spam,
denial-of-service attacks, and botnet acquisition
and control constitute an increasing fraction of
all traffic. - They directly threaten the viability of one of
our most critical modern infrastructures (the
Internet), and indirectly threaten all the
infrastructures connected to it via SCADA. - Although many technological advances are
emerging in the research community, those that
relate to critical systems seem to be of less
interest to the commercial development
community. Risks in Retrospect Comm. ACM,
July 2000 - Our networked computers, in turn, depend on
various other critical infrastructures
electricity, telecommunications,
24A final puzzle for you
- Why do tomorrows software engineers receive so
little education about - designing for maintainability,
- preparing for software aging,
- maintaining legacy software, and
- knowing when and how to terminate decrepit legacy
software systems?
25 To Dig Deeper Civil Infrastructures
- Infrastructure Protection in the Ancient World,
Michael J. Assante,http//www.inl.gov/nationalsec
urity/energysecurity/d/infrastructure_protection_i
n_the_ancient_world.pdf - America in Ruins The Decaying Infrastructure,
Pat Choate and Susan Walker, Duke University
Press, 1983. - Cities and Their Vital Systems Infrastructure
Past, Present, and Future, Jesse H. Ausubel and
Robert Herman (eds.), National Academies Press,
1988. - Civil Engineering Public Works/Infrastructure,
Library of Congress, 1991.http//www.loc.gov/rr/s
citech/tracer-bullets/civilengtb.html - America's Ailing Cities Fiscal Health and the
Design of Urban Policy, Helen F. Ladd and John
Linger, Johns Hopkins University Press, 1991. - It's Time to Rebuild America, Felix G. Rohatyn
and Warren Rudman, Washington Post, Dec. 13,
2005. - The Decaying Infrastructure of Complex Society,
2007.http//deconstructingthemanifest.blogspot.co
m/search/label/Complex20Society - 4 Things the Roman Aqueducts Can Teach Us About
Securing the Power Grid, Michael Assante and Mark
Weatherford, CSO Security and Risk, 2005.
http//www.csoonline.com/article/217014
26To Dig Deeper Software
- International Conference on Software Maintenance
(ICSM) - http//www.icsm2008.org
- European Conference on Software Maintenance and
Reengineering (CSRM) - http//www.csmr2008.uwaterloo.ca/
- Risks of Neglecting Infrastructure, Jim Horning
and Peter Neumann - http//www.csl.sri.com/users/neumann/insiderisks08
.html - Communications of the ACM, Inside Risks
- http//www.csl.sri.com/users/neumann/insiderisks.h
tml - Confessions of a Used Program Salesman
Institutionalizing Software Reuse, Will Tracz,
Addison Wesley Longman, 1995. - Risks Digest
- http//www.risks.org
- Computer-Related Risks, Peter G. Neumann,
Addison-Wesley/ACM Press, 1995. - Illustrative Risks to the Public in the Use of
Computer Systems and Related Technology - http//www.csl.sri.com/users/neumann/illustrative.
html