Visualization and Analysis of Open Source Software Evolution using An Evolution Curve Method - PowerPoint PPT Presentation

1 / 24
About This Presentation

Visualization and Analysis of Open Source Software Evolution using An Evolution Curve Method


Visualization and Analysis of Open Source Software Evolution using An Evolution Curve Method Dr. Robertas Dama evi ius Software Engineering Department, – PowerPoint PPT presentation

Number of Views:230
Avg rating:3.0/5.0
Slides: 25
Provided by: ktu84


Transcript and Presenter's Notes

Title: Visualization and Analysis of Open Source Software Evolution using An Evolution Curve Method

Visualization and Analysis of Open Source
Software Evolution using An Evolution Curve Method
  • Dr. Robertas Damaševicius
  • Software Engineering Department,
  • Kaunas University of Technology
  • Studentu 50-415, Kaunas, Lithuania
  • Email
  • http//

Context and Problem
  • Software systems are
  • designed, constructed and used by people
  • components in larger socio-technical systems
  • Software design is
  • a social process embedded within organizational
    and cultural structures
  • influenced by social processes such as programmer
    collaboration in teams
  • Open source software systems
  • Free to use
  • Free availability of source code
  • Developed by many programmers
  • Continuously evolve
  • Aim analysis of open source software evolution
    using metrics

What is software evolution?
  • Definition
  • a continuing process in time during which some
    essential software properties are changed
  • Activities
  • modification, adaptation, maintenance, and
  • other activities which occur after the delivery
    of the first operational release to the users
  • Importance
  • costs devoted to system maintenance and evolution
    account for more than 90 of total software costs
    (Erlikh, 1990)

Forces and factors of open source software
  • Evolution of open source systems
  • less strict control and management model
  • usually started by a single developer (seed)
  • attracted users become co-developers
  • governed by the needs of users and spontaneous
    collaboration of co-developers
  • Evolution mechanisms
  • natural selection, competition
  • variation-increasing variation-decreasing
  • influenced by psychological, intellectual, social
    and cultural, economic and business factors

Software metrics
  • Common
  • Source lines of code
  • Cyclomatic complexity
  • Halstead metrics
  • Number of classes and interfaces
  • R.C. Martins software package metrics
  • Cohesion, Coupling,
  • Specific software evolution metrics
  • SDI metric
  • Lmetric
  • AICC metric
  • G-metric
  • Software development models
  • Statistical models
  • Rayleigh model
  • Halsteads Software Science model
  • COCOMO model

Lehmans Laws of Software Evolution
  • Formulated by M.M. Lehman in the 1980s
  • Law of Continuing Change
  • Law of Increasing Complexity
  • Law of Statistically Smooth Growth
  • Law of Organisational Stability
  • Law of Conservation of Familiarity
  • Law of Continuing Growth
  • Law of Declining Quality
  • Law of Feedback System
  • Evolution forces
  • Growth
  • Maintenance

Transition-based model of evolution
  • Stages many, often overlapping
  • Transitions breakpoints between stages, which
    represent significant changes. Transitions occur
    because as a system evolves, its structure must
    be regularly adapted to the changing requirements
    and environment
  • Gradual change a slow process of incremental
    change caused by accumulating maintenance steps
    or gradual decay
  • Sudden change significant changes in the
    evolving system or in the process by which it is

Information-theoretic methods
  • Shannon entropy
  • A measure of the uncertainty associated with a
    random variable.
  • The information source generates a series of
    symbols xi belonging to an alphabet with size N
    according to a known probability distribution
    p(xi), the entropy function H of a sequence X can
    be defined
  • High entropy higher complexity of the systems
  • Low entropy there are some repeated patterns of
    source code code maintenance is required
  • Kolmogorov Complexity
  • Measures the complexity (i.e., information
    content) of an object by the length of the
    smallest program that generates it.
  • Kolmogorov Complexity Kf(x) of an object x in the
    description system f is the length of the
    shortest program capable of producing x

Evolution curve method (1)
  • Motivation the addition of new features to a
    software system leads to the change of basic
    software characteristics (complexity/entropy) in
    the system.
  • Idea use the change of software size and
    complexity as a means to determine different
    stages of evolution of a software system
  • Inspiration Z-curve1 and DNA walk2 methods used
    in analyzing complex genetic sequences

1 R. Zhang, C.T. Zhang. Z Curves, an Intuitive
Tool for Visualizing and Analyzing DNA sequences.
J. Biomol. Struc. Dynamics 11, 767782, 1994. 2
S. Paxia, A. Rudra, Y. Zhou, B. Mishra. A Random
Walk down the Genomes DNA Evolution in VALIS.
IEEE Computer 35(7)73-79, 2002.
Evolution curve method (2)
  • E-curve is composed of a series of nodes
    , whose coordinates are and (i
    1,2,...,N), where N is the number of versions
    of the analyzed software system.
  • The nodes are connected sequentially with
    straight segments.
  • The coordinates and are calculated
  • is the Kolmogorov Complexity of the i-th
    version of a software system
  • is the Shannon entropy of the i-th version
    of a system

Evolution curve method (3)
  • Two dimensions of the Evolution curve
  • x (relative information content) and
  • y (relative complexity),
  • Represent two independent (orthogonal)
    characteristics of a software system
  • x-dimension amount of information contained in a
    software system and is an estimation of software
  • y-dimension information entropy of a software
    system and is an estimation of software

Software evolution stages
  • Software Growth system is actively developed
  • Software Maintenance system becomes simpler
    often at a cost of its size
  • Software Improvement system becomes more complex
    and generic
  • Software Shrink functionality of a system is

Trends of Evolution curve
  • Actively developed systems long upward trends of
  • Mature, stable systems long downward trends of

Case studies
  • Source SourceForge
  • 7-zip
  • Archiver
  • 82 versions, 5 years, 160K LOC
  • Grip
  • CD player/ripper
  • 36 versions, 14K LOC
  • eMule
  • P2P file sharing client

Case study eMule
  • eMule
  • one of the biggest P2P file sharing clients
  • coded in Microsoft Visual C using MFC
  • Free software, released under the GNU GPL
  • Source code first released at version 0.02 on
    July 6, 2002
  • Latest release contains 222,680 lines of code
  • Actively developed by 5 developers
  • Current development status is Production/Stable
  • For analysis, 68 versions of eMule source code
    were used

eMule Entropy
Version 015a
Version 030a
Version 018a
eMule Size
y A Bx Cx2 A 7676.17 B 4324.67 C
177.488 r 0.9935
eMules Evolution curve
What does the changelog say?
  • Software evolution process can be divided into 4
  • software growth the size and complexity of
    developed software is increasing
  • software maintenance the aim is to contain
    complexity and fix software bugs
  • software improvement the aim is to contain
    software system size at a cost of increasing
  • software shrink both software size and its
    complexity is trimmed
  • Evolution curve method can
  • identify software evolution stages
  • identify the initial development status of the
    analyzed software system
  • actively developed systems show long growth
  • mature systems show maintenance and improvement
  • Is independent from software implementation

Ongoing Research and Further Work
  • Analysis of other entropy measures such as block
    entropy and Rényi entropies
  • paper submitted to Journal of Software
    Maintenance and Evolution
  • Dynamic models of software evolution
  • Differential equations, etc.
  • More case studies
  • paper submitted to Computing and Information
    Systems Journal

Thank You.Any Questions?
7-zip Evolution curve
Grip Evolution curve
Write a Comment
User Comments (0)