Data Preparation Process for Construction Knowledge Generation through Knowledge Discovery in Databases
- 1 January 2002
- journal article
- research article
- Published by American Society of Civil Engineers (ASCE) in Journal of Computing in Civil Engineering
- Vol. 16 (1), 39-48
- https://doi.org/10.1061/(asce)0887-3801(2002)16:1(39)
Abstract
As the construction industry is adapting to new computer technologies in terms of hardware and software, computerized construction data are becoming increasingly available. The explosive growth of many business, government, and scientific databases has begun to far outpace our ability to interpret and digest the data. Such volumes of data clearly overwhelm the traditional methods of data analysis such as spreadsheets and ad-hoc queries. The traditional methods can create informative reports from data, but cannot analyze the contents of those reports. A significant need exists for a new generation of techniques and tools with the ability to automatically assist humans in analyzing the mountains of data for useful knowledge. Knowledge discovery in databases (KDD) and data mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so that the construction manager may analyze the large amount of construction project data. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. This paper presents the necessary steps such as (1) identification of problems, (2) data preparation, (3) data mining, (4) data analysis, and (5) refinement process required for the implementation of KDD. In order to test the feasibility of the proposed approach, a prototype of the KDD system was developed and tested with a construction management database, RMS (Resident Management System), provided by the U. S. Corps of Engineers. In this paper, the KDD process was applied to identify the cause(s) of construction activity delays. However, its possible applications can be extended to identify cause(s) of cost overrun and quality control/assurance among other construction problems. Predictable patterns may be revealed in construction data that were previously thought to be chaotic.Keywords
This publication has 19 references indexed in Scilit:
- Classification of Defects in Sewer Pipes Using Neural NetworksJournal of Infrastructure Systems, 2000
- Back-Propagation Neural Network in Tidal-Level ForecastingJournal of Waterway, Port, Coastal, and Ocean Engineering, 1999
- Nonlinear Structural Control Using Neural NetworksJournal of Engineering Mechanics, 1998
- Discovering case knowledge using data miningLecture Notes in Computer Science, 1998
- Wrappers for feature subset selectionArtificial Intelligence, 1997
- The basic ideas in neural networksCommunications of the ACM, 1994
- Neural networksCommunications of the ACM, 1994
- Construction Decision Support System for Delay AnalysisJournal of Construction Engineering and Management, 1993
- Identification and control of dynamical systems using neural networksIEEE Transactions on Neural Networks, 1990
- Criticism of CPM for Project Planning AnalysisJournal of Construction Engineering and Management, 1984