Listing 1 - 10 of 10 |
Sort by
|
Choose an application
Data quality is one of the most important problems in data management. A database system typically aims to support the creation, maintenance, and use of large amount of data, focusing on the quantity of data. However, real-life data are often dirty: inconsistent, duplicated, inaccurate, incomplete, or stale. Dirty data in a database routinely generate misleading or biased analytical results and decisions, and lead to loss of revenues, credibility and customers. With this comes the need for data quality management. In contrast to traditional data management tasks, data quality management enables the detection and correction of errors in the data, syntactic or semantic, in order to improve the quality of the data and hence, add value to business processes. While data quality has been a longstanding problem for decades, the prevalent use of the Web has increased the risks, on an unprecedented scale, of creating and propagating dirty data. This monograph gives an overview of fundamental issues underlying central aspects of data quality, namely, data consistency, data deduplication, data accuracy, data currency, and information completeness. We promote a uniform logical framework for dealing with these issues, based on data quality rules. The text is organized into seven chapters, focusing on relational data. Chapter One introduces data quality issues. A conditional dependency theory is developed in Chapter Two, for capturing data inconsistencies. It is followed by practical techniques in Chapter 2b for discovering conditional dependencies, and for detecting inconsistencies and repairing data based on conditional dependencies. Matching dependencies are introduced in Chapter Three, as matching rules for data deduplication. A theory of relative information completeness is studied in Chapter Four, revising the classical Closed World Assumption and the Open World Assumption, to characterize incomplete information in the real world. A data currency model is presented in Chapter Five, to identify the current values of entities in a database and to answer queries with the current values, in the absence of reliable timestamps. Finally, interactions between these data quality issues are explored in Chapter Six. Important theoretical results and practical algorithms are covered, but formal proofs are omitted. The bibliographical notes contain pointers to papers in which the results were presented and proven, as well as references to materials for further reading. This text is intended for a seminar course at the graduate level. It is also to serve as a useful resource for researchers and practitioners who are interested in the study of data quality. The fundamental research on data quality draws on several areas, including mathematical logic, computational complexity and database theory. It has raised as many questions as it has answered, and is a rich source of questions and vitality. Table of Contents: Data Quality: An Overview / Conditional Dependencies / Cleaning Data with Conditional Dependencies / Data Deduplication / Information Completeness / Data Currency / Interactions between Data Quality Issues.
Choose an application
Information systems --- Artificial intelligence. Robotics. Simulation. Graphics --- Computer. Automation --- ICT (informatie- en communicatietechnieken) --- IR (information retrieval) --- bedrijfseconomie --- informatica --- informatiesystemen --- database management --- KI (kunstmatige intelligentie) --- informatica management --- robots --- AI (artificiële intelligentie)
Choose an application
Database management --- Web databases --- Information technology --- Bases de données --- Bases de données sur le Web --- Technologie de l'information --- Congresses. --- Management --- Gestion --- Congrès --- Computer Science --- Engineering & Applied Sciences --- Computer science. --- Data structures (Computer science). --- Database management. --- Information storage and retrieval. --- Artificial intelligence. --- Computer Science. --- Data Structures, Cryptology and Information Theory. --- Popular Computer Science. --- Database Management. --- Information Storage and Retrieval. --- Information Systems Applications (incl. Internet). --- Artificial Intelligence (incl. Robotics). --- AI (Artificial intelligence) --- Artificial thinking --- Electronic brains --- Intellectronics --- Intelligence, Artificial --- Intelligent machines --- Machine intelligence --- Thinking, Artificial --- Bionics --- Cognitive science --- Digital computer simulation --- Electronic data processing --- Logic machines --- Machine theory --- Self-organizing systems --- Simulation methods --- Fifth generation computers --- Neural computers --- Data base management --- Data services (Database management) --- Database management services --- DBMS (Computer science) --- Generalized data management systems --- Services, Database management --- Systems, Database management --- Systems, Generalized database management --- Information structures (Computer science) --- Structures, Data (Computer science) --- Structures, Information (Computer science) --- File organization (Computer science) --- Abstract data types (Computer science) --- Informatics --- Science --- Data structures (Computer scienc. --- Information storage and retrieva. --- Data Structures and Information Theory. --- Artificial Intelligence. --- Information storage and retrieval systems. --- Application software. --- Application computer programs --- Application computer software --- Applications software --- Apps (Computer software) --- Computer software
Choose an application
are essential to the quality of the program. We are grateful to Weiyi Meng, X. Sean Wang, Stratis D. Viglas, and Je?rey Xu Yu for their valuable help in leading and monitoring the discussions on our behalf. We also thank Jun Yang for his work in preparing the proceedings. Finally, we are deeply indebted to Yu Zhang and Mengya Tang, the Web Masters, who took on tremendous pain and extra work, and ably modi?ed, extended and maintained the conference management tool; for three long months they worked until late night every day, seven days a week; the necessary extension of the tool also received help from Kun Jing, Chengchao Xie, Shiqi Peng, Heng Wang, Cheng Jin and Ruizhi Ye; without their hardande?ectiveworkthe onlinediscussionswouldnothavebeen possible, among other things. October 2005 Wenfei Fan, Zhaohui Wu Program Committee Co-chairs WAIM 2005 Dedication: Hongjun Lu (1945-2005) On behalf of the Program Committee, it is with sincere gratitude and great sorrow that we would like to dedicate WAIM 2005 proceedings to Hongjun Lu, who left us on March 3, 2005. Hongjun was not only an excellent researcher and highly productive scholar of the database community, but also a wond- ful colleague and dear friend. For many years, he has been the ambassador for database research to China, and tremendously fostered the growth of this community.
Information systems --- Artificial intelligence. Robotics. Simulation. Graphics --- Computer. Automation --- ICT (informatie- en communicatietechnieken) --- IR (information retrieval) --- bedrijfseconomie --- informatica --- informatiesystemen --- database management --- KI (kunstmatige intelligentie) --- informatica management --- robots
Choose an application
This Festschrift volume, published in honour of Peter Buneman, contains contributions written by some of his colleagues, former students, and friends. In celebration of his distinguished career a colloquium was held in Edinburgh, Scotland, 27-29 October, 2013. The articles presented herein belong to some of the many areas of Peter's research interests.
Engineering & Applied Sciences --- Computer Science --- Computer science. --- Programming languages (Electronic computers). --- Computer logic. --- Database management. --- Computer Science. --- Database Management. --- Programming Languages, Compilers, Interpreters. --- Logics and Meanings of Programs. --- Computer science --- Computational complexity --- Mathematics --- Logic design. --- Design, Logic --- Design of logic systems --- Digital electronics --- Electronic circuit design --- Logic circuits --- Machine theory --- Switching theory --- Informatics --- Science --- Data base management --- Data services (Database management) --- Database management services --- DBMS (Computer science) --- Generalized data management systems --- Services, Database management --- Systems, Database management --- Systems, Generalized database management --- Electronic data processing --- Computer science logic --- Logic, Symbolic and mathematical --- Computer languages --- Computer program languages --- Computer programming languages --- Machine language --- Languages, Artificial --- Compilers (Computer programs). --- Compilers and Interpreters. --- Computer Science Logic and Foundations of Programming. --- Compiling programs (Computer programs) --- Computer programs --- Programming software --- Systems software
Choose an application
This book constitutes the proceedings of the 13th Asia-Pacific Conference APWeb 2011 held in conjunction with the APWeb 2011 Workshops XMLDM and USD, in Beijing, China, in April 2011. The 26 full papers presented together with 10 short papers, 3 keynote talks, and 4 demo papers were carefully reviewed and selected from 104 submissions. The submissions range over a variety of topics such as classification and clustering; spatial and temporal databases; personalization and recommendation; data analysis and application; Web mining; Web search and information retrieval; complex and social networks; and secure and semantic Web.
Mathematical statistics --- Computer architecture. Operating systems --- Information systems --- Computer. Automation --- patroonherkenning --- ICT (informatie- en communicatietechnieken) --- IR (information retrieval) --- factoranalyse --- informatiesystemen --- database management --- computernetwerken
Choose an application
Mathematical statistics --- Computer architecture. Operating systems --- Information systems --- Computer. Automation --- patroonherkenning --- ICT (informatie- en communicatietechnieken) --- IR (information retrieval) --- factoranalyse --- informatiesystemen --- database management --- computernetwerken
Choose an application
This Festschrift volume, published in honour of Peter Buneman, contains contributions written by some of his colleagues, former students, and friends. In celebration of his distinguished career a colloquium was held in Edinburgh, Scotland, 27-29 October, 2013. The articles presented herein belong to some of the many areas of Peter's research interests.
Logic --- Computer science --- Programming --- Information systems --- computers --- ontwerpen --- programmeren (informatica) --- database management --- programmeertalen --- computerkunde
Choose an application
This book constitutes the proceedings of the 8th CCF Conference on Big Data, BigData 2020, held in Chongqing, China, in October 2020. The 16 full papers presented in this volume were carefully reviewed and selected from 65 submissions. They present recent research on theoretical and technical aspects on big data, as well as on digital economy demands in big data applications. .
Social sciences (general) --- Computer assisted instruction --- Computer science --- Artificial intelligence. Robotics. Simulation. Graphics --- Computer. Automation --- computervisie --- grafische vormgeving --- informatica --- sociale wetenschappen --- computerondersteund onderwijs --- wiskunde --- KI (kunstmatige intelligentie) --- AI (artificiële intelligentie)
Choose an application
Social sciences (general) --- Computer assisted instruction --- Computer science --- Artificial intelligence. Robotics. Simulation. Graphics --- Computer. Automation --- computervisie --- grafische vormgeving --- informatica --- sociale wetenschappen --- computerondersteund onderwijs --- wiskunde --- KI (kunstmatige intelligentie)
Listing 1 - 10 of 10 |
Sort by
|