1 1 OVERVIEW Topics Covered: 1.1 Database management system 1.2 Data Independence 1.3 Data Abstraction 1.4 Data Models 1.5 DBMS Architecture 1.6 Users of DBMS 1.7 Overview of Conventional Data Models 1.1 DATABASE MANAGEMENT SYSTEM (DBMS) DEFINITION:A database management system is a collection of interrelated data and a set of programs to access those data. Collection of data is referred to as a database. Primary goal of dbms is to provide a way to store and retrieve database information that is both convenient and efficient. Dbms allows us to define structure for storage of information and also provides mechanism to manipulate this information. Dbms also provides safety for the information stored despite system crashes or attempts of authorized access. Limitations of data processing environment:1) Data redundancy and consistency:- Different files have different formats of programs written in different programming languages by different users. So the same information may be duplicated in several files. It may lead to data inconsistency. If a customer changes his address, then it may be reflected in one copy of data but not in the other. 2) Difficulty in accessing data:- The file system environment does not allow needed data to be retrieved in a convenient and efficient manner. 3) Data isolation:- Data is scattered in various files; so it gets isolated because file may be in different formats.
2 4) Integrity problems:- Data values stored in the database must satisfy consistency constraints. Problem occurs when constraints involve several data items from different files. 5) Atomicity problems:- If failure occurs, data must be stored to constant state that existed prior to failure. For example, if in a bank account, a person abc is transferring Rs 5000 to the account of pqr, and abc has withdrawn the money but before it gets deposited to the pqr’s account, the system failure occurs, then Rs5000 should be deposited back to abc’s bank account. 6) Concurrent access anomalies:- Many systems allow multiple users to update data simultaneously. Concurrent updates should not result in inconsistent data. 7) Security problems:- Not every user of the database system should be able to access all data. Data base should be protected from access by unauthorized users. 1.2 DATA INDEPENDENCE We can define two types of data independence: 1. Logical data independence: It is the capacity to change the conceptual schema without having to change external schemas or application programs. We may change the conceptual schema to expand the database (by adding a record type or data item), or to reduce the database (by removing a record type or data item). In the latter case, external schemas that refer only to the remaining data should not be affected. Only the view definition and the mappings need be changed in a DBMS that supports logical data independence. Application programs that reference the external schema constructs must work as before, after the conceptual schema undergoes a logical reorganization. Changes to constraints can be applied also to the conceptual schema without affecting the external schemas or application programs. 2. Physical data independence: It is the capacity to change the internal schema without having to change the conceptual (or external) schemas. Changes to the internal schema may be needed because some physical files had to be reorganized—for example, by creating additional access structures—to improve the performance of retrieval or update. If the same data as before remains in the database, we should not have to change the conceptual schema. Whenever we have a multiplelevel DBMS, its catalog must be expanded to include information on how to map requests and data among the various levels. The DBMS uses additional software to accomplish these mappings by
3 referring to the mapping information in the catalog. Data independence is accomplished because, when the schema is changed at some level, the schema at the next higher level remains unchanged; only the mapping between the two levels is changed. Hence, application programs referring to the higher-level schema need not be changed. 1.3 DATA ABSTRACTION: Major purpose of dbms is to provide users with abstract view of data i.e. the system hides certain details of how the data are stored and maintained. Since database system users are not computer trained, developers hide the complexity from users through 3 levels of abstraction, to simplify user’s interaction with the system. 1) Physical level of data abstraction: This s the lowest level of abstraction which describes how data are actually stored. 2) Logical level of data abstraction: This level hides what data are actually stored in the database and what relationship exists among them. 3) View Level of data abstraction: View provides security mechanism to prevent user from accessing certain parts of database. 1.4 DATA MODELS Many data models have been proposed, and we can categorize them according to the types of concepts they use to describe the database structure. High-level or conceptual data models provide concepts that are close to the way many users perceive data, whereas lowlevel or physical data models provide concepts that describe the details of how data is stored in the computer. Concepts provided by low-level data models are generally meant for computer specialists, not for typical end users. Between these two extremes is a class of representational (or implementation) data models, which provide concepts that may be understood by end users but that are not too far removed from the way data is organized within the computer. Representational data models hide some details of data storage but can be implemented on a computer system in a direct way. Conceptual data models use concepts such as entities, attributes, and relationships.
4 An entity represents a real-world object or concept, such as an employee or a project, that is described in the database. An attribute represents some property of interest that further describes an entity, such as the employee’s name or salary. A relationship among two or more entities represents an interaction among the entities, which is explained by the Entity-Relationship model—a popular high-level conceptual data model. Representational or implementation data models are the models used most frequently in traditional commercial DBMSs, and they include the widely-used relational data model, as well as the so-called legacy data models—the network and hierarchical models—that have been widely used in the past. We can regard object data models as a new family of higherlevel implementation data models that are closer to conceptual data models. Object data models are also frequently utilized as highlevel conceptual models, particularly in the software engineering domain. Physical data models describe how data is stored in the computer by representing information such as record formats, record orderings, and access paths. An access path is a structure that makes the search for particular database records efficient. 1.5 DBMS ARCHITECTURE Fig: Three-Schema DBMS Architecture