Database Management Systems
1. A database management system (DBMS), or simply a database system (DBS), consists of
o A collection of interrelated and persistent data (usually referred to as the database (DB)).
o A set of application programs used to access, update and manage that data (which form the data management system (MS)).
2. The goal of a DBMS is to provide an environment that is both convenient and efficient to use in
o Retrieving information from the database.
o Storing information into the database.
3. Databases are usually designed to manage large bodies of information. This involves
o Definition of structures for information storage (data modeling).
o Provision of mechanisms for the manipulation of information (file and systems structure, query processing).
o Providing for the safety of information in the database (crash recovery and security).
o Concurrency control if the system is shared by users.
A database management system is a collection of interrelated data and a set of programs to access those data. The primary goal of a DBMS is to provide a way to store and retrieve database information that is both concurrent and efficient.
Purpose of Database Systems
1. To see why database management systems are necessary, let's look at a typical ``file-processing system'' supported by a conventional operating system.
The application is a savings bank:
o Savings account and customer records are kept in permanent system files.
o Application programs are written to manipulate files to perform the following tasks:
Debit or credit an account.
Add a new account.
Find an account balance.
Generate monthly statements.
2. Development of the system proceeds as follows:
o New application programs must be written as the need arises.
o New permanent files are created as required.
o but over a long period of time files may be in different formats, and
o Application programs may be in different languages.
3. So we can see there are problems with the straight file-processing approach:
o Data redundancy and inconsistency
Same information may be duplicated in several places.
All copies may not be updated properly.
o Difficulty in accessing data
May have to write a new application program to satisfy an unusual request.
E.g. find all customers with the same postal code.
Could generate this data manually, but a long job...
o Data isolation
Data in different files.
Data in different formats.
Difficult to write new application programs.
o Multiple users
Want concurrency for faster response time.
Need protection for concurrent updates.
E.g. two customers withdrawing funds from the same account at the same time - account has $500 in it, and they withdraw $100 and $50. The result could be $350, $400 or $450 if no protection.
o Security problems
Every user of the system should be able to access only the data they are permitted to see.
E.g. payroll people only handle employee records, and cannot see customer accounts; tellers only access account data and cannot see payroll data.
Difficult to enforce this with application programs.
o Integrity problems
Data may be required to satisfy constraints.
E.g. no account balance below $25.00.
Again, difficult to enforce or to change constraints with the file-processing approach.
These problems and others led to the development of database management systems.
Database System Applications:
Banking, Airlines, Universities, Credit Card Transactions, Telecommunications, Finance, Sales, Manufacturing, Human Resources and so on.
Database Systems versus File Systems
Major disadvantages of a file-processing system:
Data redundancy and inconsistency
Difficulty in accessing data
Data isolation
Integrity problems
Atomicity problems
Concurrent access anomalies
Security problems
Data Abstraction
Since many database users are not computer trained , developers hide the complexity from users through several levels of abstraction, to simplify users’ interactions with the system:
Physical level: The lowest level of abstraction describes how the data are actually stored. The physical level describes complex low level data structures in detail.
Logical level: The next higher level of abstraction describes what data are stored in the database and what relationships exist among these data.
View level: The highest level of abstraction describes only part of the entire database. Many users of the database do not need all the information about a large database; instead they need to access only a part of the database. The view level of abstraction exists to simplify their interaction with the system.
Data Independence
1. The ability to modify a scheme definition in one level without affecting a scheme definition in a higher level is called data independence.
2. There are two kinds:
o Physical data independence
The ability to modify the physical scheme without causing application programs to be rewritten
Modifications at this level are usually to improve performance
o Logical data independence
The ability to modify the conceptual scheme without causing application programs to be rewritten
Usually done when logical structure of database is altered
3. Logical data independence is harder to achieve as the application programs are usually heavily dependent on the logical structure of the data. An analogy is made to abstract data types in programming languages.
Instances and Schemes
1. Databases change over time.
2. The information in a database at a particular point in time is called an instance of the database.
3. The overall design of the database is called the database scheme.
4. Analogy with programming languages:
o Data type definition - scheme
o Value of a variable - instance
5. There are several schemes, corresponding to levels of abstraction:
o Physical scheme
o Conceptual scheme
o Subscheme (can be many)
Database Manager
1. The database manager is a program module which provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.
2. Databases typically require lots of storage space (gigabytes). This must be stored on disks. Data is moved between disk and main memory (MM) as needed.
3. The goal of the database system is to simplify and facilitate access to data. Performance is important. Views provide simplification.
4. So the database manager module is responsible for
o Interaction with the file manager: Storing raw data on disk using the file system usually provided by a conventional operating system. The database manager must translate DML statements into low-level file system commands (for storing, retrieving and updating data in the database).
o Integrity enforcement: Checking that updates in the database do not violate consistency constraints (e.g. no bank account balance below $25)
o Security enforcement: Ensuring that users only have access to information they are permitted to see
o Backup and recovery: Detecting failures due to power failure, disk crash, software errors, etc., and restoring the database to its state before the failure
o Concurrency control: Preserving data consistency when there are concurrent users.
5. Some small database systems may miss some of these features, resulting in simpler database managers. (For example, no concurrency is required on a PC running MS-DOS.) These features are necessary on larger systems.
Database Administrator
1. The database administrator is a person having central control over data and programs accessing that data. Duties of the database administrator include:
o Scheme definition: the creation of the original database scheme. This involves writing a set of definitions in a DDL (data storage and definition language), compiled by the DDL compiler into a set of tables stored in the data dictionary.
o Storage structure and access method definition: writing a set of definitions translated by the data storage and definition language compiler
o Scheme and physical organization modification: writing a set of definitions used by the DDL compiler to generate modifications to appropriate internal system tables (e.g. data dictionary). This is done rarely, but sometimes the database scheme or physical organization must be modified.
o Granting of authorization for data access: granting different types of authorization for data access to various users
o Integrity constraint specification: generating integrity constraints. These are consulted by the database manager module whenever updates occur.
Database Users
1. The database users fall into several categories:
o Application programmers are computer professionals interacting with the system through DML calls embedded in a program written in a host language (e.g. C, PL/1, Pascal).
These programs are called application programs.
The DML precompiler converts DML calls (prefaced by a special character like $, #, etc.) to normal procedure calls in a host language.
The host language compiler then generates the object code.
Some special types of programming languages combine Pascal-like control structures with control structures for the manipulation of a database.
These are sometimes called fourth-generation languages.
They often include features to help generate forms and display data.
o Sophisticated users interact with the system without writing programs.
They form requests by writing queries in a database query language.
These are submitted to a query processor that breaks a DML statement down into instructions for the database manager module.
o Specialized users are sophisticated users writing special database application programs. These may be CADD systems, knowledge-based and expert systems, complex data systems (audio/video), etc.
o Naive users are unsophisticated users who interact with the system by using permanent application programs (e.g. automated teller machine).
No comments:
Post a Comment