Wednesday, September 1, 2010

Data, Database and Database Management System

Chapter 1: Introduction
Data, Database and Database Management System
Data
Data are raw fact or information. Data is representation of fact, concept in formalized manner suitable for communication, processing by automating mean or human being.
Information is data that has been refined and organized by processing and purposeful intelligence.
Data and information are seem to be synonymous but the distinction between them is data is used to refer “What is actually stored in the database” and information is used to refer “the meaning of data as understand by some user”
Database
Database is the collection of inter-related data and set of program to access data.
Database system is basically just a computerized record keeping system i.e. as a kind of electronic filing cabinet. In other word, it is a repository or container for a collection of computerized data files where users can perform variety of operations of such files.
E.g.: Adding new files to the database, inserting data into existing file, removing, operating file from database etc.
Database Management System (DBMS)
DBMS is a collection of program that manages the database, structure and control access to the data store in database.
File System
File System is a method of storing and organizing computer and the data they contain to make them easy to find and access them. File System may use the data storage device such as hard disk or CDROM and involves maintaining the physical location of file.
Purpose of Database System
In the early days database applications where built directly on the top of the file system.
Drawbacks of using file system to store data are as follows
(Advantage of DBMS)
1. Data Redundancy and Consistency
Multiple file formats, duplication of information in different files.
2. Difficulty in accessing data
Need to write a new program to carry out each new task.
3. Data Isolation
Data scattered in different files and may be in different formats. So it may be difficult to write application to retrieve appropriate data accommodating all files and format.
4. Integrity Problem
Data are stored in different file and may be used by different program. In long run, if file format is been change according to one program then the other program may stop responding.
5. Atomicity Problem
Failure may leave data file in an inconsistent state with partial update carried on.
6. Concurrency Access by Multiple Users
Accessing data at a same time by two or more different users create a problem while writing data trying to update data from two or more side.
7. Security Problem
Not only authorized but also other can easily see data store in data file.

So, database system offer solution to the entire above problem.

Disadvantage of Database System
In spite of using DBMS, there are few situations in which such a program may involve unnecessary overhead cost as that would not be incurred in traditional file system. The overhead cost of using DBMS is due to the following.
1. High initial investment in software, hardware and training.
2. Overhead for providing security, recovery and integrity function.
3. Cost of the maintenances of the software which remain forever.
4. Cost of backup and recovery.

View of data
A database system is a collection of interrelated files and a set of program that allows user to access and modify these files. The major purpose of database system is to provide user with an abstract view of the data i.e. the system hides certain details of how the data are stored and maintained.
Parts of view of data
Data Abstraction
For the system to be useable, it must retrieve data efficiently. The need for efficiency has led designer to use complex data structure to represent data in the database. Since database users are not computer trained, developer hides complexity from user through several level of abstraction to simplify user interaction with the system.
Three level of abstraction are:-
1. Physical Level
The lowest level of abstraction describes how the data are actually stored. The physical level describes complex low level data structure in detail.
2. Logical Level
What data are stored in the database and what relationship exists among data is deal by logical level.
Database administrator, who must decide what information to keep in the database use the logical level of abstraction.
3. View Level
The highest level of abstraction describes only part of the entire database to be viewed by the users. Different users have access to only a part of database. The system may provide many users for same database.


Fig: Level of Abstraction
Schema and Instances
Schema
The overall design of the database is called database schema.
E.g.: Employee Schema
Emoloyee(name,age,salary)
The database schema is the description of a database specified during database design.
Analogous to type information of variable of a program.
Types of schema
1. Physical Schema
Database design at a physical level.
It describes physical storage of database.
2. Logical Schema
Database design at logical level.
It hides the details of physical storage structure and concentrates on describing entities, data types, relationship etc.
3. External View
It describes the part of database that a particular user group is interested in and hides the rest of the database from the user group.

-----------------------------------------------------------





Fig: The three schema architecture
Instances
The actual contain of database at a particular point in time is instances.
Analogous to the values of variable.
Employee Schema
Emoloyee(name,age,salary)
(ram,25,10000) instance
Data Model
Representation of reality
Structure of database is called data model. It is the collection of conceptual tool for describing data relationship, data semantic etc. There are various data model.
1. Entity Relationship Model (E-R Model)
It is base on a perception of real world that consists of collection of basic object called entities and relationship among entities.
An entity is a things or object in the real world i.e. distinguishable from other objects. E.g. each person is entity.
Each entity is described by the set of attributes.
Relationship is an association among several entities. The set of all entities of same type and set of all relationship of same type are termed as entity set and relationship set respectively.
(Rectangle) ---->entity set
(Line) ---------> Link
(Ellipse)---->attributes

(Diamond)----->relationship set



2. Relational Model
This model uses a collection of table to represent both data and relationship among those data. Each table has multiple columns and each column has unique name. Each table contains record of particular type. Each record type defines a fixed number of fields or attributes. The relational model is at lower level of abstraction then the ER Model. Database design are often carried out in ER model and then translated into relational model.
Teacher
Id Name address


3. Others model
a. Object oriented data model
It is an extension of ER model with view of encapsulation, function and object identity.
b. Object relational data model
It combines feature of object oriented data model and relational model.
c. Network data model
The model in which all entities are connected to each other is called network data model.
Account Holder Account




Fig: Network Model
d. Hierarchical data model
Similar to network model. Organization of data is as a collection of trees, rather than arbitrary graph.


Fig: Hierarchical Data Model

Database Language
A database system provides a database definition language to specify a database schema and data manipulation language to express database queries and updates.
1. Data Definition Language (DDL)
A special language that specifies a database schema by a set of definitions is called DDL. A special set of table is called data dictionary (collection of data) or data directory which contains Meta data (data about data).
2. Data Manipulation Language(DML)
DML is retrieval of information store in database.
.insertion of new information into the database
.deletion of information from database
.modification of information store in database
.selection of data store in database
Types of DML
a. Procedural DML
It requires the user to specify what data are needed and how to get those results. E.g. relational algebra.
b. Declarative (Non-Procedural)DML
It requires a user to specify what data are needed without specifying how to get those data. E.g.: tuple relational calculus, domain relational calculus.
*Query language
Query is a statement requesting the retrieval of information.
E.g.: Select * from employee
The portion of DML that involves information retrieval is called query language.
Database language is a special type of language that is use to interact with the database concerning with the data definition language and data manipulation language.

Database Users
Database users are the person who uses the database.
1. Sophisticated Users
These are the users who interact with system without writing program. They form their request in a database query language. They submit each query to a query processor which function is to break down DML statement into instruction that the storage manager understand. E.g.: analyst
2. Specialized Users
They are sophisticated users who write specialized database application. Among these applications are computer aided design system, knowledge base and expert system, complex data system (audio/video) etc.
3. Naïve Users
These are unsophisticated users who interact with the system by invoking one of the application programs that have been written previously. E.g.: a bank teller who need to transfer Rs 5000 from account A to account B. Then he/she invokes a program called transfer.
4. Application Programmers
These are the computer professional who write application program to interact database.

Database Administrator
A person who has a central control over the system (both the data and program) is called database administrator (DBA).
Functions of DBA
1. Schema Definition
DBA creates original database schema by executing a set of data definition statement in DDL.
2. Storage Structure and access method definition
It is that how data is to be represented by writing the storage structure definition. The associated internal or conceptual schema most also be specified using data definition language (DDL).
3. Schema and physical modification
DBA carries out changes to the schema and physical organization to reflect the changing needs of the organization or to alter the physical organization to improve the performances.
4. Granting of authorization for data access
By granting different types of authorization DBA can regulate which part of database, various users can access.
5. Routine maintenances
DBA’s routine maintenance activities are:
a. Periodically backing of database either on tapes on to remote server.
b. Ensuring that enough free disk space is available for normal operation and upgrading disk space required.
c. Monitoring jobs running on the database and ensuring that performance does not degrade.

Transaction management
Transaction is the collection of operation that performs a single logical function in a database application.
To insure integrity of data, database system must follow the following properties called ACID properties.
1. Atomicity
Either all the operations of transaction are reflected properly in the database system or none are.

2. Consistency
Execution of a transaction in isolation preserves the consistency of the database.

3. Isolation
Each transaction should be aware of each other transaction executing concurrently in the system.

4. Durability
After a transaction complete successfully, the changes it has made to the database persists, even if there are system failure.

Database System Architecture
A database system is partition into module that deals with each of the responsibilities of the overall system. The functional component of the database system can be broadly divided into: Storage manager and query processor components.
1. Storage manager
It is a program module that provides the interface between the low level data stored in the database and the application program and queries submitted to the system. The storage manager is responsible for interaction with the file manager. The storage manager translates the various DML statements into low level system file command. Storage manager is responsible for storing, retrieving and updating data in the database. The storage manager component includes:
a. Authorization and integrity manager
Test for the satisfaction of integrity constraints and checks the authority to users to access data.

b. Transaction manager
Ensures that the database remains in the consistent state despite system failure and the concurrent transaction execution proceed without conflicting.

c. File manager
Manages the allocation of space on the disk storage and the data structure use to represent information store on disk.

d. Buffer manager
Responsible for fetching data from disk storage into main memory.

2. Query processor
a. DDL interpreter
This interprets DDL statement and records the definitions in the data dictionary.

b. DML compiler
This translates DML language in a query language into an evaluation plan consisting of low level instruction that the query evaluation engine understand.
A query can be translated into number of alternatives evaluation plan that all gives the same level. The DML compiler also performs query optimization i.e. it peaks the lowest cost evaluation plan from the alternatives.

Application Architecture
Today usually database system are not present at the side of users but are connected with the network. We differentiate between them as a client machine on which remote database user works and server machine on which database system run. Database application are partition into 2 or 3 parts:-
1. Two tier architecture
The application is partition into the component resides at the client machine which involves database functionality at the server machine through query language statement.

2. Three tier architecture
Three tier architecture, client machine act as a front end and does not contain any direct database call.
Instead client communicates with application server through a form interface.
The application server in term communicates with database system to access data.

No comments:

Post a Comment