Wednesday, September 1, 2010

Filing and File Structure

Filing and File Structure:--

1. We have been looking mostly at the higher-level models of a database. At the conceptual or logical level the database was viewed as
o A collection of tables (relational model).
o A collection of classes of objects (object-oriented model).
2. The logical model is the correct level for database users to focus on. However, performance depends on the efficiency of the data structures used to represent data in the database, and on the efficiency of operations on these data structures.

Overview of Physical Storage Media
1. Several types of data storage exist in most computer systems. They vary in speed of access, cost per unit of data, and reliability.
o Cache: most costly and fastest form of storage. Usually very small, and managed by the operating system.
o Main Memory (MM): the storage area for data available to be operated on.
 General-purpose machine instructions operate on main memory.
 Contents of main memory are usually lost in a power failure or ``crash''.
 Usually too small (even with megabytes) and too expensive to store the entire database.
o Flash memory: EEPROM (electrically erasable programmable read-only memory).
 Data in flash memory survive from power failure.
 Reading data from flash memory takes about 10 nano-secs (roughly as fast as from main memory), and writing data into flash memory is more complicated: write-once takes about 4-10 microsecs.
 To overwrite what has been written, one has to first erase the entire bank of the memory. It may support only a limited number of erase cycles ( to ).
 It has found its popularity as a replacement for disks for storing small volumes of data (5-10 megabytes).
o Magnetic-disk storage: primary medium for long-term storage.
 Typically the entire database is stored on disk.
 Data must be moved from disk to main memory in order for the data to be operated on.
 After operations are performed, data must be copied back to disk if any changes were made.
 Disk storage is called direct access storage as it is possible to read data on the disk in any order (unlike sequential access).
 Disk storage usually survives power failures and system crashes.
o Optical storage: CD-ROM (compact-disk read-only memory), WORM (write-once read-many) disk (for archival storage of data), and Juke box (containing a few drives and numerous disks loaded on demand).
o Tape Storage: used primarily for backup and archival data.
 Cheaper, but much slower access, since tape must be read sequentially from the beginning.
 Used as protection from disk failures.
2. The storage device hierarchy is presented in figure where the higher levels are expensive (cost per bit), fast (access time), but the capacity is smaller.

Storage-device hierarchy
3. Another classification: Primary, secondary, and tertiary storage.
1. Primary storage: the fastest storage media, such as cache and main memory.
2. Secondary (or on-line) storage: the next level of the hierarchy, e.g., magnetic disks.
3. Tertiary (or off-line) storage: magnetic tapes and optical disk juke boxes.
4. Volatility of storage. Volatile storage loses its contents when the power is removed. Without power backup, data in the volatile storage (the part of the hierarchy from main memory up) must be written to nonvolatile storage for safekeeping.

Organization of records into blocks
1. Each file is partitioned into fixed-length storage units, called blocks, which are the units of both storage allocation and data transfer.
2. It is desirable to keep as many blocks as possible in main memory. Usually, we cannot keep all blocks in main memory, so we need to manage the allocation of available main memory space.
3. We need to use disk storage for the database, and to transfer blocks of data between main memory and disk. We also want to minimize the number of such transfers, as they are time-consuming.
4. The buffer is the part of main memory available for storage of copies of disk blocks.
Buffer manager
1. The subsystem responsible for the allocation of buffer space is called the buffer manager.
o The buffer manager handles all requests for blocks of the database.
o If the block is already in main memory, the address in main memory is given to the requester.
o If not, the buffer manager must read the block in from disk (possibly displacing some other block if the buffer is full) and then pass the address in main memory to the requester.
2. The buffer manager must use some sophisticated techniques in order to provide good service:
o Replacement Strategy -- When there is no room left in the buffer, some block must be removed to make way for the new one. Typical operating system memory management schemes use a ``least recently used'' (LRU) method. (Simply remove the block least recently referenced.) This can be improved upon for database applications.
o Pinned Blocks - For the database to be able to recover from crashes, we need to restrict times when a block maybe written back to disk. A block not allowed to be written is said to be pinned. Many operating systems do not provide support for pinned blocks, and such a feature is essential if a database is to be ``crash resistant''.
o Forced Output of Blocks - Sometimes it is necessary to write a block back to disk even though its buffer space is not needed, (called the forced output of a block.) This is due to the fact that main memory contents (and thus the buffer) are lost in a crash, while disk data usually survives


Buffer replacement policies
1. Replacement Strategy: Goal is minimization of accesses to disk. Generally it is hard to predict which blocks will be referenced. So operating systems use the history of past references as a guide to prediction.
o General Assumption: Blocks referenced recently are likely to be used again.
o Therefore, if we need space, throw out the least recently referenced block (LRU replacement scheme).
2. LRU is acceptable in operating systems, however, a database system is able to predict future references more accurately.

3. Consider processing of the relational algebra expression



4. Further, assume the strategy to process this request is given by the following pseudo-code:
for each tuple b of borrower do
for each tuple c of customer do
if b[cname] = c[cname]
then begin
let x be a tuple defined as follows:
x[cname]:= b[cname]
x[loan#]:= b[loan#]
x[street]:= c[street]
x[city]:= c[city]
include tuple x as part of result of
borrow customer
end
end
end

5. Assume that the two relations in this example are stored in different files.
o Once a tuple of borrower has been processed, it is not needed again. Therefore, once processing of an entire block of tuples is finished, that block is not needed in main memory, even though it has been used very recently.
o Buffer manager should free the space occupied by a borrow block as soon as it is processed. This strategy is called toss-immediate.
o Consider blocks containing customer tuples.
o Every block of customer tuples must be examined once for every tuple of the borrow relation. When processing of a customer block is completed, it will not be used again until all other customer blocks have been processed. This means the most recently used (MRU) block will be the last block to be re-referenced, and the least recently used will be referenced next.
o This is the opposite of LRU assumptions. So for inner block, use MRU strategy -- if a customer block must be removed from the buffer, choose MRU block.
o For MRU strategy, the system must pin the customer block currently being processed until the last tuple has been processed. Then it is unpinned, becoming the most recently used block.
6. The buffer manager may also use statistical information regarding the probability that a request will reference a particular relation.
o The data dictionary is the most frequently-used part of the database. It should, therefore, not be removed from main memory unless necessary.
o File indices are also frequently used, and should generally be in main memory.
o No single strategy is known that handles all possible scenarios well.
o Many database systems use LRU, despite of its faults.
Concurrency and recovery may need other buffer management strategies, such as delayed buffer-out or forced output.

Organization of Records in Files
There are several ways of organizing records in files.
• heap file organization. Any record can be placed anywhere in the file where there is space for the record. There is no ordering of records.
• sequential file organization. Records are stored in sequential order, based on the value of the search key of each record.
• hashing file organization. A hash function is computed on some attribute of each record. The result of the function specifies in which block of the file the record should be placed.
• clustering file organization. Records of several different relations can be stored in the same file. Related records of the different relations are stored on the same block so that one I/O operation fetches related records from all the relations.

Sequential File Organization
1. A sequential file is designed for efficient processing of records in sorted order on some search key.
o Records are chained together by pointers to permit fast retrieval in search key order.
o Pointer points to next record in order.
o Records are stored physically in search key order (or as close to this as possible).
o This minimizes number of block accesses.

2. It is difficult to maintain physical sequential order as records are inserted and deleted.
o Deletion can be managed with the pointer chains.
o Insertion poses problems if there no space where new record should go.
o If there is space, use it, else put new record in an overflow block.
o Adjust pointers accordingly.
o Problem: we now have some records out of physical sequential order.
o If very few records in overflow blocks, this will work well.
o If order is lost, reorganize the file.
o Reorganizations are expensive and done when system load is low.
3. If insertions rarely occur, we could keep the file in physically sorted order and reorganize when insertion occurs. In this case, the pointer fields are no longer required.

Clustering File Organization
1. One relation per file, with fixed-length record, is good for small databases, which also reduces the code size.
2. Many large-scale DB systems do not rely directly on the underlying operating system for file management. One large OS file is allocated to DB system and all relations are stored in one file.
3. To efficiently execute queries involving , one may store the depositor tuple for each cname near the customer tuple for the corresponding cname.
4. This structure mixes together tuples from two relations, but allows for efficient processing of the join.
5. If the customer has many accounts which cannot fit in one block, the remaining records appear on nearby blocks. This file structure, called clustering, allows us to read many of the required records using one block read.
6. Our use of clustering enhances the processing of a particular join but may result in slow processing of other types of queries, such as selection on customer.
For example, the query
select * from customer

now requires more block accesses as our customer relation is now interspersed with the deposit relation.
7. Thus it is a trade-off, depending on the types of query that the database designer believes to be most frequent. Careful use of clustering may produce significant performance gain.

No comments:

Post a Comment