MTTM Shimla Syllabus: MTA – (II) ( 12) : ELECTRONIC DATA PROCESSING- module 2

Module – 2 : File Organisation and Processing: Fields, Records, Files, Type of files, Serial,

Sequential, Index Sequential and Random files, File Organisations, Batch Processing, Real time

processing, Time sharing, Multi Processing, Multi Programming, Client Serves processing.

Fields: A field in Microsoft Access is a piece of information related to a single person or thing. Related fields are grouped together to form a record.

Records: In a database, a record (sometimes called a row) is a group of fields within a table that are relevant to a specific entity. For example, in a table called customer contact information, a row would likely contain fields such as: ID number, name, street address, city, telephone number and so on

Files:

A file is a logical collection of information stored on secondary storage such as hard disk. Physically, a file is the smallest allotment of secondary storage devices e.g. disk. Logically, a file is a sequence of logical records i.e. a sequence of bits and bytes. Files can be used to contain data and programs (both source and object program). Data files can be numeric, alphabetic, alphanumeric or binary. A file has various attributes like name, type, location, size, protection, time and date of creation etc.

Type of files:

In order to support different types of files, operating systems support two part file names. The two parts are: name and an extension. Both are separated by a period (dot). For example, a name of a file can be program C.

File Type	Extension	Meaning
Execute file	.exe, .com, .bin	Read to run machine language program
Object file	.obj, .o	Compiled, machine language but not linked
Source code file	.c, .cc, .java, .pas, .asm, .a, .ftn	Represents source code in different languages such as c, java, pascal, assembly language of fortran.
Batch file	.bat, .sh	Command to the command interpreter.
Text file	.txt, .doc	Documentation
Library file	.lib, .dll	Libraries of routines for programmers
Backup file	.bak	Used for taking backup of some program file
Multimedia file	.mpeg, .mov, .rm	Binary file containing audio or audio/video information

Access Methods:

Files are used to store data. The information present in the file can be accessed by various methods. Thus, the way of retrieving data from a file is known as access method. Different systems use different access methods. The various access methods used are:

1. Sequential access

It is the simplest and most commonly used access method. Information in the file is accessed in the order it is stored in the file i.e. one record after the other. Starting at the beginning to the end of the file.

2. Direct access

In direct access method it is possible to access the records of a file in any order. The various records can be read or write randomly. In this way the records can be accessed by key, rather than by position.

3. Indexed access

In this method, an index is created for the file. This index contains pointer for various blocks of a file, just like a index in a back of a book. If we want to find a record of a file, first the index is searched and then the pointer from index is used to access that file. In this way, a required record is found.

File Organisations

It is used to determine an efficient file organization for each base relation. For example, if we want to retrieve student records in alphabetical order of name, sorting the file by student name is a good file organization. However, if we want to retrieve all students whose marks is in a certain range, a file ordered by student name would not be a good file organization. Some file organizations are efficient for bulk loading data into the database but inefficient for retrieve and other activities.

Types of File Organization

In order to make effective selection of file organizations and indexes, here we present the details different types of file Organization. These are:

• Heap File Organization: An unordered file, sometimes called a heap file, is the simplest type of file organization.

Records are placed in file in the same order as they are inserted. A new record is inserted in the last page of the file; if there is insufficient space in the last page, a new page is added to the file. This makes insertion very efficient.

• Hash File Organization: In a hash file, records are not stored sequentially in a file instead a hash function is used to calculate the address of the page in which the record is to be stored. The field on which hash function is calculated is called as Hash field and if that field acts as the key of the relation then it is called as Hash key.

• Indexed Sequential Access Methods (ISAM) File Organization: In an ISAM system, data is organized into records which are composed of fixed length fields. Records are stored sequentially. A secondary set of hash tables known as indexes contain "pointers" into the tables, allowing individual records to be retrieved without having to search the entire data set. It is a data structure that allows the DBMS to locate particular records in a file more quickly and thereby speed response to user queries.

• B+- tree File Organization: B+-tree is a more versatile storage structure than hashing. It supports retrievals based on exact key match, pattern matching, range of values, and part key specification. The B+-tree index is dynamic, growing as the relation grows.

• Cluster File Organization:

Some DBMSs, such as Oracle, support clustered and non-clustered tables. Clusters are group of one or more tables physically stored together because they share common columns and are often used together.

Indexed Clusters

In an index cluster, records with the same cluster key are stored together. Oracle suggests using indexed clusters when:

• Queries retrieve records over a range of cluster key value;

• Clustered tables may grow unpredictable.

Cluster can improve performance of retrieval, depending on the data distribution and what SQL operations are most often performed on the data

Hash Clusters

Hash clusters also cluster table data in a manner similar to index clusters. However, a record is stored in a hash cluster based on the result of applying a hash function to the record's cluster key value. All records with the same hash key value are stored together on disk.

Batch Processing: In Batch processing system, the various jobs of the users are collected in a queue. This process is known as spooling. SPOOLING is the short form of Simultaneous Peripheral Operations On Line.

Users didn’t interact directly with computer system; they prepare their job that consisted of the program, data and some control information. This job was usually in form of punched cards. The users submit then job to a computer operator. When batches of programs have been collected, the operator loads this batch of programs into the computer at one time where they are executed one after the other. Finally, the operators retrieve the output of these jobs and return them to the concerned users. In this way many different jobs are processed, one after the other without any interaction from the users during program execution.

· The batch processing operating system was called a monitor that resides in the main memory. Such a portion of main memory is known as resident monitor.
· The batch monitor executes batches of job at definite interval of time.
· The batch monitor accepts the commands for initializing, processing and terminating a batch.

Real Time processing,

In a real time operating system, a job is to be completed within the rigid time constraints otherwise job loses its meaning. A real time system functions correctly only if it returns the correct result within its time constraints. Thus, in a real-time system, the correctness of the computation not only depends upon the logical correctness of the computation but also upon the time at which the result is produced. A real time system is often used as a central device in a dedicated application like fuel-injection system, robotics, air-traffic control and medical imaging systems, systems that control scientific experiment, industrial control system and weapon systems.

Time sharing

Time sharing refers to the allocation of computer resources in a time dependent fashion to several programs simultaneously. A time sharing system has many user terminals that are connected to same computer simultaneously. Using these terminals, different users can work on a system at the same time. In timesharing system, the CPU time is divided among all the users on a scheduled basis. Each user program is allocated a very short period of CPU time one-by-one, beginning from the first user program and proceeding the last one, and then again beginning from the first one. This short period of time during which user gets the attention of the CPU is known as a Time Slice. Time slot or Quantum. Thus, in timesharing, when the CPU is allocated to a user program, the user uses the CPU for the period of time slot. It releases the CPU under any of the following three conditions:-

1. When the allotted time slice expires

2. When the program needs to perform I/O operations.

3. When the execution of the program is over during the time slice.

Even though it may appear that several users are using computer system at the same time, a single CPU system can only execute one instruction at a time. Thus, like multiprogramming, even with a timesharing system, only one program can be in control of the CPU at any given time.

Multi Processing

Multiprocessor system is the system that contains two or more processors or CPUs and has ability to simultaneously execute several programs. Hence, the name ‘multi-processor’.

In such a system, multiple processors share the clock, bus, memory and peripheral devices. A multiprocessor system is also known as parallel system.

In such a system, instructions from different and independent programs can be processed at the same instant of time by different CPUs.

In this system, the CPUs may simultaneously execute different instructions from the same program.

Multi Programming

Multiprogramming operating system allows multiple users to execute multiple programs using a single CPU concurrently i.e. at the same time. In multiprogramming several processes are kept in the main memory and CPU execute all these processes concurrently. It means, the CPU immediately switches from one process to next that are ready to be executed.

The advantages of Multiprogramming are:

1. Increased throughput:

Throughput is increased by utilizing the idle-time of CPU for running other programs that are already present in the main memory.

2. Lowered Response Rate:

Response time is lowered by recognizing the priority of a job as it enters the system and by processing a jobs on a priority basis.

3. Ability to assign priorities to Jobs:

Most multiprogramming systems have schemes for setting priorities for rotating programs. They specify when the CPU will rotate to another program, and which program it will rotate to.

Client Server System

# Distributed System

A distributed system is a collection of processors located in geographical dispersed physical location. In this system, the workload is divided between two or more computers that are linked together by a communication network. That is, the different processors communicate using communication links, such as telephone lines and buses. The processors in a distributed system vary in size and function. They may include small microprocessors, workstations microcomputers, mainframe computers and large general purpose computers. The various processors are also called as sites, nodes, hosts or machine. The purpose of distributed system is to provide an efficient and convenient environment for sharing of resources.

MTTM Shimla Syllabus

How much you like this blog ?

Translate

Saturday, 8 July 2017

MTA – (II) ( 12) : ELECTRONIC DATA PROCESSING- module 2

No comments:

Post a Comment