Distributed applications
The distributed applications can be divided according to three application layers.
- The presentation layer deals with managing the presentation logic, thus the modes of interaction with the user: it contains the modes of graphical interfacing and the modes of rendering
of the information. This layer is also referred to as the application front end. - The application logic or business logic layer deals with the functions to be made available to the user.
- Finally, the data access logic layer deals with information management, possibly with access to databases.
Distributed application: a definition
It is an application consisting of two or more processes running, in parallel, on separate machines connected by a communication network. The processes that constitute a distributed application cooperate by taking advantage of the services provided by the communication network.
These three application (software) layers can be installed on various hardware layers, called Tiers, where a tier represents a machine with different processing capabilities. An application can be configured as:
- Single Tiered: the three tiers are hosted on a single machine or host;
- Two Tiered: the three tiers are divided between a user machine, which hosts the presentation tier, and the server machine, which hosts the data access layer; the application logic layer can reside on the user or server side or be distributed between the two;
- Three Tiered: the three tiers each reside on a dedicated machine, i.e., a workstation user workstation (PC type, in general), an application server, and a server for managing
data.
The middleware layer that contains the combination of access to networked distributed databases and control and communication objects is also called the back end. In contrast, the presentation layer is commonly referred to as the front end.
The evolution of computer architectures
Information systems architecture indicates the set of of technical and organizational choices that affect the development and use of the technological resources of a system. Information systems architectures have developed and evolved over the years from centralized systems to distributed systems, which are more responsive to the needs of decentralization and cooperation of modern organizations.
We speak of a centralized information system when data and applications reside in a single processing node.
A distributed computing system is one that realizes at least one of the following situations:
- applications, cooperating with each other, reside on multiple processing nodes (distributed processing);
- the information assets, unified, are hosted on multiple processing nodes (distributed database).
In general terms, then, a distributed system consists of a set of logically independent that collaborate in the pursuit of common goals through an infrastructure communication hardware and software.
Centralized Systems
Centralized systems originated with modern computing in the 1950s and developed in the 1960s and 1970s through the evolution of mainframes, the introduction of timesharing operating systems, and the development of centralized database management systems of hierarchical and reticular. The emergence and development in the 1970s and 1980s of new, cheaper technologies, both in hardware and in versatile and easy-to-use data management structures led to the crisis of the centralized model and promoted the implementation of distributed systems.
Nevertheless, in the early 1990s the distributed model came under heavy criticism for the increased complexities design and management.
Server farms
The physical tiers we analyzed earlier can also be realized as a server farm which is managed by the other tiers as if it were a single resource. A server farm consists of a set of processors that share applications and data.
A server farm, or server cluster, is a collection of computer servers, usually maintained by a company, to meet server needs far beyond the capacity of a single machine. Server farms are often made up of thousands of computers that require large amounts of power to run and keep cool. At optimum performance levels, a server farm has enormous costs, both financial and environmental.
Server farms often have backup servers that can take over the function of the primary servers in the event of a primary server failure. Server farms are typically collocated with the network switches and/or routers that enable communication between the different parts of the cluster and the users of the cluster.
The computers, routers, power supplies and associated electronics are typically mounted on 19-inch racks in a server room or data centre. The image below shows one of Google’s server farms in Council Bluffs, Iowa, which provides over 115,000 square feet of space for servers running services such as Search and YouTube.
Server farms can be built according to two design principles:
- cloning;
- partitioning
Cloning
In the former case, the same software applications and the same data thus forming clones. Applications are then sent to the various clones through a load-balancing system.
Load balancing is a technique that distributes the processing load among several servers. This improves the scalability and reliability of the architecture as a whole. For example, if 20 requests for a Web page arrive on a cluster of 4 servers, the first 5 will be answered by the first server, the other 5 by the second, and so on. Scalability comes from the fact that, new servers can be added to the cluster if needed, while the increased reliability comes from the fact that the failure of one of the servers does not compromise service delivery, which in that case also becomes fault tolerance. In fact, load balancing systems integrate monitoring systems that automatically exclude unreachable servers from the cluster, thereby avoiding responding in incorrectly to a client request.
A set of clones dedicated to performing a particular service is called RACS (Reliable Array of Cloned Services), in which if one clone suffers a failure, another node can continue to deliver that service.
RACS can occur in two configurations:
- shared nothing;
- shared disk.
In the first configuration, the stored data are replicated on each clone and reside on a disk hard disk local to each clone; therefore, an update of data must be applied to each of the clones. This configuration presents excellent performance for read-only applications, such as for example, accessing static pages or downloading of files or images from Web servers.
In the second configuration, also called a cluster, the clones share a storage server that manages the hard disks, as we can see from the following image:
Partitioning
Conversely, the partitioning technique involves the duplication of hardware and software but not of the data, which are instead partitioned among the nodes. Each node then performs a specialized function; for example, a company’s sales Web system may be partitioned by customer types or by product lines, and each is managed by a node.
Partitioning is transparent to applications; requests are sent to the partition that has the relevant data. For example, if partitioning is implemented by commodity types (commodity1, commodity2, etc.), requests for access to commodity type1 are routed to the server which is able to access the requested commodity data. However, the data is stored on a single server, this means that in case of failure the part of the service handled by it is no longer accessible.
This characteristic is known as the graceful degradation property (partial degradation) of distributed systems: unlike centralized systems, in the event of a failure not the whole the system is inaccessible, but only some functionality is no longer available. To solve the problem of unavailability of some application functionality in case of failure, we employ often the cloning of individual servers constituting the partition, thus creating packs. This is then referred to as RAPS (Reliable Array of Partitioned Service), which represents a solution that ensures both scalability and service availability.