In this article I am to define as deep as possible what is a web system and establish a taxonomy that allows a proper classification of said systems. These categories can potentially define factors that affect the complexity of the system itself and define the nature of these systems.
Web systems are characterized by its client-server model. They are, hence, distributed applications, with potentially many clients accessing and operating the same server concurrently. Communications happen over a network known as the Internet. The used communication protocols are either HTTP or HTTPS which usually run over TCP/IP.
URI as resource identifier
Uniform Resource Identifiers are strings of characters that unambiguously and particularly identify a resource and that enable interaction with those over a network. Uniform Resource Locators (URL), a subset of them, are very commonly known as “web addresses”, even though they can identify other types of services as well. The usage of these is very characteristic of these systems.
Type of Communication
One differential factor is whether the clients are human users or other systems.
When the systems provide information to be consumed by other systems, they are named web services. On these systems, the outgoing information is formatted for machine consumption. Two popular ways of formatting data for machine consumption are XML and JSON, which are transmitted as plain text files with specific formatting mechanisms. The client systems can obtain information from the system and also, potentially, modify its state.
When the clients are human the systems are usually called web applications. Human clients access these systems most of the time on the context of a web browser. The design of this systems is usually divided on what is called backend, which comprises the set of techniques that define how data is processed on the server and frontend, which controls how the information is presented to the user. Interaction happens through the usage of HTML forms most of the time.
In this context, we will talk about stateful as a contraposition to stateless.
Stateful systems are traditionally considered to be less scalable, as they present problems with, for example, load balance. As an answer to this, stateless applications appeared. In this case, the state information is stored on the client and sent to the server on every request.
In this method, the information is delivered to the client using a representation of it exactly as it is stored. If the state of the system changes, those are regenerated, but it is not done based on any request by any client. It is not a request, but a state change, which triggers the regeneration of the representations.
The information is built and sent to the client on response to the request. This commonly implies what is called rendering, which comprises the gathering of the information from the sources, which usually are databases or files, its processing and the composition of the data in a way that can be sent to the client.
Cached delivery would be in between static and dynamic methods. As an answer to a request, during the rendering process, a persistent representation is created and stored, that is then reused for the subsequent requests for the same resource. There are different techniques for the storage and invalidation of this stored representations to detect the need for their regeneration, which are outside of the scope of this article.
The server is installed in a static internet-connected computer. The resources of said computer can be fully devoted to the execution of one web system or can be shared among several of them, in what is called shared hosting.
The server resides in a virtual machine inside a bigger static machine. Virtualization eases the management of the resources devoted to each of the virtual machines that are contained on the bigger machine, bringing greater resource flexibility and cost-efficiency.
In this model, the provider of the services manages the resources allocated to the application, abstracting the developer from the resources. This approach usually brings the biggest flexibility, alongside pay-per-usage models, instead of pay-per-resource or pay-per-unit. It also eliminates the need for provisioning and resource calculations.
Memory and Files
Some of these systems use the host’s operating system file system and memory to persist the data. Sometimes there is a mix of keeping data in memory and in files, in order to speed up the delivery. Some types of databases also offer the option to use in-memory data structures. Also, some types of databases offer the option to keep data in memory for access speedup.
SQL Databases are one of the oldest and most traditional ways for web systems to persist the data. These are on their own, usually, client-server applications, and hence can be running on the same machine than the application or on a different one. PostgreSQL, MariaDB, SQL Server or MySQL are examples of these. They use relational table based models in order to structure the data, and the SQL language to write the queries.
The main difference with the SQL Databases is how the data is structured, as they use different data structures which are not based on the relational table model. The name can be misleading, because many of the implementations of these types of databases implement SQL dialects to query the data. There are many subtypes of these databases, based on the data structure they implement. Some notable examples are Redis, which uses a key-value data model, MongoDB, which uses a document model, or AllegroGraph, which uses a graph model. These usually leave some functionality out of the traditional table relational model in order to obtain some speedup in the querying of the data. The usage of these has raised with the advancement of big data.