28 December 2018

Scaling out with Integration Services in SQL Server

Many software apps in the market nowadays are made by management systems. From the most simplified module, which aims at the management of the telecommunications and the networks between clients, to the most robust, responsible for the failures and connections’ management. All of these modules are made up of users and servers that exchange information among each other, using remote procedure calls (RPC) configured on a distributed processing system.

Taking into account the present and future function of the software, it must be endowed with the most diverse characteristics of flexibility, sustainability and functionality. But sometimes a peculiarity is simply left to chance – Scalability. This is the feature that will be addressed throughout the article.

Before exploring the use of scalability, it is necessary to analyse the concept itself. The scalability is reflected in the ability of a software to efficiently support small or large implementation, during its development, since it may be necessary to add capacity to support more users and/or improve the service’s quality over time. This way, it is possible to affirm that scalability is a desirable feature that allows a system to accommodate an increasing number of resources.

In any case, the presence of reduced scalability leads to poor system performance, as expectable. Therefore, the excessive repetition of resource dissipative algorithmic activities becomes central. These activities are performed with the goal of minimizing possible reengineering that could achieve the full use of parallelism.

Next, two methods are defined that enable the scalability’s increase in softwares that run on SQL Server.

The first method presented is related to the structural scalability (also known as Vertical Scaling or Scale-Up). In this method, the goal is to add more resources (CPU processing capacity, available RAM and storage) to a single machine. This option is chosen when the user does not want to make major changes to the current architecture.

The second method is about the upload scalability (Horizontal Scaling or Scale-Out). Unlike the previous one, nodes are added to the same system that allows a full execution when an increase in information traffic occurs. However, this is not the only way to expand the system. It is possible to explore multiple configurations, such as Peer-to-Peer and AlwaysOn.

Up to the SQL Server 2016 release, the SQL Server Integration Services (SISS) component was only provided with a limited packet execution. Only one run per machine. Through the introduction of Scale Out functionality in SQL Server 2017, Microsoft has removed that gap. The Scale Out allows the execution of packages, presenting high performance, through the distribution of executions by several machines in parallel. This new functionality mitigated the previously mentioned issues, in addition to the following implementations made in SISS:

  • Increase in the number of packages;
  • Increase in the volume of data from various sources;
  • Increased complexity in data transformation tasks.

In order to make use of the offered features, it is necessary to install it and know how to configure it correctly.

Using the SQL Server Installation Wizard – Image 2 – when choosing the characteristics, you must select, at least, the following items:

  • Database Engine Services;
  • Integration Services:
    • Scale Out Master;
    • Scale Out Work;

Scaling out with SQL Server

Image 2. SQL Server 2017 – Selection of features to install

On the next screen, we can configure the server. In the example shown – image 3 –  we choose to continue with the values defined by default.

Scaling out with SQL Server

Image 3. SQL Server 2017 – Configuration of the server

Once the server configuration is complete, it is necessary to define the authentication mode to mixed mode and then choose a password and the list of administrators that will have unrestricted access to the Database Engine – Image 4.

Scaling out with SQL Server

Image 4. SQL Server 2017 – Database Engine configuration.

The next steps will be preponderant for the correct functioning of SSIS Scale Out. The Master Node is the first element configured and it is through this that it becomes possible to receive and manage the various execution requests made by the users. Later, we’ll cover its method of operation and some of the information we can collect through existing views and stored procedures. Regarding its configuration, the default port number is 8391 and due to the lack of a SSL certificate, a new one is created for this purpose – Image 5.

Scaling out with SQL Server

Image 5. SQL Server 2017 – Configuration of the Master Node.

The Worker Node configuration follows the Master’s. The main function of this node is the execution of tasks coming from the Master and execute them locally. In this menu it is only necessary to specify the endpoint in which each Worker Node should be connected to stand in order to receive indications.

Scaling out with SQL Server

Image 6. SQL Server 2017 – Configuration of the Worker Node.

After this, it is possible to check the summary of the features to install and specify the path where the ConfigurationFile.ini will be placed – Image 7. After installation, it is necessary to restart the workstation and install SQL Server Management Studio (SSMS), if you do not already have it – Image 8.

Image 7. SQL Server 2017 – Configuration summary.

Image 8. SQL Server 2017 –  SSMS installation.

The entire process management features that are associated with the Master Node element – previously displayed, are possible to execute because of the information contained in the SSISDB Catalog. As it is possible to verify through Image 9, in this, are located all the information about the various Worker Nodes, packages and executions – Image 10.

Image 9. SSIS Scale Out Master – SSISDB Catalog functionalities.

Image 10.  SSIS Scale Out Master – Package execution.

However, it is possible to access the same features by running Views and Stored Procedures that are available. The following are some of the most relevant for use by the Master Node.

  • Views:
    • [catalog].[master_properties]: Propriedades do Master Node

Image 11. Master’s properties

  • [catalog].[worker_agents]: Informations about the Worker Nodes
  • Stored Procedures:
    • Management:
      • [catalog].[disable_worker_agent]: Enables the removal of the selected Worker from the activity;

Image 12. Disabling of the Worker

  • [catalog].[enable_worker_agent]: Enables the addition of the Worker selected from the activity;

 

Image 13. Activation of the Worker

 

  • Execution:
    • [catalog].[create_execution]: Creates an instance of execution in the Integration Services catalog;
    • [catalog].[add_execution_worker]: Adds a Integration Services Worker to an instance of execution in Scale Out;
    • [catalog].[start_execution]: Starts an instance of execution.

Scale Out was not developed especially for developers who have privileged access to SQL Server Management Studio. This way, Microsoft has provided this component with an extra application that is very intuitive and easy to access –  ISManager.exe – in which presents the following interface – Images 14 and 15.

Image 14. Scale Out Manager – General Information

Image 15. Scale Out Manager – Information about each Worker

As presented, SQL Server Integration Services Scale-Out provides its users with the necessary scalability given the growing need for successful process execution. Nevertheless, there are many factors to take into account when it is necessary to plan and determine the choice between the Scale-Out approach and Scale-Up in a given project. In certain cases, having only one server with great power and performance can be more expensive than having multiple servers with a smaller capacity, especially when the argument is simply financial. However, the disadvantage of using multiple servers is relative to the number of SQL Server licenses, electricity and hosting costs.

André Domingos
Associate Consultant