7 April 2017

Predict using R Language, in Azure Machine Learning


> Using just the Azure Machine Learning, it is not possible to predict in a timeline.

> This solution comes with the need to predict bookings for n periods in the future. The solution with ML only allows to predict the next period.

The Data

> We have historic information of bookings, and we pretend to predict the number of bookings for n periods.

The solution

1.1.1. What is R Language?

> It is a programming language and software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.

1.1.2. What is Triple Smoothing (ETS)?

> It is an algorithm to make predictions using a weighted average of historical values, in which it assigns to the most recent data a greater weight in the forecast of the value that is intended, in this case, in the forecast of passengers.

> Takes into account trends and seasonal variations in introduced data.

1.1.3. The Code

> Next, we present the code implemented and used in the ML project.

# Data input

•       data <- maml.mapInputPort(1)          # class: data.frame

•       library(forecast)                                 #Library forecast

# Preprocessing

•       colnames(data) <- c(“frequency”, “horizon”, “dates”, “values”)         #the column names are the vector

•       dates <- strsplit(data$dates, “;”)[[1]]              #calls the vector column dates and splits with “;”, and inputs the first value it gets

•       values <- strsplit(data$values, “;”)[[1]]

•       dates <- as.Date(dates, format = ‘Y%/m%’)   #converts char to date

•       values <- as.numeric(values)

•       nrflights <- as.numeric(nrflights)

# Fits a time-series model

•       train_ts<- ts(values, frequency=data$frequency)     #time series function which matches a value with the frequency 

•       fit1 <- ets(train_ts)     #ETS (Exponential Smoothing) algorithm applied to the time series functions

•       train_model <- forecast(fit1, h = data$horizon)        #forecast function over the ETS variable (fit1) through the chosen horizon

•       plot(train_model)       #produces a 2D graph, for the train_model

# Produce forecasting

•       train_pred <- round(train_model$mean,2)     #rounds to two decimal places, the mean for each result from the train_model

•       data.forecast <- as.data.frame(t(train_pred))           #transposes the matrix (train_pred) and creates a list of vectors (data.forecast)

•       colnames(data.forecast) <- paste(“Forecast”, 1:data$horizon, sep=””)       #Concatenates the results for the final presentation

# Data output

•       maml.mapOutputPort(“data.forecast”);



> With the code above, we are able to predict for n defined periods.




    João Leitão

      João Pinto

      Rui Xavier