The R-programming language is probably the most popular statistical programming language used today, and much of its success could be attributed to its availability to anyone with a computer. R is open sourced, freely available and easily installed on most operating systems including Microsoft Windows, Mac OS and Linux. Perhaps because of its deep roots in open source and its wide adoption in academia, there is a plethora of R packages that can be installed and used in your projects. These packages have a wide variety of applications including data visualization, machine learning, complex statistical calculations and much more! As an Oracle Application Express (APEX) developer, what do you do when your customer asks for the ability to dynamically generate a linear regression model based on the information stored in the database? The Oracle APEX Statistics application should display simple summary statistics and allow users to predict future values given some parameters?
For statistical calculations, it is possible to code a solution with Oracle’s aggregate and/or analytic functions. For example, the following SQL statement will return the Y-intercept and slope (coefficient) using data stored in a database table:
select regr_intercept(roll, unem) as intercept , regr_slope(roll, unem) as coefficient from apexr_enrollments
Another alternative would be to use R! To call the standard linear model function (lm), data must be loaded into a structure that R can understand, e.g. a data frame or list. After that, creating a linear model is as simple as: my_model <- lm(roll~unem, d) where d is the reference to the data frame or list containing the data. The challenge is how R scripts can be executed from within PL/SQL?
The official Oracle recommendation would likely be to license the Advanced Analytics Options, and that also assumes that you are running the enterprise edition of the database. The Oracle R Enterprise team provides a blog series that introduces this approach. However, if resources are limited, or the use case is simple enough not to justify a huge purchase, then here is a workaround that works even with Oracle Database XE.
Tools Required:
- CentOS with OXAR.
Step 1
From with a valid OXAR instance, install additional software and R packages as root:
- Latest version of R from the EPEL repository:
> yum install -y R
- Create an R script with the following contents:
install.packages('needs', repos='http://cran.us.r-project.org') install.packages('jsonlite', repos='http://cran.us.r-project.org')
- Save the file as install-packages.R
- Execute the R script from command line
> Rscript install-packages.R
Step 2
- Create a new project folder and creating a new package.json file
> npm init
- Install the required node packages:
> npm install --save r-script > npm install --save body-parser > npm install --save express
Step 3
- Create the two R scripts that performs the computation
- calculator.R:
x <- as.numeric(input[[1]]) y <- as.numeric(input[[2]]) sum <- x + y product <- x * y data.frame(sum, product)
- linear_model.R:
d <- input[[1]] fit <- lm(roll~unem, d) summary <- summary(fit) data.frame(summary$coefficients)
- calculator.R:
Step 4
Create the REST service using Express:
var express = require('express'); var app = express(); var bodyParser = require('body-parser'); var R = require('r-script'); app.use(bodyParser.urlencoded({extended: true})); app.use(bodyParser.json()); var port = process.env.PORT || 4000; var router = express.Router(); router.get('/', function(req, res) { res.json({ message: 'running' }); }); router.route('/calculate').get(function(req, res) { var result = R('calculator.R') .data(req.query.left_op, req.query.right_op) .callSync(); res.json({ result: result }); }); router.route('/model').post(function(req, res) { var result = R('linear_model.R') .data(JSON.parse(req.body.data)) .callSync(); res.json({ result: result }); }); app.use('/api', router); app.listen(port); console.log('Running on port ' + port);
Step 5
There are many ways to start and daemonize the REST service. My current favourite is to use PM2, or simply run the app with the command:
node server.js.
Step 6
Finally, create the Oracle APEX Statistics application. Add three pages to the sample application.
The first is a simple form that has two numeric fields for the left and right operand. Clicking on the button “Calculate” submits and calls the process that will compute the sum and product of the two numbers and display them.
declare l_result clob; l_values apex_json.t_values; begin l_result := apex_web_service.make_rest_request( p_url => 'http://localhost:4000/api/calculate' , p_http_method => 'GET' , p_parm_name => apex_util.string_to_table('left_op:right_op') , p_parm_value => apex_util.string_to_table( :P1_LEFT_OPERAND || ':' || :P1_RIGHT_OPERAND ) ); apex_debug.info(l_result); apex_json.parse( p_values => l_values , p_source => l_result ); :P1_SUM := apex_json.get_varchar2( p_path => 'result[%d].sum' , p0 => 1 , p_values => l_values ); :P1_PRODUCT := apex_json.get_varchar2( p_path => 'result[%d].product' , p0 => 1 , p_values => l_values ); end;
The second page displays a report containing data imported from the sample file obtained a tutorial on performing simple linear regression with R. Clicking on the button “Generate Model” launches a third modal page that generates a linear model and then return the intercept and coefficient using a page load process.
declare l_request_body clob; l_result clob; l_values apex_json.t_values; begin apex_json.initialize_clob_output; apex_json.open_array(); for result in ( select * from apexr_enrollments ) loop apex_json.open_object; apex_json.write('year', result.year); apex_json.write('roll', result.roll); apex_json.write('unem', result.unem); apex_json.close_object; end loop; apex_json.close_all; l_request_body := apex_json.get_clob_output; apex_web_service.g_request_headers(1).name := 'Content-Type'; apex_web_service.g_request_headers(1).value := 'application/x-www-form-urlencoded'; l_result := apex_web_service.make_rest_request( p_url => 'http://localhost:4000/api/model' , p_http_method => 'POST' , p_body => 'data=' || l_request_body ); apex_debug.info(l_result); apex_json.parse( p_values => l_values , p_source => l_result ); :P3_INTERCEPT := apex_json.get_varchar2( p_path => 'result[%d].Estimate' , p0 => 1 , p_values => l_values ); :P3_COEFFICIENT := apex_json.get_varchar2( p_path => 'result[%d].Estimate' , p0 => 2 , p_values => l_values ); apex_json.free_output; end;
The intercept and coefficient can then be used to predict enrollment rates given an unemployment rate. For the adventurous, try modifying the data using the Interactive Grid and see how the model and predicted values change.
Oracle APEX Statistics
This is a very simple demonstration on how you can integrate and combine the powers of R and APEX, and is only meant as a starting references. There are many ways to improve the solution. For example, using database connectors to have either R or Node.js interact directly with the database. Some might even question, why not do everything in PL/SQL?
In preparation of my upcoming Kscope 17 presentation on Building in Blocks, I have been reading a lot on the Microservices Architecture and how the hype applies to us APEX developers. One characteristic that stands out to me is Decentralized Governance, and I quote:
“One of the consequences of centralized governance is the tendency to standardize on single technology platforms. Experience shows that this approach is constricting – not every problem is a nail and not every solution a hammer. We prefer using the right tool for the job and while monolithic applications can take advantage of different languages to a certain extent, it isn’t that common.”
If we had access to data analysts who are proficient in R, should we not spare them the effort of learning PL/SQL and re-implementing statistical computing code that has already been more elegantly addressed by R? And of course, vice versa.
At Insum, we believe that this is perhaps one of the biggest advantages of using APEX as the core development platform. With so much extensible features available to developers, APEX can be leveraged as a “construction based” or “breadboard”. Simply wire up other third-party application code like R or Node.js scripts, and benefit from other architectural components such as authentication, authorization, session state management and much more!
To learn more about how you can use APEX and Node.js to incorporate advanced JavaScript libraries, please contact us about our training and consulting services.
Thanks for the amazing article Adrian. All of this can be run on Windows too, right? As long as we have the same development environment as OXAR?
Thanks Farzad! Yes, this should work on Windows or Mac since both R and Node.js are supported on these operating systems. You don’t need exactly the same environment as defined in OXAR. All you need is a functioning APEX environment running on your Windows machine.
Adrian,
This is super cool!
This is an excellent way to show how extensible ORDS/APEX can be, I mean this is not just a demo but an amazing (functional) workaround to the main inhibitor for companies to use new technologies. (money).
Great work!!
Thanks Gaspar! Glad you enjoyed the read!