Machine Learning Services and Result Sets
When you run an R or Python script in-database via the sp_execute_external_script stored procedure, result sets returned by the stored procedure are output with unnamed columns by default. Here's an example using R and the [WideWorldImporters] database. The input data is a simple query on the Application.Countries table. R creates a data frame from the input data and merely returns it back to SQL Server (as the "OutputDataSet" data frame). Notice the columns are named "(No column name)" in the SSMS output:
EXECUTE sp_execute_external_script
@language =N'R',
@script=N'OutputDataSet<-InputDataSet',
@input_data_1 =N'SELECT CountryName, IsoAlpha3Code, LatestRecordedPopulation
FROM WideWorldImporters.Application.Countries'
GO
To get named columns in the result set, we can use the WITH RESULTS SET clause of EXECUTE. This also allows us to define the data types of each column. Here's the example from above, but this time using WITH RESULT SETS (lines 6 - 10):
EXECUTE sp_execute_external_script
@language =N'R',
@script=N'OutputDataSet<-InputDataSet',
@input_data_1 =N'SELECT CountryName, IsoAlpha3Code, LatestRecordedPopulation
FROM WideWorldImporters.Application.Countries'
WITH RESULT SETS ((
CountryName NVARCHAR(60),
IsoAlpha3Code NVARCHAR(3),
LatestRecordedPopulation BIGINT
));
GO
The example above is so simple, defining the RESULT SETS poses no problems. But what if the format of the output isn't known at design time? R (or Python) might take the input data set and add, remove, or change columns conditionally. Further, the input data set might not even be known at design time. How would you define the RESULT SETS at run time?
WITH RESULT SETS needs a MAKE_A_GUESS or FIGURE_IT_OUT option. If there's some other type of "easy button" for this, I haven't found it. But I've come up with at least one option that is doable. I'll discuss further in the next post. Until then, happy coding!
Comments