sp_execute_external_script and Bytecode Version Mismatch
- Posted in:
- Development
- Machine Learning Services
- R
- SQL Server
You may run into some trouble with sp_execute_external_script and R when you pass in CHAR and/or VARCHAR "string" data as part of your input data set. Here is an example:
DECLARE @QueryIn NVARCHAR(MAX) = N'
SELECT ''Dave'' AS FirstName, ''Mason'' AS LastName
UNION ALL
SELECT ''Björk'', ''Guðmundsdóttir'';';
EXEC sp_execute_external_script
@language = N'R',
@script = N'Returned_Names <- Input_Names',
@input_data_1 = @QueryIn,
@input_data_1_name = N'Input_Names',
@output_data_1_name = N'Returned_Names';
A 'R' script error occurred during execution of 'sp_execute_external_script' with HRESULT 0x80004004.
Msg 39019, Level 16, State 2, Line 2
An external script error occurred:
During startup - Warning message:
In setJsonDatabasePath(system.file("extdata/capabilities.json", :
bytecode version mismatch; using eval
Error in sort.list(y) : invalid input 'Björk' in 'utf8towcs'
Calls: data.frame ... as.data.frame -> as.data.frame.character -> factor -> sort.list
Error in execution. Check the output for more information.
Error in eval(expr, envir, enclos) :
Error in execution. Check the output for more information.
Calls: source -> withVisible -> eval -> eval -> .Call
Execution halted
A quick turnaround for this issue is to CAST the "string" data as NVARCHAR:
DECLARE @QueryIn NVARCHAR(MAX) = N'
SELECT CAST(''Dave'' AS NVARCHAR(32)) AS FirstName, CAST(''Mason'' AS NVARCHAR(32)) AS LastName
UNION ALL
SELECT CAST(''Björk'' AS NVARCHAR(32)), CAST(''Guðmundsdóttir'' AS NVARCHAR(32));';
EXEC sp_execute_external_script
@language = N'R',
@script = N'Returned_Names <- Input_Names',
@input_data_1 = @QueryIn,
@input_data_1_name = N'Input_Names',
@output_data_1_name = N'Returned_Names';
Personal Experience
During my testing, I've found R handles CHAR and VARCHAR data within the input data set as long as the ASCII codes comprising the data is in the range from 0 to 127. This much is not surprising--those are the character codes for the ASCII table. Starting with character code 128, R begins having some trouble. To demonstrate, I'll use this query that uses undocumented system table spt_values:
SELECT v.number AS [ASCII Code],
CHAR(v.number) AS [Char Value]
FROM master..spt_values v
WHERE v.type = 'p'
AND v.number <= 191
Here is the query above passed as the input data set to sp_execute_external_script:
DECLARE @QueryIn NVARCHAR(MAX) = N'
SELECT v.number, CHAR(v.number)
FROM master..spt_values v
WHERE v.type = ''p''
AND v.number <= 191;';
EXEC sp_execute_external_script
@language = N'R',
@script = N'OutputDataSet <- InputDataSet',
@input_data_1 = @QueryIn
WITH RESULT SETS (
(
[ASCII Code] INT,
[Character Value] NCHAR(1)
)
)
In the output result set, we see character values are interpreted correctly up until ASCII code 128. From there onward, � characters are returned.
Those of you that were watching closely noticed I only selected ASCII codes up to 191. Why did I stop there? If the input data set includes any ASCII codes of 192 or above, we get the "bytecode version mismatch" error:
DECLARE @QueryIn NVARCHAR(MAX) = N'
SELECT v.number, CHAR(v.number)
FROM master..spt_values v
WHERE v.type = ''p''
AND v.number <= 192;';
EXEC sp_execute_external_script
@language = N'R',
@script = N'OutputDataSet <- InputDataSet',
@input_data_1 = @QueryIn
WITH RESULT SETS (
(
[ASCII Code] INT,
[Character Value] NCHAR(1)
)
)
Msg 39004, Level 16, State 20, Line 15
A 'R' script error occurred during execution of 'sp_execute_external_script' with HRESULT 0x80004004.
Msg 39019, Level 16, State 2, Line 15
An external script error occurred:
During startup - Warning message:
In setJsonDatabasePath(system.file("extdata/capabilities.json", :
bytecode version mismatch; using eval
Error in sort.list(y) : invalid input 'À' in 'utf8towcs'
Calls: data.frame ... as.data.frame -> as.data.frame.character -> factor -> sort.list
Error in execution. Check the output for more information.
Error in eval(expr, envir, enclos) :
Error in execution. Check the output for more information.
Calls: source -> withVisible -> eval -> eval -> .Call
Execution halted
But wait! There's more...
All of the testing examples I've show thus far have been with R version 3.3.3. This is what got installed when I installed Machine Learning Services with my SQL Server 2017 instance.
EXEC sp_execute_external_script
@language = N'R',
@script = N'dfReturn <- as.data.frame(as.character(version["version.string"]))',
@output_data_1_name = N'dfReturn';
-----------------------------------
R version 3.3.3 (2017-03-06)
(1 row affected)
On another instance of SQL Server 2017, I was able to upgrade the R version to 3.5.2 (this was more troublesome than I expected--there were many hoops to jump through).
EXEC sp_execute_external_script
@language = N'R',
@script = N'dfReturn <- as.data.frame(as.character(version["version.string"]))',
@output_data_1_name = N'dfReturn';
-----------------------------------
R version 3.5.2 (2018-12-20)
(1 row affected)
With the newer version of R, I've not been able to reproduce the "bytecode version mismatch" error. ASCII codes from 128 - 255 return a � character, while ASCII codes 256 and above return NULL:
DECLARE @QueryIn NVARCHAR(MAX) = N'
SELECT v.number, CHAR(v.number)
FROM master..spt_values v
WHERE v.type = ''p''
AND v.number <= 2048;';
EXEC sp_execute_external_script
@language = N'R',
@script = N'OutputDataSet <- InputDataSet',
@input_data_1 = @QueryIn
WITH RESULT SETS (
(
[ASCII Code] INT,
[Character Value] NCHAR(1)
)
)
Mapping types between one system and another can be tricky. Tough choices sometimes have to be made. In closing, I'll simply recommend you cast SQL types CHAR and VARCHAR to NCHAR and NVARCHAR respectively for input data sets to sp_execute_external_script. Happy coding, everyone!
Comments