radishintegrations-blog - Tumblr blog

radishintegrations-blog · 7 years ago

Photo

Using a tMSSQLSP component as your source for a SQL result set seems like a pretty straightforward concept, especially if you come from an SSIS background. However, there are a few things that make it not so intuitive.

I’m quite green when it comes to Talend, but have well over a decade of experience in ETL design in the Microsoft stack. Recently, I had the opportunity through my employer to work on this specific issue with a Talend-centric consulting firm, ETL Advisors. If you’re looking to add extra skills to your team, you should check them out.

Back to the topic at hand:

First of all, you’ll need to pass in any parameters to the stored procedure, which you can find various articles for. I won’t go into great detail here.

Second, the resultset that is your “select statement” within the stored procedure will need to be schema-defined as an object within Talend, and mapped as a RECORD SET.

Third, on a tParseRecordSet object that you’ll connect to the main row out from your tMSSQLSP object, you’ll need to define out the schema of all the columns that come from the stored procedure results. As you have likely encountered in your ETL career, this can sometimes be dozens of columns. If you hate yourself, get 2 monitors and start hand keying all of the output schema. If you want a possible shortcut, keep reading.

When you are in the output portion of the schema definition screen, you can import from an XML file using the folder icon highlighted below.

Where would you get this XML from? Well, with a little help from your friends, that’s where. SQL server has a ton of system stored procedures and DMVs that give you all types of information. I’ve enhanced the “sp_describe_first_result_set” in a way that presents it as valid XML. You’ll be able to copy the XML column, and paste it into an XML file that gets imported above.

This is accomplished with the following stored procedure creation/execution to suit your needs. There is plenty of help on how to use sp_describe_first_result_set natively, if you want to just see it in action without my enhancements.

But in the meantime, here’s my T-SQL if you want to get started on the rest of your life. Keep in mind there are a few data types that are probably not included, and it is mainly because I didn’t see a quick way to map them to Talend-equivalents. (Feel free to edit it up!) :

--+PROCEDURE TO COMPILE TO A LOCAL DATABASE WHERE USE IS WARRANTED ALTER PROCEDURE USP_HELPER_INTERROGATE_SP ( @PROC AS NVARCHAR(max), @PARAMS_IN AS NVARCHAR(MAX) ) AS BEGIN --+author: Radish --+date: 3/8/2018 --+comments: For demonstration purposes only, results will very with more complex stored procedures

--+/// --+create a holder table variable for manipulation of the returns results from the sp DECLARE @temp TABLE ( is_hidden BIT , column_ordinal INT, [name] SYSNAME, is_nullable BIT, system_type_id INT, system_type_name NVARCHAR(256), max_length SMALLINT, [precision] TINYINT, scale TINYINT, collation_name SYSNAME NULL, user_type_id INT, user_type_database SYSNAME NULL, user_type_schema SYSNAME NULL, user_type_name SYSNAME NULL, assembly_qualified_type_name NVARCHAR(4000), xml_collection_id INT, xml_collection_database SYSNAME NULL, xml_collection_schema SYSNAME NULL, xml_collection_name SYSNAME NULL, is_xml_document BIT, is_case_sensitive BIT , is_fixed_length_clr_type BIT, source_server SYSNAME NULL, source_database SYSNAME NULL, source_schema SYSNAME NULL, source_table SYSNAME NULL, source_column SYSNAME NULL, is_identity_column BIT, is_part_of_unique_key BIT, is_updateable BIT, is_computed_column BIT, is_sparse_column_set BIT, ordinal_in_order_by_list SMALLINT, order_by_is_descending SMALLINT, order_by_list_length SMALLINT, [tds_type_id] INT, [tds_length] INT, [tds_collation_id] INT, [tds_collation_sort_id] INT )

--+/// --+fill the holder table variable with the sp resultset INSERT @temp EXEC sp_describe_first_result_set @tsql = @PROC, @params = @PARAMS_IN --+/// --+ do the talend formatting, from the above results set and a join to the system types for sql server SELECT * FROM ( --+Talend Data Services Platform 6.4.1 XML layout for tParseRecordSet schema OUTPUT SELECT -1 AS column_ordinal, '<?xml version="1.0" encoding="UTF-8"?><schema>' AS XML_SQL_OUTPUT_SCHEMA UNION ALL SELECT column_ordinal, '<column comment="" default="" key="false" label="' + T.name + '" length="'+ CAST(max_length AS VARCHAR(50)) + '" nullable="false" originalDbColumnName="' + T.name+ '" originalLength="-1" pattern="" precision="0" talendType="' + CASE WHEN T.system_type_id IN (106,108,122) THEN 'id_BigDecimal' WHEN T.system_type_id = 104 THEN 'id_Boolean' WHEN system_type_id IN ( 40, 42,61 ) THEN 'id_Date' WHEN T.system_type_id IN (62) THEN 'id_Float' WHEN system_type_id IN ( 52, 56, 63 ) THEN 'id_Integer' WHEN T.system_type_id IN (127) THEN 'id_Long' WHEN system_type_id IN ( 39, 47 ) THEN 'id_String' WHEN T.system_type_id IN (48) THEN 'id_Short' ELSE 'id_String' END + '" type="' + UPPER(ST.name) + '"/>' FROM @temp T INNER JOIN sys.systypes ST ON T.system_type_id = ST.xusertype UNION ALL SELECT 999999999 AS column_ordinal, '</schema>' ) A ORDER BY column_ordinal END

And finally, below is an example execution of this stored procedure against a popular system stored procedure (but you should be able to put most user defined stored procedures in the place of the first parameter). Also, please note the parameter list for your stored procedure is supplied in the second ‘’, in the format ‘@[parameter name] DATATYPE, et. al’. So if you have a @NAME VARCHAR(20) and a @AGE INT, you’d pass that in as ‘@NAME VARCHAR, @AGE INT’

--+HELPFUL FOR GETTING SCHEMA DETAILS FROM FIRST RESULTSET IN A STORED PROC

--+example use with basic stored procedure on any recent SQL Server instance

--- stored proc DB.SCHEMA.NAME--,-- comma delimited params list @[NAME] DATATYPE,... EXEC DBO.USP_HELPER_INTERROGATE_SP 'msdb.sys.sp_databases' , ''

#DBA #SQL SERVER #TSQL #TALEND #INTEGRATIONS #STORED PROCEDURES #METADATA #IHATETYPING

1 note · View note

radishintegrations-blog · 7 years ago

Photo

Talend and java combined don’t always throw the most intuitive errors available on planet Earth. The null reference/unknown source from Screen A, are being thrown from an attempt at creating a new BigDecimal from a parsed in string value sourced from a flat file. Screen B is a grab of the tJavaRow where the BigDecimal function is trying to be applied. Screen C is showing on the advanced tab, of the tFileInputDelimited, you have the option of setting defaults to any column which comes in empty. Finally, Screen D is showing what a default trim level would be on any defined input columns from a metadata layout for the flat file.

The moral of the story is, for a flat file input that you are planning to pass through a tJavaRow for code manipulation - it would be in your best interest as a productive integration member on planet Earth, to trim and default any columns you have plans taking from a string to a numeric value. This error turned out to be a value of “864.00_” being passed in, where the _ represents a blank space. And everyone knows that Taylor isn’t going to have a blank space, and write your integrations-loving name!

#integrations #BigDecimal #Talend #tJavaRow #null #unknown source #Java #garbage in garbage out #RadishIntegrations

1 note · View note