Objective
After writing an article about “LoopingThrough Flat Files using SSIS 2016” I decided to blog about the same
subject but this time related to excel, again I have to say yes it is a very
old subject but I see that most of the companies are using very very old blogs
(one like mine from 2010)
and not seeing that the old approaches require polishing like….
- Converting the SSIS to SSIS2016+
- Using SSIS Project Model
- SSIS Project Parameters
- Package Parameters
- New techniques/tricks/approach/design
- Package Part
- SSISDB
- Etc…
I decided to create a small ETL that extracts all the sheets from an Excel (97-2003) files to SQL table and apply as much as I can from the above list, so let’s start.
Requirements
1.A main folder, in my case it is “C:\P_C\2018\SQL2016\SSIS\ Exl2SQL-Exl97-2003” and it will be set in the Package Parameter called “uPara_MainFolderPath”
2.Create 2 Sub folders in the above folder “SampleFile”, “ToBeProcessed”
3.A Sample CSV file called “SampleFile.csv” with data in the “Samplefolder”
Over look at the
SSIS Project and package
Variables at 3
Levels, "Project", "Package Parameter" and "Package
Variables"
After creating a SSIS package, first thing is to create variables
at 3 levels/locations within the SSIS project and package.
1 - Variables at the “Project Parameter” level
Create 2 Variables, one for your destination SQL server and one for your destination DB name.
Create 2 Variables, one for your destination SQL server and one for your destination DB name.
2 - Variables at the “Package Parameters” level
3 - Variables at the “Package Variables” level
Except the “uVar_ExcelConnectionString”
variable all other expression variables will be erased during the run time (My
new design and technique). The only “Package Variable” you might want to change
during your design time is the “uVar_ExcelConnectionString” variable.
You can see that
I have added 3 new package variables related to the Excel file and its related objects, the only variable that you might need to change is the excel connection string variable (For example I added IMEX=1),
and the sheet name during the development time.
1- "uVar_ExcelConnectionString": This is the excel connection string
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + @[User::uVar_SourceFile] + ";Extended Properties=\"EXCEL 8.0;HDR=YES;IMEX=1\";"
2- "uVar_ExcelSheetName": Change this sheet name to your default sheet name in the SampleFolder, this name will only help you during the development and during runtime it will be overwritten by new sheet name each time.
3- "uVar_ExcelSheetObjectList": This is a SSIS package object Variable that will contain the list of Excel Sheet Names.
Source and
Destination Connection Objects
We need one source “Excel File” connection object to do the
reading and one SQL destination “OLEDB” connection object to do the writing.
1.“Excel File” (Source)
2.“OLEDB” (Destination)
Package Flow
The package has 2 main section, one is to
create backup folder, clear Expressions, and some other variable settings, the
second one is to loop through each excel (97-2003) Files one by one and loop
through each Sheet one by one, do the ETL and finally move the file to the
backup folder, I will not explain the above two section except how I set the second “For Each Loop” in SSIS. I have added a T-SQL script to create the destination
table (look for DestinationClientTable.sql file)
The first Loop
must be set in 3 sections that you can go to my previous blog and check how, click on “Looping Through Flat Files using SSIS 2016”.
After the first
loop you will see a .Net code object script “SCRIPT-----Get The List of Excel Sheets” that will extract the list
of excel sheets and place them into the variable “uVar_ExcelSheetObjectList”
which is a SSIS object variable.
The .Net code that loops through the sheets in each excel files goes like this.
Public Sub Main()
Try
'--------------------------------------------------------
'
User::uVar_ExcelConnectionString
'
User::uVar_ExcelSheetObjectList
'--------------------------------------------------------
'---
Added by SQL Data Side Inc. Nik-Shahriar
Nikkhah
' Don't
forget to add >>>>>
Imports System.Data.OleDb
Dim excelConnection As OleDbConnection
Dim connectionString As String
Dim tablesInFile As DataTable
Dim tableCount As Integer = 0
Dim tableInFile As DataRow
Dim currentTable As String
Dim tableIndex As Integer = 0
Dim excelTables As String()
Dim LoopForNumnberOfRealTabs As Integer = 0
connectionString = Dts.Variables("User::uVar_ExcelConnectionString").Value.ToString
excelConnection = New OleDbConnection(connectionString)
excelConnection.Open()
tablesInFile =
excelConnection.GetSchema("Tables")
tableCount =
tablesInFile.Rows.Count
For Each tableInFile In tablesInFile.Rows
currentTable =
tableInFile.Item("TABLE_NAME").ToString
'str =
tableInFile.Item("TABLE_Type").ToString
'str
= tableInFile.Item("TABLE_SCHEMA").ToString
'str
= tableInFile.Item("TABLE_CATALOG").ToString
currentTable =
currentTable.Replace("'", "")
If Right(currentTable, 1) = "$" Then
LoopForNumnberOfRealTabs +=
1
ReDim Preserve
excelTables(LoopForNumnberOfRealTabs - 1)
excelTables(LoopForNumnberOfRealTabs - 1) = currentTable
End If
Next
excelConnection.Close()
excelConnection = Nothing
Dts.Variables("User::uVar_ExcelSheetObjectList").Value = excelTables
Dts.TaskResult = ScriptResults.Success
Catch ex As Exception
Dim strEX As String
strEX = ex.Message.ToString
Dts.TaskResult = ScriptResults.Failure
End Try
End Sub
------------------------------
Then comes the second loop that uses the “uVar_ExcelSheetObjectList”
package variable (populated in the “SCRIPT-----Get The List of Excel Sheets” step)
which contains the list of the excel sheets and loops through them one by one, on
each loop it places the excel sheet name into the “uVar_ExcelSheetName”
variable that will be used in the source object in the DFT.
With 2 steps you
can set up the second loop.
1.Collection
2.Variable Mapping
Please check the DFT settings
How To Test?
- Set your SQL server name and SQL DB Name in the project parameters.
- Create the destination table (You can use DestinationClientTable.sql).
- Create a folder in my case it is “C:\P_C\2018\SQL2016\SSIS\Exl2SQL-Exl97-2003”.
- Copy the above Path in the “uPara_MainFolderPath” variable.
- Create 2 Sub folders 1 – SampleFile 2- ToBeProcessed.
- You must have your SampleFile97-2003.xls (97-2003 format) file copied in the SampleFile folder (I have provided two files, one of then has multiple sheets “SampleFile97-2003 - MultipleSheet.xls”).
- Copy all of your source files into the ToBeProcessed folder or you can use the SampleFile97-2003.xls to get the ball rolling (or use “SampleFile97-2003 - MultipleSheet.xls”).
- Run your SSIS package
- Using windows explorer go to the same path as defined in “uPara_MainFolderPath” variable, you will see a sub folder named “BackupFolder”.
- Now your files have moved from ToBeProcessed folder to the BackupFolder folder under todays “…\BackupFolder\yyyy-mm\yyyy-mm-dd-HHmmss\”
- Also please check your destination table, my sample uses the [dbo].[Client] table.
Please note if…
- If your ToBeProcessed folder (Source Folder) is located on another folder/machine you can enter the valid path in the “uPara_ToBeProcessedFolder” variable.
- If your Backup folder is located on another folder/machine you can enter the valid path in the “uPara_BackupMainFolderPath” variable.
Do you want the code? Click here (From
Google).
References…
- http://plexussql.blogspot.ca/2010/04/looping-through-excel-files-and-sheets.html
- https://docs.microsoft.com/en-us/sql/integration-services/connection-manager/excel-connection-manager
- https://docs.microsoft.com/en-us/sql/integration-services/connection-manager/connect-to-an-excel-workbook
- https://docs.microsoft.com/en-us/sql/integration-services/extending-packages-scripting-task-examples/working-with-excel-files-with-the-script-task
- https://docs.microsoft.com/en-us/sql/integration-services/load-data-to-from-excel-with-ssis
- https://docs.microsoft.com/en-us/sql/integration-services/extending-packages-scripting-task-examples/working-with-excel-files-with-the-script-task
- http://www.madeiradata.com/load-data-excel-ssis-32-bit-vs-64-bit/
No comments:
Post a Comment