Showing posts with label SSIS Package. Show all posts
Showing posts with label SSIS Package. Show all posts
Saturday, 3 November 2012
Wednesday, 31 October 2012
Write data from multiple tables to single flat file.
Tuesday, 30 October 2012
Handling extra/missing columns in Delimited Flat file
Delimited Flat files (CSV) are one of the important non-relational
source for SSIS ETL but if we get robust file all the time then we can
use inbuild flat file source in SSIS. Surprisingly we are not that much
lucky and get the data in which columns are missing (or we can say extra
columns) like as follow...
Flat File INPUT
we Want Flatfile OUTPUT
- Flat file source which reads all data into one column and then splits them based on index of the comma in a derived column.
Here I'm giving you the demonstration to handle such file with
the help of derived column.We have to write expression (sometime
complex) to get the proper data in column format. In the above example I
am getting all the data as String and after splitting we are saving it
as string only (If you want to change the data type then you can use
Data Conversion task). If number of columns are more then this approach
is not recommended, In that case you should go with the first 2
approach.
Step1: Inside the Data Flow Task (DFT), Select a Flat File Source and Derived column.
Step2: Configure the FlatFile Source in such a way that you will get all input data in single column.
Note: We are keeping the Column name as "C" in the example.
Step3: Double click the derived column and use the following logic.
- For separating the column "C1" from the column "C", we need to fetch the data from position 1 to first occurrence of delimiter " , " so finding the first delimiter we have to use FINDSTRING SSIS function (Returns the location of the specified occurrence of a string within a character expression).
- SUBSTRING(C,1,FINDSTRING(C,",",1) - 1)
- Here we have to do "-1" because we have to exclude delimiter from the column value.
- For separating the column "C2" from the column "C", we have to fetch the data after the first delimiter to 2nd delimiter but before that we have to check whether we have 2nd delimiter in place
- FINDSTRING(C,",",2) != 0 ? SUBSTRING(C,FINDSTRING(C,",",1) + 1,FINDSTRING(C,",",2) - FINDSTRING(C,",",1) - 1) : SUBSTRING(C,FINDSTRING(C,",",1) + 1,LEN(C) - FINDSTRING(C,",",2))
- For separating the column "C3" from the column "C"
- FINDSTRING(C,",",2) == 0 ? "NULL" : (FINDSTRING(C,",",3) != 0 ? SUBSTRING(C,FINDSTRING(C,",",2) + 1,FINDSTRING(C,",",3) - FINDSTRING(C,",",2) - 1) : "NULL")
- For separating the Last column "C4" from the column "C",we have to fetch the data after the third delimiter to end of the row. and for calculating the last position of the row, we have to use LEN SSIS function
- FINDSTRING(C,",",3) == 0 ? "NULL" : SUBSTRING(C,FINDSTRING(C,",",3) + 1,LEN(C) - FINDSTRING(C,",",2))
Thursday, 18 October 2012
SSIS Package
Introduction
| EmpID | EmpName | DOB | DOJ | Salary |
| 1 | User1 | 1/1/1976 | 1/4/2000 | 20000 |
| 2 | User2 | 1/2/1976 | 1/5/2000 | 20000 |
| 3 | User3 | 1/3/1976 | 1/6/2000 | 20000 |
| 4 | User4 | 1/4/1976 | 1/7/2000 | 30000 |
| 5 | User5 | 1/5/1976 | 1/8/2000 | 20000 |
| 6 | User6 | 1/6/1976 | 1/9/2000 | 40000 |
| 7 | User7 | 1/7/1976 | 1/10/2000 | 20000 |
| 8 | User8 | 1/8/1976 | 1/11/2000 | 35000 |
| 9 | User9 | 1/9/1976 | 1/12/2000 | 20000 |
| 10 | User10 | 1/10/1976 | 1/6/2000 | 20000 |
Steps to Create SSIS Package
1. Open business intelligence development studio.2. Click on File-> New -> Project.
3. Select Integration service project in new project window and give the appropriate name and location for project. And click ok.
- Tool Box on left side bar
- Solution Explorer on upper right bar
- Property Window on lower right bar
- Control flow, data flow, event Handlers, Package Explorer in tab windows
- Connection Manager Window in the bottom
- In General Tab, enter connection manager name and description
(optional). Select source file, file format and delimiter. If first row
of source file contains headers, then select the checkbox “Column names
in the first data row".
- Select Column tab and check whether all columns are properly mapped or not.
- Select advance tab. Here you can add, remove or modify columns as per output stream requirement.
- Select preview tab to check how your output will look like:
8. Now Drag Data Flow Task from the Toolbox into the Control Flow Container.
- Data flow sources - Source makes data from different external data sources available to the other components in the data flow.
- Data flow transformations - Transformations can perform tasks such as updating, summarizing, cleaning, merging, and distributing data.
- Data flow destinations - Destination writes the data from a data flow to a specific data store, or creates an in-memory dataset.
- Connection Manager - Here we will specify source connection manager which we created for source file. If source file contains
nullvalues, select “Retain null values from Source as null values in the data flow” checkbox. - Columns -This tab allows the user to select required output columns and user can also change the output column names.
- Error Output - Using this tab, the user can decide the behavior of the component in case of failure. There are three options:
- Ignore Failure: Selecting this will ignore any failure while reading rows from source and the package will continue executing even any error occurred.
- Redirect Row: Selecting this will redirect the failed rows to other component which is connected with the error precedence constraints.
- Fail component: Selecting this will stop the execution of package in case of failure.
HigherSalary: [Salary] > 20000 (Redirect records if salary is greater than 20000)LowerSalary: For rest of the records
HigherSalary” conditional output and click ok.Records with Salary > 20000”GREEN”, it means package has run successfully, if there is any error, the component which has failed to execute will be shown in “RED” Color. We can see the package execution steps in the “Progress” tab.
Subscribe to:
Comments (Atom)