Showing posts with label SSIS. Show all posts
Showing posts with label SSIS. Show all posts

Oracle OLE DB source in SSIS

An OLE DB source to Oracle can be added to a SSIS package.

Example data

Server: locksnlatches.mydomain.com
Oracle instance/SID: dwh
Username/schema: scott
Password: tiger

An OLE DB source to Oracle can be added as follows:
  • From the menubar, select SSIS -> New connection... -> OLEDB -> Add -> New
    Provider: Oracle Provider for OLE DB
    Server or filename: locksnlatches/dwh
    User name: scott
    Password: tiger
Notes: 
  • The "Oracle Provider for OLE DB" provider always exists in SSIS. It is a wrapper around the Oracle Client driver that is installed when installing Integration Services. However, the Oracle Client has to be installed separately. It can be downloaded from the Oracle website: instantclient-basiclite-nt-11.2.0.2.0. Install the 32 bit version of the driver to be able to use it in the design environment. The Visual Studio / SSIS IDE does not support 64 bit components
  • It is possible to use the 64 bit Oracle Client driver during execution of the package, e.g. via dtexec, but it needs to be installed additionally. The SSIS package has to be configured to use this different version during runtime.
  • If the server cannot be found, then use the fully qualified name. One reason might be that the Oracle server is located in a different (Windows) domain as the SSIS server.

Use 32-bits Excel Source with 64-bits SSIS

The following is still work in progress and more of a braindump than anything else, but I've put it here, because someone might already find some use for it

Straight out of the box, you cannot use the Excel Datasource in SSIS on a 64-bits Windows platform , because the Excel Datasource uses a 32-bits OLEDB library. Microsoft has not released a 64-bits version of this library.

0) Use MADE 2010
There is 64 bits version for Office 2010, so for Excel 2010. It is called "Microsoft Access Database Engine  2010 Redistributable" of which there is a 32- and 64-bits version. We're interested in the 64-bits version, but this package can only be installed after removing all 32 bits Office applications!

a) Install MADE 2010 while NOT removing 32 bits Office applications:

AccessDatabaseEngine_x64.exe /passive

Notes:

  • THE SYSTEM WILL AUTOMATICALLY REBOOT, SO SAVE WORK BEFOREHAND!
  • Start Excel once to go through the configuration process. Start it a second time. It the configuration process starts again, then disable it by renaming this file to "setup_disabled.exe":
    c:\Program Files (x86)\Common Files\microsoft shared\OFFICE14\Office Setup Controller\setup.exe
  • run the repair option for the Microsoft Access database engine 2010 program inside "add/remove programs" 

b) If that doesn't work, then also install the Data Connectivity Components:
http://www.microsoft.com/en-us/download/confirmation.aspx?id=23734

1) Trick Windows to use 32-bits ODBC data sources
Try adding the SysWow64 folder to the beginning of your path:

Path=%systemroot%\SysWow64;...old values here...

When done via the command prompt via the SET-command, it will be undone when that window is closed.

2) Run SSIS in 32-bits mode
Use the 32-bit DTEXEC.EXE to execute the package containing the Excel Datasource. It is located in "Program Files (x86)\Microsoft Sql Server\90\Dts\Binn" (or for SSIS 2008: "Program Files (x86)\Microsoft Sql Server\10\Dts\Binn")

You can schedule this using "task scheduler" or SQL Agent.

3) Compile package as 32-bits
Package configuration properties -> Debugging -> Debugging Options -> Run64BtRuntime -> set to False.

4) Even dirtier trick: convert Excel spreadsheet to CSV beforehand
This can be done in several ways, with minimal coding required:
  • If Excel is installed, then create a VBS script that uses CreateObject("Excel.Application") to create an Excel object. Then open the XLS file and save it as a CSV using this object.
  • ... or a perl script using the "Spreadsheet::ParseExcel" module
  • ... or a powershell script using "comobject Excel.Application"
  • Install OpenOffice and program a small macro in it's OpenOffice basic language. Then create a VBS or DOS script that opens OpenOffice and calls the macro with parameters, one being the source XLS, and one being the target CSV.
  • Without installing anything on the server running SSIS: write a small C# application that utilizes the Excel OLEDB data provider built into Windows to read data from the Excel sheet. Then write the results to a CSV file.
  • ... or write a small C# application that utilizes the opensource Excel Data Reader library to read data from the Excel sheet. Then write the results to a CSV file. This library appears to be buggy however.
  • Freeware commandline tool: XLS2CSV by Zoom Technology. Parses only the first sheet. The column separator and text quotes are configurable throught an ini file
  • Freeware commandline tool: XLS2CSV by Tom Crow. Parses all sheets to separate CSV files
ODBC/DSN configuration can be found here:
  • 32 bits ODBC connections: %windir%\syswow64\odbcad32.exe
  • 64 bits ODBC connections: %windir%\system32\odbcad32.exe
Notes:
  • ADO also uses the OLEDB library.
  • The OLEDB library is a piece of code that gets loaded and compiled into the calling application at runtime.
  • If you use the MADE 2010 Redistributable package, then be aware of a small bug when connecting to Office 2010. You have to use "MICROSOFT.ACE.OLEDB.12.0" as the driver name in stead of the specified "MICROSOFT.ACE.OLEDB.14.0".
  • Although previous Office versions need to be de-installed before installing MADE 2010, they can be reinstalled afterwards.

Databases

Row-oriented:
Oracle DB
SQL Server
SQL Server CE
MySql: InnoDB
Sybase Advantage Database Server (ADS)
MS Access

Column-oriented:
Extreme-DB
InfiniDB Enterprise Edition
Infobright
MonetDB
ParAccel Analytic Database
Sybase IQ
Sybase Adaptive Server Enterprise (RDBMS & column-oriented)
Vectorwise
Vertica

Appliance:
Aster Data (MPP, MapReduce)
Greenplum (MPP, MapReduce)
Netezza (MPP)
Teradata (MPP)

In-process:
SQLite

Cannot load package & required privilege is not held by the client

How to start a filesystem dtsx package from sql agent, without a config file or without adding a package password to the command line:

- ssis: make note of the package owner, should be windows (domain) account
- ssis: package: encrypt with userkey
- os: sql agent service should run under LocalSystem account
- os: add package owner to local groups: SQLServer2005DTSUser$SERVER, SQLServer2005SQLAgent$SERVER
- ssms: create credential for package owner
- ssms: create proxy using created credential, add all principals, add all subsystems
- ssms: use the created proxy as "run as" user for the job step that starts the package

There is no easy option to specify the password for a package from sql agent, hence the "encrypt with userkey".

See also: http://support.microsoft.com/kb/918760

Background information
Principals have access to a proxy. A proxy is a mapping to a credential. A credential is a mapping to a Windows account.

1. SSMS / Security / Credential, create a credential. Set identity to an existing account that has the appropriate rights to run the job.
2. SQL Agent / Proxies / SSIS Package Execution, create a new proxy
2a. Proxy / General / Credential name: select the credential from step 1
2b. Proxy / General / Subsystems: select all
2c. Proxy / Principals / MSDB Role, add principals: sqlagentuserrole, sqlagentreaderrole, sqlagentoperatorrole. This are the principals that can access the proxy.
2d. The account must have "logon as a batch job" privileges (check with secpol.msc)
4. Set SQL Server Agent service account to LocalSystem
5. If needed, run a job step under another account. Typically: a SQL Server account based on a Windows account: SQL Server Agent / Jobs / job / Steps / Edit / Advanced / Run as user,
- select proxy
- type SSIS package
- package source File system

If the "A required privilege is not held by the client" error occurs (from Microsoft website):

1. Set the SQL Server Agent service account in SQL Server Configuration Manager to the LocalSystem account.
2. Stop and then start the SQL Server Agent service.
3. Reset the SQL Server Agent service account in SQL Server Configuration Manager back to the original account.
4.Stop and then start the SQL Server Agent service.

From Marc at Stackoverflow:

So it boils down to really just two accounts:

one account is needed to run the SQL Server Agent Windows service - this is a Windows account on your machine / server which needs to have enough permissions to run the service, start and stop it - either use LocalSystem, Network Service, or whatever other Windows account you have to run services with

The other account would be the account to run your SQL Server Agent steps under - that's typically a SQL Server account (which could be based on a Windows account), and it needs enough privileges inside SQL Server to do its job, e.g. it needs access to the database objects and all. I would strive to have just one account for each app that runs the SQL Server jobs - makes life a whole lot easier!

From Influent at SQLTeam:

1. set ProtectionLevel to DontSaveSensitive in package properties in Visual Studio (using SSPI anyway so not sure it mattered)
2. created configuration file (had to alter path to config file in dtsx file since it's different on servers than in dev environment)
3. instead of importing the package into MSDB, set the job step Package Source to "File system" and pointed to dtsx file on local drive of server
4. set the Configurations tab of the job step to point to the dtsConfig file created in step 2
5. made sure the AD user kicking off the job was part of local server groups: Administrators, SQLServer2005DTSUser$SERVER, and SQLServer2005SQLAgent$SERVER (I eventually removed Administrator permissions because I only needed it to log to Windows Event Log)
6. made the associated SQL user to the user in step 5 part of the SQLAgentUserRole in the msdb database
7. made sure the local user assigned to the SQL Agent service was part of local server groups: Administrators, SQLServer2005DTSUser$SERVER, and SQLServer2005SQLAgent$SERVER

From xprog.blogspot.com:

1) Executed as user: DOMAIN\user. The process could not be created for step 1 of job 0xB013D0354C8CBD46B79E948740EF5441 (reason: 1314). The step failed.
The error 1314 is "A required privilege is not held by the client".

This message indicates that the SQL Server Service account doesn't have the required rights to switch the security context to the proxy account.

To fix it verify:
1. The proxy account have the "Act as part of the operating system" privilege.
2. The SQL Server Account is in the group
SQLServer2005MSSQLUser$$
3. The Proxy account is in the group
SQLServer2005DTSLUser$$

SQL Server Agent Account must have 'Adjust memory quotas for a process' (or 'Increase quotas' in pre-Windows 2003) privilege on SQL Server.

From FriggleFragle at MSDN:

an issue i found was that sql server agent and ssis had been changed to not use the local system account. this caused me to continually get the dreaded --> "A required privilege is not held by the client" error. i fought with it all afternoon and eventually updated the services to use the local system account and it worked. perhaps this will save some other as much time as i wasted.

Concatenating text files

Jamie Thomson has a post on concatenating multiple text files (using the ForEach Loop) and subsequently loading this concatenated file in one go. You can find the post here:

http://blogs.conchango.com/jamiethomson/archive/2006/06/22/4116.aspx

I use to do this with a small batch file. The batch file is called concat.bat and takes in the following parameters: %1 for the extension of the files to concatenate, e.g. txt and %2 for the folder in which the files are stored.

The body of the batch file is as follows:
set I=
for /F %%f in ('dir "%2"*.%1 /B /O:N /A:-D') do set I=%%f
if not defined I goto EXIT
copy "%2"*.%1 "%2"\backup
copy "%2"*.%1 "%2"%I:~0,5%concat.tmp
if exist "%2"%I:~0,5%concat.tmp del "%2"*.%1 > NUL
ren "%2"*concat.tmp *.%1
:EXIT
Example of the usage of the batch file:
The batch files is called with %1 = .txt and %2 = c:\temp. Say there are three files in c:\temp called test1.txt, test2.txt, and test3.txt with create dates 2006/01/01, 2006/01/02, and 2006/01/03 respectively. The result is one textfile called text1.txt containing the content of all three files.

Explanation of each line of code:

Set I=
dimensions the variable I.

for /F %%f in ('dir "%2"*.%1 /B /O:N /A:-D') do set I=%%f
iterates over all files in folder c:\temp with extension ".txt", ordered descending by create date. Each iteration "do set I=%%f" assigns the filename to variable I. After iteration I contains the value of the last filename in the list. This is the file created first according to create date, since the list is sorted decendantly.

if not defined I goto EXIT
checks if I has a value, if not, the batch ends. I.e. there where no .txt files in the folder.

copy "%2"*.%1 "%2"\backup
copies the three sourcefiles to a backup folder. This is done for auditing purposes.

copy "%2"*.%1 "%2"%I:~0,5%concat.tmp
concatenates all .txt files to one file with the filename stored in I. So, a file called test1concat.tmp is created, containing the union/concatenation of the three files.

if exist "%2"%I:~0,5%concat.tmp del "%2"*.%1 > NUL
does a check to ensure the .tmp file has been created and deletes the three source files, so the inbox in emptied.

ren "%2"*concat.tmp *.%1
renames the .tmp file to the final filename, in this case text1.txt.

EXIT:
exits the batch file.

Currently it only works if all files have the same filename length, so 5 (="textX") in the case of the example. Of course this can be optimized, e.g. search for the first dot and take the left part of the file name.

I have no idea if this approach is better with regard to performance. I haven't tested it on thousands of files and have not done a comparison to the approach used in the post by Jamie. On the other hand, this approach will also work nicely in DTS, where there is no ForEach Loop.

Note:
The "for /F" functionality is not available in all Windows versions. It is available in Windows 2003 server and Windows XP.