Wednesday, May 06, 2009

Task to watch for files in a directory missing in SSIS Task Repository

I'm reading: Task to watch for files in a directory missing in SSIS Task RepositoryTweet this !
I have worked on a couple of huge data warehousing projects. I have observed that, one of the common requirement in a typical ETL project remains of watching the files coming down from the source system. And this becomes a must if the source system in a Mainframe system.

The general architecture remains like this:
  1. Feeds arrive from the source system in case if the source is some legacy system like mainframe. Also there may be jobs which fetch data from the OLTP system via push/pull methodology and then it's staged into a text file.
  2. Feeds are guided to the target ETL directory through some schedulers, for ex. Autosys
  3. A File Watcher remains on the ETL system which detects the presence of the file/feed, picks up the same and then after ETL, the same gets loaded to either a staging DB or to the destination DB.
  4. Generally it's BCPed to the staging DB, and after application of business rules, it's loaded into the destination DB or data warehouse.

In SSIS what I have seen missing is the File Watcher task. It needs to be implemented by either infinite looping to watch for a file (it's a very bad methodology of implementing the functionality) or using WMI for the same, or finally thru a work-around by signalling mechanism or through scheduling.

I found a similar implementation of the same in SQLIS, where a component called "File Watcher Task" is available. But seems like it has serious limitations and is also not bug free. I am just assuming this from the comments and components description. May be Microsoft should make this functionality available through some service packs or at least in the next version.

No comments:

Related Posts with Thumbnails