Thursday, March 29, 2012

Distributed ETL server

I work in the data warehouse team of my organization. We are currently rearchitecting our server environment. One of the ideas on the table is to devote a separate server for ETL processing. The databases would reside on several other servers. The ETL server would run the SSIS packages. I'm questioning if this would be a good idea.

The database servers would continue to carry the load of the query processing for the ETL. But with the ETL process on a separate server, the resultsets would need to go over the network for the SSIS package to then work on them. Plus, they would then have to go back over the wire to the destination server.

Are there advantages to this setup? Does this setup have better scalability? Or, would it be better to run the ETL from either the source or destination database server?

Thanks for your consideration.

Yes, maybe and maybe.

It makes sense in that you have a nice server with dedicated CPU and memory for the ETL. This is especially good for memory intensive stuff like lookups, (64bit helps too) and there is no CPU contention with SQL, but as you say you then incur the extra network cost.

The real answer lies in how much work is being done, CPU and memory wise, versus the extra network cost. I think that will depend on package complexity, types of transformations used, amount of data and general network performance, Better test it if you want a real answer.

The only other big question; is your DW SQL box busy getting hit by other "users"? Would it be advantageous to protect that box by offloading the ETL? Maybe the network cost would slow the ETL down, but protect resources for regular DW queries such that they perform better and that is more important than the network cost and general speed decline that is normally associated with extra network hops. Saying that, depending on the transformation work being done, the network hit may not be a factor. Perhaps the source extract network hit is more costly that the more localised ETL to DW hit.

It really depends....!

No comments:

Post a Comment