BizTalk SQL Jobs

Guys,
You would have been playing around BizTalk for a long time. If you have  been staying close to the Production environment you would get a chance to see  the real time issues of the SQL agent jobs that are used in BizTalk. Me  as a functional consultant in BizTalk have seen some interesting scenarios  happening in our Production environment wherein our processing came to a halt.  Do you believe if this is caused by these agent jobs? Yes it is 🙂

I  will give you a fair analysis of what I have done to understand these jobs. I  have bundled it together with what i got from the internet and from my  friends.

BizTalk Server 2006 includes the following SQL Server Agent jobs  to assist you in managing the BizTalk Server databases:

SQL Agent should  be running at all times.

The following is the list of Jobs that are used  by BizTalk for internal processing.

1. Backup BizTalk Server –  LogShipping
2. CleanupBTFExpiredEntriesJob_BizTalkMgmtDb
3. DTA Purge  (BizTalkDTADb)
4. MessageBox_DeadProcesses_Cleanup_BizTalkMsgBoxDb
5.  MessageBox_Message_Cleanup_BizTalkMsgBoxDb
6.  MessageBox_Message_ManageRefCountLog_BizTalkMsgBoxDb
7.  MessageBox_Parts_Cleanup_BizTalkMsgBoxDb
8.  MessageBox_UpdateStats_BizTalkMsgBoxDb
9.  Operations_OperateOnInstances_OnMaster_BizTalkMsgBoxDb
10.  PurgeSubscriptionsJob_BizTalkMsgBoxDb
11.  Rules_Database_Cleanup_BizTalkRuleEngineDb
12.  TrackedMessages_Copy_BizTalkMsgBoxDb

JOB: Backup BizTalk
Server – LogShipping

Criticality: Very High. Estimate 1-2 days  of continuous outage will take down system as disk space will run out.

Functional description and purpose.

The most important job in the  environment! This job performs full database and log backups of the BizTalk  Server databases and ships the logs to a remote SQL server. It is  important to make sure that all database backups are done transactionally so  this job to goes to each database in the system and transactionally marks the  log. It then takes a backup of all logs and truncates the logs [to control database growth].

You use the Backup BizTalk Server job to back up all of  the databases in your BizTalk Server source system, except for some databases
used by Business Activity Services (BAS) and Business Activity Monitoring (BAM).  The source system is the server or group of servers that contain live data.  Because the BAS and BAM databases have different backup and restore  requirements, these databases are backed up and restored using other  methods.

Additionally, the SQL job does not delete old backup files out  of the box. You must manage those files to conserve disk space by backing them
up and deleting them after new full backups have been  created.

Frequency:

Occurs every 1 day(s), every 2 hour(s) between
12:00:00AM and 11:59:59 PM.

A full backup happens once a day at 12:00AM.
Incrementals happen based on the schedule throughout the day every 2 hours.

Dependencies:
There is no dependency to other jobs. But the following  databases are backed up by this job by default:

1. BAM Primary Import  database (BAMPrimaryImport)
2. BizTalk Tracking database (BizTalkDTADB)
3.  BizTalk Management database (BizTalkMgmtDB)
4. BizTalk MessageBox database (BizTalkMsgBoxDB)
5. Rule Engine database (BizTalkRuleEngineDB)
6. SSO database
7. BizTalk Base EDI database (R2 only)

Additional databases  can be backed up by adding a reference in table
BizTalkMgmt..adm_OtherBackupDatabases. The following have been added to the
current list of backups:

1. BAM Archive Database (BAMArchive)
2. SSO  database (SSODB)

Failure:
In case of any hardware failures,  the backed up temporary logshipping database can be used to restore the system  from last known logmark. Data that was processed during the problem occurrence  will be lost. This can’t be avoided.

It is also important to note if the  job fails consecutively for a long time, the BizTalk database “data files” and  “log files” will continue to grow exponentially. This in turn will affect the  performance of the whole BizTalk environment and eventually consume the entire  disk space if not corrected immediately.

——

JOB: DTA
Purge (BizTalkDTADb)

Criticality: Medium

Functional
description and purpose

As BizTalk Server processes more and more data on  your system, the BizTalk Tracking (BizTalkDTADb) database continues to grow in  size. Unchecked growth decreases system performance and may generate errors in  the Tracking Data Delivery Service (TDDS). In addition to general tracking data,  tracked messages can also accumulate in the MessageBox database, causing poor  disk performance.  While previous versions of BizTalk Server included sample  scripts for archiving tracked messages and purging the BizTalk Tracking  database, BizTalk Server 2006 automates both processes using the DTA Purge and  Archive job. By archiving and purging data from the BizTalk Tracking database,
you can maintain a healthy system, as well as keep your tracking data archived  for future use. Because BizTalk Tracking database archives accumulate over time  and consume disk space, it is a good idea to move the BizTalk Tracking database  archives to secondary storage on a regular basis.

Frequency:
Occurs
every 1 day(s), every 1 minute(s) between 12:00:00AM and 11:59:59
PM.
Dependencies:
There is no dependency for this
job.

Failure:
As the job is scheduled to run every minute, there won’t  be any problem if this fails once. But if the job keeps on running for a long  time, then we have an issue. The BizTalkDTADB database file size will grow  eventually, adding it to the load for the job. This in turn will affect the
performance of the whole BizTalk environment.

——–

JOB:
MessageBox_DeadProcesses_Cleanup_BizTalkMsgBoxDb

Criticality:
High

Functional description and purpose
This job detects when a  BizTalk Server host instance (NT service) has stopped and releases all work that  was being done by that host instance so that it can be worked on by another host  instance.

Frequency:
Occurs every 1 day(s), every 1 minute(s) between
12:00:00AM and 11:59:59 PM.

Dependencies:
There is no dependency for  this job.

Failure:
Consider the case if Receive Host on machine_xx has been stopped due to some problems. Automatically all the messages that was  previously handled by this instance would be transferred to machine_xy. If there  were a problem with this job, then all the messages that were received by  machine_xx would not be processed until the job is running again to re-assign  the work.

—–

JOB:
MessageBox_Message_Cleanup_BizTalkMsgBoxDb

Criticality: Very
High. Estimate 1 day of continuous outage will have significant performance
impact on system.

Functional description and purpose

Do not
manually start this job.

This job removes all messages that are no  longer being referenced by any subscribers in the BizTalk MessageBox  (BizTalkMsgBoxDb) database tables.

This is an unscheduled job, which is  started by the MessageBox_Message_ManageRefCountLog_BizTalkMsgBoxDb job.

Whenever a message is published in the BizTalkMsgBoxDB, a ref count is  calculated. This would be the sum of all matching subscribers for this message.  Whenever a subscriber subscribes a message, 1 is deducted from ref count.  Finally when the ref count reaches 0 that message should be removed from the  Message Box. This work is handled by this job.

Frequency:
This job is
not scheduled. This job stems out from
MessageBox_Message_ManageRefCountLog_BizTalkMsgBoxDb. Both the jobs are
responsible for cleaning up the messages in the MessageBoxDB.

Dependencies:
This job depends on the job  “MessageBox_Message_ManageRefCountLog_BizTalkMsgBoxDb”.

Failure:
As  this job is run by the MessageBox_Message_ManageRefCountLog_BizTalkMsgBoxDb job,  we actually don’t get any problems. But consider the case, if our environment is  receiving more load when compared to the amount that it can process. In that  case, the job would take some time to clean up the messages. As it takes a long  time, more messages adds up to the load, making it to execute for a long time.
This will have an impact of environment causing it to slow down the message that  is received. The key point is that over a period of time, the MessageBoxDB  increases in size and will make the environment to throttle.

——

JOB:
MessageBox_Message_ManageRefCountLog_BizTalkMsgBoxDb

Criticality:
Very High. Estimate 1 day of continuous outage will have significant performance  impact on system.

Functional description and purpose

This  job manages the reference count logs for messages and determines when any  subscriber no longer references a message. Even thought this SQL Server Agent  job is scheduled to run once per minute, the stored procedure that is called by  this job contains logic to ensure that the stored procedure runs continually.  This is by design behavior and should not be  modified.

Frequency:
Occurs every 1 day(s), every 1 minute(s) between
12:00:00AM and 11:59:59 PM.
This will start the  MessageBox_Message_Cleanup_BizTalkMsgBoxDb job. If that job completes, then this  job will fail. But as per the design, even after its failure, it will  automatically start. So it will resemble like a long running job (But actually  there is no issue in that).

Dependencies:
This job depends on the job  “MessageBox_Message_Cleanup_BizTalkMsgBoxDb”.

Failure:
Failure of this  job doesn’t affect the environment. But it is actually the cleanup job that is  more important. As previously mentioned the cleanup should complete as soon as  possible, other wise we are gonna have a worse problem  ahead.

—–

JOB:
MessageBox_Parts_Cleanup_BizTalkMsgBoxDb

Criticality:
Medium

Functional description and purpose

This job removes all  message parts that are no longer being referenced by any messages in the BizTalk  MessageBox (BizTalkMsgBoxDb) database tables. All messages are made up of one or  more message parts, which contain the actual message  data.

Frequency:
Occurs every 1 day(s), every 1 minute(s) between
12:00:00AM and 11:59:59 PM.

Dependencies:
There is no dependency on  other jobs.

Failure:
As each message contains one or more message  parts, they have to be removed when any subscribers no longer reference them.  This job calls a stored procedure that truncates tables PartRefCountLog1 and  PartRefCountLog2. The Stored procedure places a exclusive lock on both these   tables. If there are any shared locks on these tables when this job is running,  it would not be able to truncate the records, making the table record size to  increase drastically. Due to heavy load, the job will take more time to complete affecting the overral performance.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s