Understanding DATALOSS Advisory in Tibco Rendezvous
While working with TIBCO rendezvous you guys must have been faced the problem of DATALOSS and might be aware of its severe consequences and in the worst case how it can cause TIBCO Storm (A situation where TIBCO publisher bombards network with publishing so many messages and exhaust all network bandwidth of WAN links resulting in a complete breakdown of network lines and communication). This Tibco Tutorial is in continuation of my Tibco Tutorial series and in this short TIBCO tutorial I will explain what is DATA LOSS in Tibco and How we can minimize or prevent DATALOSS in Tibco RV.
To understand the DATALOSS in Tibco Rendezvous , what causes a DATALOSS in TIBCO RV, and how we can prevent DATALOSS in TIBCO RV lets take a look back and see how exactly TIBCO Rendezvous or TIBCO RV works ?
TIBCO Rendezvous or TIBCO RV provides messaging solution, TIBCO publisher publishes message in a multicast network and TIBCO subscriber listens on same multicast network and on the same service and a particular topic also referred as Subject.
While working with TIBCO rendezvous you guys must have been faced the problem of DATALOSS and might be aware of its severe consequences and in the worst case how it can cause TIBCO Storm (A situation where TIBCO publisher bombards network with publishing so many messages and exhaust all network bandwidth of WAN links resulting in a complete breakdown of network lines and communication). This Tibco Tutorial is in continuation of my Tibco Tutorial series and in this short TIBCO tutorial I will explain what is DATA LOSS in Tibco and How we can minimize or prevent DATALOSS in Tibco RV.
To understand the DATALOSS in Tibco Rendezvous , what causes a DATALOSS in TIBCO RV, and how we can prevent DATALOSS in TIBCO RV lets take a look back and see how exactly TIBCO Rendezvous or TIBCO RV works ?
TIBCO Rendezvous or TIBCO RV provides messaging solution, TIBCO publisher publishes message in a multicast network and TIBCO subscriber listens on same multicast network and on the same service and a particular topic also referred as Subject.
So whenever a message arrives on that service TIBCO daemon also referred as TIBCO rendezvous daemon see if the subscriber has interest on any of incoming message if yes then it delivers that message to the program.
If the program is too busy or very slow to process an incoming message it would happen that some of the messages expires and dropped at TIBCO RVD (rendezvous daemon) level and the program will request retransmission of those messages.
Since every RVD has a pre-configured reliability parameter. Which is usually 30-60 seconds and till that time rendezvous daemon or RVD will keep the messages it sends out in memory to service retransmission requests from TIBCO subscribers.
When an RVD receives a retransmission request it will send out the messages requested again unless 30-60 seconds have been elapsed or passed since the original message was sent. In this case the messages that the Rendezvous Daemon (RVD) needs to send are no longer in memory and the requesting RVD will issue a DATALOSS.INBOUND.BCAST advisory, But if the message is present in memory it will retransmit that message again , so just imagine if in 100 subscriber 1 subscriber is slow in message and causing frequent retransmission request , Sender RVD will bombard network again and again with same messages which will start eating network bandwidth and in worst case could result in TIBCO rendezvous storm.
DATALOSS can be caused by various reasons; I have listed some of the common reason which could potentially result in DATALOSS and subsequent TIBCO storm
1) Subscriber is too heavily loaded and don't have enough CPU to process the message.
2) Either the publishers or subscribers don't have the available resources to retransmit the requested messages or can't process the messages they are receiving quickly enough.
3) Due to any problems being at the network layer.
DATALOSS can be INBOUND or OUTBOUND INBOUND means incoming message has been lost and OUTBOUND means the outgoing message has been lost and it depends on which RVD either Sending or Receiving is issuing this DATALOSS Advisory.
Since every RVD has a pre-configured reliability parameter. Which is usually 30-60 seconds and till that time rendezvous daemon or RVD will keep the messages it sends out in memory to service retransmission requests from TIBCO subscribers.
When an RVD receives a retransmission request it will send out the messages requested again unless 30-60 seconds have been elapsed or passed since the original message was sent. In this case the messages that the Rendezvous Daemon (RVD) needs to send are no longer in memory and the requesting RVD will issue a DATALOSS.INBOUND.BCAST advisory, But if the message is present in memory it will retransmit that message again , so just imagine if in 100 subscriber 1 subscriber is slow in message and causing frequent retransmission request , Sender RVD will bombard network again and again with same messages which will start eating network bandwidth and in worst case could result in TIBCO rendezvous storm.
DATALOSS can be caused by various reasons; I have listed some of the common reason which could potentially result in DATALOSS and subsequent TIBCO storm
1) Subscriber is too heavily loaded and don't have enough CPU to process the message.
2) Either the publishers or subscribers don't have the available resources to retransmit the requested messages or can't process the messages they are receiving quickly enough.
3) Due to any problems being at the network layer.
DATALOSS can be INBOUND or OUTBOUND INBOUND means incoming message has been lost and OUTBOUND means the outgoing message has been lost and it depends on which RVD either Sending or Receiving is issuing this DATALOSS Advisory.
How to identify DATALOSS in TIBCO RV
As stated above in case of DATALOSS Rendezvous Daemon (RVD) will issue a DATALOSS.INBOUND.BCAST advisory, if you see these advisory messages in your log file frequently then its time to alert and inform network team about it. You can also check at what time these advisories are coming and see your machines CPU load during that period for any local problems.
TIBCO Reliability parameter plays an important role to avoid DATALOSS and a careful and optimal choice of reliability parameter can minimize DATALOSS.
If you like to learn more about Tibco RV you can see my earlier Tibco RV Tutorials here :
TIBCO tutorial part 1
TIBCO tutorial part 2
TIBCO tutorial part 3
TIBCO tutorial part 4
Hi We are getting tibco storms in our network. Any suggestions how to reduce data losses caused by retransmissions
ReplyDeleteHi @Anonymous, Please find out the slow consumers, usually old desktop or server who is not able to handle the message at a rate of producer is publishing. You can do that by finding all clients subscribing on that topic and analyzing them. Unless you remove slow consumers, they will keep losing data, making re-transmission request and causing tibco storms.
ReplyDelete