Skip to content
February 15, 2020 / kiranpatils

RedisTimeoutException with Sitecore 9.0.2 – 9.3

Challenge:

In recent time, was involved in Sitecore 9 Azure PaaS Implementation for few of our enterprise clients, who had humongous amount of data, traffic and multi-region CD servers to handle load.

We faced RedisTimeout challenges during 9.0.2 launch and recently we faced similar challenges for 9.3 upgrade. It was moderate for later than earlier. But in both case, we spent good amount of time and we have learnt something new. So, thought to share with you. So, hopefully it can reduce your resolution time.

So, If you are also haunted with RedisTimeout exceptions and your launch is due soon and looking for some solutions this post might help you.

Solution:

I won’t spend much time explaining basic details about it. But I would strongly recommend you to read following great posts:

If you’ve already been through above posts, Great! Please keep reading.

Before we talk about Solution, Let’s talk about Why this happens?

If you want to know more about Session and how it works, I would strongly recommend you to read this great post : https://mhwelander.net/2016/05/19/lets-talk-about-session-state/

By this time, your basic understanding should be clear. In XP Scaled and Multi-region most applications will prefer to use Out-Proc session for obvious reasons explained in the post earlier.

So, if you are expecting lot of traffic then your CD Servers should have enough power to handle session threads as well as your REDIS server should also be powerful enough to handle those session requests.

When you are troubleshooting REDISTimeout issues, find your bottleneck first. Here are few of our learnings to identify bottleneck:

  1. Check CPU Usage of CD Server – It is high? :Then your CD server needs more power (Just a note : If your CD’s CPU is high, then there might be another reasons for that. Which needs to be checked. But for simplicity of this post, We are assuming all other issues are already taken care). Give a tier upgrade and check again.
  2. If you are not noticing high CPU on CD and REDIS is overloaded then good to add more power to your REDIS app.
  3. If you’ve upgraded them to both of them to higher tier and if you are still noticing REDIS errors then you should apply configurations given in KB Article and mentioned below – Please keep reading this post.

Can you show me some more symptoms?

If you are not sure you are impacted by REDISTimeout exceptions or not following are few symptoms of it:

AI Logs


Sitecore Logs


Azure tools (Latest) query “- traces | where timestamp > now(-1d) | where message contains “Exception” or customDimensions.StackTrace contains “Exception” | where customDimnsions.Role == “CD” | sort by timestamp desc | limit 100″
Dump file analysis

I’ve faced this error on multiple versions and solution for each version had slightly different. So, will divide solution in two parts:

In either of the version, you must need to refer this KB Article : https://kb.sitecore.net/articles/464570 – Please don’t get overwhelmed with information it provides, take few deep breaths and read – It’s good info. I personally feel Solution steps are not as straight forward as it should have been. But we all are learning together 🙂

  • Sitecore 9.0.2 : We noticed lot of performance challenges and during troubleshooting we found that our applications were setup (~2 years ago) as per Sitecore’s old tiers recommendation, And it was not capable to handle load. also, we had to upgrade our Redis Tier to C2 (C1 might work, if your site is not having heavy load)
    • We did some load testing and performance tuning and we learnt that upgrading CD apps to P2V2 (Do some performance test to find out right tier P1V2 might be a good one to go with. Also, good to move away from Standard tiers to Premium tiers as they are available at same price with better hardware! – Please refer this : https://azure.microsoft.com/en-us/updates/announcing-pricing-decrease-for-azure-app-service-on-the-premium-plan/)
    • We also had to upgrade our REDIS Apps to C2 (If you got moderate load than C1 might work for you!)
    • Apart from this we applied steps mentioned in KB Article : https://kb.sitecore.net/articles/464570#Solution(XP8.0-9.0)
      • Apply Step#1-3
      • But please make sure you read Notes section first. Whatever values you applied as per this KB article is just for reference, and you need to do some load testing before finding optimal value for your needs.
      • I’ve created gist with our configurations : https://gist.github.com/klpatil/9c19fe6f0b8c3cfa8e9ed76629e16c79
      • Please play close attention to following values in gist, we updated them for better results:
        • pollingInterval : Specifies the time interval in seconds that the session state provider uses to check if any sessions have expired. The default value is: 2.
        • executionTimeout
        • timeoutBetweenLockAttempts
        • Sitecore.Support.210408.config – Especially values given in this file needs close attention.
  • Sitecore 9.3 : As we had learnings from our last project (9.0.2) project we were not expecting any Redis error. But life is not as easy it seems to be :-). We again got haunted by REDISTimeout exceptions during our Performance testing period. Here’s what we learnt.
    • We were facing one odd error during UAT period. But during load testing. We noticed that REDIS errors count went high. And that’s when we started to investigate further on this before go-live:
    • Tiers : Luckily, we haven’t faced much challenges with tiers. CD servers were good. We only had to upgrade REDIS from C1 to C2 to handle load.
    • Sitecore Patch as per KB Article: As you must have noticed on this article it says – Fixed in 9.2. And if you scroll further it has suggested some steps : https://kb.sitecore.net/articles/464570#Solution(XP9.2InitialReleaseOrLater) and steps are also not super clear. Feeling same? We also felt same, And thought to reach out after couple of back and forth. Sitecore support also accepted that those steps need some work for 9.3 and here are our learnings till Sitecore updates KB article:
      • Good part is – you no need to apply and DLL patch for this.
      • Just some configuration updates as listed in this gist: https://gist.github.com/klpatil/0b395c5cf46a8ff5f04118ea49f05546
      • Biggest learning was maxConcurrencyLevel value should not be changed at all. As per 9.3 logic, if this value is not provided then 9.3 handles it in conjunction with batch size, else size you provided in config. And as per our learning default logic is optimal.
    • Load testing : Above steps reduced REDIS Timeout exceptions drastically. But whenever client used to run JMeter load test, they noticed REDISTimeout exception. After some consultation with Sitecore support and analysis we learnt that load testing script was sending unrealistic traffic and it was also not handling cookies properly.
  • You are still facing error? : If you are facing error even after following steps given above, then it might be good idea to create memory dump using procdump and try to analyze this with help of Sitecore support.
D:\hone\site\wwwroot\App_Data\dumps>D:\devtools\sysinternals\procdump -accepteula -e 1 -f "StackExchange.Redis.RedisTimeoutExceptio 
" -ma 285441
D:\devtools\sysinternals\procdump -accepteula -e 1 -f “StackExchange.Redis.RedisTimeoutException ” -ma <PID>

Both our environments are serving live traffic and we are having stable systems with No REDIS Timeout exceptions!

Hope this post helps you to fix you REDIS timeout exceptions or avoid them before you see them. Special thanks to Sitecore support team and my colleagues Joseph and Khushboo for all their help during troubleshooting!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: