Lync SBA’s: The Good The Bad, and The Annoying

Painful%20SlideIf you’ve deployed Lync Enterprise Voice, you’ve at least had a discussion around Survivable Branch Appliances. You may have even deployed them. After having deployed (or assisted in deploying) and supported over 25 of them I’ve learned quite a few lessons about SBA’s that I thought I would share.

First off: sizing. If you look at the official Microsoft documentation, they say that an SBA can support between 25 and 1,000 users. However the SBA vendors do not offer just one single SBA option. They offer options with 2GB RAM and 4GB RAM. They offer SBA’s with different CPU’s. So is there a disconnect between what Microsoft says on Technet and what the vendors are providing? Can a 2GB SBA really support 500 users?

That really ends up being the wrong question, or at least it’s not the only question. I have some 2GB SBA’s that can not support 25 users . The Front End service keeps crashing every day or so. This behavior is not exhibited on the 4GB models.

It ends up that the supported user count isn’t the only metric you need to review when sizing an SBA. You need to understand how large your entire Lync deployment is (or will be). If you will only have a few thousand users then I think a 2GB SBA would work out fine. But if you have 10s of thousands or hundreds of thousands of users then don’t even consider a low powered SBA. As an example: We have an 8 person office using a 4GB SBA and they experience none of the issues that the larger offices we have experience with a 2GB SBA. All of those user accounts, all 10,000 or 50,000 or 200,000 get at least partially replicated to the SQL store on the SBA’s (at least I think it does – I’ve never looked into the DB to see what all is in there). And those large databases eat RAM. And without RAM….The Lync Front End service crashes.

There are other issues to consider before buying a low-powered SBA. How often will you need to monitor or troubleshoot? If Lync barely runs in 2GB of RAM, how well will your logging tools perform? We have crashed the Lync Front End Services on a 4GB RAM SBA just by trying to open a log file in Snooper that was too large for the SBA to handle. Lesson learned. So now when dealing with large log files we have to copy them off of the SBA to our local PC’s to open them in OCSlogger/Snooper. All of this adds delay while the users in that location can’t make or receive phone calls.

With a low powered SBA, other things will take longer too. Your patching window will need to be longer just because installing a Lync Cumulative Update or Windows patches will take longer. You surely need to run antivirus. You may also run a SCCM/SCOM agent or two. Antimalware? IDS/IPS agent? Any other management stuff running and chewing up CPU and RAM? If you add up the price difference for the extra RAM and/or the faster CPU, will you save that money in less downtime and quicker time resolving issues?

Installing and configuring an SBA is completely different than bringing up any other Lync role. Traditionally, you add a device to Topology and then either run the setup off the Lync CD for a first time install or you run bootstrapper to add the new features. You do this via remote desktop and life is good. If something goes wrong, you can just uninstall Lync and start the install over again. This is pretty much how you install all software you’ve ever installed on a Windows Server.

But with an SBA, things are different before you even turn the thing on. First, you have to add an SBA to your Active Directory first and then manually add an SPN value to that computer object. I’m sure someone who’s good at AD can explain why this is needed on an SBA but not needed for any other Lync role.

Next, after publishing Topology, you do not remote desktop to the machine and install Lync off the CD (or .iso image). Instead, you connect to a vendor-written website on the SBA to configure the server. These web-based installers handle all sorts of things such as renaming the server, adding it to a domain, and changing the password of the Administrator account. Of course it does all of this via HTTP by default so if security is important to you the first thing you do is waste 10 minutes to install a certificate on IIS on the SBA.

All that these web-based installers do is wrap PowerShell into a web GUI and invariably all of them have issues. For example, I have never successfully completed a certificate request through the Web installer. The other fun thing is that these SBA’s don’t have an uninstall option for Lync. So if things go wrong for whatever reason you can’t just uninstall Lync and start the install over again. You have to re-image the entire thing and set the whole thing back to scratch. Fortunately this doesn’t happen often.

But my core issue is figuring out what the point is of this web-based installer? Why not just ship a copy of the .iso with the SBA and install it just like you do every other Lync role?

In my imagination I see a bunch of Microsoft people sitting in a conference room

Forward Thinker:  “Hey, how can a Lync administrator install an SBA when they only have limited connectivity to the device? Like, they only have a dial-up modem connection to the site or a firewall policy limits their access?”

Everyone else in the room: “WEB BASED INSTALL!!!!!”.

And so we get stuck with a web based installer but in reality the web based installer solves no issues. It only creates them. If you only have a 56K connection to a site, you probably shouldn’t be installing Lync in that site in the first place, at least not an SBA. Go with a Standard Edition. What if you only have HTTP(S) access to the site? Well, you can then install Lync but you can’t do any logging or troubleshooting with OCS Logger so you better never have an issue. In other words, this has always seemed to me to be a solution in need of a problem.

This is also one of the reasons why I greatly prefer to deploy an SBS over an SBA: I can install off an iso and I’m not limited to under-powered hardware. However, depending on how your organization is structured, you may want to limit the amount of hardware (and “ownership” of that hardware) at a remote location. So an appliance makes sense which is why we continue to push them out.

When shopping for an SBA, there are some key points to ask the vendors you are comparing:

1. How easy is it to upgrade the SBA? Based on the above diatribe, you can’t just uninstall Lync 2010 and install Lync 2013. You have to *completely* re-install the server with a brand new copy of Windows and run through the whole rotten Web-based installer again. Some of the vendors make the upgrade process generally painless by letting you download an image and then flipping a switch on the server to boot to a new partition. These are easy to upgrade remotely. Others require you to download an image and overwrite the existing installation and this has to be done via a USB key or some other transport. These are harder as you may need to do some of the upgrade steps via a serial/terminal connection. (How exactly do I do this over HTTP? I can’t. Another reason the web installer is pointless.)

2. Can your vendor provide some semblance of local support if you have offices scattered all over the globe? Some vendors are a bit more global than others and this could become an issue regarding sourcing equipment and supporting them. It becomes a bigger issue if a part fails on the gateway and a vendor who claims to be global can’t get you parts because those parts are caught up in customs.

3. How good is their support? I’ve dealt with three different SBA vendors. Two of them are great with support, one of them not so much. And to my surprise, things I heard “on the street” about the support at these vendors did not match my reality when I worked with them. So ask the vendor how easy it is to open tickets, how quickly tickets get a response, how easy or difficult it is to set up a voice call for support, etc. I don’t know if there is an easy way to get real information out of a vendor so talk with peers about their experiences with a vendor. Alternately, if you are working with a Lync support organization, ask them how well they can support the gateway side of the product and their experience with the support organizations of the gateway vendor. Note that I am not calling any one out here so don’t ask me in the comments which one of the three I’ve had the most difficulty with. I won’t say.

One other thing to keep in mind: The vendor is on the hook for supporting both Windows and Lync on the SBA. So if the Front End service crashes, don’t call Microsoft. Call the SBA vendor.

4. Manageability. Your network guys have tools that monitor their routers and switches and firewalls. Can they also monitor this device? Some of the vendors sell their own monitoring software. Check those out and compare them. Can Vendor A’s software also monitor and manage Vendor B’s gateway? Can I write custom scripts to manage or monitor the gateways myself? How easily can I extract reports from your solution and link them with my Lync monitoring reports? Can I push out firmware upgrades? Can I centrally back up my configurations? Do you have a SCOM Management Pack?

5. Completeness of Vision. This isn’t a hard and fast set of questions or requirements of a vendor. But you do want to make sure that the vendor is completely committed to Lync as one of the core facets of their business. You want to make sure that no matter what screwball telecommunications connection you need to use in whatever screwball location that the gateway will be able to handle the connection. As an example, we had to connect to a screwy SIP trunk provider and in order to make the connection work the gateway had to manipulate the HTTP headers being sent to the SIP provider. I was impressed that this feature was available but then this completeness of features is one of the reasons we use this vendor. I have full confidence that anything we ever need to connect to our gateways will be able to be handled by this vendor.

Make sure that your SBA’s can route to your Edge servers. As calls come in to an SBA from the gateway, Lync will go through its whole STUN/TURN/ICE game and that includes seeing if using the Edge is a good option. But if the SBA cannot reach the Edge servers then calls will fail. There are some workarounds to this issue but if you have a properly configured network you won’t need to use them. We have one office that is always messing up their DNS servers. We ended up having to add our Edge servers to the local Hosts file on the SBA so that the SBA could reliably resolve and connect to the Edge servers.

Don’t put in an SBA thinking it will solve all of your congested WAN problems. Sure, if you can keep calls off the WAN that will address a portion of your WAN congestion. But if your WAN fills up the SBA could start dropping calls (inability to reach Edge) and/or putting your SBA-homed users into limited functionality mode (inability to reach parent pool).

And no matter what, make sure you have QoS working across your WAN. Someone could be copying a large file across the WAN link and during that time Lync can’t deliver calls and/or your users go into limited functionality mode. QoS helps avert this.

Since I’m talking about congested WAN’s I may as well bring this up: configure the client policy for all of your remote users to use web based address book lookups in the Lync client instead of downloading the address book. Even if the bandwidth is negligible between the two, consider this problem:

We were migrating remote users from Lync 2010 to Lync 2013. 1 week later we got reports from the network group that Lync was crushing the WAN connections to 1 of our remote offices. After some work we figured out it was that everyone in the office was downloading the Lync address book at about the same time and there wasn’t enough WAN bandwidth to support this. We effectively knocked that office off the WAN due to address book downloads. We changed the client policy to Addess Book Web Query and told everyone in the office to sign out/in on their Lync client. Within an hour or so the traffic calmed down. We changed our global policy to Address Book Web Query only.

Conferencing. Installing an SBA does not change the way Lync dial-in conferencing works. An SBA/SBS cannot be a conferencing server. So if you use publish a dial-in conferencing number that is hosted by the SBA, keep in mind that all traffic on that conference is still going across the WAN to your Front End servers. You may actually be increasing your WAN bandwidth with people now calling the number at the remote office to join meetings. Also, know how many available lines or SIP trunks you have connecting your gateway to the phone system. If you only have 10 SIP channels you can only have 10 callers dialing in to that dial-in conferencing number. The 11th caller gets a busy signal. This could also prevent customers from calling you because all 10 channels are being used for the conference.

Don’t blindly add a dial-in conferencing number to an SBA. Be sure that the local users know how the voice is routed and what the maximum number of invitees should be. Also make sure QoS is enabled on the WAN so people dialing in do not have a bad meeting experience.

We didn’t do this initially but we have gone back and fixed this. When we initially configured our gateways, we only configured a connection from the gateway to our SBA. So what happens if the SBA crashes or is getting upgraded or patched? All calling fails as the gateway can’t reach the Mediation service on the SBA. Instead, set up a Mediation server in your parent pool to be a fall back route (both inbound and outbound) in case the SBA is unavailable. While calls will now be travelling over your WAN during an outage, calls can still be made and received.

And be sure you have QoS configured on your WAN so that these calls don’t sound terrible.

I used to think that SBA’s were neat little devices. Now I kind of hate them. Not because they perform poorly. A properly sized SBA can handle 800 or more users in the largest of environments and once deployed we kind of forget they even exist. But upgrading them, configuring them, troubleshooting them, and dealing with their quirks is just a giant pain. I would love it if Microsoft nuked the entire install process in Lync vNext and just made it the exact same process used to install every other piece of Lync. I’m a big fan of the SBS precisely because every complaint I have about the SBA’s doesn’t exist with an SBS. You install it the same way you install everything else. You aren’t limited by overpriced and under-powered hardware. Microsoft handles the support. If they could take this flexibility and put it into the SBA model then life would be just that little bit better.


11 pings

Skip to comment form

    • soder on 2014/11/05 at 04:17
    • Reply

    I agree with all you said, simply as it is. That tells a lot about the real value of SBA class of equipment on the market.

    Vendors delivered us their crappy, buggy, incomplete Web-based setups, and took away the flexibility to deal with powershell, and most importantly: MANAGING THOSE DAMN CERTIFICATES!

    You wont name vendors, I do. Its still not possible to export the damn SBA Lync certificate from Sonus mgmt website (or at least was not possible with the current release 3-4 months ago), if I want to re-image that crap, need to request new certificate for Lync SBA every time. They only implemented mgmt of the SBC certificate (the PSTN gateway part, if using SIP over TLS), but still not the SBA part. At least they issue new SW release every 2-3 months, and seems working hard to fix bugs.

    Audiocodes (the other vendor I worked with in the past for 4-5 years) were also very quick to issue new sw releases if I reported any issue to them (I had direct contact of one of their support guys in Israel). Of course I worked for a business partner of them, so definitely not an end-user treatment what I received from them. But they also made capital large mistakes on a daily basis in their SW. Like development happened without any planning. I simply dont understand. Their HW seems solid, but the SW running on top of it, the same crap as any others in the industry.

    Just by looking at their chagelogs (Audiocodes, Sonus), even if I am working in this industry for 8+ years now, I am still constantly amazed how these crappy written software can run critical infrastructure like telephony all over the world.. amazing..!

    • Andy Mac on 2014/11/05 at 08:27
    • Reply

    Well worded ………. I agree with most of what you say.

    • Aimperial on 2016/11/23 at 13:51
    • Reply

    Did you find a way to avoid the time ringing on PSTN calls from a site with SBA when connection outage occurs and Edge server is not available?
    All this on the STUN/TURN/ICE ritual.

    1. There is no way to remove this timeout. It’s “just the way it works”.

  1. […] Lync SBA’s: The Good The Bad, and The Annoying – 5-Nov-2014 […]

  2. […] Lync SBA’s: The Good The Bad, and The Annoying – 5-Nov-2014 […]

  3. […] Lync SBA’s: The Good The Bad, and The Annoying – 5-Nov-2014 […]

  4. […] Lync SBA’s: The Good The Bad, and The Annoying – 5-Nov-2014 […]

  5. […] Lync SBA’s: The Good The Bad, and The Annoying – 5-Nov-2014 […]

  6. […] Lync SBA’s: The Good The Bad, and The Annoying – 5-Nov-2014 […]

  7. […] Lync SBA’s: The Good The Bad, and The Annoying (Flinchbot) […]

  8. […] Lync SBA’s: The Good The Bad, and The Annoying (Flinchbot) […]

  9. […] 5. There is no way to in-place upgrade an SBS/SBA. While this would have been *really* useful and possibly the only use of in-place upgrades I would have used in production, Microsoft doesn’t support this. My guess is that this is because Microsoft foolishly still makes the SBA vendors provide custom (and wholly redundant) “Install code” which could fail to function in an upgrade scenario. This is yet another reason why the SBA/SBS model is excellent on the drawing board but is full of issues and miss-steps in production. […]

  10. […] 5. There is no way to in-place upgrade an SBS/SBA. While this would have been *really* useful and possibly the only use of in-place upgrades I would have used in production, Microsoft doesn’t support this. My guess is that this is because Microsoft foolishly still makes the SBA vendors provide custom (and wholly redundant) “Install code” which could fail to function in an upgrade scenario. This is yet another reason why the SBA/SBS model is excellent on the drawing board but is full of issues and miss-steps in production. […]

Leave a Reply

Your email address will not be published.