Lync Regions and assigning Dial-In Conference Numbers

RegionsI consider myself really sharp at Lync. However there are still times when I have “duh” moments that I feel like I should have known for years. Since Lync is a pretty complex and varied system there are just some things I never sat down and properly figured out. Dial-In conferencing Regions is one of them. I’ve probably set up dozens of dial-in conferencing numbers over the past few years but there were still a few things I just missed. This article will do a fairly deep dive into Regions and Dial-in Conferencing numbers.

So first off: What is a Region? A Region is anything you want it to be. It’s just some text. A region is independent of an Active Directory site or a Subnet or anything similar. A Region could be “Earth”. A Region could be “Germany”. A Region could be “My bedroom in the basement of mom’s house”. In practicality, a Region is used to define where a dial-in conferencing phone number resides. So if I have a dial in number in Indianapolis and one in Stockholm, the regions could be”Indianapolis” and “Stockholm” or they could be “United States” and “Sweden” or they could be “North America” and “Europe”. The Regions are just a way to let people know where the dial-in conferencing phone number is located.

None of this should be too mind blowing.

So where do you configure these region names? If you answer: “In the same place where you configured the dial-in conferencing number” then you would be wrong. You actually configure them in Dial Plans. There is a reason for this that I will get to later.

In order to set a region, open any dial plan and in the “Dial-in Conferencing Region” box, type in anything you want. In the example below, I entered United States. Also note that this is being set on the “Global” dial plan.

Global_RegionIf we want to see our existing dial-in conferencing Regions, you can run this PowerShell command:

Get-CsDialPlan | Select-Object DialInConferencingRegion

If I run that command I see the following:

Region_List

OK this is great. I have a region. Now how do I assign it to a Dial-In Conference number?

Assuming you don’t have one yet, let me show you quickly how to create a dial-in conferencing number. In Lync Control Panel, navigate to the “Conferencing” section and then the “Dial-in Access Number” tab. Click “New”

Add_DIC

Look at the above image. The “Display Number” can be anything you want. It can be in any format you want. In this case I formatted it to the standard way phone numbers in the North America Numbering Plan tend to be formatted. The Display Name field is really a comment for you the administrator so you know why this number was added. Line URI must be the exact, normalized phone number that Lync will use to answer the call. This number needs to be exactly right. The SIP URI can be anything you want. I often make it match the Line URI but you can use anything you want. The Pool is used to tell Lync which of your Lync pools will receive the inbound call. Finally you can select a Primary language and up to four secondary languages for this dial-in access number.

And where do we define the Region? All the way down at the bottom. I have to scroll down so you can see it.

DIC_Associated_Region

After clicking the Add button you are given a list in which to select your region. In this case, we only have one available region.

DIC_Select_Region

Once you select that region, you are returned to the main page for adding a Dial-In Access number. Click “Commit” and you now have a Dial-In Access number.

For those excited by PowerShell, you can perform all of the above using the following PowerShell commands:

Set-CsDialPlan -Identity "Global" -DialInConferencingRegion "United States"
New-CsDialInConferencingAccessNumber -PrimaryUri "sip:+13175551212@flinchbot.com" -DisplayNumber "(317) 555-1212" -DisplayName "Indianapolis" -LineUri "tel:+13175551212" -Pool lyncpool.flinchbot.com -PrimaryLanguage "en-US" -Regions "United States"

So how can we see what this now looks like? Point a web browser to your dial-in simple URL. In my case, that is https://dialin.flinchbot.com and I see the following:

DialIn1

 

If you don’t know your dial-in Simple URL, you can run the following PowerShell command:

Get-CsSimpleUrlConfiguration

Now more has happened here beyond just having created a new dial in number for the United States.

[Here is the part I never really knew until this week…]

What I have also done is assigned a default dial-in number for all of my users. This may seem obvious but it really isn’t (at least to my thick Hoosier skull (or I guess my now Nashvillian skull)). The reason that all of my users now have a default dial in number is because I assigned the Region to the Global dial plan. I currently only have one dial plan so everyone gets this dial plan by default. Since the dial plan controls the region, all of my users are now considered to be in the United States region.

So what happens if I create a new dial plan. Does the global Region still define the Region for users assigned to a user-level (or site-level) dial plan? Let me create a user level dial plan without a region and see what happens.

Testing this gets a little tricky because we can’t just go to the dial-in simple URL because that just shows a list of all the defined numbers, not the default number for a given user. Since I created a user-level dial plan, I had to assign that dial plan to a specific user. In order for them to see what their dial-in Region is, they need to create an Outlook meeting and then use the Lync Meeting calendar add-in (alternately, a “Meet Now” meeting directly from the Lync client).

Doing all of those steps, I see that the Global dial-in Region applies if the user-level dial plan has no Region defined.

DIC_1

So it’s probably a good idea to set a default Region in your Global dial plan. Unless you don’t want everyone to have a default location. Then leave it blank.

Now let’s go to the next step. I want to add a second dial-in number for our office in Nashville. I will call this region “United States – Southeast”. I create a User-Level dial plan and simply type “United States – Southeast” into the “Dial-in conferencing region” field. I then skip over and create a new dial-in access number.

After waiting about 5 minutes I reloaded the dialin.flinchbot.com page and I see that I now have a new entry in my list.

DialIn2

So that’s pretty cool. But here’s the part that I just figured out this week:

How do we set the default dial-in number for a user when they create a new meeting invite via Lync? It is obvious now but for years I never really knew. I always told people to manually edit the settings in their Outlook Lync Meeting options tool.

 

Meeting Options 1

Meeting Options 2

Now setting that works fine for individual users. I was asked how this can be changed for an entire office and I…uh…well…uh…stammered an “I don’t know”.

Well now I know. And it’s really obvious to me. Now.

All you have to do is set the users dial plan to one that contains the correct region. Duh! So if I have a user who needs to use the “United States – Southeast” dial-in number as their default I then assign them to the user dial plan I created. If the user needs the generic “United States” as their default dial plan I leave their dial plan setting unassigned (i.e., the Global dial plan).

Here is what a meeting invite looks like after I changed my test user from the Global dial plan to the user-level dial plan:

dic 2

Notice that it now sets the default dial-in number to “United States – Southeast” instead of the Global “United States”.

To expand on this, if you have an office (or, more likely, a region) with 4 dial plans, you have to make sure that the Region on all 4 dial plans says the same thing. It’s a bit redundant to have to manually type in the same region value into each dial plan.

So use PowerShell!

Set-CsDialPlan -Identity "Atlanta" -DialInConferencingRegion "United States - Southeast"
Set-CsDialPlan -Identity "Tampa" -DialInConferencingRegion "United States - Southeast"
Set-CsDialPlan -Identity "Orlando" -DialInConferencingRegion "United States - Southeast"
Set-CsDialPlan -Identity "Raleigh" -DialInConferencingRegion "United States - Southeast"

Here is an advanced scenario you might run into. Let’s say that somehow you get a batch of dial-in conferencing numbers that you need to add to Lync, for example via SIP trunks that bring in a bunch of phone numbers from around the globe to a central location. How do you add those Regions?

The same as you would in any other scenario. You have to create a dial plan and type in the region name. In other words, you create “dummy” dial plans (if necessary), set the region in those dummy dial plans, and then use that region when defining the dial-in access number.

Here is some PowerShell showing how to create 3 “dummy” dial plans and three dial-in access numbers using the regions created in those “dummy” dial plans:

#Argentina
New-CsDialPlan -Identity "DIC-Argentina" -DialInConferencingRegion "Argentina"
New-CsDialInConferencingAccessNumber -PrimaryUri "sip:DIC_Argentina@flinchbot.com" -DisplayNumber "+54123456789" -DisplayName "Argentina" -LineUri "tel:+54123456789" -Pool lyncpool.flinchbot.com -PrimaryLanguage "es-MX" -SecondaryLanguage "en-US" -Regions "Argentina"

#Austria
New-CsDialPlan -Identity "DIC-Austria" -DialinConferencingRegion "Austria"
New-CsDialInConferencingAccessNumber -PrimaryUri "sip:DIC_Austria@flinchbot.com" -DisplayNumber "+43123456789" -DisplayName "Austria" -LineUri "tel:+43123456789" -Pool lyncpool.flinchbot.com -PrimaryLanguage "de-DE" -SecondaryLanguage "en-GB" -Regions "Austria"

#Bahrain
New-CsDialPlan -Identity "DIC_Bahrain" -DialinConferencingRegion "Bahrain"
New-CsDialInConferencingAccessNumber -PrimaryUri "sip:DIC_Bahrain@flinchbot.com" -DisplayNumber "+973123456789" -DisplayName "Bahrain" -LineUri "tel:+973123456789" -Pool lyncpool.flinchbot.com -PrimaryLanguage "en-GB" -Regions "Bahrain"

Note on the New-CsDialPlan I don’t do anything other than provide a name (identity) and set a region. That’s all. There is no need for anything else because this dial plan will never actually be used as a dial plan (i.e. phone number normalization). They are being created solely to define a region.

Now look at the dial in web page:

DialIn3

That looks like a real dial-in page now! Note that my test user, when creating a Lync meeting via Outlook will still use the “United States – Southeast” number as his default number as he is assigned to that dial plan. If I wanted him to use the Bahrain number as his default dial-in number is would have to move him to that dial plan (and probably add some normalizations too).


I think that about wraps it up. This is stuff I feel like I should have known years ago but for some reason it didn’t click until this past week. That’s probably due to me having to add about 25 SIP-delivered dial in numbers to our central pools and me actually having to think it all the way through.  I now feel bad to those people to whom I gave wrong, or at least mediocre, information about setting the default dial-in numbers.

For me, the big take away is that *every* dial plan should have a region set.

For more (and probably better) explanations, check here and here.

Lync Regions and assigning Dial-In Conference Numbers

I consider myself really sharp at Lync. However there are still times when I have “duh” moments that I feel like I should have known for years. Since Lync is a pretty complex and varied system there are just some things I never sat down and properly figured out. Dial-In conferencing Regions is one of them. I’ve probably set up dozens of dial-in conferencing numbers over the past few years but there were still a few things I just missed. This article will do a fairly deep dive into Regions and Dial-in Conferencing numbers.

So first off: What is a Region? A Region is anything you want it to be. It’s just some text. A region is independent of an Active Directory site or a Subnet or anything similar. A Region could be “Earth”. A Region could be “Germany”. A Region could be “My bedroom in the basement of mom’s house”. In practicality, a Region is used to define where a dial-in conferencing phone number resides. So if I have a dial in number in Indianapolis and one in Stockholm, the regions could be”Indianapolis” and “Stockholm” or they could be “United States” and “Sweden” or they could be “North America” and “Europe”. The Regions are just a way to let people know where the dial-in conferencing phone number is located.

None of this should be too mind blowing.

So where do you configure these region names? If you answer: “In the same place where you configured the dial-in conferencing number” then you would be wrong. You actually configure them in Dial Plans. There is a reason for this that I will get to later.

In order to set a region, open any dial plan and in the “Dial-in Conferencing Region” box, type in anything you want. In the example below, I entered United States. Also note that this is being set on the “Global” dial plan.

If we want to see our existing dial-in conferencing Regions, you can run this PowerShell command:

Get-CsDialPlan | Select-Object DialInConferencingRegion

If I run that command I see the following:

OK this is great. I have a region. Now how do I assign it to a Dial-In Conference number?

Assuming you don’t have one yet, let me show you quickly how to create a dial-in conferencing number. In Lync Control Panel, navigate to the “Conferencing” section and then the “Dial-in Access Number” tab. Click “New”

Look at the above image. The “Display Number” can be anything you want. It can be in any format you want. In this case I formatted it to the standard way phone numbers in the North America Numbering Plan tend to be formatted. The Display Name field is really a comment for you the administrator so you know why this number was added. Line URI must be the exact, normalized phone number that Lync will use to answer the call. This number needs to be exactly right. The SIP URI can be anything you want. I often make it match the Line URI but you can use anything you want. The Pool is used to tell Lync which of your Lync pools will receive the inbound call. Finally you can select a Primary language and up to four secondary languages for this dial-in access number.

And where do we define the Region? All the way down at the bottom. I have to scroll down so you can see it.

After clicking the Add button you are given a list in which to select your region. In this case, we only have one available region.

Once you select that region, you are returned to the main page for adding a Dial-In Access number. Click “Commit” and you now have a Dial-In Access number.

For those excited by PowerShell, you can perform all of the above using the following PowerShell commands:

Set-CsDialPlan -Identity "Global" -DialInConferencingRegion "United States"
New-CsDialInConferencingAccessNumber -PrimaryUri "sip:+13175551212@flinchbot.com" -DisplayNumber "(317) 555-1212" -DisplayName "Indianapolis" -LineUri "tel:+13175551212" -Pool lyncpool.flinchbot.com -PrimaryLanguage "en-US" -Regions "United States"

So how can we see what this now looks like? Point a web browser to your dial-in simple URL. In my case, that is https://dialin.flinchbot.com and I see the following:

 

If you don’t know your dial-in Simple URL, you can run the following PowerShell command:

Get-CsSimpleUrlConfiguration

Now more has happened here beyond just having created a new dial in number for the United States.

[Here is the part I never really knew until this week…]

What I have also done is assigned a default dial-in number for all of my users. This may seem obvious but it really isn’t (at least to my thick Hoosier skull (or I guess my now Nashvillian skull)). The reason that all of my users now have a default dial in number is because I assigned the Region to the Global dial plan. I currently only have one dial plan so everyone gets this dial plan by default. Since the dial plan controls the region, all of my users are now considered to be in the United States region.

So what happens if I create a new dial plan. Does the global Region still define the Region for users assigned to a user-level (or site-level) dial plan? Let me create a user level dial plan without a region and see what happens.

Testing this gets a little tricky because we can’t just go to the dial-in simple URL because that just shows a list of all the defined numbers, not the default number for a given user. Since I created a user-level dial plan, I had to assign that dial plan to a specific user. In order for them to see what their dial-in Region is, they need to create an Outlook meeting and then use the Lync Meeting calendar add-in (alternately, a “Meet Now” meeting directly from the Lync client).

Doing all of those steps, I see that the Global dial-in Region applies if the user-level dial plan has no Region defined.

So it’s probably a good idea to set a default Region in your Global dial plan. Unless you don’t want everyone to have a default location. Then leave it blank.

Now let’s go to the next step. I want to add a second dial-in number for our office in Nashville. I will call this region “United States – Southeast”. I create a User-Level dial plan and simply type “United States – Southeast” into the “Dial-in conferencing region” field. I then skip over and create a new dial-in access number.

After waiting about 5 minutes I reloaded the dialin.flinchbot.com page and I see that I now have a new entry in my list.

So that’s pretty cool. But here’s the part that I just figured out this week:

How do we set the default dial-in number for a user when they create a new meeting invite via Lync? It is obvious now but for years I never really knew. I always told people to manually edit the settings in their Outlook Lync Meeting options tool.

 

Meeting Options 1

Meeting Options 2

Now setting that works fine for individual users. I was asked how this can be changed for an entire office and I…uh…well…uh…stammered an “I don’t know”.

Well now I know. And it’s really obvious to me. Now.

All you have to do is set the users dial plan to one that contains the correct region. Duh! So if I have a user who needs to use the “United States – Southeast” dial-in number as their default I then assign them to the user dial plan I created. If the user needs the generic “United States” as their default dial plan I leave their dial plan setting unassigned (i.e., the Global dial plan).

Here is what a meeting invite looks like after I changed my test user from the Global dial plan to the user-level dial plan:

dic 2

Notice that it now sets the default dial-in number to “United States – Southeast” instead of the Global “United States”.

To expand on this, if you have an office (or, more likely, a region) with 4 dial plans, you have to make sure that the Region on all 4 dial plans says the same thing. It’s a bit redundant to have to manually type in the same region value into each dial plan.

So use PowerShell!

Set-CsDialPlan -Identity "Atlanta" -DialInConferencingRegion "United States - Southeast"
Set-CsDialPlan -Identity "Tampa" -DialInConferencingRegion "United States - Southeast"
Set-CsDialPlan -Identity "Orlando" -DialInConferencingRegion "United States - Southeast"
Set-CsDialPlan -Identity "Raleigh" -DialInConferencingRegion "United States - Southeast"

Here is an advanced scenario you might run into. Let’s say that somehow you get a batch of dial-in conferencing numbers that you need to add to Lync, for example via SIP trunks that bring in a bunch of phone numbers from around the globe to a central location. How do you add those Regions?

The same as you would in any other scenario. You have to create a dial plan and type in the region name. In other words, you create “dummy” dial plans (if necessary), set the region in those dummy dial plans, and then use that region when defining the dial-in access number.

Here is some PowerShell showing how to create 3 “dummy” dial plans and three dial-in access numbers using the regions created in those “dummy” dial plans:

#Argentina
New-CsDialPlan -Identity "DIC-Argentina" -DialInConferencingRegion "Argentina"
New-CsDialInConferencingAccessNumber -PrimaryUri "sip:DIC_Argentina@flinchbot.com" -DisplayNumber "+54123456789" -DisplayName "Argentina" -LineUri "tel:+54123456789" -Pool lyncpool.flinchbot.com -PrimaryLanguage "es-MX" -SecondaryLanguage "en-US" -Regions "Argentina"

#Austria
New-CsDialPlan -Identity "DIC-Austria" -DialinConferencingRegion "Austria"
New-CsDialInConferencingAccessNumber -PrimaryUri "sip:DIC_Austria@flinchbot.com" -DisplayNumber "+43123456789" -DisplayName "Austria" -LineUri "tel:+43123456789" -Pool lyncpool.flinchbot.com -PrimaryLanguage "de-DE" -SecondaryLanguage "en-GB" -Regions "Austria"

#Bahrain
New-CsDialPlan -Identity "DIC_Bahrain" -DialinConferencingRegion "Bahrain"
New-CsDialInConferencingAccessNumber -PrimaryUri "sip:DIC_Bahrain@flinchbot.com" -DisplayNumber "+973123456789" -DisplayName "Bahrain" -LineUri "tel:+973123456789" -Pool lyncpool.flinchbot.com -PrimaryLanguage "en-GB" -Regions "Bahrain"

Note on the New-CsDialPlan I don’t do anything other than provide a name (identity) and set a region. That’s all. There is no need for anything else because this dial plan will never actually be used as a dial plan (i.e. phone number normalization). They are being created solely to define a region.

Now look at the dial in web page:

DialIn3

That looks like a real dial-in page now! Note that my test user, when creating a Lync meeting via Outlook will still use the “United States – Southeast” number as his default number as he is assigned to that dial plan. If I wanted him to use the Bahrain number as his default dial-in number I would have to move him to that dial plan (and probably add some normalizations too).


I think that about wraps it up. This is stuff I feel like I should have known years ago but for some reason it didn’t click until this past week. That’s probably due to me having to add about 25 SIP-delivered dial in numbers to our central pools and me actually having to think it all the way through.  I now feel bad to those people to whom I gave wrong, or at least mediocre, information about setting the default dial-in numbers.

For me, the big take away is that *every* dial plan should have a region set.

For more (and probably better) explanations, check here and here.

HP Stream 8 review (+ Sway!)

hp_stream_8_tablet-100462970-origI bought myself an HP Stream 8 a few weeks ago.

A few weeks ago, Microsoft released their new “Sway” product.

So I decided to put these 2 new things together and write a detailed review of the HP Stream 8 using Microsoft Sway. Follow this link to see it:

https://sway.com/lH68Iw8cj5gXTyPw

Validate your Federated Domains

nslookupWe have hundreds of federated partners defined in our Lync environment. Having this many invariably means that our federation with a partner “breaks” because the partner changes their Access Edge configuration. They could be using closed federation and changed their Access Edge DNS. They could have been configured for Open Federation and switched to closed. They could have historically unreliable Open Federation so we stick in an Access Edge setting that then changes.

It’s also tedious to use NSLookup to manually check the partners SRV settings.

So this script addresses these issues.

By default, it will pull in all of your federated partners via the Get-CsAllowedDomains cmdlet. It then cycles through all of these and checks to see what the actual SRV record for _sipfederationtls._tcp.{domain} is set to. It then compares what you have in Lync with what the DNS lookup returns and spits out a .csv file with all of its results. It’s then up to you to do something with this report such as finding the discrepancies and updating your federation.

This script also supports a one-off check saving you the work of having to do the SRV lookup the manual way. Just run it as Validate-ProxyFQDN -Domain {domain} and it will compare your Lync configuration with what it finds via DNS.

I have done some decent testing of this script but please point out any errors or improvements you’d like to see as this all came together pretty quickly.

Click here to download the script.

Lync SBA’s: The Good The Bad, and The Annoying

Painful%20SlideIf you’ve deployed Lync Enterprise Voice, you’ve at least had a discussion around Survivable Branch Appliances. You may have even deployed them. After having deployed (or assisted in deploying) and supported over 25 of them I’ve learned quite a few lessons about SBA’s that I thought I would share.

First off: sizing. If you look at the official Microsoft documentation, they say that an SBA can support between 25 and 1,000 users. However the SBA vendors do not offer just one single SBA option. They offer options with 2GB RAM and 4GB RAM. They offer SBA’s with different CPU’s. So is there a disconnect between what Microsoft says on Technet and what the vendors are providing? Can a 2GB SBA really support 500 users?

That really ends up being the wrong question, or at least it’s not the only question. I have some 2GB SBA’s that can not support 25 users . The Front End service keeps crashing every day or so. This behavior is not exhibited on the 4GB models.

It ends up that the supported user count isn’t the only metric you need to review when sizing an SBA. You need to understand how large your entire Lync deployment is (or will be). If you will only have a few thousand users then I think a 2GB SBA would work out fine. But if you have 10s of thousands or hundreds of thousands of users then don’t even consider a low powered SBA. As an example: We have an 8 person office using a 4GB SBA and they experience none of the issues that the larger offices we have experience with a 2GB SBA. All of those user accounts, all 10,000 or 50,000 or 200,000 get at least partially replicated to the SQL store on the SBA’s (at least I think it does – I’ve never looked into the DB to see what all is in there). And those large databases eat RAM. And without RAM….The Lync Front End service crashes.

There are other issues to consider before buying a low-powered SBA. How often will you need to monitor or troubleshoot? If Lync barely runs in 2GB of RAM, how well will your logging tools perform? We have crashed the Lync Front End Services on a 4GB RAM SBA just by trying to open a log file in Snooper that was too large for the SBA to handle. Lesson learned. So now when dealing with large log files we have to copy them off of the SBA to our local PC’s to open them in OCSlogger/Snooper. All of this adds delay while the users in that location can’t make or receive phone calls.

With a low powered SBA, other things will take longer too. Your patching window will need to be longer just because installing a Lync Cumulative Update or Windows patches will take longer. You surely need to run antivirus. You may also run a SCCM/SCOM agent or two. Antimalware? IDS/IPS agent? Any other management stuff running and chewing up CPU and RAM? If you add up the price difference for the extra RAM and/or the faster CPU, will you save that money in less downtime and quicker time resolving issues?

Installing and configuring an SBA is completely different than bringing up any other Lync role. Traditionally, you add a device to Topology and then either run the setup off the Lync CD for a first time install or you run bootstrapper to add the new features. You do this via remote desktop and life is good. If something goes wrong, you can just uninstall Lync and start the install over again. This is pretty much how you install all software you’ve ever installed on a Windows Server.

But with an SBA, things are different before you even turn the thing on. First, you have to add an SBA to your Active Directory first and then manually add an SPN value to that computer object. I’m sure someone who’s good at AD can explain why this is needed on an SBA but not needed for any other Lync role.

Next, after publishing Topology, you do not remote desktop to the machine and install Lync off the CD (or .iso image). Instead, you connect to a vendor-written website on the SBA to configure the server. These web-based installers handle all sorts of things such as renaming the server, adding it to a domain, and changing the password of the Administrator account. Of course it does all of this via HTTP by default so if security is important to you the first thing you do is waste 10 minutes to install a certificate on IIS on the SBA.

All that these web-based installers do is wrap PowerShell into a web GUI and invariably all of them have issues. For example, I have never successfully completed a certificate request through the Web installer. The other fun thing is that these SBA’s don’t have an uninstall option for Lync. So if things go wrong for whatever reason you can’t just uninstall Lync and start the install over again. You have to re-image the entire thing and set the whole thing back to scratch. Fortunately this doesn’t happen often.

But my core issue is figuring out what the point is of this web-based installer? Why not just ship a copy of the .iso with the SBA and install it just like you do every other Lync role?

In my imagination I see a bunch of Microsoft people sitting in a conference room

Forward Thinker:  “Hey, how can a Lync administrator install an SBA when they only have limited connectivity to the device? Like, they only have a dial-up modem connection to the site or a firewall policy limits their access?”

Everyone else in the room: “WEB BASED INSTALL!!!!!”.

And so we get stuck with a web based installer but in reality the web based installer solves no issues. It only creates them. If you only have a 56K connection to a site, you probably shouldn’t be installing Lync in that site in the first place, at least not an SBA. Go with a Standard Edition. What if you only have HTTP(S) access to the site? Well, you can then install Lync but you can’t do any logging or troubleshooting with OCS Logger so you better never have an issue. In other words, this has always seemed to me to be a solution in need of a problem.

This is also one of the reasons why I greatly prefer to deploy an SBS over an SBA: I can install off an iso and I’m not limited to under-powered hardware. However, depending on how your organization is structured, you may want to limit the amount of hardware (and “ownership” of that hardware) at a remote location. So an appliance makes sense which is why we continue to push them out.


When shopping for an SBA, there are some key points to ask the vendors you are comparing:

1. How easy is it to upgrade the SBA? Based on the above diatribe, you can’t just uninstall Lync 2010 and install Lync 2013. You have to *completely* re-install the server with a brand new copy of Windows and run through the whole rotten Web-based installer again. Some of the vendors make the upgrade process generally painless by letting you download an image and then flipping a switch on the server to boot to a new partition. These are easy to upgrade remotely. Others require you to download an image and overwrite the existing installation and this has to be done via a USB key or some other transport. These are harder as you may need to do some of the upgrade steps via a serial/terminal connection. (How exactly do I do this over HTTP? I can’t. Another reason the web installer is pointless.)

2. Can your vendor provide some semblance of local support if you have offices scattered all over the globe? Some vendors are a bit more global than others and this could become an issue regarding sourcing equipment and supporting them. It becomes a bigger issue if a part fails on the gateway and a vendor who claims to be global can’t get you parts because those parts are caught up in customs.

3. How good is their support? I’ve dealt with three different SBA vendors. Two of them are great with support, one of them not so much. And to my surprise, things I heard “on the street” about the support at these vendors did not match my reality when I worked with them. So ask the vendor how easy it is to open tickets, how quickly tickets get a response, how easy or difficult it is to set up a voice call for support, etc. I don’t know if there is an easy way to get real information out of a vendor so talk with peers about their experiences with a vendor. Alternately, if you are working with a Lync support organization, ask them how well they can support the gateway side of the product and their experience with the support organizations of the gateway vendor. Note that I am not calling any one out here so don’t ask me in the comments which one of the three I’ve had the most difficulty with. I won’t say.

One other thing to keep in mind: The vendor is on the hook for supporting both Windows and Lync on the SBA. So if the Front End service crashes, don’t call Microsoft. Call the SBA vendor.

4. Manageability. Your network guys have tools that monitor their routers and switches and firewalls. Can they also monitor this device? Some of the vendors sell their own monitoring software. Check those out and compare them. Can Vendor A’s software also monitor and manage Vendor B’s gateway? Can I write custom scripts to manage or monitor the gateways myself? How easily can I extract reports from your solution and link them with my Lync monitoring reports? Can I push out firmware upgrades? Can I centrally back up my configurations? Do you have a SCOM Management Pack?

5. Completeness of Vision. This isn’t a hard and fast set of questions or requirements of a vendor. But you do want to make sure that the vendor is completely committed to Lync as one of the core facets of their business. You want to make sure that no matter what screwball telecommunications connection you need to use in whatever screwball location that the gateway will be able to handle the connection. As an example, we had to connect to a screwy SIP trunk provider and in order to make the connection work the gateway had to manipulate the HTTP headers being sent to the SIP provider. I was impressed that this feature was available but then this completeness of features is one of the reasons we use this vendor. I have full confidence that anything we ever need to connect to our gateways will be able to be handled by this vendor.


Make sure that your SBA’s can route to your Edge servers. As calls come in to an SBA from the gateway, Lync will go through its whole STUN/TURN/ICE game and that includes seeing if using the Edge is a good option. But if the SBA cannot reach the Edge servers then calls will fail. There are some workarounds to this issue but if you have a properly configured network you won’t need to use them. We have one office that is always messing up their DNS servers. We ended up having to add our Edge servers to the local Hosts file on the SBA so that the SBA could reliably resolve and connect to the Edge servers.

Don’t put in an SBA thinking it will solve all of your congested WAN problems. Sure, if you can keep calls off the WAN that will address a portion of your WAN congestion. But if your WAN fills up the SBA could start dropping calls (inability to reach Edge) and/or putting your SBA-homed users into limited functionality mode (inability to reach parent pool).

And no matter what, make sure you have QoS working across your WAN. Someone could be copying a large file across the WAN link and during that time Lync can’t deliver calls and/or your users go into limited functionality mode. QoS helps avert this.

Since I’m talking about congested WAN’s I may as well bring this up: configure the client policy for all of your remote users to use web based address book lookups in the Lync client instead of downloading the address book. Even if the bandwidth is negligible between the two, consider this problem:

We were migrating remote users from Lync 2010 to Lync 2013. 1 week later we got reports from the network group that Lync was crushing the WAN connections to 1 of our remote offices. After some work we figured out it was that everyone in the office was downloading the Lync address book at about the same time and there wasn’t enough WAN bandwidth to support this. We effectively knocked that office off the WAN due to address book downloads. We changed the client policy to Addess Book Web Query and told everyone in the office to sign out/in on their Lync client. Within an hour or so the traffic calmed down. We changed our global policy to Address Book Web Query only.


Conferencing. Installing an SBA does not change the way Lync dial-in conferencing works. An SBA/SBS cannot be a conferencing server. So if you use publish a dial-in conferencing number that is hosted by the SBA, keep in mind that all traffic on that conference is still going across the WAN to your Front End servers. You may actually be increasing your WAN bandwidth with people now calling the number at the remote office to join meetings. Also, know how many available lines or SIP trunks you have connecting your gateway to the phone system. If you only have 10 SIP channels you can only have 10 callers dialing in to that dial-in conferencing number. The 11th caller gets a busy signal. This could also prevent customers from calling you because all 10 channels are being used for the conference.

Don’t blindly add a dial-in conferencing number to an SBA. Be sure that the local users know how the voice is routed and what the maximum number of invitees should be. Also make sure QoS is enabled on the WAN so people dialing in do not have a bad meeting experience.


We didn’t do this initially but we have gone back and fixed this. When we initially configured our gateways, we only configured a connection from the gateway to our SBA. So what happens if the SBA crashes or is getting upgraded or patched? All calling fails as the gateway can’t reach the Mediation service on the SBA. Instead, set up a Mediation server in your parent pool to be a fall back route (both inbound and outbound) in case the SBA is unavailable. While calls will now be travelling over your WAN during an outage, calls can still be made and received.

And be sure you have QoS configured on your WAN so that these calls don’t sound terrible.


I used to think that SBA’s were neat little devices. Now I kind of hate them. Not because they perform poorly. A properly sized SBA can handle 800 or more users in the largest of environments and once deployed we kind of forget they even exist. But upgrading them, configuring them, troubleshooting them, and dealing with their quirks is just a giant pain. I would love it if Microsoft nuked the entire install process in Lync vNext and just made it the exact same process used to install every other piece of Lync. I’m a big fan of the SBS precisely because every complaint I have about the SBA’s doesn’t exist with an SBS. You install it the same way you install everything else. You aren’t limited by overpriced and under-powered hardware. Microsoft handles the support. If they could take this flexibility and put it into the SBA model then life would be just that little bit better.

Moving Immovable Users

immovableThis is probably the first of a few blog posts regarding a problem we are facing with our Lync 2013 environment. In short, we have 2 corrupt routing groups right now. Users assigned to those routing groups are unable to add a contact to their buddy list and they cannot change their status.

This tip isn't anything too special and a lot of you may already know this but I'm putting it out there in case someone else runs into this situation.

Our initial thought was to move the users to a different pool which will remove them from one of the bad routing groups. However, we cannot move the users to a different pool. When doing so, we get the errors seen below.

PS C:\Users\flinchbot> Move-CsUser "user@flinchbot.com" -Target pool.flinchbot.com
Confirm
Move-CsUser
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help
(default is "Y"):
Move-CsUser : Distributed Component Object Model (DCOM) operation begin move
away failed.
At line:1 char:1
+ Move-CsUser "user@flinchbot.com" -Target pool.flinchbot.com
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidResult: (:) [Move-CsUser], MoveUserExcept
 ion
 + FullyQualifiedErrorId : FAILED::MoveRetry,Microsoft.Rtc.Management.AD.Cm
 dlets.MoveOcsUserCmdlet
Move-CsUser : Distributed Component Object Model (DCOM) operation
RollbackMoveAway failed "-1007781356".
At line:1 char:1
+ Move-CsUser "user@flinchbot.com" -Target pool.flinchbot.com
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidResult: (:) [Move-CsUser], MoveUserExcept
 ion
 + FullyQualifiedErrorId : FAILED::MoveRetry,Microsoft.Rtc.Management.AD.Cm
 dlets.MoveOcsUserCmdlet
Move-CsUser : Distributed Component Object Model (DCOM) operation begin move
away failed.
At line:1 char:1
+ Move-CsUser "user@flinchbot.com" -Target pool.flinchbot.com
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 + CategoryInfo : InvalidOperation: (CN=Uk,lre poc..flinchbot,DC
 =com:OCSADUser) [Move-CsUser], MoveUserException
 + FullyQualifiedErrorId : MoveError,Microsoft.Rtc.Management.AD.Cmdlets.Mo
 veOcsUserCmdlet

So that wasn't going to work. So we decided to try a force-move of the users. In general a force-move is to be avoided as this process will move the user but it will throw away, among other things, any contact list entries.

So we did an Export-CsUserData of the users information first:

PS C:\Users\flinchbot> Export-CsUserData -UserFilter "user@flinchbot.com" -Poolfqdn pool.flinchbot.com -filename "e:\tempuser.zip"

We verified that the data was correct by extracting the .zip file and looking at the .xml file. In there we could see the contact list entries that the user already had.

Next we did the force-move.

PS C:\Users\flinchbot> Move-CsUser "user@flinchbot.com" -Target pool.flinchbot.com -force
Confirm
Move-CsUser [Using Force will cause data loss!]
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help
(default is "Y"):

This moved the user. Finally we restored the data using the Update-CsUserData cmdlet:

PS C:Usersflinchbot> Update-CsUserData -UserFilter "user@flinchbot.com" -FileName "e:\tempuser.zip" -verbose
VERBOSE: Processing input file e:tempuser.zip.
VERBOSE: Opening file
C:UsersflinchbotAppDataLocalImportUserDataTemp.Xml.
VERBOSE: Opening file e:tempuser.zip.
VERBOSE: Processed 1 users so far.
VERBOSE: User user@flinchot.com specified in User Filter processed.
VERBOSE: Output file C:UsersflinchbotAppDataLocalImportUserDataTemp.Xml
 generated successfully.
VERBOSE: Processing user t-user@flinchbot.com.
VERBOSE: Processed 1 users so far.
Confirm
Are you sure you want to perform this action?
Performing operation "Update-CsUserData" on Target "user@flinchbot.com".
[Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help
(default is "Y"):

After signing out of the user account and signing back in we saw the contacts had been restored. We were also now able to add new users to the contact list as well as update the Lync status.

Moving the user back to their original pool gave the same errors as in the first example above. We need to figure that issue out but at least our users can have full Lync client functionality again even if they are now in the wrong pool.

Quick & Dirty – Gather Shutdown Tracker Events

Today I had the need to see if my Front End servers were shut down “dirty’ and when. So I kicked out the following script.

$banana = Get-CsComputer -Pool lyncpool.flinchbot.com
foreach($Server in $banana)
{
 write-host $server.fqdn
 Get-EventLog -ComputerName $server.fqdn -LogName System -InstanceId 41 | export-csv shutdowns.csv -Append
}

Port 5088 Missing from Lync 2013 Documentation

scvmovies029portofmissinggirls

If they had the other Harry Caray, a whole lot of Budweiser would be missing too.

We had an issue where users were able to sign in with Lync mobility but were unable to send and receive IM’s. There are 2 things to note about this scenario:

1. The users are homed on an SBA

2. There are firewalls between the SBA and the parent pool.

So if you don’t have this scenario then you can quit reading now as you won’t ever have this problem.

In order to troubleshoot why our users were unable to successfully use Lync mobility, we jumped into the logs. We reviewed the log from the mobile phone and it showed nothing useful. We enabled the Lync Logging tool on the SBA and had a user log in and try to send an instant message.

Reviewing this log, we saw a request for port 5088 form the SBA to the parent pool. The request was to a specific server in the parent pool and it was from our Survivable Branch Appliance.

If you look at the image below you’ll see this in the Snooper view of the collected log file. The ms-diagnostics line pretty much spells this out as clearly as you could expect.

Look at the circle. It's 5088!

Look at the circle. It’s 5088!

Port 5088 does not currently exist on the Lync Ports and Protocols page on TechNet. Searching for this port turns up very little outside of this one TechNet article. That article points to the set-cswebserver PowerShell cmdlet which is used to define the web server settings in Lync. If you expand the Parameters section in the article and scroll down to the UcwaSipExternalListeningPort section you will see that this is set to use 5088/tcp by default. This is incorrect as this is the port used by UcwaSipPrimaryListeningPort. This TechNet article has the two ports switched in their documentation (The same error is seen when running get-help set-cswebserver -detailed).

ucwa ports

Run get-csservice -Webserver and you will see the default ports. Note that they don’t match the documentation.

 

In other words, even when Microsoft has documented this port in TechNet, they got it wrong. We didn’t see port 5089 in any of our traces so we couldn’t figure out when this port gets used.

After we updated the firewalls in front of our parent pool Lync servers, the problem immediately disappeared and our SBA users were able to successfully IM via their mobile clients.


Our contact at Microsoft has forwarded this omission to the relevant teams so hopefully at some point this will be added to the Lync ports and protocols page.


Credit to figuring this out goes to Antwan who is resurrecting his UC Playa blog. I’m just the one who wrote the article.

Lync 2013 and Useless(?) Topology Updates

RedHerringBlurbWe noticed today (and a few days ago, for that matter) that our CMS Replication state was “False” an awful lot of the time. So much so that we thought our CMS Replication was broken. We failed over our CMS role1 the other day and, after coming back from lunch, all of our replicas were “True”. Well we tried the same trick today and it didn’t fix the problem. We dug deep into the logs and it appeared that everything was actually working correctly. We even went so far as making a simple change (New-csUserPolicy  “Delete This Policy”) and verifying after a few minutes that it showed up on a few of our other Lync servers2. So we turned our focus to why wasn’t the replication status ever “True”?3

I’ll skip ahead a little here and get to the point where we made our little discovery. We exported a topology, then waited a random amount of time – say 5 minutes. Then we exported another copy of topology. We took the DocItemSet.xml file from each export and did a text comparison between the two files. Lo and behold there was a change. What was this Topology change?

A user migration.

Yes, moving a user from one pool to another caused a topo refresh to our servers. What the???

Our production environment is pretty big. As such, there are almost constant changes in the environment – be it updating a dial plan or disabling a user. In other words, it’s essentially dumb luck if we ever see our replication status set to “True” on all of our servers.


I was able to replicate this in my lab which has no automated systems enabling users or other system admins editing dial plans or the like. I can control the environment very tightly.

I exported a copy of the topology. I then ran “Move-csuser flinchbot -Target lync2013se.flinchbot.com”. I then waited 5 minutes and exported the topology a second time. Next I went to this site and copied the first topology file into the left pane and the updated topology file in the right pane. It found 5 changes.

Topo1

Look at the bottom right of this image.

The first is (and I am guessing here) a hash of some sort letting the recipient servers know that there has been a change to the following section (XML node). This is found at a root node in the XML document (I think that’s the right term).  The next change is similar. Like above, I think it’s a marker to point out that within the root node above, this is the specific entry that has changed.

Topo2

Finally we get to the actual change. Notice that the usercount decrements from 320 to 319. This is the move of the user FROM the source pool. Topo3

The fourth change is similar to the second change above – I think it’s just pointing out that “here be changes”: Topo4

I have no users on the destination pool (well maybe a random account or two). As such, you can see that the usercount going from 0 to 1 is completely expected if a new user is moved to this specific pool. Topo5


So….the question is why is there a topology update sent out for a user move?

All signs point to Windows Fabric and/or pool pairing being the reason. But why would you spam all of the Lync servers in your entire infrastructure with a change that is only relevant to a subset and then only if they are using Windows Fabric?

And then the change is only the number of users?

If the user count for a pool is set to 1501 in one of these files, is this the event that triggers Windows Fabric to create a new user routing group or to re-balance its groups? It seems an awful brute-force kind of way to do this.

Consider an environment with tens of thousands or hundreds of thousands of users. Users are being created/deleted/moved all the time. Now files are being blasted around the network constantly to inform all of your servers that a user was moved. Admittedly these files tend to be fairly small. In my lab they are 30K in size. In the production environment I help manage these files are much larger.

As a fun side effect, all of these topo pushes will account for additional writes the the SQL XDS Database which will fill up your SQL Logs faster.

So I don’t know why Microsoft architected it this way. But if you see that your CMS state is False an awful lot then it may very well be normal for your environment.


 

Footnotes:

1You can move the active CMS host(s) by stopping the Lync Server File Transfer Agent, Lync Server Master Replicator Agent, and Lync Server Replica Replicator Agent on the current active CMS host(s). This forces an election and one of the other Front End servers will pick up one or both of the roles.

2For reference, this was done by running Export-CsConfiguration -Filename export.zip -LocalStore. Looking in the returned export.zip file at the DocItemSet.xml file we found that the change had indeed replicated.

3For the record, to check your replication status run Get-CsManagementStoreReplicationStatus”

Lync 2013 and Useless(?) Topology Updates

RedHerringBlurbWe noticed today (and a few days ago, for that matter) that our CMS Replication state was “False” an awful lot of the time. So much so that we thought our CMS Replication was broken. We failed over our CMS role1 the other day and, after coming back from lunch, all of our replicas were “True”. Well we tried the same trick today and it didn’t fix the problem. We dug deep into the logs and it appeared that everything was actually working correctly. We even went so far as making a simple change (New-csUserPolicy  “Delete This Policy”) and verifying after a few minutes that it showed up on a few of our other Lync servers2. So we turned our focus to why wasn’t the replication status ever “True”?3

I’ll skip ahead a little here and get to the point where we made our little discovery. We exported a topology, then waited a random amount of time – say 5 minutes. Then we exported another copy of topology. We took the DocItemSet.xml file from each export and did a text comparison between the two files. Lo and behold there was a change. What was this Topology change?

A user migration.

Yes, moving a user from one pool to another caused a topo refresh to our servers. What the???

Our production environment is pretty big. As such, there are almost constant changes in the environment – be it updating a dial plan or disabling a user. In other words, it’s essentially dumb luck if we ever see our replication status set to “True” on all of our servers.


I was able to replicate this in my lab which has no automated systems enabling users or other system admins editing dial plans or the like. I can control the environment very tightly.

I exported a copy of the topology. I then ran “Move-csuser flinchbot -Target lync2013se.flinchbot.com”. I then waited 5 minutes and exported the topology a second time. Next I went to this site and copied the first topology file into the left pane and the updated topology file in the right pane. It found 5 changes.

Topo1

Look at the bottom right of this image.

The first is (and I am guessing here) a hash of some sort letting the recipient servers know that there has been a change to the following section (XML node). This is found at a root node in the XML document (I think that’s the right term).  The next change is similar. Like above, I think it’s a marker to point out that within the root node above, this is the specific entry that has changed.

Topo2

Finally we get to the actual change. Notice that the usercount decrements from 320 to 319. This is the move of the user FROM the source pool. Topo3

The fourth change is similar to the second change above – I think it’s just pointing out that “here be changes”: Topo4

I have no users on the destination pool (well maybe a random account or two). As such, you can see that the usercount going from 0 to 1 is completely expected if a new user is moved to this specific pool. Topo5


So….the question is why is there a topology update sent out for a user move?

All signs point to Windows Fabric and/or pool pairing being the reason. But why would you spam all of the Lync servers in your entire infrastructure with a change that is only relevant to a subset and then only if they are using Windows Fabric?

And then the change is only the number of users?

If the user count for a pool is set to 1501 in one of these files, is this the event that triggers Windows Fabric to create a new user routing group or to re-balance its groups? It seems an awful brute-force kind of way to do this.

Consider an environment with tens of thousands or hundreds of thousands of users. Users are being created/deleted/moved all the time. Now files are being blasted around the network constantly to inform all of your servers that a user was moved. Admittedly these files tend to be fairly small. In my lab they are 30K in size. In the production environment I help manage these files are much larger.

As a fun side effect, all of these topo pushes will account for additional writes the the SQL XDS Database which will fill up your SQL Logs faster.

So I don’t know why Microsoft architected it this way. But if you see that your CMS state is False an awful lot then it may very well be normal for your environment.


 

Footnotes:

1You can move the active CMS host(s) by stopping the Lync Server File Transfer Agent, Lync Server Master Replicator Agent, and Lync Server Replica Replicator Agent on the current active CMS host(s). This forces an election and one of the other Front End servers will pick up one or both of the roles.

2For reference, this was done by running Export-CsConfiguration -Filename export.zip -LocalStore. Looking in the returned export.zip file at the DocItemSet.xml file we found that the change had indeed replicated.

3For the record, to check your replication status run Get-CsManagementStoreReplicationStatus”