New Windows Phone UC App

wp_ss_20140505_0001About 2 years ago I released the Lync News app for Windows Phone. Today that app has been retired and replaced with “flinchböt on UC“, an app which covers Lync as well as Exchange and has a fairly terrible name (I was in a hurry and didn’t give the name any thought.). The new app is streamlined from the previous one partially because it was done with App Studio instead of native Visual C++ and partially because the older one was a bloated mess.

So if you have been using the Lync News app on Windows Phone, thanks – but it’s time to uninstall it! This version has way better load times for not only the app but for the Lync feed as well. The Exchange feed is a bit laggy but since I rarely have to deal with Exchange in my job I don’t care that it’s slow.

The app is fairly self explanatory. The one thing to point out is to see the full, original post click the url link at the top of a given article. Otherwise you can read it in a slightly-less readable format within the app. You can also pin an article to your start screen. If you have an article open, tpa on the menu then Share. Pick “Share Link” and then you can save to OneNote which is hot. That would be a really cool way to save articles.

Here is the link to download the app to your Windows 8 phone.

As a reminder, there is also a similar app for Android that can be found here.

Below are some screenshots.

 

wp_ss_20140505_0002

wp_ss_20140505_0003

wp_ss_20140505_0004

wp_ss_20140505_0005

wp_ss_20140505_0006

 

Fun with KHI and Performance Monitor

A few weeks ago I wrote a post basically saying that the Lync Stress Tool was worthless. In it I said you should really monitor the progress of your Lync deployment using Performance Monitor. I also pointed to the Key Health Indicators  that Microsoft recommends you use to monitor your Lync installs. Heck, they even have a script to easily install the KHI Data Collector Set into Performance Monitor for you.

As we built our Lync 2013 servers, we installed the KHI Data Collector Set on each server as part of our standard build process. As we have about 40 Lync servers it’s a pain to go back to 40 servers and update the KHI Data Collector Set configuration. For example, we want to change the logging directory off of our c: drive and to the e: drive. We’d also like to launch the performance monitor collection every so often, have it run for a week, and then stop. Manually starting Performance Monitor on 40 servers? This is where PowerShell comes in.

I cobbled together a script to change the settings of the KHI Data Collector Set in Performance Monitor. If the KHI Data Collector Set was not installed on the server, the script installs it. After updating (or installing) the KHI Data Collector Set, it starts it on all of the servers. This is a total time saver. I won’t shar the entire script here because I copied the entire Microsoft-written KHI script and buried it into mine. Copyright, plagiarism, etc.

But I will give you enough information to build your own script.

At the top of the script is this:


$arrServers=import-csv e:\scriptsservers.csv

This reads in a simple list of all of the servers I want to manipulate. Set the Header in the file to “ServerName”.

Next, I pasted in the two functions at the top of the Microsoft Script. I edited the CreateDataCollector function to look like this:


Function CreateDataCollector
{
Write-Host -ForegroundColor Green "Creating Lync Server 2013 KHI Data Collector on $($server.ServerName)..."

Invoke-Expression "logman.exe create counter KHI -o e:PerflogsKHI_$($server.ServerName) -f csv -si 15 -v mmddhhmm -cf .LyncServer2013KHIs.config -s $($server.ServerName)"
Remove-Item .\LyncServer2013KHIs.config
}

I edited the Write-Host line to properly display the Server name as it comes from the text file we are using. I then deleted a few lines and built my own Invoke-Expression command. Note that in this one I am slipping in the server name into the name of the logfile. I am also pointng th elogfile to an e:Perflogs directory.

The CreateKHIsTextFile function is left unchanged.

And then after those 2 functions is the code I cobbled together.

Function StartKHI
{
 $datacollectorset.Query("KHI", $Server.Servername)
#Change alread-installed KHI Collector set to log to e: drive instead of default c: drive
 Invoke-Expression "logman.exe update KHI -o e:PerflogsKHI_$($Server.ServerName) -b 5/1/2014 17:00:00 -e 5/8/2014 17:00:00 -s $($Server.ServerName)"
#Start the Collector Set
 $datacollectorset.Start($false);
}

foreach ($Server in $arrServers)
{
 Write-host "Working on" $Server.ServerName "..." -ForegroundColor Green

 $datacollectorset = New-Object -COM Pla.DataCollectorSet;
 try
 {
#If the collector set is not already installed, it errors. If no error, start the collector
 StartKHI
 }
 catch
 {
#Starting the collector crashed, so it's probably not installed. Install it, then start it.
 write-host ("KHI counters not installed on {0}" -f $Server.ServerName) -ForegroundColor Green
 write-host "Installing...." -ForegroundColor Green
 CreateKHIsTextFile
 CreateDataCollector
 StartKHI
 }
}
 

I’ll assume you are fairly well versed in PowerShell. So let me point out the one bit of creativity I had to use. No value is returned by the  “$datacollectorset.Query(“KHI”, $Server.Servername)” call. Instead, it returns nothing if it worked. If it fails it lows up and scrawls PowerShell blood all over your screen. So the way to tell if the KHI is already installed or not is to use a Try/Catch construct. If the try works, it starts the KHI Data Collector successfully. If it fails, then I assume that the KHI Data Collectors haven’t been installed. So I call the Microsoft-written (and slightly edited by me) functions to install it. Once those are done, I go ahead and start the Data Collector.

So using this script, I am able to either install the KHI Data Collectors or to update them with values I want. If you look at the Invoke-Expression line in the StartKHI function, I use the -b and -e parameters. This sets a begin and end time for the collector to run. In this case it is one week. You will probably have to edit this before running your copy.


Below is a short script to stop the Data Collector Set. It’s useful when testing.


$arrServers=import-csv e:scriptsservers.csv

foreach ($Server in $arrServers)
{
 Write-host "Working on" $Server.ServerName "..." -ForegroundColor Green

 try
 {
 $datacollectorset = New-Object -COM Pla.DataCollectorSet;
 $datacollectorset.Query("KHI", $Server.ServerName);
 $datacollectorset.Stop($false);
 }
 catch
 {
 write-host "KHI counters already stopped on $($Server.ServerName)" -ForegroundColor Green
 }
}

In the above you don’t really have to use the Try/Catch. It’s just to make things prettier (i.e., less PowerShell blood).


So if you cobble the full script together, you can install the KHI Data Collector set, edit its settings, and start and stop the collector. Pretty useful, especially if you have a lot of servers. Now the next challenge: What do you do with 40 servers-worth of logs?

Find SIP Addresses with Illegal Characters

SIP HappensOne of my peers had a Lync 2013 pool-failover scenario. Just about everything worked right except that apparently the Lync Backup Service had been getting hung up and not completing its replication cycles. They opened a case with Microsoft and one of the issues discovered was that Lync Backup Service was hanging on users whose SIP Address had illegal characters. Once they manually fixed these SIP Addresses, the Backup Service was able to complete successfully.

So what characters are illegal in a SIP address (at least so far as Lync is concerned)?

~ | { } [ ] < > ` # ^ & @

We can convert that to a Regular Expression:

^([^~|{}[]<>#^’&@\]+)$

Once that is done, a quick and dirty script can be written to compare every user against this Regular Expression. If the Regular Expression matches the SIP Address, then we can be notified of this.


# These are the invalid characters ~|{}[]<>`#^&@

$AddressToTest = get-csuser
$regex = "^([^~|{}[]<>#^’&@\]+)$"

Foreach ($user in $AddressToTest)

{
If (($User.sipaddress -split "@")[0].substring(4) -notmatch $regex)

{
Write-Host "Invalid username specified." $User.sipaddress
}
}


The only fancy part of this script is in the If statement. We can’t compare the entire SIP Address against the regex because the “@” will always be a match. So the Split is used to grab the left hand side of the SIP Address which is the portion that will (most likely) have illegal characters. You’ll also note the “substring” portion in the if statement. This means begin the comparison 4 characters in; skip the “sip:” portion of the returned SIP Address.

Note that if you want to test out this script in a lab environment, you can force a user to have any illegal character if you edit their SIP Address via ADSIEdit. Also note that set-csuser will permit you to edit a SIP Address and inserting a few of the above characters.

Here is sample output from the script:

Invalid_URI_Capture

 

Tom Arbuthnot points to a Technet document specifically calling out the unsupported usageof the hyphen and apostrophe here: http://tomtalks.uk/2014/08/apostrophe-and-dash-not-supported-in-user-sip-addresses-in-lync-server-find-problem-sip-uris/

Disabling HTTP in OWAS/WAC

tumblr_inline_mm0uxpnKvq1qz4rgpWe built our OWAS farms and, like most Lync people, had no clue what we were doing. But they ended up working anyway so hooray for us.

Now that we are begrudgingly learning a little about it we have learned that we should disable HTTP on the pools and run with HTTPS only.

So we tried the obvious command to disable HTTP:

Set-OfficeWebAppsFarm -AllowHTTP $False

That gives this wonderful error:

Set-OfficeWebAppsFarm : A positional parameter cannot be found that accepts argument ‘False’.
At line:1 char:1
+ Set-OfficeWebAppsFarm -AllowHTTP $False
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : InvalidArgument: (:) [Set-OfficeWebAppsFarm], ParameterBindingException
+ FullyQualifiedErrorId : PositionalParameterNotFound,Microsoft.Office.Web.Apps.Administration.SetFarmCommand

After asking around, we found that the secret to this command is to use a colon (:) instead of a space ( ) between the parameter and the value. As such, this is the proper syntax:

Set-OfficeWebAppsFarm -AllowHTTP:$False

Note that if you have the SSLOffloaded parameter set to True that you cannot disable AllowHTTP. If you try, you get this error:

WARNING: When offloading SSL, AllowHttp is automatically enabled.

To work around this, run the following command to set both to false.

Set-OfficeWebAppsFarm -SSLOffloaded:$False -AllowHTTP:$False

For more detail and tips on how to secure your Office Web Apps, see this blog.

Disabling HTTP in OWAS/WAC

tumblr_inline_mm0uxpnKvq1qz4rgpWe built our OWAS farms and, like most Lync people, had no clue what we were doing. But they ended up working anyway so hooray for us.

Now that we are begrudgingly learning a little about it we have learned that we should disable HTTP on the pools and run with HTTPS only.

So we tried the obvious command to disable HTTP:

Set-OfficeWebAppsFarm -AllowHTTP $False

That gives this wonderful error:

Set-OfficeWebAppsFarm : A positional parameter cannot be found that accepts argument ‘False’.
At line:1 char:1
+ Set-OfficeWebAppsFarm -AllowHTTP $False
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : InvalidArgument: (:) [Set-OfficeWebAppsFarm], ParameterBindingException
+ FullyQualifiedErrorId : PositionalParameterNotFound,Microsoft.Office.Web.Apps.Administration.SetFarmCommand

After asking around, we found that the secret to this command is to use a colon (:) instead of a space ( ) between the parameter and the value. As such, this is the proper syntax:

Set-OfficeWebAppsFarm -AllowHTTP:$False

Note that if you have the SSLOffloaded parameter set to True that you cannot disable AllowHTTP. If you try, you get this error:

WARNING: When offloading SSL, AllowHttp is automatically enabled.

To work around this, run the following command to set both to false.

Set-OfficeWebAppsFarm -SSLOffloaded:$False -AllowHTTP:$False

For more detail and tips on how to secure your Office Web Apps, see this blog.

LyncSCOM bug

Just a quick note. We have recently been receiving a lot of SCOM alerts like the following in our Lync 2013 environment:

Resolution State: New

Alert Name: [LYNC] Total number of Storage Service EWS Autodiscovery errors.

 

Source: LS Storage Service Component [lync203-1.flinchbot.com]

Path: lync203-1.flinchbot.com

Last modified by: System

Last modified time: 3/4/2014 3:48:54 PM

 

Perf Object Name:

Perf Counter Name: LYSS – Total number of Storage Service EWS Autodiscovery errors.

Perf Counter Value: 186

Error Threshold: 25

Warning Threshold: 1

Consecutive Samples Repeat Count: 2

 

Please see the ‘Product Knowledge’ and the ‘Alert Context’ tab on Alert Properties view for more information.

 

[end of alert description]

These seemed odd because, though I am not at all involved in Exchange, our Exchange guys are sharp and would have caught a misconfigured autodiscover record.

Further, we do not do any archiving so why is a Lync Front End even bothering to check for EWS?

So I checked with my contact at Microsoft and he informed me that this is a bug and we should disable the alert. I don’t know (and don’t care!) if it is a Lync or SCOM error.

LyncSCOM bug

Just a quick note. We have recently been receiving a lot of SCOM alerts like the following in our Lync 2013 environment:

Resolution State: New

Alert Name: [LYNC] Total number of Storage Service EWS Autodiscovery errors.

 

Source: LS Storage Service Component [lync203-1.flinchbot.com]

Path: lync203-1.flinchbot.com

Last modified by: System

Last modified time: 3/4/2014 3:48:54 PM

 

Perf Object Name:

Perf Counter Name: LYSS – Total number of Storage Service EWS Autodiscovery errors.

Perf Counter Value: 186

Error Threshold: 25

Warning Threshold: 1

Consecutive Samples Repeat Count: 2

 

Please see the ‘Product Knowledge’ and the ‘Alert Context’ tab on Alert Properties view for more information.

 

[end of alert description]

These seemed odd because, though I am not at all involved in Exchange, our Exchange guys are sharp and would have caught a misconfigured autodiscover record.

Further, we do not do any archiving so why is a Lync Front End even bothering to check for EWS?

So I checked with my contact at Microsoft and he informed me that this is a bug and we should disable the alert. I don’t know (and don’t care!) if it is a Lync or SCOM error.

The Hidden Logs That Could Crash Your Lync Servers!

How’s that for the title of a blog article! Apparently I’ve been reading too much Huffington Post or something. For the record, I never read that website. I have standards, as low as they may be.

So back to the title and the point of this post. Are there actually hidden log files that could cause some unintended problems with your Lync 2013 environment? Absolutely. I am assuming you are already aware that IIS logs could fill up your local hard drive. It is also a good idea to keep an eye on the trace files created by OCS Logger and Snooper.

However, there are some hidden logfiles that are created by Windows Fabric that could very much fill up your hard drive and it would be a decent challenge to find them. If you are unaware, Lync 2013 sits on top of a technology called Windows Fabric. For a nice overview, check out this Technet blog article as well as this article on masteringlync.com.

By default, Windows Fabric is set to create log files in this hidden system directory:

C:\ProgramData\Windows Fabric\Log\Traces

Once a log file reaches 128MB, it creates a brand new log file. Over time, all of these 128MB log files will fill up your hard drive. When the hard drive gets full it’s very likely that you will see some issues with Lync – yes, even including the potential of one of your Lync servers crashing.

Here is a screenshot of one of my lab servers where I have done nothing to address this potential issue.

According to Windows Explorer, that is 810MB of disk space taken up in my Lab by Windows Fabric log files. Note that these are binary log files so it’s not as if I could read these log files to see what is happening. As such, these log files are only useful to Microsoft when troubleshooting a potential issue. You know, an issue like your hard drive has filled up! I don’t think there is a point in keeping a years worth of Windows Fabric log files.

So how do we keep these log files from eating up our drive space?

For the paranoid, create a scheduled task on all of your Front End Servers (and Directors and SBAs/SBSes) to move the logs to some other server that has disk space you want to waste.

For the rest of us looking for an easy, one time fix, run this command from an elevated command prompt (this is not a PowerShell command):

Logman update trace FabricLeaseLayerTraces -f bincirc --cnf

This will change the logging to circular. According to this Technet article, –cnf is used to “create a new file when the log size has been exceeded”. I imagine this is added as a parameter so that logging doesn’t stop once the initial 128MB file size has been reached. Rather, it will go back to the beginning of the same file and continue logging.

So there you go. Either keep an eye on this directory or run the Microsoft-recommended command to make sure these hidden log files don’t cause you unnecessary heartache.

NUMA

Microsoft should have been embarrassed that they publicly claimed support for virtualizing Lync 2013 but were incapable of providing guidance…until last week when they finally released their 14-months overdue white paper.

Why can’t school be like the real-world? If I could re-write my papers (aka – release them way late) that I wrote in college I so could up my college GPA from B- to a solid B!

Far too late to help us, we did get some tidbits of information out of Microsoft months before this paper was published. One of the main tripping points we came across is mentioned in the white paper as such:

Disable non-uniform memory access (NUMA) spanning on the hypervisor, as this can reduce guest performance

One of our environments was having all kinds of performance issues and disabling NUMA provided a clear boost to performance. As such, the following little meme flew around our office for a few days. Now that this guidance is official, I thought I’d share it with the rest of you.

 

newman

Is the Lync Stress Tool worthless?

stressedThere has been a lot of chatter lately about the Lync 2013 Stress tool particularly since Microsoft just released a new guide about this tool. The guide is very useful as figuring out the tool on your own is….challenging.

In short, the tool works by simulating a heavy load of traffic against your Lync environment. If your servers can handle the load you have defined then you can be fairly confident that your installation is ready for production.

However, there is a big caveat that needs to be explained before you launch this against your Lync servers that sit in any semblance of a production environment. By “any semblance of a production environment” I  mean the Active Directory domain that houses your production or pre-production Lync 2013 servers, any other Lync installs that share the same Lync Organization as the pool you want to test, and anything else that might get pegged harder than usual due to this testing such as network bandwidth or firewalls.

In section 5.1 of the guide, Microsoft even mentions the following:

To stress test Lync Server using LSS, it is best to use an isolated lab environment. The stress testing lab needs to include:

  • Active Directory Domain Services domain controllers

  • Active Directory Certificate Services root certification authority

So if MIcrosoft says you should only use this in a lab environment, begin to ask yourself what is the point of testing lab servers? Well….there isn’t much of a point unless you build an exact duplicate in your lab as to what you will put into production. Depending on the size of your environment, this could be a very sizable investment. (It’s not unheard of to have over 30 Lync-related servers in a single pool. Plan on deploying more than 1 pool in a paired-pool config and your lab will get really large (and expensive) though this tool doesn’t stress test every component).

Alternately, you could bring up your entire Lync environment in a Lab domain, stress test it, then uninstall everything (bootstrapper.exe /scorch) and re-install it into production. Assuming you do everything exactly correct then you will at least have a decent idea that your moving-to-production hardware can handle your anticipated load. But that is an awful lot of work to build your environment twice just to get some metrics.

So then what’s the big deal with just running this in production? Why does Microsoft warn against it?

The guide mentions that you need to build client machines to launch the tests. Each client machine can handle no more than 4500 simulated endpoints (with Multiple Points of Presence (MPOP), it goes up to 6,300 but for the purposes of this article, the focus is on the 4,500 endpoints). Each endpoint is actually a user created in your Active Directory environment and each one of these users will be Lync enabled. What happens when you Lync enable a few thousand or tens of thousands Lync users? You need to regenerate the Lync Address Book and push it out to all of your users.

This is exactly what you don't want your users seeing just because you are testing.

This is exactly what you don’t want your users seeing just because you are testing.

If you are in a small environment then maybe this isn’t a problem. But if you are geographically dispersed and/or your users have limited bandwidth then you can start seeing how there might be issues by throwing abnormally large address books around your network. And if you didn’t think ahead and name your thousands of test users something like ZZZZZZ_LyncUserX then you will have a few thousand new “users” buried smack in the middle of your Lync Address book.

Look at all of those accounts clogging up the address book.

Look at all of those accounts clogging up the address book.

When you remove all of these users a new Address Book will need to be generated and pushed too.

Depending on how robust your AD infrastructure is, do you think your network can handle several thousand users all logging in over a short period of time? Sure you can set the tool to log in users at a rate of one per second but what will this do to any security logging or auditing software you might have in production?

The testing tool can also create a bunch of conference directories that you will have to manually clean up afterwards.

So what should you do instead? Well ask yourself this: What are you really trying to test? The ability of Lync to handle thousands of connections or the ability of your servers  to support thousands of connections to Lync? Because quite honestly, I trust Microsoft to make Lync scaleable to handle the maximum load you are looking to run. But where the bottlenecks come is in in your server infrastructure. Is your SQL Server properly scaled? Do you have enough bandwidth between servers or is your switch overrun and dropping packets?

Microsoft has released a Key Health Indicators document that works with Windows Performance Monitor to collect the key metrics you need to make sure that your servers running Lync are running well. You can download a script to create these counters in Performance Monitor. They are part of the Network Planning, Monitoring, and Troubleshooting with Lync Server document. Just run the PowerShell script and it will create a set of Key Health Indicators for you to monitor.

The script creates a Lync-specific KHI collection within Performance Monitor for you.

The script creates a Lync-specific KHI collector set within Performance Monitor for you.

Now run the KHI collector set for a few days with no one using the servers. This will create your baseline metrics. Now, begin adding or migrating your users to the Lync 2013 servers. Every week run the KHI metrics and see if you notice any unusual spikes. If so, investigate them as these could be pointing out potential bottlenecks such as disk that is too slow or not enough CPU resources.

Using this method will actually let you monitor your environment and let you know if it is handling the actual stress of your deployment and not a theoretical stress in your lab.

Now, how could Microsoft improve the stress tool? Well, create 1 or 10 or 100 users and have them log in 4000 or 400 or 40 times. The tool allows you to have each user log in multiple times but only up to a “100% ratio”. This means if you have 1000 test users you can have up to 2000 sessions with multiple logins (MPOP). However if you need to stress an environment that would need up to 30,000 endpoints you still need to create 15,000 test users.

So to answer the question that is asked in the title of this article: Is the Lync Stress Tool worthless? My answer is that, unless you are in a small deployment or you are really digging deep into Lync architecture, it is basically worthless. Instead, proactively monitor your Lync servers as you would any other production server and should any issues pop up you will be prepared to handle them before they become catastrophic.