Sunday, December 14, 2014

UserProfile taxonomy properties not getting crawled ??

I just discovered a pretty strange behaviour. And I can't really find this limitation mentioned anywhere so i thought i would share it.

I had this custom taxonomy userprofile property that remained empty after a full search crawl.
Other taxonomy properties in the same profile looked just fine. My custom taxonomy property remained empty, no matter what.
Now this happened on 2 of about 9000 users, so it should work.

So i started removing one field at a time from the profile, reran a incremental search on the profile.
So finally after removing the 5 terms in the "About me" field, my custom taxonomy property suddenly got crawled.
Then i kept adding back one term at a time. And when adding back the 4th term my property was not beeing crawled.
Maybe the term was corrupted, so i tried removing it, and adding another term and got the same result..

Then i counted the number of terms on the problem profile like this:
using (SPSite site = new SPSite("https://whatever"))
{
    UserProfileManager manager = new UserProfileManager(SPServiceContext.GetContext(site));
    UserProfile userProfile = manager.GetUserProfile("domain\\whatever");
    IEnumerator<ProfileSubtypeProperty> e = userProfile.Properties.GetEnumerator();
    int count = 0;
    while (e.MoveNext())
    {
        UserProfileValueCollection values = userProfile[e.Current.Name];
        List<Term> terms = new List<Term>(values.GetTaxonomyTerms());
        count += terms.Count;
    }
    Console.WriteLine("Number of terms: " + count);
}

Number of terms: 31

So if i removed one term, my custom taxonomy user profile property got crawled again.
It seems like there is a limit of max 30 terms per user, more than 30 will not be picked up
by the crawler..

I have no idea if it is possible to increase this limit. But if the limit is there for a reason
then saving the profile should at least give an error.

If you move the userprofile property up in CA, this gives the property higher priority when getting crawled.
So if it is important that the property should be crawled, then move it up.

Here is some code that counts number of terms in all user profiles, so that yo can indentify profiles
that is not getting crawled correctly:

using (SPSite site = new SPSite("https://whatever"))
{
    UserProfileManager manager = new UserProfileManager(SPServiceContext.GetContext(site));
    foreach (UserProfile userProfile in manager)
    {
        IEnumerator<ProfileSubtypeProperty> e = userProfile.Properties.GetEnumerator();
        int count = 0;
        while (e.MoveNext())
        {
            UserProfileValueCollection values = userProfile[e.Current.Name];
            List<Term> terms = new List<Term>(values.GetTaxonomyTerms());
            count += terms.Count;

        }
        if (count > 30)
        {
            Console.WriteLine("Account: " + userProfile.AccountName + ", has " + count + " number of terms");
        }
    }
}