Monday, December 23, 2013

SharePoint 2013, search driven lightweight article scheduling


I'm currently working on a search driven intranet. And one requirement was no approval of articles
but with a lightweight article scheduling.
That means, the "unpublished" articles should still be available in standard search, but removed from all rollups.


Problem 1:
The builtin support for article scheduling requires approval to be turned on.

Problem 2:
The builtin approval logic removes the page entirely from the standard search.

Problem 3:
The SharePoint search index does not include null values. So it's not possible to get the articles
with no scheduling (null values).

Problem 4:
The KQL ignores the timeportion of DateTime fields when filtering the results.

So how can we work around these limitations ?
We can't unpublish the pages with a custom timer job,  because that will remove it from standard search.
So the pages needs to be in a published state even after it has been expired.

First of all, approval was turned off, which hides the PublishingStartDate and the PublishingExpirationDate fields in the pages list.
This will hide the fields when you edit the properties of an item. But this is ok, we can still use the fields in our custom layout pages, even if it is hidden.

As i said null values in items are excluded from the search index. So there is no way we can filter on null values. (No start schedule is null, no end schedule is null).
So we need some values that can represent null values.
And we need fields that are not hidden, hidden fields are excluded entirely by the search index.
And don't try to unhide PublishingStartDate / PublishingExpirationDate it will simply not work.

So what we need to do is to create two "hidden" fields that looks like this:

<Field ID="{SOMEGUID}"
     Name="MySchedulingStartDate"
     DisplayName="$Resources:My,field_schedulingstart;"
     StaticName="MySchedulingStartDate"
     Type="DateTime"
     Format="DateTime"
     Group="My Intranett"
     ShowInDisplayForm="FALSE"
     ShowInEditForm="FALSE"
     ShowInFileDlg="FALSE"
     ShowInNewForm="FALSE"
     ShowInVersionHistory="FALSE"
     ShowInViewForms="FALSE"
     ShowInListSettings="TRUE"
     Hidden="FALSE"
     Required="FALSE">
</Field>

<Field ID="{SOMEGUID}"
     Name="MySchedulingEndDate"
     DisplayName="$Resources:My,field_schedulingend;"
     StaticName="MySchedulingEndDate"
     Type="DateTime"
     Format="DateTime"
     Group="My Intranett"
     ShowInDisplayForm="FALSE"
     ShowInEditForm="FALSE"
     ShowInFileDlg="FALSE"
     ShowInNewForm="FALSE"
     ShowInVersionHistory="FALSE"
     ShowInViewForms="FALSE"
     ShowInListSettings="TRUE"
     Hidden="FALSE"
     Required="FALSE">
</Field>


So these fields are not hidden, but they are invisible in edit forms etc. So the fields are picked up
by the search index.


Ok one problem solved, then we need to assign these hidden fields a value that can represent
no scheduling start and no expiration date, so that the search index will index them.

So we need a Event Receiver on the pages list that looks like this:

public class MyEventReceiver : SPItemEventReceiverBase
{
    public override void ItemUpdating(SPItemEventProperties properties, bool isCheckIn)
    {
        try
        {
            base.ItemUpdating(properties, isCheckIn);

            object startValue = properties.AfterProperties[Fields.PublishingStartDateName];
            if (startValue == null)
            {
                properties.AfterProperties[Fields.MySchedulingStartDateName] = SharePointContstants.MinDateIso8601String;
            }
            else
            {
                properties.AfterProperties[Fields.MySchedulingStartDateName] = startValue;
            }

            object endValue = properties.AfterProperties[Fields.PublishingExpirationDateName];
            if (endValue == null)
            {
                properties.AfterProperties[Fields.MySchedulingEndDateName] = SharePointContstants.MaxDateIso8601String;
            }
            else
            {
                properties.AfterProperties[Fields.MySchedulingEndDateName] = endValue;
            }
        }
        catch (Exception exception)
        {
            //Log to ULS
        }
    }

So in the event receiver we will assign these values as "null" values:
No start schedule = "1900-01-01T00:00:00Z"
No end schedule  = "8900-12-31T00:00:00Z"

DO NOT use DateTime.MinValue or DateTime.MaxValue, they are not compatible with
the DateTime field. So you need use the values above.
DateTime.MinValue = "0001-01-01 00:00:00"
DateTime.MaxValue = "9999-12-31 23:59:59"
If you try to assign these values, it will not update the field value, and it just silently ignores it.

DO NOT assign a DateTime object to the after properties, you need to assign a string value with correct datetime format.

So every time the item is updated we will update the hidden fields accordingly.
The hidden fields will always have a DateTime value and will be indexed by the search index.
You also need to create managed search properties for the "hidden" fields, so that we can use them in our search query.

Ok so how can we use this in our queries ?

using (KeywordQuery keywordQuery = new KeywordQuery(web))
{
    keywordQuery.TrimDuplicates = false;
    keywordQuery.StartRow = 0;
    keywordQuery.SelectProperties.Add(SearchFields.Title);
    keywordQuery.SelectProperties.Add(SearchFields.MySchedulingStartDate);
    keywordQuery.SelectProperties.Add(SearchFields.MySchedulingEndDate);
    keywordQuery.EnableSorting = true;
    keywordQuery.SortList.Add(SearchFields.ArticleDate, SortDirection.Descending);
    
    string contentType = "<MYContentTypeId>";
    string friendlyUrl = "<MYFriendlyUrl>";
    string now = SPUtility.CreateISO8601DateTimeFromSystemDateTime(DateTime.Now);
    string dateFilter = string.Format(" AND {0}<={2} AND {1}>={2}",SearchFields.MySchedulingStartDate, SearchFields.MySchedulingEndDate, now);
    string query = string.Format("Path:\"{0}\" AND ContentTypeId:\"{1}*\"", friendlyUrl, contentType) + dateFilter;
    keywordQuery.QueryText = query;

    SearchExecutor exec = new SearchExecutor();
    ResultTableCollection resultsCollection = exec.ExecuteQuery(keywordQuery);
    ResultTable resultsTable = resultsCollection.Filter("TableType"
    KnownTableTypes.RelevantResults).FirstOrDefault();
    return resultsTable;
}

Since all items will have a valid DateTime value the query is quite simple <= =>.

As i said the time portion when filtering on DateTime fields in the KQL
are just ignored, and there is no way that this can be handled by the KQL.
Another "By Design" from Microsoft. Which basically means it was too hard to implement at that time.

The strange thing is that the time portion are returned properly in the result, and that's a good thing so we just refilter the results when iterating the results from the search, like this:

DateTime now = DateTime.Now;
foreach (DataRow result in results.ResultRows)
{
    DateTime startDate = 
   Convert.ToDateTime(result[SearchFields.MySchedulingStartDate]).ToLocalTime();
    DateTime endDate = 
   Convert.ToDateTime(result[SearchFields.MySchedulingEndDate]).ToLocalTime();
    if (startDate <= now && endDate >= now)
    {

    }
}

Not a big problem, you will only get false results on the starting / ending dates. So it won't be many refiltered results.

Let's say you have an article that expires on: "2013-12-01 10:00:00"

Now this article will occur in the search result the entire day, and needs to be refiltered that day only.
So between 2013-12-01 10:00:00 and 2013-12-01 23:59:59 it will occur in the search result and will be refiltered.
The next day it will be filtered by the search, so no need for refiltering.

And of course if you still want to unpublish an article and remove it from the search index, you can still do this manually.

So that's just one way of doing it.

No comments:

Post a Comment