Indexing DateTime fields – Sitecore 7 Content Search

Reason

The date time indexing logic that ships with Sitecore 7 Content Search is partially flawed. Neither the logic storing nor retrieving DateTime values is capable of handling changes to ContentSearch.DateFormat. This means that DateTime values will lose any information reg. hours, minutes, seconds etc. when being stored in the indices, no matter what date time format is specified in the content search configuration.

Shown below is a default and a patched ContentSearch.DateFormat. The format has been changed from “yyyyMMdd” to “s” (top vs. bottom screenshot):

Default ContentSearch.DateFormat setting

Patched ContentSearch.DateFormat setting

The standard DateFieldReader always stores the field contents with a date resolution of DAY, hence ignoring the above setting. This means that out of the box, the hour, minute and second parts of a Sitecore DateField will not be stored regardless of the ContentSearch.DateFormat used.

Sitecore.ContentSearch.LuceneProvider.FieldReaders.DateFieldReader

Fortunately it’s easy to replace the default DateFieldReader with a custom implementation via the content search configuration files.
Unfortunately, once a DateFieldReader has been implemented which respects the ContentSearch.DateFormat, format exceptions will be thrown when working with types derived from Sitecore.ContentSearch.SearchTypes.SearchResultItem (“String was not recognized as a valid DateTime”).
This occurs when the properties CreatedDate and Updated are read from the index. These two properties are backed by index fields which are configured with a custom date time format as follows:

__smallCreatedDate and __smallUpdatedDate configuration

It turns out that the default date time converter doesn’t handle date time formats that are configured on individual fields, and there’s no obvious way to retrieve the custom storage format once a field’s value has been passed to a TypeConverter for conversion:

Sitecore.ContentSearch.Converters.IndexFieldDateTimeValueConverter

The following article describes a fix to the issues outlined above by overriding the default DateFieldReader and IndexFieldDateTimeValueConverter, in addition to introducing a setting for “alternate date time formats”.

Examples are based on .NET 4.5 and Sitecore 7.1 rev. 130926.

Code

The storage and retrieval process can be roughly outlined as follows:

  1. A crawler reads items from Sitecore.
  2. Each item’s fields are read by matching FieldReader implementations (e.g. DateFieldReader, CheckboxFieldReader, ImageFieldReader).
  3. The field value is converted to a string and stored in the index.
  4. When retrieving content directly from the index (i.e. when using a class containing properties marked with the Sitecore.ContentSearch.IndexFieldAttribute and Linq-to-Sitecore) a TypeConverter parses the stored value to an instance of the appropriate runtime type.

Configuration

To replace the default FieldReader and TypeConverter with our custom implementations, save the configuration shown below to a .config-file (e.g. “z.ContentSearch.DateTimeFix.config”) and place it in the “App_Config/Include”-folder.
It’s important to note that the configuration file has been prefixed with the letter “z”. This is done to ensure that it’s loaded after the configuration file “Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.config”, which is also located in the “App_Config/Include”-folder. Our configuration won’t have any effect otherwise, as Sitecore loads config include files in lexicographical order (our configuration would be overridden by the default config).
Modify namespace and assembly names as needed.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <settings>
      <!-- The default date time format used when storing values in the index. -->
      <setting name="ContentSearch.DateFormat">
        <patch:attribute name="value">s</patch:attribute>
      </setting>
      <!-- Alternate date time formats that are used in the index. Multiple values must be separated by pipes. -->
      <setting name="ContentSearch.AlternateDateTimeFormats" value="yyyyMMdd" />
    </settings>

    <contentSearch>
      <configuration>
        <defaultIndexConfiguration>
          <!-- DateTimeFieldReader -->
          <fieldReaders type="Sitecore.ContentSearch.FieldReaders.FieldReaderMap, Sitecore.ContentSearch">
            <mapFieldByTypeName hint="raw:AddFieldReaderByFieldTypeName">
              <fieldReader fieldTypeName="date|datetime">
                <patch:attribute name="fieldReaderType">NamespaceName.DateTimeFieldReader, AssemblyName</patch:attribute>
              </fieldReader>
            </mapFieldByTypeName>
          </fieldReaders>

          <!-- DateTimeConverter -->
          <indexFieldStorageValueFormatter type="Sitecore.ContentSearch.LuceneProvider.Converters.LuceneIndexFieldStorageValueFormatter, Sitecore.ContentSearch.LuceneProvider">
            <converters hint="raw:AddConverter">
              <converter handlesType="System.DateTime">
                <patch:attribute name="typeConverter">NamespaceName.DateTimeConverter, AssemblyName</patch:attribute>
              </converter>
            </converters>
          </indexFieldStorageValueFormatter>
        </defaultIndexConfiguration>
      </configuration>
    </contentSearch>
  </sitecore>
</configuration>

FieldReader

The DateTimeFieldReader simply makes use of the ContentSearch.IndexDateFormat when invoking DateTime.ToString(...):

using System;
using Sitecore.ContentSearch;
using Sitecore.ContentSearch.FieldReaders;
using Sitecore.ContentSearch.Utilities;
using Sitecore.Data.Fields;

public class DateTimeFieldReader : FieldReader
{
  public override object GetFieldValue(IIndexableDataField field)
  {
    if (field.Value is DateTime)
      return Format((DateTime)field.Value);

    Field dataField = field as SitecoreItemDataField;
    if (dataField == null)
      return string.Empty;

    if (string.IsNullOrEmpty(dataField.Value))
      return string.Empty;

    if (FieldTypeManager.GetField(dataField) is DateField)
      return Format(new DateField(dataField).DateTime);

    return string.Empty;
  }

  private string Format(DateTime dateTime)
  {
    return dateTime.ToString(ContentSearchConfigurationSettings.IndexDateFormat);
  }
}

TypeConverter

Note that the DateTimeConverter makes use of the new ContentSearch.AlternateDateTimeFormats setting. Being forced to maintain all index date time formats in two separate parts of the configuration (as part of the field configurations themselves and as a setting) is suboptimal, but it’s a suitably pragmatic solution which will work until Sitecore provides a proper fix:

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Globalization;
using Sitecore.Configuration;
using Sitecore.ContentSearch.Utilities;

public class DateTimeConverter : System.ComponentModel.DateTimeConverter
{
  private readonly List<string> _alternateDateTimeFormats = new List<string>();
  private readonly string _defaultDateTimeFormat = ContentSearchConfigurationSettings.IndexDateFormat;

  public DateTimeConverter()
  {
    _alternateDateTimeFormats.Add(_defaultDateTimeFormat);
    _alternateDateTimeFormats.AddRange(Settings.GetSetting("ContentSearch.AlternateDateTimeFormats", string.Empty).Split('|'));
  }

  public override object ConvertFrom(ITypeDescriptorContext context, CultureInfo culture, object value)
  {
    string fieldValue = value as string;
    if (string.IsNullOrEmpty(fieldValue))
      return DateTime.MinValue;

    try
    {
      return DateTime.ParseExact(fieldValue, _alternateDateTimeFormats.ToArray(), culture.DateTimeFormat, DateTimeStyles.None);
    }
    catch (FormatException ex)
    {
      string message = string.Format("{0} Value is '{1}'.", ex.Message, value);
      throw new ArgumentException(message, "value", ex);
    }
  }

  public override object ConvertTo(ITypeDescriptorContext context, CultureInfo culture, object value, Type destinationType)
  {
    return ((DateTime) value).ToString(_defaultDateTimeFormat, culture.DateTimeFormat);
  }
}

Example

Shown below is a simple C# class inheriting from the aforementioned SearchResultItem:

DateTimeExample class

The following makes up the code behind of a simple ASP.NET test page (SearchResults is a GridView control):
DateTimeExample code behind

As can be seen in the output, both the default fields __smallCreatedDate and __smallUpdatedDate as well as the field NewsDate are retrieved properly:
DateTimeExample output

The screenshot below is taken from Luke, showing how the field values are stored in Lucene:
Luke screenshot

7 thoughts on “Indexing DateTime fields – Sitecore 7 Content Search

  1. Thx for you post but I get this error on this line:
    ISearchIndex index = ContentSearchManager.GetIndex(“sitecore_master_index”);

    An exception of type ‘System.InvalidCastException’ occurred in mscorlib.dll but was not handled in user code
    Additional information: Cannot cast from ‘System.String’ to ‘Sitecore.ContentSearch.ProviderIndexConfiguration’.

    I use Sitecore 7 and hope you can help me out here:)

    • Hi K-J!

      The error message you’ve encountered sounds familiar; if you’re experiencing the issue I’m thinking of, it’s caused by the order in which config-include files are read by Sitecore.
      Try one of the following:

      • EITHER prefix your config-include filename with something like “z”, e.g. rename it from “my-file.config” to “z-my-file.config”.
        This should work because Sitecore reads the config-files in lexicographical order, so prefixing your file with “z” makes sure it’s included after the default search configuration files (“Sitecore.ContentSearch.XYZ.config” etc.).
      • OR move your config-include file to a subfolder of the Include-directory, e.g. move it from “/App_Config/Include/my-file.config” to “/App_Config/Include/K-J/my-file.config”.
        This should work because Sitecore reads config-files in subdirectories after files in the top directory.

      Hope this helps!

  2. Nice article – just what I needed. Just a quick note on the prefixing with “z” to ensure the files are loading in the right order. If you put your include files in a sub-folder inside the Include folder, Sitecore will load in the Include folder first and then all the Sub Folders, I’ve moved all my include files to sub folders now and it works great!

  3. Hey Uli!

    I know this was a long time ago, but I was wondering if you ever figured out what was causing the patched version of the setting to be different than the one in the patch file?

    -Elena

    • Hi Elena,

      What you’re describing sounds like an issue related to the order in which config include files are applied/read from disk.
      If you’re using config include files, try putting them in a subfolder, e.g. “App_Config/Include/MyFiles/DateTimeFormat.config”.

    • Or you might be using a different version of Sitecore than I did at the time.
      Part of the content search configuration was moved around a bit in Sitecore 7.2 for instance.
      If you’re lucky, Sitecore has documented this somewhere in the patch notes.

      • Thanks for getting back to me! This was also on a 7.2 instance. Turned out the patched config I was looking at (through showconfig.aspx) was lying to me about what file the patch came from.

        -Elena

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s