I decided to log the resolution to this error in case someone else comes across it. Here is what I came up with:
SharePoint Portal Server has a limit to the amount of text data that it can filter from
a single document. This limit applies only to the text in the document. It does not
apply to graphics or any other type of content. By default, the limit is 64 KB of text
in a single document. If a document has more text than the limit, SharePoint Portal
Server stops the indexing process, considers the document indexed, and moves on
to the next document.
The resolution is:
SharePoint Portal Server has a registry value that limits the maximum file size that
it will crawl. The registry value is called Max Download Size and is located in the
HKLM/Software/Microsoft/SPS Search/Gathering Manager key. By default, the entry
for this registry value is set at 16 MB. SharePoint Portal Server also has a registry entry
that defines how many times larger the file can be than the text. This registry value is
contained in the HKLM/Software/Microsoft/SPS Search/Gathering Manager/Max Grow
Factor key. By default, this value is set at four. This means that the text in a file can be
only up to four times larger than the file size.
If anyone has anymore info please feel free to reply.