This post continues from the previous one where I debated some of the draw-backs of using a network file share.
In that last post I mentioned there are some situations where using a file share can still be useful.
Company A has been storing content on a file share for many, many years. People have the been granted access to specific folders, and they also have the freedom of creating sub-folders (which they do).
Over time, the file share becomes unmanageable. As Adrian pointed out in his post, this has several disadvantages: nothing can be easily found; a lot of the information ins inaccessible; collaboration is not as effective as it could be, etc.
Recently Company A became aware of SharePoint. “Cool!” they thought, let’s move everything into a document library. Then we won’t have problems any more.
Is SharePoint the answer?
I certainly agree that a product like SharePoint can be useful. Once the content is in SharePoint, it can be further categorized using metadata, made accessible through the use of views, etc, and can be easily searched. Company A also thought this way.
Company A considered a couple of options here:
- Move the content from the file share to a SharePoint document library, and then just get SharePoint to index it, so that a search can be done whenever anyone needs to find something.
- Move the content from the file share to the document library, and then add appropriate metadata (enrich), and then also perform a crawl.
Let’s look at the options
Both option 1 and 2 are good options. Having all the content in SharePoint means that it’s all there in one place. Security can be applied, as well as versioning.
Option 2 increases the findability of the content even more by adding rich, meaningful metadata. Company A can create a taxonomy that allows the content to be suitably categorised. Combined with customizable views, users can display the content of the document libraries in multiple different ways.
As mentioned, there are terabytes of content in the fileshares.
Moving terabytes of content into a document library would mean that the database is now terabytes in size. Unless the data is properly optimized, and maintained, this will be a big hit on performance.
In the fileshares there are thousands of text documents, spreadsheets, pdfs, images, CAD drawings, project files, mp3 files, films, executables, and a wide assortment of different file formats, from the different applications.
SharePoint can accept all these formats, but, by default, a lot of these file formats are excluded from being placed in a document library, and exceptions have to be made.
The file share has worked well for a long time. The main concerns were:
a. manageability – it was hard to manage security to the fileshares, as well as keeping track of when files were modified. It was also extremely difficult to navigate the folder structure; and
b. findability – It was almost impossible to locate any files (unless you knew where they were in the first place.
Keeping this in mind, here are a couple of alternatives Company A could consider:
- Keep everything in the fileshare, but configure SharePoint to crawl, and index the files in it.
- Keep “static” files in the fileshare, move “dynamic” ones into SharePoint.
Let’s look at the alternatives
The advantages of the first scenario is that you avoid that very, very large database. By setting up the fileshare as a content source, you can configure SharePoint to crawl it. And, as a result users can perform searches to find what they want. Scheduling incremental crawls allows SharePoint to pick up any changes that are made to the content of the files.
The disadvantage of this becomes obvious when the security changes on the file. A full crawl is necessary to pick up any security changes. This means that if there are regular security changes (new users being granted access to the share, access to documents being changed, etc) a full crawl is required. This can take a very long time (especially with a slow/busy network).
In this second scenario, a little bit of work is required to identify all the documents that are not “active” documents (i.e. the documents that users are not currently modifying). This would include films, images, executables, and any files taht are not “dynamic”.
Then the company could move any documents that are “dynamic” (still being edited, etc) into SharePoint. Then, as described above, extra metadata can be added to improve findability.
The fileshare can then be treated as an “archive”, and the security changed to be Read Only. This will ensure that no documents get modified. And therefore the content only has to be crawl once.
Alternatively, lock down the file share so that no one can modify the permissions on folders or documents. Because there is no security change, no full crawl is required. Regular incremental crawls can be scheduled to pick up an changes to the content of the document.
The other day I was watching a demo of AvePoint’s File Share Connector. This connector allowed users working in SharePoint to interact seamlessly with the documents that were actually in the file share.
The obvious advantage of this is that SharePoint functionality is available, without jamming all the files in the database.
I was pretty impressed with what I saw. However – I haven’t used the connector myself, in a real-live situation, so I cannot make any comments on it. If you have used it, please feel free to let me know.